[Python] BERT Model Warning (Uninitialized Weights)

ยท

2 min read

Fix: BERT Model Warning (Uninitialized Weights)

This warning appears because we haven't fine-tuned BERT on our dataset. By default, BERT's classifier layer is randomly initialized.

๐Ÿ”น Solution: Load a Pretrained Sentiment Model

Instead of using a fresh BERT model, use a pretrained sentiment classifier:

For example, replace:

model_name = "bert-base-uncased"
model = BertForSequenceClassification.from_pretrained(model_name, num_labels=2).to(device)

โ€ฆ with:

model_name = "nlptown/bert-base-multilingual-uncased-sentiment"
model = BertForSequenceClassification.from_pretrained(model_name).to(device)

This removes the warning and gives more accurate predictions!

Full code example:

import torch
from transformers import BertTokenizer, BertForSequenceClassification
from torch.utils.data import DataLoader, TensorDataset

# Step 1: Check if CUDA is available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

# Step 2: Load Pretrained BERT
model_name = "nlptown/bert-base-multilingual-uncased-sentiment"
tokenizer = BertTokenizer.from_pretrained(model_name)
model = BertForSequenceClassification.from_pretrained(model_name).to(device)

# Step 3: Create Sample Dataset
sentences = ["I love this product!", "This is the worst experience ever."]


# Tokenize sentences
inputs = tokenizer(sentences, padding=True, truncation=True, return_tensors="pt").to(device)


# Step 4: Create DataLoader
dataset = TensorDataset(inputs["input_ids"], inputs["attention_mask"], labels)
dataloader = DataLoader(dataset, batch_size=2)

# Step 5: Run BERT Model on CUDA
model.eval()  # Set model to evaluation mode
with torch.no_grad():
    for batch in dataloader:
        input_ids, attention_mask, labels = batch
        outputs = model(input_ids, attention_mask=attention_mask)
        logits = outputs.logits
        predictions = torch.argmax(logits, dim=1)
        print("Predictions:", predictions.cpu().numpy())  # Move results to CPU for printing

Output:

Using device: cuda
Predictions: [4 0]
Sentence: 'I love this product!' โ†’ Sentiment: Very Positive
Sentence: 'This is the worst experience ever.' โ†’ Sentiment: Very Negative
  • The nlptown/bert-base-multilingual-uncased-sentiment model is trained on sentiment analysis with 5 labels (0โ€“4):

    • 0 โ†’ Very Negative

    • 1 โ†’ Negative

    • 2 โ†’ Neutral

    • 3 โ†’ Positive

    • 4 โ†’ Very Positive

This means:

  • The first sentence ("I love this product!") got a score of 4 (Very Positive)

  • The second sentence ("This is the worst experience ever.") got a score of 0 (Very Negative)

ย