[Python] BERT Model Warning (Uninitialized Weights)
Mohamad's interest is in Programming (Mobile, Web, Database and Machine Learning). He is studying at the Center For Artificial Intelligence Technology (CAIT), Universiti Kebangsaan Malaysia (UKM).
Fix: BERT Model Warning (Uninitialized Weights)
This warning appears because we haven't fine-tuned BERT on our dataset. By default, BERT's classifier layer is randomly initialized.
๐น Solution: Load a Pretrained Sentiment Model
Instead of using a fresh BERT model, use a pretrained sentiment classifier:
For example, replace:
model_name = "bert-base-uncased"
model = BertForSequenceClassification.from_pretrained(model_name, num_labels=2).to(device)
โฆ with:
model_name = "nlptown/bert-base-multilingual-uncased-sentiment"
model = BertForSequenceClassification.from_pretrained(model_name).to(device)
This removes the warning and gives more accurate predictions!
Full code example:
import torch
from transformers import BertTokenizer, BertForSequenceClassification
from torch.utils.data import DataLoader, TensorDataset
# Step 1: Check if CUDA is available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")
# Step 2: Load Pretrained BERT
model_name = "nlptown/bert-base-multilingual-uncased-sentiment"
tokenizer = BertTokenizer.from_pretrained(model_name)
model = BertForSequenceClassification.from_pretrained(model_name).to(device)
# Step 3: Create Sample Dataset
sentences = ["I love this product!", "This is the worst experience ever."]
# Tokenize sentences
inputs = tokenizer(sentences, padding=True, truncation=True, return_tensors="pt").to(device)
# Step 4: Create DataLoader
dataset = TensorDataset(inputs["input_ids"], inputs["attention_mask"], labels)
dataloader = DataLoader(dataset, batch_size=2)
# Step 5: Run BERT Model on CUDA
model.eval() # Set model to evaluation mode
with torch.no_grad():
for batch in dataloader:
input_ids, attention_mask, labels = batch
outputs = model(input_ids, attention_mask=attention_mask)
logits = outputs.logits
predictions = torch.argmax(logits, dim=1)
print("Predictions:", predictions.cpu().numpy()) # Move results to CPU for printing
Output:
Using device: cuda
Predictions: [4 0]
Sentence: 'I love this product!' โ Sentiment: Very Positive
Sentence: 'This is the worst experience ever.' โ Sentiment: Very Negative
The
nlptown/bert-base-multilingual-uncased-sentimentmodel is trained on sentiment analysis with 5 labels (0โ4):0 โ Very Negative
1 โ Negative
2 โ Neutral
3 โ Positive
4 โ Very Positive
This means:
The first sentence ("I love this product!") got a score of 4 (Very Positive)
The second sentence ("This is the worst experience ever.") got a score of 0 (Very Negative)