Help with Sentiment Analysis code is required - unexpected outcome!

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • BarryA
    New Member
    • Jun 2022
    • 19

    Help with Sentiment Analysis code is required - unexpected outcome!

    Hello everyone,

    I'm hoping you can help me with a difficulty I'm having with my Sentiment Analysis project. I've been attempting to apply a simple sentiment analysis model to a dataset of movie reviews, but I'm getting some surprising results. Here's the pertinent section of my code:
    Code:
    import pandas as pd
    from sklearn.model_selection import train_test_split
    from sklearn.feature_extraction.text import CountVectorizer
    from sklearn.linear_model import LogisticRegression
    
    # Load the movie reviews dataset
    data = pd.read_csv('movie_reviews.csv')
    
    # Preprocess the data
    # ... (code for data preprocessing)
    
    # Split the dataset into training and testing sets
    X_train, X_test, y_train, y_test = train_test_split(data['review'], data['sentiment'], test_size=0.2, random_state=42)
    
    # Vectorize the text data using CountVectorizer
    vectorizer = CountVectorizer()
    X_train_vectorized = vectorizer.fit_transform(X_train)
    X_test_vectorized = vectorizer.transform(X_test)
    
    # Train the Logistic Regression model
    model = LogisticRegression()
    model.fit(X_train_vectorized, y_train)
    
    # Evaluate the model
    accuracy = model.score(X_test_vectorized, y_test)
    print(f"Accuracy: {accuracy}")
    When I run the code, the accuracy is constantly around 50%, which is dangerously similar to random guessing. I assume there's a problem with how I'm vectorizing the text input or training the model, but I can't figure out what's causing this problem.

    I verified the dataset and read more about it in this article, and it appears to be successfully loaded with both'review' and'sentiment' columns. I also attempted a simple Naive Bayes classifier, but it didn't help much in accuracy.

    Could you kindly evaluate the code and let me know if you find any flaws or improvements that may help me enhance the accuracy of my sentiment analysis model?

    Thank you in advance for your help!
Working...