Validation Accuracy capped at certain value and graph shows rapid growths and falls

Using keras, I’m trying to make a model that will predict will the user like a movie or not based on his imdb data. My dataset is list of movie ratings and it has about 900 samples. The model classifies samples in one of three categories based on rating (1-4 bad, 5-7 good, 8 – 10 great). The model is capped at about 0.6 accuracy and however I tinker with the settings, it doesn’t ever go beyond that, but the accuracy graph is also what concerns me because it displays very rapid growths and falls. My question is basically if anyone has any advice what I could do to improve my model, make it more accurate and more consistent.

My code:

data = {'Name': [],
    'Rating': [],
    'Running': [],
    'Year': [],
    'Genre': [],
    'Votes': [],
    'Director': [],
    'Writer': [],
    'Production Company': [],
    'Actor1': [],
    'Actor2': [],
    'Actor3': [],
    'Actor4' : []}

labels = []

with open('ratings_expanded.csv', 'r', encoding='ISO-8859-1') as file:
    reader = csv.reader(file, delimiter = ',')
try:
    for row in reader:
        #data['Name'].append(row[0])
        data['Rating'].append(float(row[1]))
        data['Running'].append(int(row[2]))
        data['Year'].append(int(row[3]))
        data['Genre'].append(row[4])
        data['Votes'].append(int(row[5]))
        data['Director'].append(row[6])
        data['Writer'].append(row[7])
        data['Production Company'].append(row[8])
        actors = row[9].split(',')
        data['Actor1'].append(actors[0])
        data['Actor2'].append(actors[1])
        data['Actor3'].append(actors[2])
        data['Actor4'].append(actors[3])

        labels.append(int(row[10]))
        
except Exception as e:
    print(str(e))

labels_clean = []
for l in labels:
    if l >= 1 and l < 5:
        labels_clean.append(1)
    elif l >= 5 and l < 8:
        labels_clean.append(2)
    else:
        labels_clean.append(3)


df = pd.DataFrame(data, columns=['Rating', 'Running', 'Year', 'Genre', 'Director',     'Writer', 'Production Company', 'Actor1', 'Actor2', 'Actor3', 'Actor4'])

def Encoder(df):
    columnsToEncode = list(df.select_dtypes(include = ['category', 'object']))
    le = LabelEncoder()
    for feature in columnsToEncode:
        try:
            df[feature] = le.fit_transform(df[feature])
        except:
            print('Error encoding ' + feature)
    return df

df_processed = Encoder(df)
dataset = df_processed.values


labels = to_categorical(np.asarray(labels_clean)-1)


l = len(dataset)
x_train = dataset[:math.floor(l * 0.75)]
y_train = labels[:math.floor(l * 0.75)]


x_val = dataset[math.floor(l * 0.25):]
y_val = labels[math.floor(l * 0.25):]

model = models.Sequential()
model.add(layers.Dense(128, activation = 'relu', input_dim = 11))
model.add(layers.Dense(64, activation = 'relu'))
model.add(layers.Dense(32, activation = 'relu'))
model.add(layers.Dense(16, activation = 'relu'))
model.add(layers.Dense(3, activation = 'softmax'))



model.compile(optimizer = 'rmsprop',
          loss = 'categorical_crossentropy',
          metrics = ['accuracy'])

history = model.fit(x_train, y_train, epochs = 20,
                batch_size = 64,
                validation_data = (x_val, y_val))

This is the accuracy graph I was talking about: enter image description here

This is an example of one row of my dataset:

Misery,7.8,107,1990,"Drama, Thriller",194775,Rob Reiner,"Stephen King, William Goldman",Castle Rock Entertainment,"James Caan, Kathy Bates, Richard Farnsworth, Frances Sternhagen, Lauren Bacall, Graham Jarvis, Jerry Potter, Thomas Brunelle, June Christopher, Julie Payne, Archie Hahn, Gregory Snegoff, Wendy Bowers, Misery the Pig",8

And this is how my dataset looks encoded: enter image description here

So, As I already said, my question is why is the model behaving like this and is there a good course of action for me to improve the model’s accuracy. Thanks in advance.

Answer

I solved my problem by normalazing the data:

l = len(dataset)
x_train = dataset[:math.floor(l * 0.75)]
y_train = labels[:math.floor(l * 0.75)]


x_val = dataset[math.floor(l * 0.75):]
y_val = labels[math.floor(l * 0.75):]


mean = x_train.mean(axis = 0)
x_train -= mean
std = x_train.std(axis = 0)
x_train /= std

x_val -= mean
x_val /= std

This resulted in increase in accuracy to ~80%