Skip to content
Advertisement

How to know a movie has how many 0.5/1/1.5/2/2.5/3/3.5/4/4.5/5 rating that rated by every user?

I would like to know how many 0.5/1/1.5/2/2.5/3/3.5/4/4.5/5 ratings that rated by every user in a data frame of a certain movie which is Ocean’s Eleven (2001) in order to calculate Pearson Correlation using the formula.

Below is the code

import numpy as np
import pandas as pd

ratings_data = pd.read_csv("D:\ratings.csv")
movies_name = pd.read_csv("D:\movies.csv")

movies_data = pd.merge(ratings_data, movies_name, on='movieId')
 
movies_data.groupby('title')['rating'].mean()

movies_data.groupby('title')['rating'].count()

average_ratings_count['rating_counts']=pd.DataFrame(movies_data.groupby('title')['rating'].count())

https://i.stack.imgur.com/1eFLV.png

matrix_user_ratings = movies_data.pivot_table(index='userId', columns='title', values='rating')

oceanRatings = matrix_user_ratings["Ocean's Eleven (2001)"]
oceanRatings.head(20)

userId
1     NaN
2     NaN
3     NaN
4     NaN
5     NaN
6     NaN
7     4.0
8     NaN
9     NaN
10    NaN
11    NaN
12    NaN
13    NaN
14    NaN
15    NaN
16    NaN
17    NaN
18    4.0
19    NaN
20    NaN
Name: Ocean's Eleven (2001), dtype: float64

In this case, I just can know there are two 4.0 ratings, but I have around 600+ users. Because I am using movieLens dataset.

Advertisement

Answer

You can use groupby:

oceanRatings = matrix_user_ratings["Ocean's Eleven (2001)"].groupby('rating').count()

Or value_counts():

oceanRatings = matrix_user_ratings["Ocean's Eleven (2001)"].value_counts()
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement