AI Movies Recommendation System Based on K-Means Clustering Algorithm

In this article, we’ll build an artificial intelligence movies recommendation system by using k-means algorithm which is a clustering algorithm. We’ll recommend movies to users which are more relevant to them based on their previous history. We’ll only import those data, where users has rated movies 4+ as we want to recommend only those movies which users like most. In this whole article, we have used Python programming language with their associated libraries i.e. NumPy, Pandas, Matplotlib and Scikit-Learn. Moreover, we have supposed that the reader has familiarity with Python and the aforementioned libraries.

Introduction to AI Movies Recommendation System

In this busy life as people don’t have time to search for their desired item and even they want it on their table or even in a little effort. So, the recommendation system has become an important part to help them to make a right choice for their desired thing and to grow our product. Since data is increasing day by day and in this era with such a large database, it has even become a difficult task to find a more relevant item of our interest, because often we can’t search an item of our interest with just a title and even sometimes it is harder. So, recommendation system help us to provide a most relevant item to individual available in our database.

K-Means Clustering Algorithm

K-Means is an unsupervised machine learning algorithm which can be used to categorize data into different groups. In this article we’ll use this algorithm to categorize users based on their 4+ ratings on movies. I’ll not describe the background mathematics of this algorithm but I’ll describe little intuition of this algorithm. If you want to understand the mathematical background of this algorithm, then I’ll suggest you to search it on Google, many authors has written articles on its mathematical background. Since, the complete mathematics behind this algorithm has been done by Scikit-Learn library so, we will only understand and implement it.

Image for post
Image for post
Figure 1 — Scatter Plot Before K-Means Clustering
Image for post
Image for post
Figure 2 — Scatter Plot After K-Means Clustering
Image for post
Image for post
Figure 3 — Graphical Abstract of K-Means Algorithm
  1. Then, we have to select k random points called centroid which are not necessary from our dataset. Because to avoid random initialization trap which can stuck to bad clusters, we’ll use k-means++ to initalize k centroids and it is provided by Scikit-Learn in k-means algorithm.
  2. K-means algorithm will assign each data point to its closest centroid which will finally gives us k clusters.
  3. The centroid will be re-center to a position which is now actually the centroid of its own cluster and will be new centroid.
  4. It will reset all clusters and again assign each dataset point to its new closest centroid.
  5. If, the new clusters are same as the previous cluster was OR total iterations has completed then it will stop and gives us the final clusters of our dataset. Else, It will move again to step 4.

Elbow Method

The elbow method is the best way to find optimal number of clusters. For this, we need to find within clusters sum of squares (WCSS). WCSS is the sum of squares of each point distance from its centroid and its mathematical formula is following

Image for post
Image for post
Image for post
Image for post
Figure 4 — Elbow Method Plot

Methodology Used in this Article

In this article, we’ll build a clustering based algorithm to categorize users into groups of same interest by using k-means algorithm. We will use data, where users has rated movies with 4+ rating on the supposition of that, if a user is rating a movie 4+ then he/she may like it. We have downloaded database The Movies Dataset from Kaggle.com which is a MovieLens Dataset. In the following sections, we have completely described the whole project, from Importing Dataset -> Data Engineering -> Building K-Means Clustering Model -> Analyzing Optimal Number of Clusters -> Training Model and Predicting -> Fixing Clusters -> Saving Training -> Finally, Making Recommendations for Users. A complete project of movies recommendation system can be downloaded from my GitHub Library AI Movies Recommendation System Based on K-means Clustering Algorithm. A Jupyter notebook of this article is also provided in the repository, you can download and play with that.

Importing All Required Libraries

import pandas as pd
print('Pandas version: ', pd.__version__)

import numpy as np
print('NumPy version: ', np.__version__)

import matplotlib
print('Matplotlib version: ', matplotlib.__version__)

from matplotlib import pyplot as plt

import sklearn
print('Scikit-Learn version: ', sklearn.__version__)

from sklearn.feature_extraction.text import CountVectorizer

from sklearn.cluster import KMeans


import pickle
print('Pickle version: ', pickle.format_version)

import sys
print('Sys version: ', sys.version[0:5])

from sys import exc_info

import ast
Pandas version:  0.25.1
NumPy version: 1.16.5
Matplotlib version: 3.1.1
Scikit-Learn version: 0.21.3
Pickle version: 4.0
Sys version: 3.7.4

Data Engineering

This section is divided into two subsections. Firstly, we will import data and reduce it into a sub DataFrame, so that we can focus more on our model and can look what type of users has rated movies and what type of recommendation for him based on that. Secondly, we’ll perform feature engineering so that we have data in the form which is valid for machine learning algorithm.

Preparing Data for Model

We have downloaded MovieLens Dataset from Kaggle.com. Here first we’ll import rating dataset, because we want users rating on movies and further we’ll filter data where users has gives 4+ ratings

ratings = pd.read_csv('./Prepairing Data/From Data/ratings.csv', usecols = ['userId', 'movieId','rating'])
print('Shape of ratings dataset is: ',ratings.shape, '\n')
print('Max values in dataset are \n',ratings.max(), '\n')
print('Min values in dataset are \n',ratings.min(), '\n')
Shape of ratings dataset is:  (26024289, 3) 

Max values in dataset are
userId 270896.0
movieId 176275.0
rating 5.0
dtype: float64

Min values in dataset are
userId 1.0
movieId 1.0
rating 0.5
dtype: float64
# Filtering data for only 4+ ratings
ratings = ratings[ratings['rating'] >= 4.0]
print('Shape of ratings dataset is: ',ratings.shape, '\n')
print('Max values in dataset are \n',ratings.max(), '\n')
print('Min values in dataset are \n',ratings.min(), '\n')
Shape of ratings dataset is:  (12981742, 3) 

Max values in dataset are
userId 270896.0
movieId 176271.0
rating 5.0
dtype: float64

Min values in dataset are
userId 1.0
movieId 1.0
rating 4.0
dtype: float64
movies_list = np.unique(ratings['movieId'])[:200]
ratings = ratings.loc[ratings['movieId'].isin(movies_list)]
print('Shape of ratings dataset is: ',ratings.shape, '\n')
print('Max values in dataset are \n',ratings.max(), '\n')
print('Min values in dataset are \n',ratings.min(), '\n')
Shape of ratings dataset is:  (776269, 3) 

Max values in dataset are
userId 270896.0
movieId 201.0
rating 5.0
dtype: float64

Min values in dataset are
userId 1.0
movieId 1.0
rating 4.0
dtype: float64
users_list = np.unique(ratings['userId'])[:100]
ratings = ratings.loc[ratings['userId'].isin(users_list)]
print('Shape of ratings dataset is: ',ratings.shape, '\n')
print('Max values in dataset are \n',ratings.max(), '\n')
print('Min values in dataset are \n',ratings.min(), '\n')
print('Total Users: ', np.unique(ratings['userId']).shape[0])
print('Total Movies which are rated by 100 users: ', np.unique(ratings['movieId']).shape[0])
Shape of ratings dataset is:  (447, 3) 

Max values in dataset are
userId 157.0
movieId 198.0
rating 5.0
dtype: float64

Min values in dataset are
userId 1.0
movieId 1.0
rating 4.0
dtype: float64

Total Users: 100
Total Movies which are rated by 100 users: 83
users_fav_movies = ratings.loc[:, ['userId', 'movieId']]
users_fav_movies = ratings.reset_index(drop = True)
users_fav_movies.T
png
users_fav_movies.to_csv('./Prepairing Data/From Data/filtered_ratings.csv')

Data Featuring

In this section, we will create a sparse matrix which we’ll use in k-means. For this, let define a function which return us a movies list for each user from dataset

def moviesListForUsers(users, users_data):
# users = a list of users IDs
# users_data = a dataframe of users favourite movies or users watched movies
users_movies_list = []
for user in users:
users_movies_list.append(str(list(users_data[users_data['userId'] == user]['movieId'])).split('[')[1].split(']')[0])
return users_movies_list
users = np.unique(users_fav_movies['userId'])
print(users.shape)
(100,)
users_movies_list = moviesListForUsers(users, users_fav_movies)
print('Movies list for', len(users_movies_list), ' users')
print('A list of first 10 users favourite movies: \n', users_movies_list[:10])
Movies list for 100  users
A list of first 10 users favourite movies:
['147', '64, 79', '1, 47', '1, 150', '150, 165', '34', '1, 16, 17, 29, 34, 47, 50, 82, 97, 123, 125, 150, 162, 175, 176, 194', '6', '32, 50, 111, 198', '81']
Image for post
Image for post
def prepSparseMatrix(list_of_str):
# list_of_str = A list, which contain strings of users favourite movies separate by comma ",".
# It will return us sparse matrix and feature names on which sparse matrix is defined
# i.e. name of movies in the same order as the column of sparse matrix
cv = CountVectorizer(token_pattern = r'[^\,\ ]+', lowercase = False)
sparseMatrix = cv.fit_transform(list_of_str)
return sparseMatrix.toarray(), cv.get_feature_names()
sparseMatrix, feature_names = prepSparseMatrix(users_movies_list)
df_sparseMatrix = pd.DataFrame(sparseMatrix, index = users, columns = feature_names)
df_sparseMatrix
png
first_6_users_SM = users_fav_movies[users_fav_movies['userId'].isin(users[:6])].sort_values('userId')
first_6_users_SM.T
png
df_sparseMatrix.loc[np.unique(first_6_users_SM['userId']), list(map(str, np.unique(first_6_users_SM['movieId'])))]
png

Clustering Model

To clustering the data, first of all we need to find the optimal number of clusters. For this purpose, we will define an object for elbow method which will contain two functions first for running k-means algorithm for different number of clusters and other to showing plot.

class elbowMethod():
def __init__(self, sparseMatrix):
self.sparseMatrix = sparseMatrix
self.wcss = list()
self.differences = list()
def run(self, init, upto, max_iterations = 300):
for i in range(init, upto + 1):
kmeans = KMeans(n_clusters=i, init = 'k-means++', max_iter = max_iterations, n_init = 10, random_state = 0)
kmeans.fit(sparseMatrix)
self.wcss.append(kmeans.inertia_)
self.differences = list()
for i in range(len(self.wcss)-1):
self.differences.append(self.wcss[i] - self.wcss[i+1])
def showPlot(self, boundary = 500, upto_cluster = None):
if upto_cluster is None:
WCSS = self.wcss
DIFF = self.differences
else:
WCSS = self.wcss[:upto_cluster]
DIFF = self.differences[:upto_cluster - 1]
plt.figure(figsize=(15, 6))
plt.subplot(121).set_title('Elbow Method Graph')
plt.plot(range(1, len(WCSS) + 1), WCSS)
plt.grid(b = True)
plt.subplot(122).set_title('Differences in Each Two Consective Clusters')
len_differences = len(DIFF)
X_differences = range(1, len_differences + 1)
plt.plot(X_differences, DIFF)
plt.plot(X_differences, np.ones(len_differences)*boundary, 'r')
plt.plot(X_differences, np.ones(len_differences)*(-boundary), 'r')
plt.grid()
plt.show()
elbow_method = elbowMethod(sparseMatrix) 
elbow_method.run(1, 10)elbow_method.showPlot(boundary = 10)
png
elbow_method.run(11, 30)elbow_method.showPlot(boundary = 10)
Image for post
Image for post

Fitting Data on Model

Now let first create the same k-means model and run it to make predictions.

kmeans = KMeans(n_clusters=15, init = 'k-means++', max_iter = 300, n_init = 10, random_state = 0)
clusters = kmeans.fit_predict(sparseMatrix)
users_cluster = pd.DataFrame(np.concatenate((users.reshape(-1,1), clusters.reshape(-1,1)), axis = 1), columns = ['userId', 'Cluster'])
users_cluster.T
png
Image for post
Image for post
def clustersMovies(users_cluster, users_data):
clusters = list(users_cluster['Cluster'])
each_cluster_movies = list()
for i in range(len(np.unique(clusters))):
users_list = list(users_cluster[users_cluster['Cluster'] == i]['userId'])
users_movies_list = list()
for user in users_list:
users_movies_list.extend(list(users_data[users_data['userId'] == user]['movieId']))
users_movies_counts = list()
users_movies_counts.extend([[movie, users_movies_list.count(movie)] for movie in np.unique(users_movies_list)])
each_cluster_movies.append(pd.DataFrame(users_movies_counts, columns=['movieId', 'Count']).sort_values(by = ['Count'], ascending = False).reset_index(drop=True))
return each_cluster_movies
cluster_movies = clustersMovies(users_cluster, users_fav_movies)
cluster_movies[1].T
png
for i in range(15):
len_users = users_cluster[users_cluster['Cluster'] == i].shape[0]
print('Users in Cluster ' + str(i) + ' -> ', len_users)
Users in Cluster 0 ->  35
Users in Cluster 1 -> 19
Users in Cluster 2 -> 1
Users in Cluster 3 -> 5
Users in Cluster 4 -> 8
Users in Cluster 5 -> 1
Users in Cluster 6 -> 12
Users in Cluster 7 -> 2
Users in Cluster 8 -> 1
Users in Cluster 9 -> 1
Users in Cluster 10 -> 1
Users in Cluster 11 -> 11
Users in Cluster 12 -> 1
Users in Cluster 13 -> 1
Users in Cluster 14 -> 1

Fixing Small Clusters

Since, there are many clusters which includes less number of users. So we don’t want any user in a cluster alone and let say we want at least 6 users in each cluster. So we have to move users from small cluster into a large cluster which contain more relevant movies to user

def getMoviesOfUser(user_id, users_data):
return list(users_data[users_data['userId'] == user_id]['movieId'])
def fixClusters(clusters_movies_dataframes, users_cluster_dataframe, users_data, smallest_cluster_size = 11):
# clusters_movies_dataframes: will be a list which will contain each dataframes of each cluster movies
# users_cluster_dataframe: will be a dataframe which contain users IDs and their cluster no.
# smallest_cluster_size: is a smallest cluster size which we want for a cluster to not remove
each_cluster_movies = clusters_movies_dataframes.copy()
users_cluster = users_cluster_dataframe.copy()
# Let convert dataframe in each_cluster_movies to list with containing only movies IDs
each_cluster_movies_list = [list(df['movieId']) for df in each_cluster_movies]
# First we will prepair a list which containt lists of users in each cluster -> [[Cluster 0 Users], [Cluster 1 Users], ... ,[Cluster N Users]]
usersInClusters = list()
total_clusters = len(each_cluster_movies)
for i in range(total_clusters):
usersInClusters.append(list(users_cluster[users_cluster['Cluster'] == i]['userId']))
uncategorizedUsers = list()
i = 0
# Now we will remove small clusters and put their users into another list named "uncategorizedUsers"
# Also when we will remove a cluster, then we have also bring back cluster numbers of users which comes after deleting cluster
# E.g. if we have deleted cluster 4 then their will be users whose clusters will be 5,6,7,..,N. So, we'll bring back those users cluster number to 4,5,6,...,N-1.
for j in range(total_clusters):
if len(usersInClusters[i]) < smallest_cluster_size:
uncategorizedUsers.extend(usersInClusters[i])
usersInClusters.pop(i)
each_cluster_movies.pop(i)
each_cluster_movies_list.pop(i)
users_cluster.loc[users_cluster['Cluster'] > i, 'Cluster'] -= 1
i -= 1
i += 1
for user in uncategorizedUsers:
elemProbability = list()
user_movies = getMoviesOfUser(user, users_data)
if len(user_movies) == 0:
print(user)
user_missed_movies = list()
for movies_list in each_cluster_movies_list:
count = 0
missed_movies = list()
for movie in user_movies:
if movie in movies_list:
count += 1
else:
missed_movies.append(movie)
elemProbability.append(count / len(user_movies))
user_missed_movies.append(missed_movies)
user_new_cluster = np.array(elemProbability).argmax()
users_cluster.loc[users_cluster['userId'] == user, 'Cluster'] = user_new_cluster
if len(user_missed_movies[user_new_cluster]) > 0:
each_cluster_movies[user_new_cluster] = each_cluster_movies[user_new_cluster].append([{'movieId': new_movie, 'Count': 1} for new_movie in user_missed_movies[user_new_cluster]], ignore_index = True)
return each_cluster_movies, users_cluster
movies_df_fixed, clusters_fixed = fixClusters(cluster_movies, users_cluster, users_fav_movies, smallest_cluster_size = 6)
j = 0
for i in range(15):
len_users = users_cluster[users_cluster['Cluster'] == i].shape[0]
if len_users < 6:
print('Users in Cluster ' + str(i) + ' -> ', len_users)
j += 1
print('Total Cluster which we want to remove -> ', j)
Users in Cluster 2 ->  1
Users in Cluster 3 -> 5
Users in Cluster 5 -> 1
Users in Cluster 7 -> 2
Users in Cluster 8 -> 1
Users in Cluster 9 -> 1
Users in Cluster 10 -> 1
Users in Cluster 12 -> 1
Users in Cluster 13 -> 1
Users in Cluster 14 -> 1
Total Cluster which we want to remove -> 10
print('Length of total clusters before fixing is -> ', len(cluster_movies))
print('Max value in users_cluster dataframe column Cluster is -> ', users_cluster['Cluster'].max())
print('And dataframe is following')
users_cluster.T
Length of total clusters before fixing is ->  15
Max value in users_cluster dataframe column Cluster is -> 14
And dataframe is following
png
print('Length of total clusters after fixing is -> ', len(movies_df_fixed))
print('Max value in users_cluster dataframe column Cluster is -> ', clusters_fixed['Cluster'].max())
print('And fixed dataframe is following')
clusters_fixed.T
Length of total clusters after fixing is ->  5
Max value in users_cluster dataframe column Cluster is -> 4
And fixed dataframe is following
png
print('Users cluster dataFrame for cluster 11 before fixing:')
users_cluster[users_cluster['Cluster'] == 11].T
Users cluster dataFrame for cluster 11 before fixing:
png
print('Users cluster dataFrame for cluster 4 after fixing which should be same as 11th cluster before fixing:')
clusters_fixed[clusters_fixed['Cluster'] == 4].T
Users cluster dataFrame for cluster 4 after fixing which should be same as 11th cluster before fixing:
png
print('Size of movies dataframe after fixing -> ', len(movies_df_fixed)) 
Size of movies dataframe after fixing ->  5
for i in range(len(movies_df_fixed)):
len_users = clusters_fixed[clusters_fixed['Cluster'] == i].shape[0]
print('Users in Cluster ' + str(i) + ' -> ', len_users)
Users in Cluster 0 ->  45
Users in Cluster 1 -> 21
Users in Cluster 2 -> 8
Users in Cluster 3 -> 15
Users in Cluster 4 -> 11
for i in range(len(movies_df_fixed)):
print('Total movies in Cluster ' + str(i) + ' -> ', movies_df_fixed[i].shape[0])
Total movies in Cluster 0 ->  64
Total movies in Cluster 1 -> 39
Total movies in Cluster 2 -> 15
Total movies in Cluster 3 -> 50
Total movies in Cluster 4 -> 25
class saveLoadFiles:
def save(self, filename, data):
try:
file = open('datasets/' + filename + '.pkl', 'wb')
pickle.dump(data, file)
except:
err = 'Error: {0}, {1}'.format(exc_info()[0], exc_info()[1])
print(err)
file.close()
return [False, err]
else:
file.close()
return [True]
def load(self, filename):
try:
file = open('datasets/' + filename + '.pkl', 'rb')
except:
err = 'Error: {0}, {1}'.format(exc_info()[0], exc_info()[1])
print(err)
file.close()
return [False, err]
else:
data = pickle.load(file)
file.close()
return data
def loadClusterMoviesDataset(self):
return self.load('clusters_movies_dataset')
def saveClusterMoviesDataset(self, data):
return self.save('clusters_movies_dataset', data)
def loadUsersClusters(self):
return self.load('users_clusters')
def saveUsersClusters(self, data):
return self.save('users_clusters', data)
saveLoadFile = saveLoadFiles()
print(saveLoadFile.saveClusterMoviesDataset(movies_df_fixed))
print(saveLoadFile.saveUsersClusters(clusters_fixed))
[True]
[True]
load_movies_list, load_users_clusters = saveLoadFile.loadClusterMoviesDataset(), saveLoadFile.loadUsersClusters()
print('Type of Loading list of Movies dataframes of 5 Clusters: ', type(load_movies_list), ' and Length is: ', len(load_movies_list))
print('Type of Loading 100 Users clusters Data: ', type(load_users_clusters), ' and Shape is: ', load_users_clusters.shape)
Type of Loading list of Movies dataframes of 5 Clusters:  <class 'list'>  and Length is:  5
Type of Loading 100 Users clusters Data: <class 'pandas.core.frame.DataFrame'> and Shape is: (100, 2)

Recommendations for Users

Now here we’ll create an object for recommending most favorite movies in the cluster to the user which user has not added to favorite earlier. And also when any user has added another movie in his favorite list, then we have to update clusters movies datasets also.

class userRequestedFor:
def __init__(self, user_id, users_data):
self.users_data = users_data.copy()
self.user_id = user_id
# Find User Cluster
users_cluster = saveLoadFiles().loadUsersClusters()
self.user_cluster = int(users_cluster[users_cluster['userId'] == self.user_id]['Cluster'])
# Load User Cluster Movies Dataframe
self.movies_list = saveLoadFiles().loadClusterMoviesDataset()
self.cluster_movies = self.movies_list[self.user_cluster] # dataframe
self.cluster_movies_list = list(self.cluster_movies['movieId']) # list
def updatedFavouriteMoviesList(self, new_movie_Id):
if new_movie_Id in self.cluster_movies_list:
self.cluster_movies.loc[self.cluster_movies['movieId'] == new_movie_Id, 'Count'] += 1
else:
self.cluster_movies = self.cluster_movies.append([{'movieId':new_movie_Id, 'Count': 1}], ignore_index=True)
self.cluster_movies.sort_values(by = ['Count'], ascending = False, inplace= True)
self.movies_list[self.user_cluster] = self.cluster_movies
saveLoadFiles().saveClusterMoviesDataset(self.movies_list)

def recommendMostFavouriteMovies(self):
try:
user_movies = getMoviesOfUser(self.user_id, self.users_data)
cluster_movies_list = self.cluster_movies_list.copy()
for user_movie in user_movies:
if user_movie in cluster_movies_list:
cluster_movies_list.remove(user_movie)
return [True, cluster_movies_list]
except KeyError:
err = "User history does not exist"
print(err)
return [False, err]
except:
err = 'Error: {0}, {1}'.format(exc_info()[0], exc_info()[1])
print(err)
return [False, err]
movies_metadata = pd.read_csv(
'./Prepairing Data/From Data/movies_metadata.csv',
usecols = ['id', 'genres', 'original_title'])

movies_metadata = movies_metadata.loc[
movies_metadata['id'].isin(list(map(str, np.unique(users_fav_movies['movieId']))))].reset_index(drop=True)
print('Let take a look at movie metadata for all those movies which we were had in our dataset')
movies_metadata
Let take a look at movie metadata for all those movies which we were had in our dataset
png
user12Movies = getMoviesOfUser(12, users_fav_movies)
for movie in user12Movies:
title = list(movies_metadata.loc[movies_metadata['id'] == str(movie)]['original_title'])
if title != []:
print('Movie title: ', title, ', Genres: [', end = '')
genres = ast.literal_eval(movies_metadata.loc[movies_metadata['id'] == str(movie)]['genres'].values[0].split('[')[1].split(']')[0])
for genre in genres:
print(genre['name'], ', ', end = '')
print(end = '\b\b]')
print('')
Movie title:  ['Dancer in the Dark'] , Genres: [Drama , Crime , Music , ]
Movie title: ['The Dark'] , Genres: [Horror , Thriller , Mystery , ]
Movie title: ['Miami Vice'] , Genres: [Action , Adventure , Crime , Thriller , ]
Movie title: ['Tron'] , Genres: [Science Fiction , Action , Adventure , ]
Movie title: ['The Lord of the Rings'] , Genres: [Fantasy , Drama , Animation , Adventure , ]
Movie title: ['48 Hrs.'] , Genres: [Thriller , Action , Comedy , Crime , Drama , ]
Movie title: ['Edward Scissorhands'] , Genres: [Fantasy , Drama , Romance , ]
Movie title: ['Le Grand Bleu'] , Genres: [Adventure , Drama , Romance , ]
Movie title: ['Saw'] , Genres: [Horror , Mystery , Crime , ]
Movie title: ["Le fabuleux destin d'Amélie Poulain"] , Genres: [Comedy , Romance , ]
user12Recommendations = userRequestedFor(12, users_fav_movies).recommendMostFavouriteMovies()[1]
for movie in user12Recommendations[:15]:
title = list(movies_metadata.loc[movies_metadata['id'] == str(movie)]['original_title'])
if title != []:
print('Movie title: ', title, ', Genres: [', end = '')
genres = ast.literal_eval(movies_metadata.loc[movies_metadata['id'] == str(movie)]['genres'].values[0].split('[')[1].split(']')[0])
for genre in genres:
print(genre['name'], ', ', end = '')
print(']', end = '')
print()
Movie title:  ['Trois couleurs : Rouge'] , Genres: [Drama , Mystery , Romance , ]
Movie title: ["Ocean's Eleven"] , Genres: [Thriller , Crime , ]
Movie title: ['Judgment Night'] , Genres: [Action , Thriller , Crime , ]
Movie title: ['Scarface'] , Genres: [Action , Crime , Drama , Thriller , ]
Movie title: ['Back to the Future Part II'] , Genres: [Adventure , Comedy , Family , Science Fiction , ]
Movie title: ["Ocean's Twelve"] , Genres: [Thriller , Crime , ]
Movie title: ['To Be or Not to Be'] , Genres: [Comedy , War , ]
Movie title: ['Back to the Future Part III'] , Genres: [Adventure , Comedy , Family , Science Fiction , ]
Movie title: ['A Clockwork Orange'] , Genres: [Science Fiction , Drama , ]
Movie title: ['Minority Report'] , Genres: [Action , Thriller , Science Fiction , Mystery , ]

Thank You

MS (Computational Mathematics), Data Scientist and Machine Learning Engineer, Mathematician, Programmer, Research Scientist, Writer.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store