unb-lamfo-nlp-mcti
/

NLP-Recommendation-MCTI

English

Recommendation

Model card Files Files and versions Community

Jmilagres commited on Dec 16, 2022

Commit

723dff6

•

1 Parent(s): 5e73f5f

Update README.md

Browse files

Files changed (1) hide show

README.md +89 -0

README.md CHANGED Viewed

@@ -54,8 +54,97 @@ slope_one.SlopeOne: A simple yet accurate collaborative filtering algorithm.
 co_clustering.CoClustering: A collaborative filtering algorithm based on co-clustering.
 Every model was used and evaluated. When faced with each other different methods presented different error estimatives.
 ## Intended uses
 You can use the raw model for either masked language modeling or next sentence prediction, but it's mostly intended to
 be fine-tuned on a downstream task. See the [model hub](https://www.google.com) to look for

 co_clustering.CoClustering: A collaborative filtering algorithm based on co-clustering.
+It is possible to pass a custom dataframe as an argument to this class. The dataframe in question needs to have 3 columns with the following name: ['userID', 'itemID', 'rating'].
+```python
+class Method:
+    def __init__(self,df):
+        self.df=df
+        self.available_methods=[
+            'surprise.NormalPredictor',
+            'surprise.BaselineOnly',
+            'surprise.KNNBasic',
+            'surprise.KNNWithMeans',
+            'surprise.KNNWithZScore',
+            'surprise.KNNBaseline',
+            'surprise.SVD',
+            'surprise.SVDpp',
+            'surprise.NMF',
+            'surprise.SlopeOne',
+            'surprise.CoClustering',
+        ]
+    def show_methods(self):
+        print('The avaliable methods are:')
+        for i,method in enumerate(self.available_methods):
+            print(str(i)+': '+method)
+    def run(self,the_method):
+        self.the_method=the_method
+        if(self.the_method[0:8]=='surprise'):
+            self.run_surprise()
+        elif(self.the_method[0:6]=='Gensim'):
+            self.run_gensim()
+        elif(self.the_method[0:13]=='Transformers-'):
+            self.run_transformers()
+        else:
+            print('This method is not defined! Try another one.')
+    def run_surprise(self):
+        from surprise import Reader
+        from surprise import Dataset
+        from surprise.model_selection import train_test_split
+        reader = Reader(rating_scale=(1, 5))
+        data = Dataset.load_from_df(self.df[['userID', 'itemID', 'rating']], reader)
+        trainset, testset = train_test_split(data, test_size=.30)
+        the_method=self.the_method.replace("surprise.", "")
+        eval(f"exec('from surprise import {the_method}')")
+        the_algorithm=locals()[the_method]()
+        the_algorithm.fit(trainset)
+        self.predictions=the_algorithm.test(testset)
+        list_predictions=[(uid,iid,r_ui,est) for uid,iid,r_ui,est,_ in self.predictions]
+        self.predictions_df = pd.DataFrame(list_predictions, columns =['user_id', 'item_id', 'rating','predicted_rating'])
+```
 Every model was used and evaluated. When faced with each other different methods presented different error estimatives.
+The surprise library provides 4 different methods to assess the accuracy of the ratings prediction. Those are: rmse, mse, mae and fcp. For further discussion on each metric please visit the package documentation.
+```python
+class Evaluator:
+    def __init__(self,predictions_df):
+        self.available_evaluators=['surprise.rmse','surprise.mse',
+                                   'surprise.mae','surprise.fcp']
+        self.predictions_df=predictions_df
+    def show_evaluators(self):
+        print('The avaliable evaluators are:')
+        for i,evaluator in enumerate(self.available_evaluators):
+            print(str(i)+': '+evaluator)
+    def run(self,the_evaluator):
+        self.the_evaluator=the_evaluator
+        if(self.the_evaluator[0:8]=='surprise'):
+            self.run_surprise()
+        else:
+            print('This evaluator is not available!')
+    def run_surprise(self):
+        import surprise
+        from surprise import accuracy
+        predictions=[surprise.prediction_algorithms.predictions.Prediction(row['user_id'],row['item_id'],row['rating'],row['predicted_rating'],{}) for index,row in self.predictions_df.iterrows()]
+        self.predictions=predictions
+        self.the_evaluator= 'accuracy.' + self.the_evaluator.replace("surprise.", "")
+        self.acc = eval(f'{self.the_evaluator}(predictions,verbose=True)')
+```
 ## Intended uses
 You can use the raw model for either masked language modeling or next sentence prediction, but it's mostly intended to
 be fine-tuned on a downstream task. See the [model hub](https://www.google.com) to look for