Model Training Module
HybridModel
A hybrid recommendation system combining content-based and collaborative filtering.
This class implements a hybrid approach that leverages both content-based filtering (using movie genres) and collaborative filtering (using user-item interactions). It uses FAISS for efficient similarity search in the collaborative filtering component and combines recommendations from both approaches.
Source code in src/train.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 |
|
__init__(movies, ratings)
Initialize the hybrid recommendation model.
Sets up the hybrid model by creating sparse matrices for collaborative filtering, training the FAISS index, and preparing all necessary mappings.
Source code in src/train.py
find_closest_title(input_title)
Find the closest matching movie title using fuzzy string matching.
This method handles typos and variations in movie titles by finding the most similar title in the movies database using sequence matching.
Source code in src/train.py
hybrid_recommend(user_ratings, content_weight=0.4, top_n=5)
Generate hybrid recommendations combining content-based and collaborative filtering.
This method implements a hybrid recommendation approach that: 1. Generates content-based recommendations using movie genres 2. Creates collaborative filtering recommendations using user similarity 3. Combines both approaches to provide diverse, high-quality recommendations 4. Handles fuzzy matching for movie titles to improve usability
Source code in src/train.py
load_hybrid_model()
Load a pre-trained hybrid recommendation model from disk.
This helper function loads a previously saved hybrid model using joblib, ensuring proper class reference resolution for successful deserialization.
Source code in src/train.py
train_model()
Train and save the hybrid recommendation model.
This function orchestrates the complete model training process: 1. Loads movie and rating data from CSV files 2. Applies data preprocessing and downsampling for performance 3. Trains the hybrid model combining content-based and collaborative filtering 4. Saves the trained model to disk using joblib
The function performs data downsampling to improve training speed and memory usage by selecting the top 20,000 most active users and top 10,000 most rated movies.