# Features
There are three types of featuers: user, media, context

# User Features
- Age / gender / language / country
- Average session time: does user like watching long /short movies
- Last genre watched
- User_actor_historgram, user_genre_histogram
# Context Features
- Season of the year : people watch summer /christmas movies
- Upcoming holiday
- Days to upcoming holiday
- Time of day : different content based on day
- Device: might watcher shorter content on phone
# Media Features
- Imdb / rotten rating
- Revenue of movie before on netflix
- Time_on platform : how long the media has been on netflix
- Media_watch_history_last_1_day: if lot of watches, then it could say blockbuster
- Genre
- Duration
- Content tags
# Cross Features
- User genre historical interaction : 3 month / 1 year
- User and movie embedding similarity
- Language / age match
- User actor / director
# Training Data Generation
Most actions are implicit. Most users won't rate something they completed watching. So, implicit ratings
Weekly watching patterns; use a whole month as training data and next 2 weeks as testing data Label: positive (80%+ video watched) negative (<10%); Inbetween (55%): liked it but got bored ; ignore Downsample negative examples