# Features

There are three types of featuers: user, media, context

features

# User Features

  • Age / gender / language / country
  • Average session time: does user like watching long /short movies
  • Last genre watched
  • User_actor_historgram, user_genre_histogram

# Context Features

  • Season of the year : people watch summer /christmas movies
  • Upcoming holiday
  • Days to upcoming holiday
  • Time of day : different content based on day
  • Device: might watcher shorter content on phone

# Media Features

  • Imdb / rotten rating
  • Revenue of movie before on netflix
  • Time_on platform : how long the media has been on netflix
  • Media_watch_history_last_1_day: if lot of watches, then it could say blockbuster
  • Genre
  • Duration
  • Content tags

# Cross Features

  • User genre historical interaction : 3 month / 1 year
  • User and movie embedding similarity
  • Language / age match
  • User actor / director

# Training Data Generation

Most actions are implicit. Most users won't rate something they completed watching. So, implicit ratings

Weekly watching patterns; use a whole month as training data and next 2 weeks as testing data Label: positive (80%+ video watched) negative (<10%); Inbetween (55%): liked it but got bored ; ignore Downsample negative examples