Content based filtering is one of the most common recommending approaches. It provides recommendation based on items user currently likes or uses. For example, if a user likes the movie Frozen, content based filtering will find movies similar to Frozen according to movie characteristics such as movie category, producers, actors and movie length etc.
“The steps in recommending products or contents to the user in content based filtering are as follows:
Identify the factors which describe and differentiate the products and the factors which might influence whether a user would buy the product or not,
Represent all the products in terms of those factors, descriptors or attributes,
Create a tuple or number vector for each product that represents the strength of each factors for the product,
Start to look at the users and their histories to create a user profile based on their history. It will have the same number of factors and their strength would indicate how much influenced the user is towards that factor,
Recommend the user those products that are nearest to them in terms of those factors.”
For more, please refer to the original article below:
This week we welcomed Chris, Denise, Jake, and Jennie from Symetra to come speak about their company and their data analytics division! We learned about some of the projects they work on, the challenges they face, and the best things about working at Symetra.
Symetra is looking for interns this summer! You can learn more and apply here:
For this week’s workshop, we went over the data science realm in general and talked about various tools leveraged by data scientists and analysts. Please see below for a summary of the workshop and link to the slides!
The BCG GAMMA Case Project was our biggest project as a club to date. 30 participants (six teams of five) gathered at BCG’s Seattle office on October 30th for the case kickoff. The case focused on a real-life challenge BCG consulted on in the past and teams were tasked with building random forest models to determine factors affecting customer churn. Over the three weeks following the kickoff, teams worked to develop models using R and Python primarily. On November 21st, the teams presented their recommendations to BCG consultants.
The case was a textbook example in fulfilling our club’s mission to bridge the gap between the traditionally technical and nontechnical disciplines. We accomplished this by balancing each team with both business/econ students and info/CS/data science students, and by pairing graduate and undergraduate students. The outcome? Everyone learned something new about their fellow huskies!
We couldn’t have been more impressed by our peers at the University of Washington, who presented some amazing case projects to BCG GAMMA last week! Thank you to Allen Chen, Spencer Barnes, and Annie Lai for judging the presentations, it was a blast to participate! A major shout out to Nam Pho for the initial connection and for mentoring club leadership throughout the process! A wonderful learning experience for all involved.
For this week’s special topic, we demonstrated how to build a random forest model with real-world income data using Python and an example of how to predict a person’s income range with given information.
Covered in Special Topic:
Explanatory analysis in Python
Lambda function in Python
One-Hot Encoding vs Label Encoding
Training and Testing
Case Study of predicting one individual’s income range
For this week’s workshop, we covered the basics of predictive analytics by introducing the concepts of explanatory analysis, linear regression, decision tree and random forest. The students during the workshop were able to learn how to explore a dataset in R and build a linear model to predict the future. Please see below for a summary of the workshop and link to the slides!
Covered in Workshop:
Went over basic R syntax & dplyr package
Real-world examples of predictive analytics
Importance & methods of cleaning data
Introduction of linear regression
Evaluation & interpretation of linear regression
Introduction of classification
Books and classes recommendation to study data science