Recap 1/15/2020: Basics of Data Science Workshop

For this week’s workshop, we went over the data science realm in general and talked about various tools leveraged by data scientists and analysts. Please see below for a summary of the workshop and link to the slides!

Covered in Workshop:

  • Debuted a sleek new look for our slides and logo!
  • Discussed our club’s purpose
  • Overview of various aggregated resources
  • The data science profession and their toolset
  • Overview of data visualization
  • Shared student projects (links on slides)
  • Discussed machine setup steps
  • Went over basic R syntax & Dplyr package

Workshop Slides

Recap 11/20/2019: Special Topic: Random Forest Using Python

For this week’s special topic, we demonstrated how to build a random forest model with real-world income data using Python and an example of how to predict a person’s income range with given information.

Covered in Special Topic:

  • Explanatory analysis in Python
  • Lambda function in Python
  • Python DataFrame
  • One-Hot Encoding vs Label Encoding
  • Training and Testing
  • Model Training
  • Model Prediction
  • Case Study of predicting one individual’s income range
  • Confusion Matrix
  • Differences of accuracy, precision, and recall

Special Topic Notebook

Recap 11/13/2019: Predictive Analytics Introduction Workshop

For this week’s workshop, we covered the basics of predictive analytics by introducing the concepts of explanatory analysis, linear regression, decision tree and random forest. The students during the workshop were able to learn how to explore a dataset in R and build a linear model to predict the future. Please see below for a summary of the workshop and link to the slides!

Covered in Workshop:

  • Went over basic R syntax & dplyr package
  • Real-world examples of predictive analytics
  • Importance & methods of cleaning data
  • Introduction of linear regression
  • Evaluation & interpretation of linear regression
  • Introduction of classification
    • Decision Trees
    • Random Forest
  • Books and classes recommendation to study data science

Workshop Slide