DS 6030 | Fall 2024 | University of Virginia
Homework #7: Stacking and Boosting
Stacking for Kaggle
You are to make at least one official entry in the House Prices: Advanced Regression Techniques Kaggle contest using stacking or model averaging; at least one component model must be a boosting model.
- You will need to register in Kaggle (its free)
- Read the details of the contest. Understand the data and evaluation function.
- Make at least one submission that uses stacking or model averaging.
- If you get a score on the public leaderboard of \(\text{RMSE}<0.50\) (note RMSE is calculated on the log scale), you receive full credit, otherwise, you’ll lose 10 points.
- I’ll allow teaming. Each team member can produce one component model and then use stacking or model averaging to combine predictions.
- You don’t need to team, but must still combine multiple models. At least one of the component models should be boosting.
- Each person submit the following in Canvas:
- Code (if teaming, your code and the shared stacking code)
- kaggle name (or team name) so we can ensure you had a valid submission.
- your score and current ranking on the kaggle leaderboard
- Top 5 scores get 2 bonus points
- Teams will split their bonus points among team members
Note: Check out the Kaggle notebooks which let you make submissions directly from the notebook. Its very similar to using Rivanna’s OnDemand in that you can make a RMarkdown/Jupyter notebook or R/Python scrips that run on the cloud. Free CPU (4 cores, 30GB RAM) - amazing! Let your laptops cool off after all their hard work this semester.