Project 5 : House Price Prediction
In this project, my goal was to build a predictive model to answer the House Prices Kaggle Competition, in which we are provided with diverse data concerning houses such as quality, number of bathrooms, areas, in order to predict the final sale price.
The large majority of the work was spent on exploratory data analysis and feature engineering, with some data cleaning to do in both the training and test set.
Random forest models were unfit for the data at hand so I focused on ones that were resilient to outliers, namely Lasso, Ridge, Elastinet, SVM and XGBoost models that performed very well, but required differently processed training sets depending on their tendency to overfit.
In the end, I ended up with an ensemble model that was given a score of 0.12321, ranking at the top quarter of the competition.
Link to the GitHub repository (includes a ReadMe notebook)