Loan Default Prediction
ML classification for loan default probability using real-world financial data.
About this project
Data Science Coding Challenge: Loan Default Prediction using Machine Learning (UIC). Developed and implemented a predictive model to estimate the probability of loan defaults using real-world financial data. Replaced baseline dummy classifiers with Logistic Regression, incorporating feature preprocessing and one-hot encoding for categorical variables. Aligned training and testing datasets and generated probability-based outputs for over 100,000 loan records. Applied Logistic Regression for a reliable classification model; engineered features and produced submission-ready predictions (LoanID, predicted_probability). Gained hands-on experience with data preprocessing, model training, evaluation (AUC), and submission pipelines in Python.
Tech stack