Results
Using a cleaned, feature‑engineered line‑list dataset, I explored trends in age, gender, geography and exposure history. Gradient‑boosted and ensemble models (XGBoost, Random Forest) reached 93 % accuracy for recovery prediction and 96 % for mortality prediction, outperforming baselines like logistic regression.
Interactive dashboards and world maps highlight hot spots, outcome ratios and time‑series trends, giving stakeholders an at‑a‑glance view of evolving risks.
Challenges
COVID‑19 case data contained inconsistent date formats, missing values and class imbalance. Rigorous preprocessing, feature selection and resampling were essential to ensure model reliability and avoid bias toward majority classes.