Dataset Analysis | Dylan Skalman

These are some of the data-driven explorations and predictive modeling projects I’ve completed, showcasing advanced analysis techniques applied to real-world biomedical datasets.

Glioma EDA & Predictive Modeling

A UCI dataset with 3 clinical and 20 molecular features was analyzed to classify glioma grades. Logistic Regression with 10-fold cross-validation yielded: 87.3% accuracy, 80.2% precision, 92.9% recall.

View on Kaggle Read the Write-up

Diabetes EDA & Predictive Modeling

A Kaggle dataset including variables like BMI, Glucose, and HbA1c-level was cleaned and modeled using Linear Regression. Though the data source was unverified, it was a fast-paced modeling experiment.

View on Kaggle

Heart Disease EDA & Predictive Modeling

Based on reprocessed UCI Cleveland data, this project compares Logistic, Linear, Decision Tree, and Random Forest models. Variants of the dataset were explored for predictive accuracy.

Original Dataset View on Kaggle Download PDF