A continuation of STT 2860 with an emphasis on statistical modeling and reproducible reporting using professional tools. Hypothesis testing will be introduced via resampling, and estimation will be introduced via bootstrapping. Cross-validation will be used to evaluate and select models that take into account the bias-variance trade-off. Supervised learning techniques will include linear regression, regression trees, classification trees, and random forests. Unsupervised learning techniques will include hierarchical clustering, k-means, and if time permits an introduction to principal components.
The labs are a series of assignments that combine various modeling techniques used to predict bodyfat.
The bookdown reproductions are of various DataCamp assignments.
Multiple and Logistical Regression