statsmodels-statistical-modelinglisted
Install: claude install-skill jaechang-hits/SciAgent-Skills
# statsmodels
## Overview
Statsmodels provides classical statistical modeling with rigorous inference for Python. It covers linear models, generalized linear models, discrete choice, time series, and comprehensive diagnostics. Unlike scikit-learn (prediction-focused), statsmodels emphasizes coefficient interpretation, p-values, confidence intervals, and model diagnostics.
## When to Use
- Fitting linear regression (OLS, WLS, GLS) with detailed coefficient tables and diagnostics
- Running logistic regression with odds ratios and marginal effects for clinical/epidemiological studies
- Analyzing count data with Poisson or negative binomial regression
- Time series forecasting with ARIMA, SARIMAX, or exponential smoothing
- Performing ANOVA, t-tests, or non-parametric tests with proper corrections
- Testing model assumptions (heteroskedasticity, autocorrelation, normality of residuals)
- Model comparison using AIC/BIC or likelihood ratio tests
- Using R-style formula interface (`y ~ x1 + x2 + C(group)`) for intuitive model specification
- For prediction-focused ML with cross-validation and hyperparameter tuning, use `scikit-learn` instead
- For Bayesian modeling with posterior inference, use `pymc` instead
## Prerequisites
- **Python packages**: `statsmodels`, `numpy`, `pandas`, `scipy`
- **Optional**: `matplotlib` (for diagnostic plots), `patsy` (for formula API, included with statsmodels)
- **Data**: Tabular data as pandas DataFrames or NumPy arrays
```bash
pip install s