← ClaudeAtlas

statsmodels-statistical-modelinglisted

Python statistical modeling: regression (OLS, WLS, GLM), discrete (Logit, Poisson, NegBin), time series (ARIMA, SARIMAX, VAR), with rigorous inference, diagnostics, and hypothesis tests. Use scikit-learn for ML; statistical-analysis for test choice.
jaechang-hits/SciAgent-Skills · ★ 183 · AI & Automation · score 81
Install: claude install-skill jaechang-hits/SciAgent-Skills
# statsmodels ## Overview Statsmodels provides classical statistical modeling with rigorous inference for Python. It covers linear models, generalized linear models, discrete choice, time series, and comprehensive diagnostics. Unlike scikit-learn (prediction-focused), statsmodels emphasizes coefficient interpretation, p-values, confidence intervals, and model diagnostics. ## When to Use - Fitting linear regression (OLS, WLS, GLS) with detailed coefficient tables and diagnostics - Running logistic regression with odds ratios and marginal effects for clinical/epidemiological studies - Analyzing count data with Poisson or negative binomial regression - Time series forecasting with ARIMA, SARIMAX, or exponential smoothing - Performing ANOVA, t-tests, or non-parametric tests with proper corrections - Testing model assumptions (heteroskedasticity, autocorrelation, normality of residuals) - Model comparison using AIC/BIC or likelihood ratio tests - Using R-style formula interface (`y ~ x1 + x2 + C(group)`) for intuitive model specification - For prediction-focused ML with cross-validation and hyperparameter tuning, use `scikit-learn` instead - For Bayesian modeling with posterior inference, use `pymc` instead ## Prerequisites - **Python packages**: `statsmodels`, `numpy`, `pandas`, `scipy` - **Optional**: `matplotlib` (for diagnostic plots), `patsy` (for formula API, included with statsmodels) - **Data**: Tabular data as pandas DataFrames or NumPy arrays ```bash pip install s