Overview
This page provides two practice examples showing how StatWiseAI can support research planning without requiring users to upload raw participant-level data. These examples focus on the Health and Retirement Study (HRS) and the National Health and Nutrition Examination Survey (NHANES).
Users should rely on public documentation, codebooks, variable descriptions, analytic guidelines, statistical outputs, simulated data, or approved summary information.
Use Case 1: HRS longitudinal analysis planning
Background
The Health and Retirement Study (HRS) is a longitudinal panel study that surveys a representative sample of approximately 20,000 people in the United States. HRS provides multidisciplinary data on aging and includes public documentation such as questionnaires, codebooks, and data descriptions.
This HRS use case is designed to help StatWiseAI users learn about:
- Longitudinal data.
- Repeated measures.
- Aging-related outcomes.
- Time-varying variables.
- Attrition and missing data.
- Cross-wave harmonization.
- Documentation review.
- Weights and survey design considerations.
Example research question
Are baseline depressive symptoms associated with later functional limitations among older adults?
Example prompt
I am using public documentation from the Health and Retirement Study to plan a longitudinal analysis. I want to examine whether baseline depressive symptoms are associated with later functional limitations among older adults. I am not uploading participant-level data. Help me identify relevant types of variables, possible longitudinal modeling strategies, missing data and attrition concerns, and assumptions I should check before analysis.
What a useful StatWiseAI response should include
A useful response should help the user think through:
- Which HRS documentation to review.
- How depressive symptoms may be measured.
- How functional limitations may be measured.
- Which waves or years may be relevant.
- Whether the analysis is cross-sectional or longitudinal.
- How baseline and follow-up should be defined.
- Whether repeated measures should be modeled.
- Whether attrition may bias estimates.
- Whether weights are needed.
- What covariates may be important.
- Whether the analysis is descriptive, associational, predictive, or causal.
Follow-up prompts you can ask StatWiseAI:
What HRS documentation should I review before choosing variables?
What are possible modeling approaches for repeated measures in a longitudinal cohort?
What missing data and attrition issues should I consider?
What assumptions would need to hold before interpreting the association as causal?
What would a reviewer ask about this analysis plan?
Common mistakes to avoid
- Treating repeated observations as independent.
- Ignoring attrition.
- Mixing waves without checking whether variables are measured consistently.
- Making causal claims without a causal design.
- Ignoring the distinction between baseline covariates and time-varying covariates.
- Selecting variables without reviewing codebooks and questionnaires.
- Assuming that a harmonized variable is always appropriate without checking its construction.
Use Case 2: NHANES survey-weighted analysis planning
Background
The National Health and Nutrition Examination Survey (NHANES) is a national survey that combines interviews, health examinations, and laboratory tests. CDC provides NHANES questionnaires, datasets, documentation, variable search tools, and data analysis tutorials.
This NHANES use case is designed to help StatWiseAI users learn about:
- Complex survey design.
- Survey weights.
- Strata and primary sampling units.
- Combining survey cycles.
- Public documentation review.
- Questionnaire, examination, laboratory, and demographic files.
- Population-level inference.
- Subsample weights.
- Common survey analysis mistakes.
Example research question
Is food insecurity associated with depressive symptoms among adults?
Example prompt
I am using public NHANES documentation to plan an analysis of the association between food insecurity and depressive symptoms among adults. I am not uploading raw participant-level data. Help me identify relevant documentation, survey design features, possible variables, covariates, and common analysis mistakes to avoid. Please include reminders about survey weights, strata, primary sampling units, and combining survey cycles.
What a useful StatWiseAI response should include
A useful response should help the user think through:
- Which NHANES cycles include the needed variables.
- Whether food insecurity and depressive symptoms are available in the same cycles.
- Which files contain the relevant variables.
- Which demographic covariates may be needed.
- Whether the analysis should use interview, examination, laboratory, or subsample weights.
- Whether cycles need to be combined.
- How strata and PSU variables should be specified.
- Whether the analytic sample is restricted by age or eligibility.
- Whether missing data may affect results.
- How to avoid overinterpreting cross-sectional associations.
Why survey design matters
NHANES weights are created to account for the complex survey design, including oversampling, survey nonresponse, and post-stratification. CDC notes that weights are needed to calculate estimates representative of the U.S. civilian noninstitutionalized population. NHANES analysis also requires attention to strata and PSU variables for variance estimation.
Follow-up prompts
Which NHANES documentation should I review before choosing variables?
What survey design features must be accounted for?
How do I decide which NHANES weight to use?
What should I consider if I combine multiple cycles?
What are common mistakes when analyzing NHANES data?
Please draft R/Stata/SAS/SPSS/Python code using placeholder variable names only.
Common mistakes to avoid
- Ignoring survey weights.
- Ignoring strata and PSU variables.
- Using the wrong weight.
- Combining cycles without adjusting weights appropriately.
- Using variables from different files without checking eligibility.
- Treating NHANES as a simple random sample.
- Making causal claims from cross-sectional data.
- Ignoring small subgroup sample sizes.
- Failing to review official analytic guidance.
Practice activity
Choose either HRS or NHANES and write a prompt that includes:
- Research question.
- Dataset documentation source.
- Outcome.
- Main predictor or exposure.
- Covariates.
- Study design features.
- Missing data concerns.
- Preferred output format.
- Reminder not to use raw participant-level data.
Then ask StatWiseAI to identify assumptions, limitations, and questions that should be answered before analysis.
Start Here: Responsible Use Rules
AI Basics for Researchers
Prompting for Data Analysis
Working with Dataset Documentation
Reviewing AI Outputs
Requesting Code
Reproducibility and Prompt History
Practice Use Cases: HRS and NHANES
Templates and Checklists
Return to StatWiseAI AI Literacy Tutorial Home

