Templates and Checklists

Overview

This page provides reusable templates for working with StatWiseAI. Users may copy and adapt these prompts for their own projects.

Users should not enter proprietary, sensitive, private, identifiable, PHI, HIPAA-regulated, FERPA-regulated, or restricted information. Use public documentation, metadata, codebooks, analytic guidance, statistical output, analytic code, simulated data, or approved summary information.

General analysis-planning prompt

I am using StatWiseAI to support analysis planning for [dataset or documentation source].
I am not uploading participant-level, proprietary, sensitive, PHI, HIPAA-regulated, or FERPA-regulated data.
My research question is: [insert question].
The study design is: [cross-sectional / longitudinal / survey / cohort / clinical / administrative / other].
The outcome is: [describe outcome and coding if known].
The main predictor or exposure is: [describe predictor or exposure].
Potential covariates include: [list covariates].
Important data features include: [survey weights / strata / PSU / repeated measures / clustering / missing data / longitudinal waves / linked files].
Please suggest possible analysis strategies, identify assumptions and limitations, and list questions I should answer before proceeding.

Dataset documentation prompt

I am reviewing public documentation for [dataset name].
I want to study [research topic].
I am not uploading raw data.
Please help me identify which types of documentation I should review, including codebooks, questionnaires, data dictionaries, analytic guidelines, and variable search tools.
Please also identify possible design features, missing data concerns, and documentation issues that could affect the analysis.

Model comparison prompt

I am considering several analysis options for [research question].
The outcome is [continuous / binary / count / time-to-event / repeated measure / other].
The main predictor is [description].
Important design features include [survey design / repeated measures / clustering / missing data / other].
Please compare 2–3 possible modeling approaches. For each approach, explain when it would be appropriate, what assumptions it requires, what limitations it has, and what I should verify before using it.

Code request prompt

Please draft code in [R / Python / SAS / Stata / SPSS].
Use placeholder variable names only.
The goal is to [describe analysis].
The outcome is [OUTCOME].
The main predictor is [EXPOSURE].
Covariates include [COVARIATE_1, COVARIATE_2, COVARIATE_3].
Important design variables include [WEIGHT, STRATA, PSU, CLUSTER, ID, TIME] as applicable.
Please include comments explaining what each placeholder means.
Please also include a checklist of what I should verify before running the code.

AI-output review prompt

Review your previous response critically.
What assumptions did you make?
What information was missing from my prompt?
What could make your recommendation inappropriate?
What alternative approaches should I consider?
What should I verify in the dataset documentation before proceeding?
What should I ask a statistician, PI, or Co-I to review?

Reviewer-style critique prompt

Act as a statistical reviewer.
Review this analysis plan for potential problems related to design, measurement, missing data, model choice, assumptions, interpretation, reproducibility, and ethics.
Identify the most important concerns and suggest revisions.

HRS prompt template

I am using public documentation from the Health and Retirement Study.
I am not uploading participant-level data.
My research question is: [insert question].
I want to examine [outcome] in relation to [predictor/exposure] among [population].
Please help me identify relevant types of HRS documentation, possible variables, longitudinal design considerations, missing data and attrition concerns, and modeling options.
Please also identify assumptions and limitations I should document.

NHANES prompt template

I am using public NHANES documentation.
I am not uploading raw participant-level data.
My research question is: [insert question].
The outcome is [outcome].
The main predictor or exposure is [exposure].
Potential covariates include [covariates].
Please help me identify relevant NHANES files, survey design features, appropriate weights, strata and PSU considerations, cycle-combination issues, and common analysis mistakes to avoid.

Privacy checklist

Before using StatWiseAI, confirm:

I am not entering proprietary data.
I am not entering participant-level sensitive data.
I am not entering PHI.
I am not entering HIPAA-regulated information.
I am not entering FERPA-regulated information.
I am not entering identifiable information.
I am not entering restricted data.
I am using public documentation, simulated data, statistical output, code, or approved summary information.
I understand that prompt history and outputs are stored automatically and accessible to the research team.

Analysis-planning checklist

Before starting an analysis, confirm:

The research question is clear.
The dataset or documentation source is identified.
The outcome is defined.
The main predictor or exposure is defined.
Covariates are listed.
The study design is understood.
Survey weights, strata, PSUs, clusters, or repeated measures are identified when relevant.
Missing data issues are considered.
Eligibility criteria are documented.
The analysis goal is clear: description, association, prediction, or causal inference.
Assumptions and limitations are documented.

AI-generated code checklist

Before using AI-generated code, confirm:

The code uses the correct software language.
Placeholder variable names are replaced correctly.
The model matches the outcome type.
The code handles the study design appropriately.
Weights, strata, PSUs, clusters, or repeated measures are included when needed.
Missing data are handled intentionally.
The code runs without errors.
Output is reviewed carefully.
Results are not overinterpreted.
The final code is saved in an approved project location.

Reproducibility checklist

For each major AI-assisted decision, document:

Date of AI use.
Prompt or prompt summary.
AI-generated recommendation.
Human decision.
Rationale for the decision.
Verification steps.
Reviewer or collaborator consulted.
Final code location.
Final output location.
Remaining limitations or concerns.

Final reminder

StatWiseAI is a research support tool. It can help users think through analytic decisions, but users remain responsible for protecting data, verifying outputs, documenting decisions, and drawing scientifically appropriate conclusions.

Start Here: Responsible Use Rules
AI Basics for Researchers
Prompting for Data Analysis
Working with Dataset Documentation
Reviewing AI Outputs
Requesting Code
Reproducibility and Prompt History
Practice Use Cases: HRS and NHANES
Templates and Checklists

Return to StatWiseAI AI Literacy Tutorial Home