Overview
A “prompt” is the instruction or question a user gives to an AI system. In research, prompting is not just a technical skill. It is part of analytic reasoning.
A strong prompt helps StatWiseAI understand the research question, dataset documentation, variables, study design, assumptions, and type of output needed. A weak prompt often produces a generic answer. A strong prompt produces a response that is more specific and useful, easier to review, and easier to document.
A weak prompt
“What analysis should I run?”
This prompt is too vague. It does not describe the research question, outcome variable, predictor variables, dataset, study design, or analytic goal.
A stronger prompt
“I am using public documentation from a longitudinal study to plan an analysis. I want to examine whether baseline depressive symptoms are associated with later functional limitations among older adults. I am not uploading participant-level data. Please help me identify relevant variables, possible longitudinal modeling strategies, missing data concerns, and assumptions I should check before analysis.”
This prompt is stronger because it gives StatWiseAI:
- A research topic.
- A study design clue.
- A likely outcome and predictor.
- A clear data privacy boundary.
- A specific request.
- A request for assumptions and limitations.
Recommended prompt structure
When asking StatWiseAI for help, include as many of the following as possible:
Research question
What are you trying to learn?
Example:
“I want to examine whether food insecurity is associated with depressive symptoms among adults.”
Dataset or documentation source
What dataset, documentation, codebook, or output are you using?
Example:
“I am using public NHANES documentation and variable descriptions.”
Study design
What is the general design?
Examples:
- Cross-sectional survey
- Longitudinal cohort
- Clinical trial
- Administrative dataset
- EHR dataset
- Public-use survey
- Repeated-measures dataset
Outcome variable
What is the outcome?
Example:
“The outcome is depressive symptoms, measured using a questionnaire scale.”
Predictor or exposure
What is the main predictor, exposure, or grouping variable?
Example:
“The main exposure is food insecurity.”
Covariates
What other variables may need to be considered?
Example:
“Potential covariates include age, sex, race/ethnicity, education, income, insurance status, and chronic conditions.”
Data structure
What features of the data matter?
Examples:
- Survey weights
- Strata
- Primary sampling units
- Repeated measures
- Clustering
- Multiple time points
- Missing data
- Linked files
- Restricted variables
- Small subgroup sample sizes
Type of help needed
Be specific about the task.
Examples:
- Help me identify relevant documentation.
- Suggest possible statistical models.
- Compare analytic approaches.
- Draft code using placeholder variable names.
- Review this output.
- Identify assumptions and limitations.
- Suggest sensitivity analyses.
- Create a reproducibility checklist.
Preferred output format
Tell StatWiseAI how to respond.
Examples:
- “Provide a checklist.”
- “Use a table.”
- “Compare 2–3 options.”
- “Write this for a beginner.”
- “Give me R code with comments.”
- “Do not write final conclusions.”
General prompt template
You may copy and adapt this template:
I am using StatWiseAI to support analysis planning for [dataset or documentation source].
My research question is: [insert research question].
I am not uploading participant-level, proprietary, sensitive, PHI, HIPAA-regulated, or FERPA-regulated data.
The study design is: [cross-sectional / longitudinal / survey / cohort / clinical / administrative / other].
The outcome is: [name or description, type, coding if known].
The main predictor or exposure is: [name or description].
Important covariates may include: [list].
Important data features include: [survey weights, repeated measures, clustering, missing data, time-to-event structure, linked files, etc.].
I need help with: [choosing a model / identifying variables / understanding documentation / generating code / reviewing output / planning sensitivity analyses].
Please provide: [checklist / table / step-by-step plan / code / explanation].
Please also identify assumptions, limitations, and questions I should answer before proceeding.
Follow-up prompts
A good AI interaction usually takes more than one prompt. After receiving a response, users should ask follow-up questions such as:
- What assumptions are you making?
- What information is missing from my prompt?
- What could make this recommendation inappropriate?
- What alternative approaches should I consider?
- What should I verify before using this recommendation?
- What would a statistical reviewer ask about this plan?
- What sensitivity analyses should I consider?
- How should I document this decision?
- Can you revise this using placeholder variable names only?
- Can you explain this in simpler language?
Prompting reminder
Do not enter proprietary data, sensitive data, participant-level data, PHI, HIPAA-regulated information, FERPA-regulated information, or other restricted information into StatWiseAI. Use public documentation, metadata, data dictionaries, codebooks, statistical outputs, analytic code, or simulated data for practice.
Start Here: Responsible Use Rules
AI Basics for Researchers
Prompting for Data Analysis
Working with Dataset Documentation
Reviewing AI Outputs
Requesting Code
Reproducibility and Prompt History
Practice Use Cases: HRS and NHANES
Templates and Checklists
Return to StatWiseAI AI Literacy Tutorial Home

