Guide to Documenting All Analysis Decisions
Making your research workflow transparent through comprehensive documentation is excellent practice. Here's how to document all analysis decisions effectively:
1. Create a Decision Log/Analysis Journal
What to include:
- Date of each decision
- Who made the decision (if team research)
- What the decision was
- Why it was made
- What alternatives were considered
- Potential implications
Format example:Date: 2024-01-15
Decision: Used median instead of mean for income variable
Rationale: Data showed significant right skew with outliers
Alternatives considered: Trimmed mean, log transformation
Impact: More robust to outliers; may lose information about high earners
2. Document Pre-Analysis Decisions
- Research questions and how they evolved
- Hypotheses (including those later abandoned)
- Sample size calculations and power analyses
- Inclusion/exclusion criteria for participants/data
- Variable selection rationale
- Pre-registration decisions (if applicable)
3. Data Preprocessing Documentation
Track every data transformation:
- Missing data handling (deletion, imputation method, rationale)
- Outlier treatment (identification method, how handled)
- Variable transformations (log, square root, standardization)
- Recoding decisions (binning, category merging)
- Data cleaning rules (what was removed and why)
4. Statistical Analysis Decisions
Document:
- Test selection (why ANOVA vs. regression, etc.)
- Assumption checking (what you tested, results, how violations were handled)
- Significance levels (α chosen and justification)
- Multiple comparison corrections (which method, why)
- Model building strategy (stepwise, theory-driven, etc.)
- Software and package versions used
5. Negative Results & Dead Ends
Crucially, document:
- Analyses that didn't work out
- Hypotheses not supported
- Models that didn't fit well
- Why certain approaches were abandoned
This prevents "fishing" accusations and shows intellectual honesty.
6. Use Version Control
- For code: Git/GitHub with meaningful commit messages
- For documents: Track changes in Word, or version numbering
- For data: Document any changes to datasets with version numbers
7. Create an Analysis Script/Notebook
Use:
- R Markdown or Quarto
- Jupyter Notebooks (Python)
- Stata do-files with extensive comments
- SPSS syntax files (not point-and-click)
Benefits: Code + narrative + output in one document
8. Standardized Documentation Template
Create a template with sections:
1. Research Question
2. Data Source & Collection
3. Variables (operational definitions)
4. Sample Characteristics
5. Data Cleaning Steps
6. Descriptive Statistics
7. Assumption Tests
8. Primary Analyses
9. Sensitivity Analyses
10. Deviations from Plan
9. Real-Time Documentation
- Document as you go, not retrospectively
- Use comments in code extensively
- Keep a research diary/lab notebook
- Note the date of each analysis session
10. Metadata and Codebooks
Create comprehensive documentation:
- Variable names and labels
- Coding schemes
- Measurement scales
- Sources for each variable
- Transformations applied
11. Decision Flowcharts
Create visual documentation:
- Participant flow diagrams
- Analysis decision trees
- "If-then" logic for data handling
12. Transparency Checklist
Before finalizing, confirm you've documented:
- [ ] All data exclusions
- [ ] All variable manipulations
- [ ] All measures collected (even if not analyzed)
- [ ] All experimental conditions
- [ ] Sample size determination
- [ ] Stopping rule for data collection
- [ ] All statistical tests conducted
Tools to Help
- OSF (Open Science Framework): For pre-registration and documentation
- Protocols.io: For detailed method protocols
- Electronic lab notebooks: LabArchives, Benchling
- Reference managers: Zotero, Mendeley (for tracking literature)
- Project management: Notion,