Write step-by-step data cleaning instructions for a messy spreadsheet
Raw exports from CRMs, survey tools, or ERPs are almost always dirty. This prompt produces a concrete, ordered cleaning checklist tailored to the specific problems in your dataset.
You are a data analyst who specializes in preparing raw data for analysis. I have a messy spreadsheet that needs cleaning before I can use it.
Spreadsheet description: {{SPREADSHEET_DESCRIPTION}}
Known problems (list everything you've noticed): {{KNOWN_PROBLEMS}}
Tool I'm using (e.g., Excel, Google Sheets, Python/pandas, R): {{TOOL}}
Final goal for this data (e.g., pivot table, chart, export to database): {{FINAL_GOAL}}
Follow these steps:
1. Group the problems I listed into categories: structural issues (wrong shape, merged cells, headers in wrong row), data-type issues (dates stored as text, numbers with currency symbols), consistency issues (inconsistent naming, mixed cases), and completeness issues (blanks, nulls).
2. For each problem, write a specific, numbered cleaning step with the exact formula, function, or code snippet needed in {{TOOL}}.
3. Order the steps so that each one does not break a later step (e.g., fix structure before applying formulas).
4. Add a validation check after each major step so I can confirm it worked before moving on.
5. Flag any problems that cannot be fixed programmatically and require a manual review or a decision from the data owner.
Do not suggest steps that require tools or permissions I have not mentioned. {{SPREADSHEET_DESCRIPTION}}{{KNOWN_PROBLEMS}}{{TOOL}}{{FINAL_GOAL}}
How to use this prompt
- Copy the prompt above (Copy button on the top-right).
- Replace each
{{VAR}}with your own value. Variables:{{SPREADSHEET_DESCRIPTION}}{{KNOWN_PROBLEMS}}{{TOOL}}{{FINAL_GOAL}}. - Paste it into one of the recommended tools below.
- Iterate: tighten constraints in the prompt if the output is generic.
Why this prompt is structured this way
The prompt is split into explicit steps because LLMs do better when the path is named, not implied. Each variable forces specificity at the input layer — vague inputs get vague outputs.
Pair this prompt with a tool
Perplexity
$0/mo (Pro at $20)AI search engine with citations.
Perplexity is the answer engine Google would build if it weren't protecting search ad revenue. Cited answers, follow-up questions, focused source modes.
Claude (Anthropic)
$0/mo (Pro at $20)Frontier model with long context and strong reasoning.
Claude (Opus / Sonnet / Haiku tiers) is the assistant favored by writers and engineers who care about reasoning quality and tone. 1M token context on Opus.
ChatGPT (OpenAI)
$0/mo (Plus at $20)The category-defining general-purpose AI assistant.
ChatGPT has the broadest feature surface: image gen, voice, custom GPTs, web browsing, code execution. Often the right default; sometimes beaten on specific tasks by Claude or Perplexity.
The PlaybookPrompts weekly
One short email per week. The five sharpest prompts we found, one tool worth your attention, one anti-pattern to avoid. Unsubscribe in one click.
Newsletter signup is not configured. Set PUBLIC_NEWSLETTER_USERNAME in the build env.