Two modes: Create (gather requirements, generate SKILL.md) and Improve (diagnose existing skill against best practices, propose changes). Includes bundled references for frontmatter spec and writing guide. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
3.9 KiB
3.9 KiB
Skill Writing Guide
Best practices for writing effective Claude Code skills.
Two Categories of Skills
- Capability uplift — teaches the agent something it couldn't do before (scaffold component, run audit, deploy)
- Encoded preference — captures your specific way of doing something the agent could already do (commit style, review checklist, naming conventions)
Know which you're building — it changes how much detail to include.
Description Optimization
The description is the most important line. It determines when the skill gets triggered.
- List trigger contexts explicitly: "Use when the user wants to X, Y, or Z"
- Think about should-trigger / should-not-trigger scenarios
- A slightly "pushy" description is better than a vague one
- Test: would this description make the model select this skill for the right prompts?
Writing Instructions
Explain WHY, not just rules
- Bad: "MUST use semantic HTML"
- Good: "Use semantic HTML elements (nav, main, aside) because screen readers depend on landmarks for navigation"
Avoid heavy-handed MUSTs
- Reserve MUST/NEVER for genuine constraints (security, data loss)
- For preferences, explain the reasoning and let the agent make good decisions
Progressive disclosure
Three levels of instruction loading:
- Frontmatter — always loaded (name, description). Keep minimal.
- Body — loaded when skill is invoked. Core instructions here.
- Bundled resources — loaded on demand via
Read. Put reference tables, specs, examples here.
Use bundled resources (references/, scripts/, assets/) for content that would bloat the main SKILL.md.
Every sentence should change behavior
- Delete filler: "It is important to...", "Make sure to...", "Please note that..."
- Delete obvious instructions the agent would do anyway
- Test: if you removed this sentence, would the output change? No → delete it.
Structure Conventions
Project conventions (this repo)
- Always set
disable-model-invocation: true - Use H1 for the skill title (short action phrase)
- Reference
$ARGUMENTSearly in the body - Use
!backtick for live data injection (git diff, file listings) - Numbered steps, imperative voice
- Output format in a fenced markdown block if structured
Bundled resources pattern
.claude/skills/my-skill/
SKILL.md # Main instructions
references/ # Specs, guides, schemas
scripts/ # Shell scripts, templates
assets/ # Static files
Reference from SKILL.md: Read ${CLAUDE_SKILL_DIR}/references/spec.md
Length Guidelines
- Simple skills (encoded preference): 30-50 lines
- Standard skills (capability uplift): 50-100 lines
- Complex skills (multi-mode, research): 100-200 lines
- Maximum: 500 lines (if exceeding, split into bundled resources)
Common Mistakes
- Overfitting to test cases — write general instructions, not scripts for specific inputs
- Too many rules — the agent ignores rules after ~20 constraints. Prioritize.
- No examples — for complex output formats, show one complete example
- Ignoring conversation context — skills without fork can use prior conversation. Leverage it.
- Forgetting edge cases — what happens with empty input? Invalid arguments? Missing files?
Improvement Workflow
- Draft the skill
- Test with 3-5 realistic prompts
- Review output — does every instruction change behavior?
- Remove filler, tighten descriptions
- Add edge case handling for failures observed in testing
- Re-test after changes
Evaluation Criteria
When reviewing a skill, score against:
- Trigger accuracy — does the description match the right prompts?
- Instruction clarity — can the agent follow without ambiguity?
- Output quality — does the skill produce useful, consistent results?
- Conciseness — is every line earning its place?
- Robustness — does it handle edge cases and errors?