Your agent has 40 skills. It uses 3.

You spent a weekend wiring up your skills library. pdf-processing. code-reviewer. log-analyzer. deployment-helper. db-schema-explorer. The whole catalog. You felt productive.

Then Monday morning you asked your agent to do a thing one of those skills covers. It just answered the question with basic tools. No skill invoked. Tuesday too. Wednesday too.

You start to wonder why you bothered.

The reason isn't the skills. It's the descriptions.

How skills actually load

Every modern agent runtime, Claude Code, Cursor, OpenCode, OpenClaw, Gemini CLI, use the same three-tier loading model for the SKILL.md format:

Level 1 - Frontmatter (always in context). At session start, the agent reads only the YAML frontmatter from every installed skill: name and description. That's roughly 100 tokens per skill, and it lives in the system prompt for the entire session. You can install 40 skills with no real context cost.

Level 2 - Instructions (loaded on trigger). When the agent decides a skill is relevant, it reads the full body of SKILL.md into context. This is where your step-by-step workflow lives. Recommended budget: under 5,000 tokens.

Level 3 - Reference files (loaded on demand). Anything inside the skill's references/, assets/, or scripts/ directories. Loaded only when SKILL.md tells the agent to read them.

The architecture is elegant. The cost at idle is near zero. The cost at execution scales with what's actually needed.

But notice what the agent has to work with at decision time: just the name and the description. That's the entire job interview. If your description doesn't pattern-match what the user actually said, your skill never gets called. The instructions you spent two hours writing, the agent never sees them.

The five rules of descriptions that fire

After reading roughly 200 SKILL.md files across Anthropic's official skills, the agentskills.io examples, and a fistful of community libraries, the pattern is clear. Descriptions that trigger reliably follow five rules. Descriptions that don't, break at least one.

1. Write in the third person. The description is injected into the system prompt. "I will help you process PDFs" reads like the agent's own voice and confuses the routing logic. "Processes PDFs and extracts tables" reads like a tool capability. The latter routes correctly every time. This rule alone fixes maybe a third of broken skills.

2. Cover what AND when. "Processes PDF files" describes capability. That's half the job. The other half is trigger language: "Use when the user mentions PDFs, forms, or document extraction." The model is doing pattern-matching against user input, not capability inference. Give it the words.

3. Use specific trigger phrases. Bad: "for working with documents." Good: "Use when the user mentions PDFs, forms, or document extraction." Generic trigger language gets out-competed by skills with concrete terms. If your skill is for log analysis, the description should mention the actual phrases users say, "check the logs," "find errors," "what's failing in production."

4. Don't restate the name. A skill called log-analyzer with a description that says "analyzes logs" is wasting the description field. The name already communicates the gist. The description's job is to expand on it, what kinds of logs, what kinds of analysis, what triggers it.

5. Don't write skills for trivial tasks. Even a perfect description won't trigger a skill on a simple task. Ask "read this PDF" and the agent will just read the PDF with basic tools. Skills reliably fire on multi-step or specialized work, extracting tables across multiple PDFs, filling forms, comparing versions. If your skill is one bash call wrapped in a markdown file, delete it.

A before-and-after

Take a fictional release-notes-writer skill. Most builders ship it like this:

---
name: release-notes-writer
description: Writes release notes.
---

Six words. Restates the name. No trigger phrases. No third-person framing of when to use it. The agent will silently route around this skill on basically every relevant request.

Now the same skill with the rules applied:

---
name: release-notes-writer
description: Generates user-facing release notes from a git
  commit range. Use when the user asks for release notes,
  a changelog, "what changed since last release", or a
  summary of commits between tags. Groups changes by type
  (features, fixes, breaking) and writes in plain user-facing
  language, not engineer shorthand.
---

This one fires. Third person. Capability plus triggers. Specific phrases users actually say. Expands on the name. Implies the multi-step nature of the work.

The test that matters

You can't trust your gut on whether a description works. The way to know is to write 20 eval queries, 10 that should trigger the skill, 10 that should not, and run them in a fresh session. Watch what fires.

Most builders skip this step. Most builders also have 40 skills that don't fire.

The Blueprint below has both pieces. The optimized SKILL.md template and the eval framework. Paste them. Run them. Within 30 minutes you'll know which of your skills are real and which are decoration.

The bigger picture: agentskills.io is now a portable open spec. A SKILL.md you write today works in Claude Code, Cursor, OpenCode, OpenClaw, Gemini CLI, and the 40+ runtimes that have adopted the format. The investment carries forward across tools. But the trigger problem is universal, every runtime routes on the description, so the engineering work pays off everywhere.

Get the description right once. Use it everywhere.

A SKILL.md that fires + a 20-query eval

Two parts. Copy both. The skill goes in ~/.claude/skills/release-notes-writer/SKILL.md (or your runtime's equivalent ~/.openclaw/workspace/skills/release-notes-writer/SKILL.md). The eval is a markdown checklist you run in a fresh session.

Part 1: The skill

markdown

---
name: release-notes-writer
description: Generates user-facing release notes from a git
  commit range. Use when the user asks for release notes,
  a changelog, "what changed since last release", or a
  summary of commits between tags. Groups changes by type
  (features, fixes, breaking) and writes in plain user-facing
  language, not engineer shorthand.
---

# Release Notes Writer

## Quick start
1. Identify the commit range. If the user gave one, use it.
   Otherwise: `git describe --tags --abbrev=0` for the last
   tag, then range from that tag to HEAD.
2. Get commits: `git log <range> --pretty=format:"%h %s"`
3. Group by type using conventional commit prefixes:
   feat → Features
   fix → Fixes
   BREAKING CHANGE / ! → Breaking changes
   chore/docs/refactor → skip unless user asks for everything
4. Rewrite each entry in user-facing language. "fix: handle
   null in user.email" becomes "Fixed a crash when users had
   no email address on file."
5. Lead with breaking changes, then features, then fixes.
6. End with a one-line summary suitable for an email subject.

## Tone rules
- Past tense, active voice
- No engineer jargon (no "refactored", no "typo", no PR numbers)
- Skip anything users won't notice (test changes, CI tweaks)
- If a fix sounds dramatic, soft-pedal it. Don't scare users.

## Output format
Default to markdown with three sections (## Breaking changes,
## New features, ## Fixes). Skip empty sections. If user asks
for plaintext or a different structure, use that instead.

Part 2: The eval

Run these in a fresh session with the skill installed. Mark each one fired / didn't fire. Goal: 10 of 10 should-trigger fire, 0 of 10 should-not-trigger fire. Anything else means the description needs another pass.

Should trigger (10):

Write release notes for v2.4.0
Generate a changelog from main..v2.3.0
What changed since the last release?
Summarize the commits between these two tags
I need user-facing release notes
Draft notes for the v2.4 announcement
Convert this git log into release notes
Help me write a changelog
What's new in this version compared to last?
Format these commits as release notes for users

Should NOT trigger (10):

Show me the commits since last week
Review this code change for bugs
Write documentation for the API
Summarize this PR description
Help me write a commit message
What's in HEAD right now?
Compare these two branches for me
Generate a project README
Translate these notes to French
Create a marketing email about the release

If a should-trigger fails, the description is missing trigger language users actually use. If a should-not-trigger fires, the description is too broad. Tighten and re-run.

This is the entire feedback loop. Twenty queries, fifteen minutes. Do this once per skill before you trust it in production.

Skill review: skill-creator (the meta-skill)

What it does: Anthropic's official skill-creator skill, a SKILL.md whose only job is to generate other SKILL.md files. Lives at github.com/anthropics/skills. Bundles a copy of the agentskills.io spec and a working example as L3 references, so when the agent decides to create a skill, it reads the spec inline and produces something that conforms.

Setup difficulty: Trivial. Drop the folder into your skills directory. The skill itself is one file plus a references/ directory.

Verdict: This is the skill that should ship with every agent runtime by default. Not because the SKILL.md format is hard, it's a YAML header and some markdown. Because the description engineering part is hard, and skill-creator builds in an evaluation system. After it generates a skill, it offers to optimize the description by running 20 eval queries against the new file and reporting which ones trigger correctly. That's the right loop, automated.

Watch out for: The auto-generated descriptions can be verbose. The skill's bias is toward including more trigger language to maximize firing. That's correct on average but produces frontmatter that runs longer than the Anthropic style guide recommends (their official skills lean shorter). Trim by hand after generation.

Rating: Essential. If you're going to write more than three skills, install this first and make every new one go through it.

Configuration is the product. Six issues in.

When this newsletter started, the thesis was: the model is 20% of the work. The configuration is the other 80%.

Six issues in, the workspace files arc is complete. SOUL.md for identity. AGENTS.md for operational rules. USER.md for context about you. MEMORY.md for what the agent has learned. SKILL.md for what the agent can do. Every issue has been about a different file in the same directory.

The pattern that keeps showing up: people wait for a model upgrade to fix things that are configuration problems. Their agent forgets context, they upgrade to Opus. Their agent misses tasks, they switch to Sonnet 4.6. Their skills don't fire, they install more skills.

None of those move the needle. What moves the needle is reading the file that's not working and writing it better.

The next stretch of issues will go up the stack, agent-to-agent coordination, evals as a workflow, model routing as architecture. But the foundation is six markdown files. Get those right, and almost every other problem is fixable.

See you next week.

Michael

Your agent has 40 skills. It uses 3.

Your agent has 40 skills. It uses 3.

How skills actually load

The five rules of descriptions that fire

A before-and-after

The test that matters

A SKILL.md that fires + a 20-query eval

Part 1: The skill

Part 2: The eval

Skill review: skill-creator (the meta-skill)

Configuration is the product. Six issues in.

Keep Reading

Agentic Smith