Structured markdown skill on dark background

Guide

How to Write a Skill: The Complete Guide

Trigger, methodology, output format, overrides, progressive disclosure, testing, and publishing. The definitive skill authoring reference.

Amodal Team12 min read

What a skill is

A skill is a methodology encoded as markdown. It tells the agent how to think about a class of problem. Not code. Not a prompt template. Not an API integration. A structured playbook that captures the reasoning process a domain expert follows when they solve a specific kind of problem.

The agent reads the skill and follows it — the same way a junior analyst follows a runbook written by a senior. The skill doesn't execute anything. It doesn't call APIs. It tells the agent what to look for, what order to check things in, what the output should look like, and what mistakes to avoid.

Here's a complete skill in 10 lines:

skills/quick-lookup.md

# Quick Lookup

## Trigger
Activate when the user asks for a specific record by ID, name, or reference number.

## Methodology
1. Identify the system and record type from the user's query.
2. Query the system for the exact record.
3. Summarize the key fields in 2-3 sentences.

## Output format
Lead with the record ID and status. Then the 3-5 most relevant fields. No tables for single records.

## What NOT to do
- Never guess a record ID. If ambiguous, ask the user to clarify.
- Never show raw JSON. Always summarize.

That's it. No imports. No dependencies. No build step. A markdown file that a domain expert can write, a compliance officer can review, and an agent can follow.

Anatomy of a skill

Every skill has four sections. Each one serves a distinct purpose, and skipping any of them weakens the skill.

Trigger

When the skill activates. Written in natural language. The agent scans installed skill triggers to decide which skill to load for a given conversation. Be specific enough to avoid false positives, broad enough to catch legitimate requests.

Trigger section

## Trigger
Activate when the user asks about deal status, pipeline health,
stale opportunities, or "what needs attention" in the sales context.

Common mistake

Vague triggers like "Activate when the user needs help" match everything and dilute the agent's skill selection. Be precise about the problem class.

Methodology

Numbered steps. The reasoning process the agent follows. This is where domain expertise lives. Each step should be concrete and actionable — not "analyze the data" but "check the last activity date against the stage-specific staleness threshold."

Methodology section

## Methodology
1. Query the CRM for all open opportunities in the current quarter.
2. For each opportunity, check the last activity date against
   stage-specific staleness thresholds.
3. Cross-reference with email activity — any deal with no email
   in 2x the staleness window is critical.
4. Check for missing fields: no decision maker by Proposal stage
   is a red flag.
5. Rank by revenue × probability × urgency. Present the top 10.

Output format

What the response looks like. This prevents the agent from dumping raw data or choosing an unhelpful presentation. Be explicit about structure, ordering, and what to lead with.

Output format section

## Output format
Lead with the count: "X deals need attention."

For each deal:
- **Deal name** (Stage) — $amount
- Why it's flagged (stale, missing field, no activity)
- Recommended action (one sentence)

Sort by urgency, not by amount. End with a one-line summary
of total pipeline at risk.

What NOT to do

Guardrails. The things the agent should never do when executing this skill. These prevent the most common failure modes — hallucinating data, auto-escalating, making assumptions the user didn't authorize.

What NOT to do section

## What NOT to do
- Never fabricate deal data. If the CRM query fails, say so.
- Never auto-assign or reassign deals. Recommend, don't act.
- Never compute win probability yourself — use the CRM's value.
- If there are more than 20 flagged deals, show the top 10 and
  offer to show the rest. Don't dump a 20-item list.

Progressive disclosure

Skills scale because the agent doesn't load them all at once. The agent starts each session with a skill index — just the name and a one-line description of each installed skill. Roughly 100 tokens per skill.

Only when the agent determines a skill is relevant to the current conversation does it load the full content. This means an organization with 20, 30, even 50 skills installed pays almost no context cost for unrelated conversations.

20 skills × ~100 tokens each = 2,000 tokens for the full index

Full skill body = 500–1,000 tokens, loaded on demand

Typical conversation: 1–2 skills fully loaded

vs. monolithic prompt: all instructions loaded always

Key insight

You can have dozens of skills installed without impacting performance on unrelated tasks. The agent pays the full cost only for the skills it actually uses in a given conversation.

This is why skills are better than stuffing everything into a system prompt. A system prompt is a monolith — every instruction, for every scenario, loaded into every session. Skills are modular. The agent loads only what it needs, when it needs it.

Three examples at three complexity levels

Simple: Quick record lookup (10 lines)

The simplest useful skill. Look up a record, summarize it, present it. No cross-referencing, no multi-system queries. One system, one record, one answer.

skills/record-lookup.md

# Record Lookup

## Trigger
Activate when the user asks for a specific record, ticket, or entity by ID or name.

## Methodology
1. Identify the system (CRM, ticketing, billing) from context.
2. Query for the exact record.
3. Extract the 5 most relevant fields for the record type.

## Output format
One-line status summary. Then key fields as a compact list. No tables.

## What NOT to do
- Never guess IDs. Ask to clarify if ambiguous.
- Never show raw API responses.

Medium: Deal triage (25 lines)

Cross-references multiple data points from one or two systems. Has a real methodology with prioritization logic. Produces a structured, actionable report.

skills/deal-triage.md

# Deal Triage

## Trigger
Activate when the user asks about pipeline health, stale deals,
"what needs attention," or deal prioritization.

## Methodology
1. Query the CRM for all open opportunities closing this quarter.
2. For each deal, calculate days since last activity.
3. Apply staleness thresholds by stage:
   - Prospecting: 7 days
   - Qualification: 10 days
   - Needs Analysis: 14 days
   - Proposal/Negotiation: 14 days
4. Cross-reference with email: no email in 2x staleness = critical.
5. Flag deals missing key fields (no decision maker by Proposal,
   no next step by Qualification).
6. Rank by: urgency (critical > warning) then revenue descending.

## Output format
Lead with count: "X deals need attention (Y critical, Z warning)."
For each deal:
- **Name** (Stage) — $amount — days since activity
- Flag reason
- One-sentence recommended action
End with total pipeline at risk.

## What NOT to do
- Never reassign deals. Recommend only.
- Never compute win probability — use CRM values.
- If >15 deals flagged, show top 10, offer to expand.
- Never fabricate activity dates.

Complex: Threat investigation with task agents (40 lines)

The most powerful pattern. The skill tells the agent to dispatch parallel task agents for data gathering, then synthesize the results into a prioritized report. Each task agent gets its own isolated context — the primary agent stays clean for reasoning.

skills/threat-investigation.md

# Threat Investigation

## Trigger
Activate when the user asks to investigate an alert, incident,
suspicious activity, or IOC (IP, hash, domain, user).

## Methodology
1. Parse the alert or IOC from the user's message. Identify the
   entity type (IP, hash, domain, user, host).
2. Dispatch parallel task agents to gather context:
   - Agent 1: Query SIEM for all events involving this entity (24h)
   - Agent 2: Query EDR for host telemetry and process trees
   - Agent 3: Query threat intel for known indicators
   - Agent 4: Query identity provider for user activity and auth logs
   - Agent 5: Query network logs for lateral movement indicators
   - Agent 6: Query asset inventory for host details and owner
3. Synthesize task agent results into a timeline:
   - First seen, last seen, total event count
   - Sequence of actions in chronological order
   - Any indicators that correlate across systems
4. Assess severity:
   - Critical: confirmed malicious activity + lateral movement
   - High: confirmed malicious activity, contained to one host
   - Medium: suspicious activity, no confirmed IOC match
   - Low: anomalous but likely benign (check false_positives KB)
5. Generate recommended actions based on severity:
   - Critical: isolate host, disable account, page on-call
   - High: isolate host, create ticket, notify team
   - Medium: create ticket, add to watchlist
   - Low: document, update baselines if benign

## Output format
Lead with severity and one-sentence summary.
Then the timeline (chronological, max 15 entries).
Then correlated indicators across systems.
Then recommended actions as a numbered checklist.

## What NOT to do
- Never auto-isolate hosts or disable accounts without confirmation.
- Never assess severity without evidence. "Insufficient data" is valid.
- Never skip systems that are unavailable — note them as gaps.
- If the entity appears in the false_positives KB, say so immediately.

Context isolation

The 6 task agents each process 5–15K tokens of raw data, but the primary agent only sees their 200–500 token summaries. The skill describes WHAT to gather — the runtime handles the dispatch, isolation, and summarization automatically.

The override model

Marketplace skills encode community best practices. Your organization has specifics. The override model lets you inherit the base methodology and layer your customizations on top.

skills/acme-deal-triage.md

---
import: deal-triage
---
# Acme Deal Triage Overrides

## Our sales cycle
Mid-market: 45 days. Enterprise: 90 days.

## Staleness threshold adjustments
- Enterprise deals: all thresholds get 2x multiplier
- Partner-sourced deals: add 5 days to each threshold

## Additional flags
- Any enterprise deal in Proposal without a contact
  with Role = "Decision Maker" is critical.
- Any deal over $500K without an executive sponsor
  documented is a warning.

## Output format additions
After the standard deal list, add a section:
"Enterprise deals requiring exec attention" — filtered to
deals >$250K with any critical or warning flag.

The import: deal-triage frontmatter tells the runtime to load the base deal-triage skill first, then append your overrides. When the marketplace skill updates — better methodology, new edge cases handled — you get the improvements automatically. Your overrides stay intact.

You can override any section. Add new methodology steps, adjust thresholds, extend the output format, add guardrails. The base skill provides the framework. You fill in your specifics.

Version pinning

Overrides pin to the installed version of the base skill. When a new version publishes, you choose when to upgrade. Review the changelog, test, then update. No surprise behavior changes.

Testing skills

amodal dev starts a local chat session with your skills loaded. Test by having conversations and verifying three things: the right skill activates, the methodology is followed, and the output matches the format you specified.

Test: correct activation

Terminal

$ amodal dev

You: What deals need attention this quarter?
Agent: [deal-triage skill activates]
       4 deals need attention (1 critical, 3 warning)...

You: What's the weather in Chicago?
Agent: [no skill activates — general knowledge]
       I don't have access to weather data...

You: Investigate the alert from Crowdstrike
Agent: [threat-investigation skill activates]
       Dispatching 6 task agents to gather context...

Test: methodology adherence

Watch the agent's reasoning. Does it follow your numbered steps in order? Does it check what you told it to check? Does it skip steps or invent its own? If it deviates, your methodology steps aren't specific enough. Make them more concrete.

Test: output format

Does the response match your output format section? Does it lead with the count? Does it sort by urgency? Does it include the recommended actions? If not, tighten the output format description. Be prescriptive — "Lead with X. Then Y for each item. End with Z."

Test: guardrails

Try to make the agent violate its "What NOT to do" section. Ask it to auto-reassign deals. Ask for data that doesn't exist. Push it to dump raw JSON. If it complies, your guardrails need to be stronger. The "What NOT to do" section is your defense against the model's defaults — make it explicit and specific.

Iteration loop

Write the skill → test in amodal dev → adjust → test again. Most skills take 2–3 iterations to get right. The first draft captures the methodology. The second tightens the output format. The third adds the guardrails you only discover by testing.

Publishing to the marketplace

Once your skill is tested and working, publish it. One command validates, packages, and submits.

Terminal

$ amodal publish skill skills/deal-triage.md

Validating skill...
  ✓ Trigger section found
  ✓ Methodology section found (5 steps)
  ✓ Output format section found
  ✓ What NOT to do section found
  ✓ No secrets or credentials detected

Packaging...
  ✓ Skill ID: deal-triage
  ✓ Version: 1.0.0
  ✓ Size: 847 tokens

Publishing...
  ✓ Submitted to marketplace
  ✓ Visibility: org-private

Published deal-triage@1.0.0
Share: https://marketplace.amodalai.com/skills/deal-triage

Three visibility levels:

Org-private

Only your organization can install it. Default for first publish. Good for internal methodologies with proprietary knowledge.

Community

Anyone can install it. Goes through community review. Good for general-purpose methodologies that aren't org-specific.

First-party

Maintained by Amodal. Curated, tested, and updated. These are the reference implementations for common domains.

Version updates use amodal publish skill --version 1.1.0. Published versions are immutable — no silent changes. Consumers choose when to upgrade. Every version gets a changelog entry.

A skill is:

Markdown

Version-controlled

Human-readable

Composable

Shareable

No build step. No dependencies. No engineering required. A domain expert writes it, a compliance officer reviews it, and an agent follows it. The methodology outlives every platform migration and model upgrade.

Skills are the new code →From domain expert to architect →Why configuration beats intelligence →