← All posts

Claude Skills: The End of Prompt Engineering?

After spending months perfecting prompts, Skills made most of it obsolete. Here's what actually changed - and what didn't.

Hero Image

I’ve spent the last six months maintaining a 3000-line prompt library. Weekly team updates, brand compliance checks, security audits - all meticulously crafted prompts that I’d paste into Claude depending on the task.

Then Skills dropped yesterdat and made 90% of that work obsolete.

Here’s the thing: I’m not mad about it. I’m actually kind of relieved. Because tbh, copy-pasting prompts from Notion into Claude is not how I want to spend my time.

But let’s be clear - Skills aren’t “the end of prompt engineering” like some blog posts are claiming. They’re more like… the beginning of prompt packaging. And that distinction matters a lot.

Skills in action


What Actually Are Skills?

Skills are essentially organized folders of instructions that Claude loads dynamically when it thinks they’re relevant. You write a SKILL.md file with some YAML frontmatter, drop it in ~/.claude/skills/, and Claude discovers it automatically.

Here’s what makes them different from just saving prompts in a text file:

1. Progressive disclosure - Claude sees skill names and descriptions at startup, but only loads the full content when needed. This is huge for context window management.

2. Composability - Multiple skills can stack together automatically. Need brand compliance and security auditing? Both skills activate if your task needs both.

3. Security model - Skills can restrict which tools Claude can use via allowed-tools. Want a read-only agent that can’t modify files? That’s actually possible now.

4. Cross-platform - Same skill works in Claude.ai, Claude Code, and the API. Write once, use everywhere.

The official anthropics/skills repo has 958 stars at the time of writing and 12 hours into launching, and growing. People are clearly interested.


How I Actually Use Skills (Real Examples)

Weekly Team Updates

I used to have a saved prompt for writing team updates. Every Friday I’d copy it, fill in the details, and send it off. The prompt was ~400 tokens of “use these headings, match this tone, include metrics” instructions.

Now I have a skill:

---
name: team-updates
description: Write weekly team status updates in our standard format
---

# Team Updates

Generate a weekly team update with these sections:

## This Week's Wins
- Key accomplishments (2-4 bullets)
- Link to relevant PRs/docs where applicable
- Quantify impact when possible

## In Progress
- Current work with % completion estimates
- Blockers or dependencies (be explicit)

## Next Week
- Priorities for the coming week
- Major decisions needed

## Metrics
- Deployment count
- Test coverage change
- Customer-facing improvements

**Tone:** Direct and factual. No corporate speak. Use "we" not "the team."
**Length:** 200-300 words max. Nobody reads walls of text.

The difference? I don’t think about it anymore. I just start writing “Update for Oct 15…” and Claude loads the skill automatically based on context.

Is this revolutionary? No. Is it conveniant? Yes. And honestly, convenience is what makes tools stick.

Brand Compliance for Artifacts

We ship a lot of Claude artifacts to customers. They need to follow brand guidelines - specific colors, fonts, component patterns.

My old workflow: Keep a Google Doc with the style guide, reference it manually, hope I don’t miss anything.

New workflow: brand-guidelines skill with our exact color codes, typography rules, and component patterns. Claude applies them automatically when building artifacts.

---
name: brand-guidelines
description: Apply Agnost AI brand colors and typography to artifacts
---

# Brand Guidelines

When creating visual artifacts, use these exact values:

## Colors
- Primary: #0066FF
- Secondary: #00D4FF
- Background: #F8F9FA
- Text: #1A1A1A
- Borders: #E5E7EB

## Typography
- Headings: Inter, 600 weight
- Body: Inter, 400 weight
- Code: JetBrains Mono

## Components
Use shadcn/ui components when available:
- Buttons: Use `<Button variant="default">` for primary actions
- Cards: Include subtle border, no drop shadow
- Forms: Always include validation states

Never use: Comic Sans, bright neon colors, or anything that would make our designer cry.

The skills model means this isn’t just a reference document - Claude actively applies these rules without me needing to remind it every time.

Security Auditing (Read-Only Agent)

This is where the allowed-tools feature gets interesting.

I needed an agent that could audit code for security issues but couldn’t modify anything. You can’t really do that with regular prompting - Claude will still ask for write permissions if it wants to fix something.

With Skills:

---
name: security-auditor
description: Audit code for security vulnerabilities (read-only)
allowed-tools: Read, Grep, Glob
---

# Security Auditor

Review code for common security vulnerabilities:

## Check For:
- Exposed API keys or credentials
- SQL injection vectors
- XSS vulnerabilities
- Insecure dependencies
- Improper auth checks

## Output Format:
For each finding:
1. Severity (Critical/High/Medium/Low)
2. File and line number
3. Description of the vulnerability
4. Suggested fix (but don't implement it)

You are read-only. Never modify files or suggest running commands that would change state.

The allowed-tools: Read, Grep, Glob restriction means Claude cannot write files even if it wants to. The environment enforces this at the tool level.

This is actually pretty powerful for team workflows. You can have junior devs run security audits without worrying they’ll accidentally break something.


The Technical Model: Why This Works

The Skills architecture is smarter than it first appears.

Progressive Disclosure (The Key Innovation)

Claude loads skills in stages:

  1. Metadata level - Name and description pre-loaded at startup (~50 tokens per skill)
  2. Core content - Full SKILL.md loaded when Claude determines relevance (~200-500 tokens)
  3. Supplementary files - Additional resources loaded on-demand (unbounded)

This is why Skills can scale. You could have 20 skills installed, but only pay the context window cost for the 2-3 that are actually relevant to your current task.

The filesystem acts as external memory. Want to bundle a 50-page API reference? Put it in a separate file that Claude loads only when needed.

security-auditor/
├── SKILL.md                    # Core instructions (~300 tokens)
├── owasp-top-10.md            # Reference material (10k tokens)
└── common-vulns-by-lang.md    # Language-specific guides (15k tokens)

Claude reads SKILL.md when the skill activates. It only loads the reference files if it needs specific details. Zero token cost until then.

Compare this to Custom Instructions (always loaded, ~1000 token limit) or Projects (always loaded, massive token cost). Skills are architected for scale.

Composability: Why It Feels Like Package Managers

Multiple skills stack automatically when needed. No explicit dependencies required.

Say I’m building a customer-facing artifact. Claude might activate:

  • brand-guidelines (colors, typography)
  • artifacts-builder (React + Tailwind patterns)
  • accessibility (ARIA labels, keyboard nav)

I didn’t specify “use all three.” Claude determined they’re all relevant and composed them together.

This emergent behavior is… honestly kind of impressive? It works like npm or cargo - composable, modular tools that just work together.

The Anthropic team calls this “like a well-organized manual that starts with a table of contents.” I think it’s more like a package manager where Claude is the dependency resolver.

Security Model: Least Privilege by Default

The allowed-tools field enables some genuinely useful security patterns:

Read-only agents for code review:

allowed-tools: Read, Grep, Glob

Test-only agents that can’t touch production:

allowed-tools: Read, Write, Bash
# (no network access tools)

Documentation agents that only work with .md files:

allowed-tools: Read, Write, Glob

This isn’t perfect - there’s no way to restrict which files a skill can access, just which tool categories. But it’s a start.

And importantly: These restrictions are enforced by the execution environment, not just the prompt. Even if Claude “wants” to write a file, it can’t if Write isn’t in allowed-tools.


API Details (For the API Users Among Us)

Skills work via the Messages API with special headers:

POST /v1/messages
Content-Type: application/json
x-api-key: your-key-here
anthropic-beta: skills-2025-10-02

{
  "model": "claude-sonnet-4.5-20250929",
  "max_tokens": 4096,
  "skills": [
    {
      "name": "security-auditor",
      "description": "Audit code for vulnerabilities",
      "content": "...",
      "allowed_tools": ["Read", "Grep", "Glob"]
    }
  ]
}

Current limitations:

  • 8 skills max per request
  • 8MB total size for all skill content
  • Beta feature, might change
  • Only Claude 4 models support it

The API version is more flexible than the filesystem version. You can generate skills dynamically, pull them from a database, whatever. The tradeoff is you lose the automatic discovery - you have to explicitly pass them in each request.

For production use, I’d combine both approaches:

  • Standard skills in ~/.claude/skills/ for consistent behavior
  • Dynamic API skills for user-specific or context-specific overrides

Enterprise is Actually Using This

Box, Rakuten, and Canva are already using Skills in production. That’s… fast adoption for a two-week-old feature.

Rakuten’s quote is telling: “What once took a day, we can now accomplish in an hour.”

They’re using Skills for management accounting workflows - processing multiple spreadsheets, catching anomalies, generating reports. This is exactly the kind of repetitive-but-complex work where Skills shine.

Box and Canva haven’t shared specifics yet, but I’d bet they’re using Skills for similar “standardize the way we do X” workflows. Custom instructions don’t cut it when you need 500 tokens of context about your internal processes.


What Skills DON’T Solve

Let’s be real about the limitations.

1. Skills Are Still Prompts

You’re still doing prompt engineering. You’re just packaging it differently.

The same prompt engineering principles apply:

  • Be specific
  • Include examples
  • Handle edge cases
  • Test thoroughly

Skills don’t magically make bad prompts good. They just make good prompts more reusable.

2. No Cross-Conversation Memory

Every new chat starts from zero. Skills don’t remember what happened in previous conversations.

You can’t build a skill that “learns” from feedback or adapts to your preferences over time. It’s stateless.

This is a fundamental limitation of the architecture, not something Skills could solve. (Though honestly, I’m fine with this. Stateful AI agents freak me out a bit.)

3. Sharing is Manual

You can’t publish a skill to a marketplace and have others install it with a single click. If you want to share a skill, you:

  1. Put it in a git repo
  2. Tell people to clone it to ~/.claude/skills/
  3. Hope they don’t mess up the file structure

Compare this to Claude Code plugins (where you can publish to a marketplace) or VSCode extensions (where installation is one click). The DX here needs work.

4. Discovery is… Fine?

Claude decides when to activate skills based on their descriptions. This works surprisingly well, but it’s not perfect.

I’ve had skills fail to activate when they should’ve, and activate when they shouldn’t. The model is good at matching intent to description, but “good” isn’t “perfect.”

You end up writing very explicit descriptions with trigger phrases:

description: Use this skill when writing weekly team status updates, progress reports, or standup summaries. Activates when user mentions "team update", "weekly summary", or "status report".

It works. But it’s not exactly elegant.

5. Token Limits Still Apply

Skills don’t expand Claude’s context window. They’re smarter about context usage (progressive disclosure), but the limits are still there.

If you need to reference 10 large documents simultaneously, Skills won’t save you. You’ll hit context limits just like before.


Skills vs. Other Approaches

Custom Instructions

  • Always loaded (token cost)
  • 1000 token limit
  • Global to your account
  • Good for: Personal preferences, tone, format

Projects

  • Always loaded (massive token cost)
  • 200k token context window
  • Per-project context
  • Good for: Accumulated knowledge, ongoing work

Skills

  • Loaded on-demand (efficient token usage)
  • Effectively unbounded via filesystem
  • Composable across contexts
  • Good for: Repeatable workflows, standardized processes

The pattern I’m settling on:

  • Custom Instructions: Personal style preferences (50 tokens)
  • Projects: Active project context and docs (2-10k tokens)
  • Skills: Reusable workflows and procedures (loaded as needed)

They complement each other rather than compete.


Building Your First Skill (5 Minute Walkthrough)

Let’s build something useful: a skill that writes git commit messages following conventional commits.

mkdir ~/.claude/skills/commit-writer
cd ~/.claude/skills/commit-writer

Create SKILL.md:

---
name: commit-writer
description: Write git commit messages following conventional commits format
---

# Commit Message Writer

Generate conventional commit messages using this format:

## Format

<type>(<scope>): <subject>

<body>

<footer>


## Types
- feat: New feature
- fix: Bug fix
- docs: Documentation changes
- refactor: Code refactoring
- test: Adding tests
- chore: Build process or auxiliary tool changes

## Rules
1. Subject line: 50 chars max, imperative mood ("add" not "adds")
2. Body: Wrap at 72 chars, explain *why* not *what*
3. Footer: Breaking changes or issue references

## Example

feat(auth): add JWT token refresh logic

Users were getting logged out every 15 minutes. Now refresh tokens extend sessions automatically.

Closes #234


Before suggesting a commit:
1. Run `git diff --staged` to see what's actually being committed
2. Group changes by logical purpose
3. If changes span multiple concerns, suggest splitting into separate commits

Now use it:

# Stage some changes
git add .

# In Claude
"Help me commit these changes"

Claude loads the skill automatically and generates a properly formatted conventional commit message.

Total setup time: 3 minutes. Time saved per commit: ~30 seconds. Over a year of daily commits: ~90 hours saved.

The ROI on skills is actually pretty good once you identify the right workflows.


The Composability Trick Nobody’s Talking About

Here’s where Skills get genuinely interesting: implicit composition.

I have separate skills for:

  • brand-guidelines (colors, typography)
  • accessibility (ARIA labels, semantic HTML)
  • responsive-design (mobile-first patterns)

When I ask Claude to “build a signup form,” it activates all three automatically. The result follows brand guidelines, includes proper ARIA labels, and works on mobile.

I didn’t explicitly say “use all three skills.” Claude recognized that a signup form benefits from all three domains and composed them together.

This emergent behavior is what makes Skills more than just “saved prompts in a folder.” The composition happens at the semantic level based on task requirements, not through explicit dependencies.

Compare this to how you’d achieve the same thing with saved prompts:

  1. Remember which prompts apply
  2. Copy-paste all three
  3. Hope they don’t contradict each other
  4. Manually resolve conflicts if they do

Skills handle the composition automatically. And because they’re all loaded into the same context, Claude can resolve contradictions intelligently.

This is the “package manager” model I mentioned earlier. It’s not just about reusability - it’s about automatic dependency resolution at the semantic level.


The Skeptical Take: What Still Bugs Me

I’m excited about Skills, but let’s be honest about the rough edges.

Version control is manual. If you update a skill, there’s no way to track changes or roll back. Git helps, but there’s no built-in versioning system.

Testing is hard. How do you know your skill works correctly across different contexts? You basically have to try it and see. There’s no test framework or validation tooling.

Debugging is opaque. When a skill doesn’t activate when you expect it to, you’re kinda just… guessing why? There’s no debug mode that shows “I considered these skills and rejected them because X.”

Sharing is clunky. Emailing someone a zip file of your skills folder is not a great DX in 2025.

No skill marketplace yet. There should be an anthropics/awesome-skills repo by now. There isn’t. (Update: Maybe someone should make one?)

These are all solvable problems. But they’re current limitations worth knowing about.


Where This Is Going

Skills have been out for 12 hours now. The fact that Box, Rakuten, and Canva are already using them in production tells you something about the value proposition.

I think we’re going to see:

1. Skill marketplaces - Just like MCP servers got their marketplace, skills will get one. The composability model makes this really valuable.

2. Domain-specific skill packs - Legal, medical, finance, etc. Highly specialized bundles that encode industry expertise.

3. Team skill libraries - Companies building internal skill collections that codify “how we do things here.”

4. AI-generated skills - Meta skills that generate other skills based on your workflow. (Yes, I know how that sounds. But also… yes?)

5. Cross-platform standardization - The SKILL.md format working everywhere (API, Claude.ai, Claude Code) means skills become truly portable.

The open question is whether Anthropic keeps this as a closed ecosystem or opens it up for third-party tooling. Right now it’s somewhere in between.


Should You Start Using Skills?

Depends what you’re doing with Claude.

Yes, if:

  • You have repetitive workflows you currently handle with saved prompts
  • You work on a team that needs standardized AI behavior
  • You’re building customer-facing artifacts with strict brand guidelines
  • You need security restrictions (read-only agents, tool limitations)

Maybe not if:

  • You’re using Claude for one-off tasks that change constantly
  • Your workflows are too variable to standardize
  • You’re not comfortable with YAML and filesystem management

The setup cost is low (5-10 minutes per skill). The payoff is cumulative. After a month of use, I have 12 skills that save me probably an hour per week combined.

Is that revolutionary? No. Is it worth doing? Yeah, actually.


The Bottom Line

Skills didn’t end prompt engineering. They just moved it from “thing I do every time I use Claude” to “thing I package once and reuse.”

That’s still valuable. Maybe even more valuable than the hype suggests, just for different reasons.

The progressive disclosure model is smart. The composability is legitimately cool. The security restrictions enable new patterns. And the fact that it works across Claude.ai, Claude Code, and the API makes it genuinely portable.

But it’s still early. The tooling needs work. The DX could be better. And we’re all still figuring out the best practices.

If you’ve been copy-pasting prompts from a Notion doc for the last six months, maybe give Skills a try. Your future self will thank you.


Resources

Official:

Examples:

Related Posts:


Want visibility into which skills and MCP servers your team actually uses? Agnost AI provides analytics for Claude workflows - see what’s working, what’s collecting dust, and optimize your tooling based on real data.

Because the best workflow is the one your team actually uses.