When Hermes finishes a task, it doesn't move on. Hermes creates a Skill object — structured code, test cases, and example pairs — and stores it for every similar task that follows. Skills compound over time. The agent in month three handles edge cases that month one missed, not because the model was retrained, but because the Skill library grew.
What does Hermes build when it completes a task?
Hermes creates a Skill object from each completed task. A Skill object is a structured document containing four elements:
- Task category — how Hermes classifies this type of task (e.g., "candidate-application", "invoice-chase", "client-status-update")
- Code approach — the method Hermes used to complete the task, stored as executable code
- Test cases — input/output pairs derived from the task, used to validate future skill applications
- Example pairs — a set of specific inputs and the correct outputs for each
Skill creation is automatic — no configuration required after context definition. The trigger is task completion, not a scheduled training run. A Skill built from a candidate application processed on a Monday is available to apply to the next candidate application that arrives on Tuesday.
Nous Research describes Hermes as "an intelligent personal assistant that gets more capable the longer it runs" — this is the mechanism behind that claim: each task completed adds to a growing library of structured approaches.[¹]
How do skills improve over time?
Skills compound. Month three handles edge cases month one missed.
Skills improve through accumulation. Each time Hermes encounters a task that matches an existing Skill category, Hermes applies the Skill, evaluates whether the output matches the expected format, and adds the new input-output pair to the Skill's example set. A Skill that has processed 50 candidate applications is more accurate on format variants than one that has processed 5.
The Hermes learning mechanism operates at the inference layer — not by retraining the underlying model. Skills are Skill objects: code, tests, and examples for each task category. The model itself does not change. What changes is the library of approaches Hermes has for your specific workflows.
The practical effect is a performance curve. In the first two weeks, Hermes handles the most common task variants correctly. In months two and three, the same instance handles edge cases that previously required escalation — because every completed task has added examples to the relevant Skill. A Hermes instance running a recruiting agency's candidate triage workflow in month three understands variant email formats, partial resumes, and forwarded applications in ways the month-one instance could not, because every one of those has added to the candidate-application Skill.
Where are skills stored and can they be shared?
Skills are stored at agentskills.io, an open standard for agent skill exchange.[²] The agentskills.io registry stores Skill objects as structured files — code, tests, and examples — compatible with other agent systems including Cursor, GitHub Copilot, and Claude Code.
A Skill built by one Hermes instance can be exported to agentskills.io and imported by another. A business running two regional Hermes instances shares Skills between them — a candidate-application Skill built from the UK office's email patterns is available to the Germany office. Skills don't have to be built twice for the same task category.
The open standard also means Skills aren't locked to Hermes. A Skill built from a recruiting workflow can be made available to other teams using different agent systems that support the agentskills.io format. For a full guide to deploying Hermes and setting up context definition — the step that determines skill quality — see the Hermes setup guide.
What does this mean for a business running Hermes?
The skill compounding effect has two practical implications.
First, Hermes gets better at your specific workflows, not at general tasks. Skills encode what your tasks actually look like — your clients, your output formats, your escalation patterns. A Hermes instance that has processed your recruitment workflows for three months understands your specific candidate types and reply conventions in a way a freshly deployed instance does not. The improvement is specific to your context, not a generic upgrade.
Second, error rates decrease over time. A task category that required correction 30% of the time in month one will require fewer corrections by month three — as long as the context was defined accurately at the start. Poorly defined context produces Skills that encode incorrect approaches. Getting context definition right in week one is the most important lever for skill quality in month three. For a broader understanding of how AI agents work, see what is an AI agent. For Hermes's full capabilities, see what is Hermes.
Frequently asked questions
How does Hermes learn from completed tasks? Hermes creates a Skill object when a task is completed. The Skill object contains the task category, the code approach used, test cases derived from the task, and example input-output pairs. On the next similar task, Hermes applies the Skill and adds the new example to it. Skills improve as more examples accumulate.
Does Hermes retrain the underlying model? No. The Hermes learning mechanism operates at the inference layer. The underlying language model does not change. Skills are stored as structured Skill objects at agentskills.io — code, tests, and examples — that Hermes applies when handling similar tasks. The model stays fixed; the Skill library grows.
Can skills built by one Hermes instance be used by another? Yes. Skills are stored at agentskills.io, an open standard for agent skill exchange. A Skill built by one Hermes instance can be exported and imported by another. Skills are also compatible with other agent systems that support the agentskills.io format, including Cursor and Claude Code.
How long does it take for Hermes to improve noticeably? The most common task variants are typically handled correctly within the first two to four weeks as Skills accumulate from real task completions. Edge case handling improves through months two and three. The rate of improvement depends on task volume — more completed tasks produce more Skill examples faster.
Notes
- Nous Research, Hermes documentation. https://hermes-agent.nousresearch.com/docs/
- agentskills.io, open standard for agent skills. https://agentskills.io