Hermes Setup Guide

Installing Hermes takes under a day. The step that determines performance over the next six months is not deployment — it is context definition: telling the agent what your tasks actually look like, who handles exceptions, and what a correct output means. Get that right and Hermes starts improving from task one. Get it wrong and the first month of skills encodes the wrong patterns.

Deploy

Clone the Hermes repo, set environment variables, and start the Docker container on your server.

Connect platforms

Add API tokens or OAuth credentials for Slack, Gmail, Telegram, or whichever platforms your team uses.

Define context

Write example tasks, expected output formats, and escalation paths for each workflow Hermes will handle.

Test with real tasks

Run 20–50 live tasks in review-only mode and confirm outputs match the context definition before enabling actions.

Go live

Enable action permissions and set a weekly review cadence for the first month to track skill quality.

How do you deploy the Hermes instance?

Hermes runs via Docker and deploys on any standard VPS. A 2-vCPU, 4 GB RAM instance is sufficient for teams handling up to a few hundred tasks daily. Three things are required before starting the container: Docker and Docker Compose installed on the server, API access to the language model Hermes will use (compatible with OpenAI and Anthropic model APIs), and the Hermes repository cloned from Nous Research's GitHub.[¹]

Core configuration lives in a .env file: model API key, server port, and the agentskills.io connection token for skill storage. Running docker compose up starts the instance. The first run initialises the model connection and registers the deployment with agentskills.io.

Hermes is released under the MIT licence and runs entirely on team infrastructure. Nous Research describes the deployment model as "an intelligent personal assistant that gets more capable the longer it runs" — it operates on your servers, with no data sent to a third-party agent service.[¹]

The most common issues at this stage: invalid API key format, port conflicts with existing services, and firewall rules blocking the webhooks Hermes needs to receive incoming platform messages. Most resolve within the first hour of setup.

How do you connect your platforms?

A single Hermes deployment handles all connected platforms simultaneously — no separate agent instance per channel. Each platform requires a token or OAuth credential. The Hermes admin interface provides step-by-step instructions for each connection:

Slack: Create a Slack App, add bot scopes (channels:read, chat:write, messages:read), install to the workspace, and add the Bot User OAuth Token to the Hermes config
Gmail: Create a Google Cloud project, enable the Gmail API, generate OAuth2 credentials, and complete the consent flow
Telegram: Create a bot via @BotFather and add the bot token
Microsoft Teams, Discord, WhatsApp: Follow equivalent OAuth or token flows documented in the Hermes platform guide

Each new platform takes 15–30 minutes to connect. After connecting, the Hermes admin interface confirms status and shows incoming message activity for each channel.

Most Hermes setups stall not at deployment — but at context definition.

One Hermes deployment handles every connected platform. No separate instance per channel.

What does context definition involve?

Context definition is where most Hermes setups underperform. Hermes begins building Skill objects from the first completed task — structured records of how to handle each task category. The skills built in the first month reflect the inputs received and the outputs produced. Poor context definition in week one propagates into every skill built from those tasks.

Hermes starts encoding skills from the first completed task. If the first 50 tasks are poorly framed or corrected constantly, those corrections become the encoded approach. The quality of skills in month three reflects the quality of context definition in week one.

Context definition requires four inputs for each workflow Hermes will handle:

Example inputs — 5–10 real examples of tasks the workflow will receive (actual emails, messages, or requests, not invented ones)
Expected output format — what a correct output looks like, with annotated examples showing what made each output right
Exception handler — the name and contact of the person Hermes escalates to when it is uncertain
Task category label — how Hermes should name and group this task type in its skill library

This step typically takes 1–3 business days per workflow — not because it is technically complex, but because determining what "correct" looks like requires input from the people doing the work today.

Context definition card showing four fields: Example inputs (5–10 real tasks), Expected output format (annotated examples), Exception handler (name and contact), Task category label (skill library name) — Context definition gives Hermes the information it needs to build accurate skills from the start.

How do you test Hermes before going live?

Before enabling action permissions, run a test phase of 20–50 real tasks in review-only mode. Hermes processes incoming tasks and produces outputs, but takes no action in connected systems — no emails sent, no records created — until a human approves each output.

Review each output against the context definition. A correct output matches the expected format and uses the information from the input accurately. Flag outputs that miss the mark and add the correct version as an example pair to the context definition. After 20 consecutive correct outputs on a workflow, that workflow is ready for live operation.

At go-live, enable action permissions per platform. Set a weekly review cadence for the first month: check a sample of recent outputs, note any recurring error patterns, and update context definitions where needed. Skill accumulation accelerates in weeks 2–4 as Hermes handles more task variants — by the end of month one, common task types are typically handled correctly. For a full explanation of how skills build and compound over time, see how Hermes learns.

Three-phase timeline: Week 1 shows deployment and first tasks; Weeks 2–4 shows skills building and correction rate dropping; Month 2 and beyond shows steady state with most common task variants handled correctly — Skill quality improves fastest in the first four weeks. Month two is typically steady state for common task types.

Frequently asked questions

What server does Hermes run on? Hermes runs on any standard VPS via Docker. A 2-vCPU, 4 GB RAM instance handles hundreds of daily tasks for a small team. Nous Research recommends a minimum of 2 GB RAM; 4 GB provides headroom for concurrent platform connections and skill processing.

How long does Hermes setup take? Deployment and platform connections take less than a day. Context definition — the step that determines skill quality — takes 1–3 days per workflow, depending on how many workflows are being configured and how readily the team can provide real task examples and output standards.

What platforms does Hermes support? Hermes connects to 20+ platforms from a single deployment, including Slack, Gmail, Telegram, Discord, WhatsApp, Microsoft Teams, and Signal. Each platform requires a separate token or OAuth credential. The Hermes admin interface documents the connection steps for each.

What happens if Hermes is uncertain about a task? Hermes escalates to the exception handler defined in the context definition for that workflow. The exception handler receives the task and Hermes's best attempt at an output, reviews it, and either approves or corrects it. Corrections are fed back into the skill for that task category.

Notes

Nous Research, Hermes documentation. https://hermes-agent.nousresearch.com/docs/