Back to Blog

Hermes Agent with Ollama: The Self-Improving AI Agent That Runs Entirely Local

AIAgentsHermesOllamaLocal AINous ResearchAutonomousProductivity
Abstract visualization of an AI agent processing local data with neural network pathways in monochrome

The local AI agent landscape added a significant player in 2026. Hermes Agent from Nous Research launched with a self-improving architecture that separates it from interactive assistants and pure automation tools. It maintains cross-session memory, builds accumulated skills, and runs entirely local through Ollama.

I spent two weeks running Hermes daily across coding tasks, research workflows, and infrastructure automation. Here is what the agent actually does, how to set it up, and which model configurations work best.

Subscribe to the newsletter for monthly AI agent benchmarks and local LLM infrastructure guides.

What Makes Hermes Different

Most AI agents are stateless. Start a new session and they have no memory of previous interactions. Claude Code works this way. OpenAI's Operator works this way. The agent performs well within a session but does not build on prior work.

Hermes maintains cross-session memory. It remembers your preferences, accumulated skills, and the context of prior projects. Over time, the agent becomes calibrated to your workflow rather than starting from a blank state every conversation.

The second differentiator is the skill system. Hermes ships with 70+ built-in skills covering research, coding, data analysis, and productivity tasks. More importantly, it can create new skills dynamically based on your workflows. If you run a repetitive multi-step process, Hermes can abstract it into a reusable skill that persists across sessions.

The architecture runs entirely through Ollama, which means you own the inference stack completely. No API calls leave your machine unless you explicitly choose a cloud model.

Installation: One Command

If you already have Ollama installed, Hermes setup takes under 5 minutes.

bash
ollama launch hermes

Ollama handles the entire installation sequence:

  1. Installs Hermes if not already present (via the Nous Research install script)
  2. Prompts you to select a model (local via Ollama or cloud models)
  3. Configures the Ollama provider automatically at http://127.0.0.1:11434/v1
  4. Runs the setup wizard for optional messaging gateway connections

The installer checks for dependencies (Python 3.11+, Node.js, ripgrep, ffmpeg) and installs any missing components via uv, the fast Python package manager. The entire dependency chain resolved in under 4 minutes on my M4 Max MacBook Pro.

Model Selection

Local Models (Recommended)

ModelParametersVRAMStrengths
Gemma47B-27B8-16 GBReasoning, code generation
Qwen3.632B-72B24-48 GBCoding, visual understanding, agentic tool use

For Apple Silicon, Qwen3.6 32B runs at acceptable speeds on M4 Max with 64 GB unified memory. Gemma4 7B is snappy on any recent Mac with 16+ GB RAM.

Cloud Models via Ollama Gateway

If you want the best model capability without local hardware constraints, Ollama's cloud gateway connects models that are not designed for consumer hardware:

ModelProviderStrengths
kimi-k2.5MinimaxMultimodal reasoning with subagents
glm-5.1ZhipuReasoning and code generation
qwen3.5QwenReasoning, coding, agentic tool use with vision
minimax-m2.7MinimaxFast efficient coding and real-world productivity

The gateway approach gives you access to larger models without the hardware requirement. The tradeoff is inference latency and API dependency.

Messaging Gateway: Access Hermes From Anywhere

Hermes includes a messaging gateway that connects Telegram, Discord, Slack, WhatsApp, Signal, and Email. After installation, run:

bash
hermes gateway setup

This launches an interactive wizard that walks through platform authentication. Once connected, you can message your Hermes agent from any configured platform and receive responses powered by your local Ollama inference.

The practical use case: running Hermes on a always-on machine at home, then querying it via Telegram while traveling. The agent retains its cross-session memory regardless of which platform triggers it.

To reconfigure at any time:

bash
hermes setup

Daily Usage: What Actually Works

Coding Workflows

Hermes handles boilerplate generation, refactoring tasks, and documentation writing competently with Qwen3.6 32B. The agent writes code to files, executes shell commands, and maintains context across files within a session.

The skill system shines for repetitive workflows. I taught Hermes a skill for the standard PR review checklist I run on every pull request. Now a single prompt triggers the entire sequence: fetch the PR diff, run linters, check test coverage, and format the review summary.

Research Tasks

The built-in research skills work well for technical literature review. Hermes navigates documentation, synthesizes information across sources, and produces structured output. For financial or market research, connecting a cloud model through the gateway improves output quality on complex synthesis tasks.

Infrastructure Automation

Hermes manages shell scripting, cron job setup, and Docker container orchestration through natural language commands. The cross-session memory means it learns your infrastructure preferences over time.

Comparison with Claude Code

DimensionHermes + OllamaClaude Code
MemoryCross-session persistentSession-only
Skills70+ built-in, user-creatablePrompt-based
Local inferenceNative, full controlCloud API required
Messaging gatewayBuilt-inNot supported
Self-improvementLearns from your usageStatic capabilities
Best forAutonomous agents, always-on tasksInteractive coding assistance

Claude Code remains the stronger interactive development partner for active coding sessions. Hermes is the better choice for autonomous workflows, cross-session memory, and accessing AI capability from messaging platforms.

Limitations

Hermes is not a replacement for focused coding assistance during active development. The agent shines for background tasks and autonomous operation, but the inference latency on local models makes it less ideal for real-time pair programming.

The skill creation system, while powerful, requires upfront investment to teach the agent your specific workflows. Teams with standardized processes benefit most.

Browser tool support requires Playwright Chromium dependencies on macOS, which must be installed manually:

bash
cd /Users/pooya/.hermes/hermes-agent && npx playwright install chromium

Without this step, browser-based skills do not function.

Verdict

Hermes Agent with Ollama fills a specific niche that other tools leave empty. If you want an always-on AI agent that learns your preferences, operates entirely on your hardware, and can be accessed via messaging apps, this configuration is unmatched in 2026.

For teams evaluating AI agent infrastructure, Hermes deserves serious evaluation alongside LangGraph, CrewAI, and Smolagents. The self-improving architecture and messaging gateway create workflows that stateless agents cannot replicate.

X / Twitter
LinkedIn
Facebook
WhatsApp
Telegram

About Pooya Golchian

Common questions about Pooya's work, AI services, and how to start a project together.

Get practical AI and engineering playbooks

Weekly field notes on private AI, automation, and high-performance Next.js builds. Each edition is concise, implementation-ready, and tested in production work.

Open full subscription page

Get the latest insights on AI and full-stack development.