What Do Agents Dream of?

June 14, 2026

Dreaming is a new feature in Hugin, Gimle's agent framework. See the Hugin website for the broader framework, and the earlier post for the state-machine architecture this builds on. Below I'll be referring to the standard Hugin example of dreaming.

It is by now acknowledged that one of the primary roles of sleep is to improve memory. Across a day the brain commits a bunch of specific episodes to memory and then overnight it replays them and distils the ones that recur into durable, general knowledge. To be fancy, you could say the brain consolidated episodic memory into semantic memory. Dreaming appears to be how that happens — the brain replaying and reorganising the day’s experience while it is offline.

Recently a lot of the excitement in AI has been about new ways to make a model better. For a long time there were two: you could train it, rewriting its weights, or you could give it more in its context — what it sees at the moment it runs. Lately a third has taken off: test-time compute, or “thinking”, where the model reasons for longer before it answers. All three are powerful, but each one only improves a single run; none of them lets the model keep what it worked out for next time.

That last gap is the one dreaming is about. An agent without long-term memory finishes a task and forgets almost everything: whatever it worked out along the way lives in its context window and dies with the session. To overcome that we give it memory, which enables it to stop forgetting, but now raw memory accumulates. The same insight gets saved a dozen times across a dozen sessions, and nothing ever distils that growing heap of specific episodes into the few general lessons that actually matter.

Dreaming is a fourth way — and it copies what the human brain does. It is an offline pass that replays the notes an agent saved during its runs, distils them into durable lessons, and folds those lessons back into the agent’s prompt — so the next run begins already knowing them. No fine-tuning or similar, instead the agent improves itself by consolidating its own memory.

How an agent dreams

For agent to us dreaming to improve itself and its memories, three things have to be true.

First, it has to save something during a run worth keeping. In Hugin an agent does this with save_insight, a built-in tool that writes a small artifact to persistent storage. In Claude Code you will see it writing memories, small markdown files with a learning from the particular session. These are episodic: specific, scattered, tied to the moment they were produced.

Second, something has to consolidate those episodes. That is the dream, an offline process that reads back the episodic artifacts, finds the patterns that recur, and writes them down as a new type of persistent memory. In Hugin, we call this a Learning artifacts. In Claude Code this is simply a rewrite of the existing memories. In both cases they are durable prose lessons scoped to the agent that produced them.

Third, those learnings have to come back. On the next run the applicable learnings are injected straight into the agent’s prompt, so the knowledge is present before the agent does anything.

That is the whole closed-loop needed for self-improving dreaming: runs produce episodes, the dream turns episodes into learnings, learnings shape the next runs.

The dreaming loop. Runs deposit scattered episodic insights; the dream consolidates them into durable learnings, scoped to the configuration that produced them; those learnings are injected back into the prompt on the next run. The loop is closed and so experience compounds without retraining.

How it works in Hugin

The injection point is a single template variable. Any agent that wants to receive its learnings puts {{ learnings }} somewhere in its system prompt:

# assistant_system.yaml — the agent's system template
name: assistant_system
template: |
  You are a personal travel concierge for one returning traveler.

  ## What you've learned about this traveler
  {{ learnings }}

  If the section above is empty, you don't yet know this traveler's
  preferences — help them anyway, and record durable preferences with
  save_insight so you serve them better next time.

This is opt-in and it costs nothing if you don’t use it: a template that never mentions {{ learnings }} is rendered as before. When the variable is present, the renderer selects the learnings that apply to this agent’s scope, formats them into a block, and substitutes it in. Before the first dream the block is empty.

A consolidated lesson is its own artifact type, looking like this:

@Artifact.register("Learning")
@dataclass
class Learning(Artifact):
    content: str                       # the lesson, prose ready to drop into a prompt
    scope_config: Optional[str] = None # the config this learning improves
    scope_task: Optional[str] = None   # optionally narrowed to one task
    source_artifact_ids: List[str] = field(default_factory=list)  # the episodes it came from
    confidence: float = 0.0            # the dream's self-assessed confidence
    derived_from: str = "dream"        # so dreams never re-consume their own output

Two things to point out: source_artifact_ids keeps every learning traceable back to the specific episodes it was distilled from and derived_from = "dream" lets the consolidation pass exclude its own past output from each new dream, so it consolidates real experience rather than re-consolidating its own conclusions over and over again.

Running a dream is a CLI command. It scans a storage path, groups the episodic artifacts by the configuration that produced them, and runs one consolidation pass per scope:

# Preview what it would learn, without persisting anything:
hugin dream --storage-path ./storage --config assistant --dry-run

# Then for real:
hugin dream --storage-path ./storage --config assistant

Under the hood the dream is itself just a Hugin agent. It is handed the episodic memories for a scope, asked to find “cross-cutting patterns, recurring mistakes, and durable lessons”, and calls a dreaming.save_learning tool for each one. This tool call stamps the scope automatically and records the dream’s confidence as a rating.

Three memory systems on three timescales. Context is the stack re-rendered at every model call — complete but forgotten at the task boundary. Artifacts are the insights an agent chooses to save — episodic, persisting across tasks. Learnings are what the dream distils from those artifacts — semantic, carried across sessions. Dreaming is the arrow that lifts episodic memory into the longest-lived layer.

Dreaming adds a third layer to Hugin’s memory mechanisms. Each of these operating at different timescales and levels of abstraction. Learnings are scoped, so a lesson learned by an agent is injected only into that particular agent’s future runs, never globally. They are ranked by rating and recency and truncated to a small budget, so the injected block cannot grow without bound as the agent dreams night after night. And low-rated learnings decay out of selection over time. The longest-lived memory is also the most curated.

A worked example: an agent that wakes up smarter

The example in the repo is a travel concierge for a single returning traveler. On its first run it knows nothing about them. It learns, it dreams, and on the next run it already knows them.

The configuration and task are ordinary Hugin YAML — the only thing tying them to dreaming is the {{ learnings }} block in the shared system template above:

# assist.yaml — the task
name: assist
prompt: |
  The traveler says:

  {{ request.value }}

  Help them with this request. If they reveal a durable preference,
  record it with save_insight (one insight per preference). When done,
  call finish.
parameters:
  request:
    type: string
    default: >-
      Book me a flight to Lisbon next month. And please — I only ever
      fly in window seats, and I'm vegetarian.

First run. The concierge handles the Lisbon request and, hearing two durable preferences, calls save_insight twice. Its “What you’ve learned” section is empty — it is meeting this traveler for the first time. Two episodic artifacts now sit in storage.

The dream. hugin dream --config assistant replays those two insights and writes them out as Learning artifacts scoped to the assistant config.

Next run. A fresh request — “Plan me a weekend in Rome.” — that says nothing about seats or meals. But the concierge books a window seat and requests a vegetarian meal anyway, because the learnings are now in its prompt:

## What you've learned about this traveler
- The traveler prefers window seats; book a window seat by default.
- The traveler is vegetarian; request a vegetarian meal by default.

Nothing about the model changed. The weights are identical across the two runs. The only thing that changed is what the agent carries into the second run — and that was written by the first.

The payoff in three beats. On the first run the concierge knows nothing, hears two preferences, and saves them as raw insights. The dream consolidates those insights into durable learnings. On the next run the learnings are already in the prompt and used for the booking

Dreaming as a first step

There are usually three ways to improve the output of a model. Firstly, you can change its weights, i.e. retrain it, which is powerful, but expensive and opaque. Secondly, you can put more in the context, i.e. in-context learning. This is cheap and immediate, but isolated to a single session. Thirdly, there is test-time compute, or thinking. Here you change neither the weights nor the prompt — you simply let the model reason for longer before it answers, spending extra computation at the moment it runs. This can be remarkably effective, but, like in-context learning, the effort is isolated to a single session, unless you pair it with retraining.

Dreaming is a fourth way, in the gap the others leave. The improvement lives in the prompt, not the weights, so it is fast, and it lasts across sessions, not just within one, so it is persistent. It is kind of the best of both worlds.

The ways a model can improve, placed by cost and how long the gain lasts. In-context learning is cheap but forgotten at the session boundary; thinking buys more at inference but only for the current answer; training lasts but is expensive and opaque. Dreaming is the best of both worlds: cheap and immediate like context, yet durable across sessions like training.

That sounds like magic, so it is worth being clear: dreaming cannot teach the agent anything its weights could not already do — it is no substitute for training. But what it does, it does cheaply and legibly. Every learning is a plain sentence you can read, see where it applies, trace back to the runs that produced it, and delete if it is wrong. A self-improvement technique that you can audit and undo is a very different thing from self-improvement baked into the weights.

You can view dreaming as another compression technique, similar to context compaction, but lifted from a single conversation to a whole history of sessions, and run offline where there’s no latency to pay. A thing to note is that dreaming does not check whether a lesson actually helped and improve or drop it when it didn’t. It simply assumes that dreamed memories are always good. A clear improvement to this technique would therefore be a correction loop that would assess the quality of consolidated memories and improve on the style of dreaming based on past outputs.

Hugin on GitHub — the source, including the dreaming example you can run end to end.
Hugin Dreaming Example - the example used and mentioned above
Hugin: a state machine framework for agentic reasoning — the architecture the dream is built on.

eriksfunhouse.com

Where the fun never stops!

What Do Agents Dream of?

How an agent dreams

How it works in Hugin

A worked example: an agent that wakes up smarter

Dreaming as a first step