Back Original

Open source memory layer so any AI agent can do what Claude.ai and ChatGPT do

Open Source Β· MCP Native Β· PostgreSQL + pgvector

Stash makes your AI remember you. Every session. Forever. No more explaining yourself from scratch.

Sound familiar?

😫 Without Stash

Hey, I'm building a SaaS for restaurants. Can you help?

Of course! Tell me about your project.

We talked about this last week... I already explained everything.

I'm sorry, I don't have access to previous conversations.

...again?

πŸ” You just wasted 10 minutes re-explaining yourself. Again.

VS

😌 With Stash

Hey, continuing work on my project.

Welcome back! Last time we finalized the pricing model for your restaurant SaaS. You were about to work on the onboarding flow. Want to pick up there?

Yes! Exactly that.

Great. You also mentioned you wanted to avoid Stripe's complexity β€” I have that noted. Here's where we left off...

βœ“ Picked up instantly. Zero repetition. Full context.

New session

❌ "Who are you again?"

βœ“ Picks up where you left off

Your preferences

❌ Re-explain every time

βœ“ Already knows them

Past mistakes

❌ Repeats the same errors

βœ“ Remembers what didn't work

Long projects

❌ Loses track of goals

βœ“ Tracks goals across weeks

Token cost

❌ Grows every session

βœ“ Only recalls what matters

Switching models

❌ Start from zero again

βœ“ Memory is model-agnostic

What is Stash

Not just memory.

Stash is a persistent cognitive layer that sits between your AI agent and the world. It doesn't replace your model β€” it makes your model continuous. Episodes become facts. Facts become patterns. Patterns become wisdom.

"Your AI is the brain. Stash is the life experience."

your agent

Claude, GPT, local model, anything

episodes

Raw observations, append-only

facts

Synthesized beliefs with confidence

relationships

Entity knowledge graph

patterns

Higher-order abstractions

goals Β· failures Β· hypotheses

Intent, learning, uncertainty

postgres + pgvector

Battle-tested infrastructure

Namespaces

Memory organized

Not all memory is equal. What your agent learns about you is different from what it learns about a project, which is different from what it knows about itself. Namespaces let the agent organize what it learns into clean, separate buckets β€” just like folders on your computer.

Each namespace is a path. Paths are hierarchical. Reading from /projects automatically includes everything under /projects/stash, /projects/cartona, and so on. You never have to think about it β€” the agent does.

πŸ“ Write to one namespace. Read from any subtree.

example namespace structure

πŸ“ / everything

πŸ“ /users/alice who alice is, her preferences

πŸ“ /projects all projects

πŸ“ /projects/restaurant-saas pricing, features, decisions

πŸ“ /projects/mobile-app design, tech stack, goals

πŸ“ /self agent self-knowledge

πŸ“„ /self/capabilities what I do well

πŸ“„ /self/limits what I struggle with

πŸ“„ /self/preferences how I work best

πŸ”

Recursive reads

Recall from /projects and get everything across all sub-projects automatically.

✏️

Precise writes

Remember always targets one exact namespace β€” no accidental cross-contamination.

πŸ”’

Clean separation

User memory never mixes with project memory. Agent self-knowledge stays in /self.

Stash vs RAG

RAG gives your AI

You've probably heard of RAG β€” Retrieval Augmented Generation. It's clever. But it's not memory. Here's the difference, in plain English.

πŸ“š RAG

"A very fast librarian"

You give it a pile of documents. When you ask a question, it searches those documents and hands you the relevant pages. That's it. It doesn't remember your conversation. It doesn't learn. It doesn't know you. Every question starts from scratch β€” it's just a smarter search engine over files you already wrote.

  • Only knows what's in your documents
  • Cannot learn from conversations
  • Cannot track goals or intentions
  • Cannot reason about cause and effect
  • Cannot notice contradictions over time
  • Stateless β€” no continuity whatsoever
  • You must write the knowledge first

VS

🧠 Stash

"A mind that grows"

Stash learns from everything your agent experiences β€” conversations, decisions, successes, failures. It synthesizes raw observations into facts, connects facts into a knowledge graph, detects contradictions, tracks goals, and builds an understanding of you that deepens over time. You don't write anything. It figures it out.

  • Learns from every conversation automatically
  • Builds a knowledge graph over time
  • Tracks your goals across weeks and months
  • Reasons about cause and effect
  • Self-corrects when beliefs contradict
  • Continuous β€” picks up exactly where you left off
  • Creates knowledge β€” you don't have to

πŸ“š

RAG is like...

A brilliant intern who reads your files perfectly β€” but forgets everything the moment they leave the room.

β†’

🧠

Stash is like...

A colleague who was there from day one, remembers every decision you ever made, and gets more valuable every single week.

Can you use both? Yes β€” RAG is great for searching documents. Stash is for remembering experience. They solve different problems. Stash just goes much, much further.

Why Stash is Different

Everyone gave AI

Claude.ai has memory. ChatGPT has memory. They only work for themselves β€” locked to one platform, one model, one company. Stash works for everyone, everywhere, forever. And it goes far deeper than any of them.

Remembers you

βœ“

βœ“

βœ“

Works with any AI model

βœ—

βœ—

βœ“

Works with local / private models

βœ—

βœ—

βœ“

You own your data

βœ—

βœ—

βœ“

Background consolidation

βœ—

βœ—

βœ“

Goals & intent tracking

βœ—

βœ—

βœ“

Learns from failures

βœ—

βœ—

βœ“

Causal reasoning

βœ—

βœ—

βœ“

Agent self-model

βœ—

βœ—

βœ“

What it gives your AI

A notepad

A notepad

A mind

The Problem

🧠

Brilliant brain, no experience

AI models reason brilliantly but remember nothing. Every session you re-explain who you are, what you need, and what you've already tried. You're training the same student every single day.

πŸ’Έ

Context windows are expensive

The workaround is stuffing full conversation history into every prompt. It's slow, expensive, and you still hit the limit. You're paying for tokens that repeat the same facts over and over.

πŸ”„

Agents repeat their mistakes

Your agent tried something, it failed, and next session it tries the exact same thing again. There's no mechanism to carry lessons forward. Every failure is forgotten.

πŸ”’

Memory is a platform privilege

Only a handful of AI platforms offer memory β€” and only for their own models. Your custom agent, your local LLM, your Cursor setup? They all start blind. Memory shouldn't be a premium feature.

Express Setup

Up and running

No infrastructure to set up. No dependencies to install manually. Docker Compose handles everything β€” Postgres, pgvector, Stash, all wired together and ready.

2

Copy .env.example β†’ .env and set your API key + model preferences

3

Run docker compose up β€” that's it. Stash is live.

$ git clone https://github.com/alash3al/stash

$ cd stash

$ cp .env.example .env

Β Β  # edit .env with your API key,

Β Β  # models and STASH_VECTOR_DIM

$ docker compose up

βœ“ postgres + pgvector ready

βœ“ stash migrations applied

βœ“ mcp server listening

βœ“ consolidation running in background

$

⚠️ Set STASH_VECTOR_DIM in your .env before first run. It cannot be changed after initialization.

01

πŸ“

Episodes

Raw observations stored as they happen

02

πŸ’‘

Facts

Clustered episodes synthesized by LLM

03

πŸ•ΈοΈ

Relationships

Entity edges extracted from facts

04

πŸ”—

Causal Links

Cause-effect pairs between facts

05

πŸŒ€

Patterns

Abstract higher-order insights

06

βš–οΈ

Contradictions

Self-correction and confidence decay

NEW

07

🎯

Goal Inference

Facts automatically tracked against active goals. Progress detected, contradictions surfaced.

NEW

08

πŸ’₯

Failure Patterns

Detect repeated mistakes. Extract failure patterns as new facts. The agent stops repeating itself.

NEW

09

πŸ”¬

Hypothesis Scan

New evidence passively confirms or rejects open hypotheses. No manual intervention needed.

MCP Integration

Two commands.

Stash speaks MCP natively. Drop it into Claude Desktop, Cursor, or any MCP-compatible agent in under 5 minutes. No SDK. No vendor lock-in. Your agent remembers you everywhere.

28 tools covering the full cognitive stack β€” from raw remember and recall all the way to causal chains, contradiction resolution, and hypothesis management.

Claude Desktop Cursor OpenCode Custom Agents Local LLMs Any MCP Client

$ ./stash mcp execute --with-consolidation

$ ./stash mcp serve --port 8080 --with-consolidation

βœ“ remember Β· recall Β· forget Β· init

βœ“ goals Β· failures Β· hypotheses

βœ“ consolidate Β· query_facts Β· relationships

βœ“ causal links Β· contradictions

βœ“ namespaces Β· context Β· self-model

$

Agent Self-Model

Your agent can

Call init and Stash creates a /self namespace scaffold. The agent uses its own memory layer to build and maintain a model of its own capabilities, limits, and preferences.

/self/capabilities

What I can do well

The agent remembers where it excels and recalls these when planning how to approach a task.

/self/limits

What I struggle with

Recorded failures and known weaknesses. The anti-repeat mechanism. Never make the same mistake twice.

/self/preferences

How I work best

Learned preferences for how to operate. The agent develops a working style over time, not just facts.

Autonomous Loop

An agent that

Give your agent a 5-minute research loop. It orients from past memory, researches a topic it chooses itself, invents new connections, consolidates what it learned, and closes gracefully β€” ready to pick up next time.

Run it as a cron job. Every 5 minutes, your agent gets smarter.

β†’ See the loop prompt

01

Orient

Recall context, active goals, open hypotheses, past failures

02

Research

Search the web on a topic the agent chooses itself

03

Think

Surface tensions, gaps, contradictions in what it now knows

04

Invent

Generate something new β€” a hypothesis, pattern, or discovery

05

Consolidate

Run the pipeline. Synthesize raw episodes into structured knowledge

06

Reflect + Sleep

Write a session summary. Set context for next run. Stop.

⚑

Stash itself runs on OpenRouter. The author runs Stash locally pointed at OpenRouter β€” access to hundreds of models, one API key, zero infrastructure.

☁️

Cloud APIs

OpenRouter gives you access to hundreds of models β€” GPT, Claude, Gemini, Mistral β€” all behind one OpenAI-compatible endpoint. Point Stash at it and pick any model for embedding and reasoning.

🏠

Local Models

Running Ollama locally? Stash works out of the box. Use Qwen, Llama, Mistral, or any model you've pulled β€” your memory stays fully private, fully offline.

πŸ”§

Self-Hosted

vLLM, LM Studio, llama.cpp server, Together AI, Groq β€” if it speaks the OpenAI API format, Stash speaks it back. Same provider serves both models.

⚠️

Set STASH_VECTOR_DIM before your first run and never change it. pgvector locks the embedding dimension at initialization β€” changing it later requires a full database reset. The default embedding model is openai/text-embedding-3-small with STASH_VECTOR_DIM=1536.

.env β€” same provider for embedding + reasoning

STASH_OPENAI_BASE_URL=https://openrouter.ai/api/v1

STASH_OPENAI_API_KEY=sk-or-...

STASH_EMBEDDING_MODEL=openai/text-embedding-3-small

STASH_REASONER_MODEL=anthropic/claude-3-haiku

STASH_VECTOR_DIM=1536 Β 

STASH_OPENAI_BASE_URL=http://localhost:11434/v1

STASH_EMBEDDING_MODEL=nomic-embed-text

STASH_REASONER_MODEL=qwen2.5:3b

STASH_VECTOR_DIM=768 Β 

STASH_OPENAI_BASE_URL=https://api.groq.com/openai/v1

STASH_EMBEDDING_MODEL=openai/text-embedding-3-small

STASH_REASONER_MODEL=llama-3.1-8b-instant

STASH_VECTOR_DIM=1536 Β 

Open source. Apache 2.0 licensed. Backed by PostgreSQL. Works with any MCP-compatible agent.