π§ Memory System
The MemoryTool
in co_ai
serves as the shared memory backbone for the entire pipeline. It manages embeddings, hypotheses, reviews, prompt versions, logs, and moreβall stored in a PostgreSQL database with pgvector support.
π¦ Core Design
The memory system consists of modular "stores", each implementing a specific aspect of persistent storage. These stores all inherit from a BaseStore
and are registered into a central MemoryTool
.
MemoryTool Overview
memory = MemoryTool(cfg=cfg.db, logger=logger)
````
It registers multiple internal stores automatically:
```text
- embeddings: vector storage and similarity search
- hypotheses: hypothesis versions, links, metadata
- context: stores pipeline-level state/context
- prompts: prompt text, responses, evaluations
- reports: final YAML or HTML reports
π Store Breakdown
π EmbeddingStore
Stores vector embeddings of hypotheses and enables similarity search via pgvector
.
Key Methods:
get_or_create(text)
similar(text, top_k=5)
π‘ HypothesisStore
Stores hypotheses generated throughout the pipeline, along with evaluations and links to the prompt that created them.
Schema includes:
text
goal
confidence
features
(JSON)embedding
(vector)prompt_id
(foreign key)
πΎ PromptStore
Tracks prompt versions, tuning attempts, evaluations, and strategies used during generation.
Key Features:
- Save new prompts with
save()
- Log evaluation results with
store_evaluation()
- Retrieve recent prompts via
get_latest_prompts(n)
π₯ ContextStore
Stores and retrieves pipeline context snapshots to allow state recovery or debugging between stages.
Stored As:
yaml
- Timestamped entries
- One entry per run ID
π ReportLogger
Stores the final pipeline result YAML in the reports
table and allows for post-run analysis or export.
π§© Custom Stores
You can define and plug in your own store like so:
class MyCustomStore(BaseStore):
def __init__(self, db, logger=None):
self.db = db
self.logger = logger
self.name = "my_custom_store"
Then register it:
memory.register_store(MyCustomStore(db, logger))
ποΈ Database Schema (Simplified)
Here are a few tables used:
CREATE TABLE embeddings (
id SERIAL PRIMARY KEY,
text TEXT UNIQUE,
embedding VECTOR(1024),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE hypotheses (
id SERIAL PRIMARY KEY,
text TEXT,
goal TEXT,
confidence FLOAT,
prompt_id INT,
embedding VECTOR(1024),
...
);
CREATE TABLE prompts (
id SERIAL PRIMARY KEY,
agent_name TEXT,
prompt_text TEXT,
version INT,
is_current BOOLEAN,
...
);
π Config Example
Hydra config fragment:
db:
host: localhost
port: 5432
database: co_ai
user: postgres
password: yourpassword
β Benefits
- Consistent logging and traceability
- Unified access to evolving data
- Easy to extend with new stores
- Enables historical comparisons and tuning