Skip to content

🧠 Memory System

The MemoryTool in co_ai serves as the shared memory backbone for the entire pipeline. It manages embeddings, hypotheses, reviews, prompt versions, logs, and moreβ€”all stored in a PostgreSQL database with pgvector support.


πŸ“¦ Core Design

The memory system consists of modular "stores", each implementing a specific aspect of persistent storage. These stores all inherit from a BaseStore and are registered into a central MemoryTool.

MemoryTool Overview

memory = MemoryTool(cfg=cfg.db, logger=logger)
````

It registers multiple internal stores automatically:

```text
- embeddings: vector storage and similarity search
- hypotheses: hypothesis versions, links, metadata
- context: stores pipeline-level state/context
- prompts: prompt text, responses, evaluations
- reports: final YAML or HTML reports

πŸ“š Store Breakdown

πŸ“ EmbeddingStore

Stores vector embeddings of hypotheses and enables similarity search via pgvector.

Key Methods:

  • get_or_create(text)
  • similar(text, top_k=5)

πŸ’‘ HypothesisStore

Stores hypotheses generated throughout the pipeline, along with evaluations and links to the prompt that created them.

Schema includes:

  • text
  • goal
  • confidence
  • features (JSON)
  • embedding (vector)
  • prompt_id (foreign key)

πŸ’Ύ PromptStore

Tracks prompt versions, tuning attempts, evaluations, and strategies used during generation.

Key Features:

  • Save new prompts with save()
  • Log evaluation results with store_evaluation()
  • Retrieve recent prompts via get_latest_prompts(n)

πŸ“₯ ContextStore

Stores and retrieves pipeline context snapshots to allow state recovery or debugging between stages.

Stored As:

  • yaml
  • Timestamped entries
  • One entry per run ID

πŸ“œ ReportLogger

Stores the final pipeline result YAML in the reports table and allows for post-run analysis or export.


🧩 Custom Stores

You can define and plug in your own store like so:

class MyCustomStore(BaseStore):
    def __init__(self, db, logger=None):
        self.db = db
        self.logger = logger
        self.name = "my_custom_store"

Then register it:

memory.register_store(MyCustomStore(db, logger))

πŸ—„οΈ Database Schema (Simplified)

Here are a few tables used:

CREATE TABLE embeddings (
    id SERIAL PRIMARY KEY,
    text TEXT UNIQUE,
    embedding VECTOR(1024),
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE hypotheses (
    id SERIAL PRIMARY KEY,
    text TEXT,
    goal TEXT,
    confidence FLOAT,
    prompt_id INT,
    embedding VECTOR(1024),
    ...
);
CREATE TABLE prompts (
    id SERIAL PRIMARY KEY,
    agent_name TEXT,
    prompt_text TEXT,
    version INT,
    is_current BOOLEAN,
    ...
);

πŸ›  Config Example

Hydra config fragment:

db:
  host: localhost
  port: 5432
  database: co_ai
  user: postgres
  password: yourpassword

βœ… Benefits

  • Consistent logging and traceability
  • Unified access to evolving data
  • Easy to extend with new stores
  • Enables historical comparisons and tuning