API Reference¶

Auto-generated from source docstrings.

Memory¶

`lakehouse_memory.memory.Memory` ¶

Composition root that wires episodic, semantic, and working stores.

Each Memory instance holds a SQL client, one vector index per store, and a Scope that filters all reads and tags all writes to a specific user / session / agent combination.

The canonical way to create a Memory wired to real Databricks resources is Memory.from_databricks. After construction, call provision() once to create the underlying Unity Catalog tables and Vector Search indexes.

Example::

mem = Memory.from_databricks(
    catalog="my_catalog",
    schema_name="agent_memory",
    workspace_url="https://my-workspace.azuredatabricks.net",
    access_token="dapi...",
    http_path="/sql/1.0/warehouses/abc123",
    vector_search_endpoint="my_vs_endpoint",
)
mem.provision()
scoped = mem.with_scope(user_id="u1", session_id="s1")
scoped.episodic.write(event_type="chat_message", payload={}, text="Hello")

`init(config, client, *, episodic_index, semantic_index, scope=None)` ¶

Initialise Memory with explicit collaborators.

Prefer Memory.from_databricks for production use. This constructor is useful in tests where you supply stub clients and no-op indexes.

Parameters:

Name	Type	Description	Default
`config`	`MemoryConfig`	Catalog/schema/embedding settings for this Memory instance.	required
`client`	`DatabricksClient`	SQL client used by all three stores for DDL and DML.	required
`episodic_index`	`VectorIndex`	Vector index used by the episodic store for similarity search. Pass a no-op `VectorIndex` to skip vector search for episodic events.	required
`semantic_index`	`VectorIndex`	Vector index used by the semantic store for similarity search. Pass a no-op `VectorIndex` to skip vector search for facts.	required
`scope`	`Scope \| None`	Optional identity scope applied to every read and write. Defaults to an empty `Scope()` (no filtering).	`None`

`as_langchain_chat_history(limit=100)` ¶

Return a LangChain BaseChatMessageHistory wired to the episodic store.

Requires the [langchain] optional extra::

pip install lakehouse-memory[langchain]

Parameters:

Name	Type	Description	Default
`limit`	`int`	Maximum number of recent chat messages to return when `messages` is accessed. Defaults to `100`.	`100`

Returns:

Type	Description
`LakehouseChatHistory`	A `LakehouseChatHistory` instance scoped to this Memory's scope.

`as_langchain_retriever(k=5)` ¶

Return a LangChain BaseRetriever wired to the semantic store.

Requires the [langchain] optional extra::

pip install lakehouse-memory[langchain]

Parameters:

Name	Type	Description	Default
`k`	`int`	Number of semantically-similar facts to return per query. Defaults to `5`.	`5`

Returns:

Type	Description
`LakehouseSemanticRetriever`	A `LakehouseSemanticRetriever` instance scoped to this Memory's
`LakehouseSemanticRetriever`	scope.

`from_databricks(*, catalog, schema_name, workspace_url, access_token, http_path, vector_search_endpoint, scope=None, embedding=None)` `classmethod` ¶

Build a Memory wired to real Databricks resources.

Constructs a SqlConnectorClient and two Delta Sync-backed DatabricksVectorIndex objects (one for episodic events, one for semantic facts) and stashes the VS credentials so that a later call to provision() can create the indexes without repeating them.

Does not provision — call mem.provision() after construction to idempotently create the Unity Catalog tables and Vector Search indexes.

Parameters:

Name	Type	Description	Default
`catalog`	`str`	Unity Catalog catalog name (e.g. `"my_catalog"`).	required
`schema_name`	`str`	Schema inside catalog where memory tables live (e.g. `"agent_memory"`).	required
`workspace_url`	`str`	Full Databricks workspace URL, including scheme (e.g. `"https://my-workspace.azuredatabricks.net"`).	required
`access_token`	`str`	Databricks personal-access token or service-principal secret used for both SQL Warehouse and Vector Search API calls.	required
`http_path`	`str`	SQL Warehouse HTTP path (e.g. `"/sql/1.0/warehouses/abc123"`).	required
`vector_search_endpoint`	`str`	Name of the existing Databricks Vector Search endpoint to back both indexes.	required
`scope`	`Scope \| None`	Optional identity scope to pre-apply to every store. Defaults to an empty `Scope()` (no filtering).	`None`
`embedding`	`EmbeddingConfig \| None`	Optional embedding endpoint configuration. Defaults to `EmbeddingConfig()` (`databricks-gte-large-en`, 1024 dims).	`None`

Returns:

Type	Description
`Memory`	A fully-wired `Memory` instance. Call `provision()` before
`Memory`	reading or writing to ensure the underlying tables and indexes exist.

`provision(*, vector_search_endpoint=None, workspace_url=None, access_token=None)` ¶

Idempotently create the UC schema + tables and, optionally, the Vector Search indexes.

Always creates the Unity Catalog schema (if absent) and the three memory tables (episodic, semantic, working). When a Vector Search endpoint is available — either supplied here or stashed by from_databricks — also creates the two Delta Sync indexes.

Safe to call multiple times; existing tables and indexes are left untouched.

Parameters:

Name	Type	Description	Default
`vector_search_endpoint`	`str \| None`	Name of the Databricks Vector Search endpoint to use when creating indexes. Falls back to the value stashed by `from_databricks`, if any. Pass `None` (and provide no stashed value) to skip index creation entirely.	`None`
`workspace_url`	`str \| None`	Workspace URL needed for Vector Search API calls. Falls back to the value stashed by `from_databricks`.	`None`
`access_token`	`str \| None`	Databricks PAT or service-principal secret for Vector Search API calls. Falls back to the value stashed by `from_databricks`.	`None`

Raises:

Type	Description
`ValueError`	If vector_search_endpoint is resolved but workspace_url or access_token cannot be determined.

`with_scope(*, user_id=None, session_id=None, agent_id=None)` ¶

Return a new Memory with scope fields merged from the given arguments.

Any field you pass overrides the corresponding field on the current scope; fields you omit (or pass as None) are inherited unchanged. The new instance shares the same SQL client and vector indexes as the original — no new connections are opened. Stashed VS credentials (workspace_url, access_token, endpoint) are forwarded so that provision() may still be called on the derived instance.

Parameters:

Name	Type	Description	Default
`user_id`	`str \| None`	Override the `user_id` dimension of the scope.	`None`
`session_id`	`str \| None`	Override the `session_id` dimension of the scope.	`None`
`agent_id`	`str \| None`	Override the `agent_id` dimension of the scope.	`None`

Returns:

Type	Description
`Memory`	A new `Memory` instance with the merged scope applied to all
`Memory`	three stores.

MemoryConfig¶

`lakehouse_memory.config.MemoryConfig` ¶

Bases: BaseModel

Top-level configuration for a Memory instance.

Holds the Unity Catalog coordinates (catalog + schema) that determine where the three memory tables are stored, plus the embedding configuration used by Vector Search. Instances are frozen (immutable) after construction.

Attributes:

Name	Type	Description
`catalog`	`str`	Unity Catalog catalog name. Must be non-empty.
`schema_name`	`str`	Schema inside catalog where the `episodic`, `semantic`, and `working` tables reside. Must be non-empty.
`embedding`	`EmbeddingConfig`	Embedding endpoint configuration used when creating and querying Vector Search indexes. Defaults to `EmbeddingConfig()`.

`fqn(table)` ¶

Return the fully-qualified Unity Catalog name for a table.

Parameters:

Name	Type	Description	Default
`table`	`str`	Unqualified table name (e.g. `"episodic"`).	required

Returns:

Type	Description
`str`	Three-part identifier `<catalog>.<schema_name>.<table>` suitable
`str`	for use in SQL statements and Vector Search index names.

EmbeddingConfig¶

`lakehouse_memory.config.EmbeddingConfig` ¶

Bases: BaseModel

Configuration for the embedding model used by Databricks Vector Search.

Specifies which Foundation Model API endpoint produces the embeddings that back the episodic and semantic vector indexes, along with the expected output dimensionality.

Attributes:

Name	Type	Description
`endpoint_name`	`str`	Databricks Foundation Model API endpoint name that generates embeddings. Must match the endpoint used when the Vector Search index was created. Defaults to `"databricks-gte-large-en"`.
`dimensions`	`int`	Dimensionality of the embedding vectors produced by endpoint_name. Must be positive. Defaults to `1024`.

Scope¶

`lakehouse_memory.scope.Scope` `dataclass` ¶

Identity scope that constrains memory reads and tags memory writes.

A Scope represents a specific combination of identity dimensions: user_id, session_id, and/or agent_id. Any subset of these may be set; unset fields act as wildcards — they are absent from SQL WHERE clauses and vector metadata filters, so they match all values.

Scope is the single source of truth for scope-related SQL and vector filter construction. Every store applies these filters automatically to every read operation and includes all set fields as columns on every write.

Instances are frozen (immutable). Use merge to derive a new Scope with some fields overridden.

Attributes:

Name	Type	Description
`user_id`	`str \| None`	Identifies the end-user whose memory is being accessed.
`session_id`	`str \| None`	Identifies the conversation session.
`agent_id`	`str \| None`	Identifies the agent (or agent variant) operating on memory.

`merge(other)` ¶

Return a new Scope with other's set fields overriding self's.

Fields that are None on other are inherited from self unchanged. This allows incremental narrowing of scope without losing previously set dimensions.

Parameters:

Name	Type	Description	Default
`other`	`Scope`	A `Scope` whose non-`None` fields will override the corresponding fields on `self`.	required

Returns:

Type	Description
`Scope`	A new frozen `Scope` instance with the merged field values.

`to_metadata_filter()` ¶

Build a metadata filter dict for Databricks Vector Search queries.

Returns:

Type	Description
`dict[str, str]`	A `{field_name: value}` dict containing only the fields that are
`dict[str, str]`	set on this scope. An empty dict is returned when no fields are
`dict[str, str]`	set (i.e., no filtering is applied).

`to_where_clause()` ¶

Build a SQL WHERE-clause fragment and a bound-parameter map.

Clauses are AND-joined in stable alphabetical field order so that the generated SQL is deterministic across Python versions and runtimes.

Returns:

Type	Description
`str`	A 2-tuple `(clause, params)` where clause is a non-empty SQL
`dict[str, str]`	fragment like `"agent_id = :agent_id AND user_id = :user_id"`
`tuple[str, dict[str, str]]`	and params is the corresponding `{name: value}` dict for
`tuple[str, dict[str, str]]`	parameterised queries. Both are empty (`""` and `{}`) when no
`tuple[str, dict[str, str]]`	fields are set.

LangChain Adapters¶

LakehouseChatHistory¶

`lakehouse_memory.adapters.langchain.LakehouseChatHistory` ¶

Bases: BaseChatMessageHistory

BaseChatMessageHistory backed by the episodic memory store.

Bridges Memory.episodic to the LangChain BaseChatMessageHistory interface so that LakehouseChatHistory can be dropped directly into RunnableWithMessageHistory.

Each chat turn is persisted as an episodic event with event_type="chat_message" and payload={"role": "human"|"ai", "content": ...}.

Scope is inherited from the Memory instance supplied at construction time. To isolate history per session, call memory.with_scope(session_id="new-session-id") before constructing this object.

Note

clear() is an intentional no-op. Episodic memory is append-only by design; deleting history is not supported. To start a fresh conversation, change the session_id via memory.with_scope(session_id=...).

`init(memory, limit=100)` ¶

Initialise a LakehouseChatHistory.

Parameters:

Name	Type	Description	Default
`memory`	`Memory`	A `Memory` instance (typically already scoped to a session via `memory.with_scope(session_id=...)`).	required
`limit`	`int`	Maximum number of most-recent messages to return when `messages` is accessed. Defaults to `100`.	`100`

`clear()` ¶

No-op: episodic memory is append-only and does not support deletion.

For a fresh conversation, derive a new scoped instance via memory.with_scope(session_id="<new-session-id>") and pass it to a new LakehouseChatHistory.

LakehouseSemanticRetriever¶

`lakehouse_memory.adapters.langchain.LakehouseSemanticRetriever` ¶

Bases: BaseRetriever

BaseRetriever backed by the semantic memory store.

Bridges Memory.semantic to the LangChain BaseRetriever interface so that LakehouseSemanticRetriever can be dropped into any retrieval chain or used with create_retrieval_chain.

Each retrieved fact becomes a Document whose page_content is the fact text; all other fact columns (source, scope fields, timestamps, etc.) are placed in metadata (text is excluded since it is promoted to page_content).

Scope is inherited from the Memory instance supplied at construction time.

Attributes:

Name	Type	Description
`memory`	`Memory`	The `Memory` instance whose semantic store is queried. Scope filtering (user / session / agent) is applied automatically from `memory.scope`.
`k`	`int`	Number of semantically-similar facts to return per query. Defaults to `5`.

API Reference¶

Memory¶

lakehouse_memory.memory.Memory ¶

__init__(config, client, *, episodic_index, semantic_index, scope=None) ¶

as_langchain_chat_history(limit=100) ¶

as_langchain_retriever(k=5) ¶

from_databricks(*, catalog, schema_name, workspace_url, access_token, http_path, vector_search_endpoint, scope=None, embedding=None) classmethod ¶

provision(*, vector_search_endpoint=None, workspace_url=None, access_token=None) ¶

with_scope(*, user_id=None, session_id=None, agent_id=None) ¶

MemoryConfig¶

lakehouse_memory.config.MemoryConfig ¶

fqn(table) ¶

EmbeddingConfig¶

lakehouse_memory.config.EmbeddingConfig ¶

Scope¶

lakehouse_memory.scope.Scope dataclass ¶

merge(other) ¶

to_metadata_filter() ¶

to_where_clause() ¶

LangChain Adapters¶

LakehouseChatHistory¶

lakehouse_memory.adapters.langchain.LakehouseChatHistory ¶

__init__(memory, limit=100) ¶

clear() ¶

LakehouseSemanticRetriever¶

lakehouse_memory.adapters.langchain.LakehouseSemanticRetriever ¶

`lakehouse_memory.memory.Memory` ¶

`init(config, client, *, episodic_index, semantic_index, scope=None)` ¶

`as_langchain_chat_history(limit=100)` ¶

`as_langchain_retriever(k=5)` ¶

`from_databricks(*, catalog, schema_name, workspace_url, access_token, http_path, vector_search_endpoint, scope=None, embedding=None)` `classmethod` ¶

`provision(*, vector_search_endpoint=None, workspace_url=None, access_token=None)` ¶

`with_scope(*, user_id=None, session_id=None, agent_id=None)` ¶

`lakehouse_memory.config.MemoryConfig` ¶

`fqn(table)` ¶

`lakehouse_memory.config.EmbeddingConfig` ¶

`lakehouse_memory.scope.Scope` `dataclass` ¶

`merge(other)` ¶

`to_metadata_filter()` ¶

`to_where_clause()` ¶

`lakehouse_memory.adapters.langchain.LakehouseChatHistory` ¶

`init(memory, limit=100)` ¶

`clear()` ¶

`lakehouse_memory.adapters.langchain.LakehouseSemanticRetriever` ¶