---
title: Simple Composable Memory
 | Developer Documentation
---

**NOTE:** This example of memory is deprecated in favor of the newer and more flexible `Memory` class. See the [latest docs](https://docs.llamaindex.ai/en/stable/module_guides/deploying/agents/memory/).

In this notebook, we demonstrate how to inject multiple memory sources into an agent. Specifically, we use the `SimpleComposableMemory` which is comprised of a `primary_memory` as well as potentially several secondary memory sources (stored in `secondary_memory_sources`). The main difference is that `primary_memory` will be used as the main chat buffer for the agent, where as any retrieved messages from `secondary_memory_sources` will be injected to the system prompt message only.

Multiple memory sources may be of use for example in situations where you have a longer-term memory such as `VectorMemory` that you want to use in addition to the default `ChatMemoryBuffer`. What you’ll see in this notebook is that with a `SimpleComposableMemory` you’ll be able to effectively “load” the desired messages from long-term memory into the main memory (i.e. the `ChatMemoryBuffer`).

## How `SimpleComposableMemory` Works?

We begin with the basic usage of the `SimpleComposableMemory`. Here we construct a `VectorMemory` as well as a default `ChatMemoryBuffer`. The `VectorMemory` will be our secondary memory source, whereas the `ChatMemoryBuffer` will be the main or primary one. To instantiate a `SimpleComposableMemory` object, we need to supply a `primary_memory` and (optionally) a list of `secondary_memory_sources`.

![SimpleComposableMemoryIllustration](https://d3ddy8balm3goa.cloudfront.net/llamaindex/simple-composable-memory.excalidraw.svg)

```
import os


os.environ["OPENAI_API_KEY"] = "sk-..."
```

```
from llama_index.core.memory import (
    VectorMemory,
    SimpleComposableMemory,
    ChatMemoryBuffer,
)
from llama_index.core.llms import ChatMessage
from llama_index.embeddings.openai import OpenAIEmbedding


vector_memory = VectorMemory.from_defaults(
    vector_store=None,  # leave as None to use default in-memory vector store
    embed_model=OpenAIEmbedding(),
    retriever_kwargs={"similarity_top_k": 1},
)


# let's set some initial messages in our secondary vector memory
msgs = [
    ChatMessage.from_str("You are a SOMEWHAT helpful assistant.", "system"),
    ChatMessage.from_str("Bob likes burgers.", "user"),
    ChatMessage.from_str("Indeed, Bob likes apples.", "assistant"),
    ChatMessage.from_str("Alice likes apples.", "user"),
]
vector_memory.set(msgs)


chat_memory_buffer = ChatMemoryBuffer.from_defaults()


composable_memory = SimpleComposableMemory.from_defaults(
    primary_memory=chat_memory_buffer,
    secondary_memory_sources=[vector_memory],
)
```

```
composable_memory.primary_memory
```

```
ChatMemoryBuffer(chat_store=SimpleChatStore(store={}), chat_store_key='chat_history', token_limit=3000, tokenizer_fn=functools.partial(<bound method Encoding.encode of <Encoding 'cl100k_base'>>, allowed_special='all'))
```

```
composable_memory.secondary_memory_sources
```

```
[VectorMemory(vector_index=<llama_index.core.indices.vector_store.base.VectorStoreIndex object at 0x11a2d24b0>, retriever_kwargs={'similarity_top_k': 1}, batch_by_user_message=True, cur_batch_textnode=TextNode(id_='97f800fe-1988-44d8-a6dc-7a07bfd30f8e', embedding=None, metadata={'sub_dicts': [{'role': <MessageRole.USER: 'user'>, 'additional_kwargs': {}, 'blocks': [{'block_type': 'text', 'text': 'Alice likes apples.'}], 'content': 'Alice likes apples.'}]}, excluded_embed_metadata_keys=['sub_dicts'], excluded_llm_metadata_keys=['sub_dicts'], relationships={}, metadata_template='{key}: {value}', metadata_separator='\n', text='Alice likes apples.', mimetype='text/plain', start_char_idx=None, end_char_idx=None, metadata_seperator='\n', text_template='{metadata_str}\n\n{content}'))]
```

### `put()` messages into memory

Since `SimpleComposableMemory` is itself a subclass of `BaseMemory`, we add messages to it in the same way as we do for other memory modules. Note that for `SimpleComposableMemory`, invoking `.put()` effectively calls `.put()` on all memory sources. In other words, the message gets added to `primary` and `secondary` sources.

```
msgs = [
    ChatMessage.from_str("You are a REALLY helpful assistant.", "system"),
    ChatMessage.from_str("Jerry likes juice.", "user"),
]
```

```
# load into all memory sources modules"
for m in msgs:
    composable_memory.put(m)
```

### `get()` messages from memory

When `.get()` is invoked, we similarly execute all of the `.get()` methods of `primary` memory as well as all of the `secondary` sources. This leaves us with sequence of lists of messages that we have to must “compose” into a sensible single set of messages (to pass downstream to our agents). Special care must be applied here in general to ensure that the final sequence of messages are both sensible and conform to the chat APIs of the LLM provider.

For `SimpleComposableMemory`, we **inject the messages from the `secondary` sources in the system message of the `primary` memory**. The rest of the message history of the `primary` source is left intact, and this composition is what is ultimately returned.

```
msgs = composable_memory.get("What does Bob like?")
msgs
```

```
[ChatMessage(role=<MessageRole.SYSTEM: 'system'>, additional_kwargs={}, blocks=[TextBlock(block_type='text', text='You are a REALLY helpful assistant.\n\nBelow are a set of relevant dialogues retrieved from potentially several memory sources:\n\n=====Relevant messages from memory source 1=====\n\n\tUSER: Bob likes burgers.\n\tASSISTANT: Indeed, Bob likes apples.\n\n=====End of relevant messages from memory source 1======\n\nThis is the end of the retrieved message dialogues.')]),
 ChatMessage(role=<MessageRole.USER: 'user'>, additional_kwargs={}, blocks=[TextBlock(block_type='text', text='Jerry likes juice.')])]
```

```
# see the memory injected into the system message of the primary memory
print(msgs[0])
```

```
system: You are a REALLY helpful assistant.


Below are a set of relevant dialogues retrieved from potentially several memory sources:


=====Relevant messages from memory source 1=====


  USER: Bob likes burgers.
  ASSISTANT: Indeed, Bob likes apples.


=====End of relevant messages from memory source 1======


This is the end of the retrieved message dialogues.
```

### Successive calls to `get()`

Successive calls of `get()` will simply replace the loaded `secondary` memory messages in the system prompt.

```
msgs = composable_memory.get("What does Alice like?")
msgs
```

```
[ChatMessage(role=<MessageRole.SYSTEM: 'system'>, additional_kwargs={}, blocks=[TextBlock(block_type='text', text='You are a REALLY helpful assistant.\n\nBelow are a set of relevant dialogues retrieved from potentially several memory sources:\n\n=====Relevant messages from memory source 1=====\n\n\tUSER: Alice likes apples.\n\n=====End of relevant messages from memory source 1======\n\nThis is the end of the retrieved message dialogues.')]),
 ChatMessage(role=<MessageRole.USER: 'user'>, additional_kwargs={}, blocks=[TextBlock(block_type='text', text='Jerry likes juice.')])]
```

```
# see the memory injected into the system message of the primary memory
print(msgs[0])
```

```
system: You are a REALLY helpful assistant.


Below are a set of relevant dialogues retrieved from potentially several memory sources:


=====Relevant messages from memory source 1=====


  USER: Alice likes apples.


=====End of relevant messages from memory source 1======


This is the end of the retrieved message dialogues.
```

### What if `get()` retrieves `secondary` messages that already exist in `primary` memory?

In the event that messages retrieved from `secondary` memory already exist in `primary` memory, then these rather redundant secondary messages will not get added to the system message. In the below example, the message “Jerry likes juice.” was `put` into all memory sources, so the system message is not altered.

```
msgs = composable_memory.get("What does Jerry like?")
msgs
```

```
[ChatMessage(role=<MessageRole.SYSTEM: 'system'>, additional_kwargs={}, blocks=[TextBlock(block_type='text', text='You are a REALLY helpful assistant.')]),
 ChatMessage(role=<MessageRole.USER: 'user'>, additional_kwargs={}, blocks=[TextBlock(block_type='text', text='Jerry likes juice.')])]
```

### How to `reset` memory

Similar to the other methods `put()` and `get()`, calling `reset()` will execute `reset()` on both the `primary` and `secondary` memory sources. If you want to reset only the `primary` then you should call the `reset()` method only from it.

#### `reset()` only primary memory

```
composable_memory.primary_memory.reset()
```

```
composable_memory.primary_memory.get()
```

```
[]
```

```
composable_memory.secondary_memory_sources[0].get("What does Alice like?")
```

```
[ChatMessage(role=<MessageRole.USER: 'user'>, additional_kwargs={}, blocks=[TextBlock(block_type='text', text='Alice likes apples.')])]
```

#### `reset()` all memory sources

```
composable_memory.reset()
```

```
composable_memory.primary_memory.get()
```

```
[]
```

## Use `SimpleComposableMemory` With An Agent

Here we will use a `SimpleComposableMemory` with an agent and demonstrate how a secondary, long-term memory source can be used to use messages from on agent conversation as part of another conversation with another agent session.

```
from llama_index.llms.openai import OpenAI
from llama_index.core.tools import FunctionTool
from llama_index.core.agent.workflow import FunctionAgent
```

### Define our memory modules

```
vector_memory = VectorMemory.from_defaults(
    vector_store=None,  # leave as None to use default in-memory vector store
    embed_model=OpenAIEmbedding(),
    retriever_kwargs={"similarity_top_k": 2},
)


chat_memory_buffer = ChatMemoryBuffer.from_defaults()


composable_memory = SimpleComposableMemory.from_defaults(
    primary_memory=chat_memory_buffer,
    secondary_memory_sources=[vector_memory],
)
```

### Define our Agent

```
def multiply(a: int, b: int) -> int:
    """Multiply two integers and returns the result integer"""
    return a * b


def mystery(a: int, b: int) -> int:
    """Mystery function on two numbers"""
    return a**2 - b**2


multiply_tool = FunctionTool.from_defaults(fn=multiply)
mystery_tool = FunctionTool.from_defaults(fn=mystery)
```

```
llm = OpenAI(model="gpt-4.1-mini")
agent = FunctionAgent(
    tools=[multiply_tool, mystery_tool],
    llm=llm,
)
```

### Execute some function calls

When `.chat()` is invoked, the messages are put into the composable memory, which we understand from the previous section implies that all the messages are put in both `primary` and `secondary` sources.

```
response = await agent.run(
    "What is the mystery function on 5 and 6?", memory=composable_memory
)
print(str(response))
```

```
The mystery function on 5 and 6 returns -11.
```

```
response = await agent.run(
    "What happens if you multiply 2 and 3?", memory=composable_memory
)
print(str(response))
```

```
If you multiply 2 and 3, the result is 6.
```

### New Agent Sessions

Now that we’ve added the messages to our `vector_memory`, we can see the effect of having this memory be used with a new agent session versus when it is used. Specifically, we ask the new agents to “recall” the outputs of the function calls, rather than re-computing.

#### An Agent without our past memory

```
response = await agent.run(
    "What was the output of the mystery function on 5 and 6 again? Don't recompute."
    # memory=composable_memory
)


print(str(response))
```

```
I don't have the previous output of the mystery function on 5 and 6 stored. If you want, I can recompute it for you. Would you like me to do that?
```

#### An Agent with our past memory

We see that the agent without access to the our past memory cannot complete the task. With this next agent we will indeed pass in our previous long-term memory (i.e., `vector_memory`). Note that we even use a fresh `ChatMemoryBuffer` which means there is no `chat_history` with this agent. Nonetheless, it will be able to retrieve from our long-term memory to get the past dialogue it needs.

```
response = await agent.run(
    "What was the output of the mystery function on 5 and 6 again? Don't recompute.",
    memory=composable_memory,
)
print(str(response))
```

```
The output of the mystery function on 5 and 6 is -11.
```

```
response = await agent.run(
    "What was the output of the multiply function on 2 and 3 again? Don't recompute.",
    memory=composable_memory,
)
print(str(response))
```

```
The output of the multiply function on 2 and 3 was 6.
```

### What happens under the hood with `.run(user_input)`

Under the hood, `.run(user_input)` call effectively will call the memory’s `.get()` method with `user_input` as the argument. As we learned in the previous section, this will ultimately return a composition of the `primary` and all of the `secondary` memory sources. These composed messages are what is being passed to the LLM’s chat API as the chat history.