How to Build a Universal Long-Term Memory Layer for AI Agents Using Mem0 and OpenAI

In this tutorial, we build a universal long-term memory layer for AI agents using Mem0, OpenAI models, and ChromaDB. We design a system that can extract structured memories from natural conversations, store them semantically, retrieve them intelligently, and integrate them directly into personalized agent responses. We move beyond simple chat history and implement persistent, user-scoped memory with full CRUD control, semantic search, multi-user isolation, and custom configuration. Finally, we construct a production-ready memory-augmented agent architecture that demonstrates how modern AI systems can reason with contextual continuity rather than operate statelessly.

!pip install mem0ai openai rich chromadb -q

import os
import getpass
from datetime import datetime

print(“=” * 60)
print(“🔐 MEM0 Advanced Tutorial — API Key Setup”)
print(“=” * 60)

OPENAI_API_KEY = getpass.getpass(“Enter your OpenAI API key: “)
os.environ[“OPENAI_API_KEY”] = OPENAI_API_KEY

print(“\n✅ API key set!\n”)

from openai import OpenAI
from mem0 import Memory
from rich.console import Console
from rich.panel import Panel
from rich.table import Table
from rich.markdown import Markdown
from rich import print as rprint
import json

console = Console()
openai_client = OpenAI()

console.rule(“[bold cyan]MODULE 1: Basic Memory Setup[/bold cyan]”)

memory = Memory()

print(Panel(
“[green]✓ Memory instance created with default config[/green]\n”
” • LLM: gpt-4.1-nano (OpenAI)\n”
” • Vector Store: ChromaDB (local)\n”
” • Embedder: text-embedding-3-small”,
title=”Memory Config”, border_style=”cyan”
))

We install all required dependencies and securely configure our OpenAI API key. We initialize the Mem0 Memory instance along with the OpenAI client and Rich console utilities. We establish the foundation of our long-term memory system with the default configuration powered by ChromaDB and OpenAI embeddings.

console.rule(“[bold cyan]MODULE 2: Adding & Retrieving Memories[/bold cyan]”)

USER_ID = “alice_tutorial”

print(“\n📝 Adding memories for user:”, USER_ID)

conversations = [
[
{“role”: “user”, “content”: “Hi! I’m Alice. I’m a software engineer who loves Python and machine learning.”},
{“role”: “assistant”, “content”: “Nice to meet you Alice! Python and ML are great areas to be in.”}
],
[
{“role”: “user”, “content”: “I prefer dark mode in all my IDEs and I use VS Code as my main editor.”},
{“role”: “assistant”, “content”: “Good to know! VS Code with dark mode is a popular combo.”}
],
[
{“role”: “user”, “content”: “I’m currently building a RAG pipeline for my company’s internal docs. It’s for a fintech startup.”},
{“role”: “assistant”, “content”: “That’s exciting! RAG pipelines are really valuable for enterprise use cases.”}
],
[
{“role”: “user”, “content”: “I have a dog named Max and I enjoy hiking on weekends.”},
{“role”: “assistant”, “content”: “Max sounds lovely! Hiking is a great way to recharge.”}
],
]

results = []
for i, convo in enumerate(conversations):
result = memory.add(convo, user_id=USER_ID)
extracted = result.get(“results”, [])
for mem in extracted:
results.append(mem)
print(f” Conversation {i+1}: {len(extracted)} memory(ies) extracted”)

print(f”\n✅ Total memories stored: {len(results)}”)

We simulate realistic multi-turn conversations and store them using Mem0’s automatic memory extraction pipeline. We add structured conversational data for a specific user and allow the LLM to extract meaningful long-term facts. We verify how many memories are created, confirming that semantic knowledge is successfully persisted.

console.rule(“[bold cyan]MODULE 3: Semantic Search[/bold cyan]”)

queries = [
“What programming languages does the user prefer?”,
“What is Alice working on professionally?”,
“What are Alice’s hobbies?”,
“What tools and IDE does Alice use?”,
]

for query in queries:
search_results = memory.search(query=query, user_id=USER_ID, limit=2)
table = Table(title=f”🔍 Query: {query}”, show_lines=True)
table.add_column(“Memory”, style=”white”, max_width=60)
table.add_column(“Score”, style=”green”, justify=”center”)

for r in search_results.get(“results”, []):
score = r.get(“score”, “N/A”)
score_str = f”{score:.4f}” if isinstance(score, float) else str(score)
table.add_row(r[“memory”], score_str)

console.print(table)
print()

console.rule(“[bold cyan]MODULE 4: CRUD Operations[/bold cyan]”)

all_memories = memory.get_all(user_id=USER_ID)
memories_list = all_memories.get(“results”, [])

print(f”\n📚 All memories for ‘{USER_ID}’:”)
for i, mem in enumerate(memories_list):
print(f” [{i+1}] ID: {mem[‘id’][:8]}… → {mem[‘memory’]}”)

if memories_list:
first_id = memories_list[0][“id”]
original_text = memories_list[0][“memory”]

print(f”\n✏️ Updating memory: ‘{original_text}'”)
memory.update(memory_id=first_id, data=original_text + ” (confirmed)”)

updated = memory.get(memory_id=first_id)
print(f” After update: ‘{updated[‘memory’]}'”)

We perform semantic search queries to retrieve relevant memories using natural language. We demonstrate how Mem0 ranks stored memories by similarity score and returns the most contextually aligned information. We also perform CRUD operations by listing, updating, and validating stored memory entries.

console.rule(“[bold cyan]MODULE 5: Memory-Augmented Chat[/bold cyan]”)

def chat_with_memory(user_message: str, user_id: str, session_history: list) -> str:

relevant = memory.search(query=user_message, user_id=user_id, limit=5)
memory_context = “\n”.join(
f”- {r[‘memory’]}” for r in relevant.get(“results”, [])
) or “No relevant memories found.”

system_prompt = f”””You are a highly personalized AI assistant.
You have access to long-term memories about this user.

RELEVANT USER MEMORIES:
{memory_context}

Use these memories to provide context-aware, personalized responses.
Be natural — don’t explicitly announce that you’re using memories.”””

messages = [{“role”: “system”, “content”: system_prompt}]
messages.extend(session_history[-6:])
messages.append({“role”: “user”, “content”: user_message})

response = openai_client.chat.completions.create(
model=”gpt-4.1-nano-2025-04-14″,
messages=messages
)
assistant_response = response.choices[0].message.content

exchange = [
{“role”: “user”, “content”: user_message},
{“role”: “assistant”, “content”: assistant_response}
]
memory.add(exchange, user_id=user_id)

session_history.append({“role”: “user”, “content”: user_message})
session_history.append({“role”: “assistant”, “content”: assistant_response})

return assistant_response

session = []
demo_messages = [
“Can you recommend a good IDE setup for me?”,
“What kind of project am I currently building at work?”,
“Suggest a weekend activity I might enjoy.”,
“What’s a good tech stack for my current project?”,
]

print(“\n🤖 Starting memory-augmented conversation with Alice…\n”)

for msg in demo_messages:
print(Panel(f”[bold yellow]User:[/bold yellow] {msg}”, border_style=”yellow”))
response = chat_with_memory(msg, USER_ID, session)
print(Panel(f”[bold green]Assistant:[/bold green] {response}”, border_style=”green”))
print()

We build a fully memory-augmented chat loop that retrieves relevant memories before generating responses. We dynamically inject personalized context into the system prompt and store each new exchange back into long-term memory. We simulate a multi-turn session to demonstrate contextual continuity and personalization in action.

console.rule(“[bold cyan]MODULE 6: Multi-User Memory Isolation[/bold cyan]”)

USER_BOB = “bob_tutorial”

bob_conversations = [
[
{“role”: “user”, “content”: “I’m Bob, a data scientist specializing in computer vision and PyTorch.”},
{“role”: “assistant”, “content”: “Great to meet you Bob!”}
],
[
{“role”: “user”, “content”: “I prefer Jupyter notebooks over VS Code, and I use Vim keybindings.”},
{“role”: “assistant”, “content”: “Classic setup for data science work!”}
],
]

for convo in bob_conversations:
memory.add(convo, user_id=USER_BOB)

print(“\n🔐 Testing memory isolation between Alice and Bob:\n”)

test_query = “What programming tools does this user prefer?”

alice_results = memory.search(query=test_query, user_id=USER_ID, limit=3)
bob_results = memory.search(query=test_query, user_id=USER_BOB, limit=3)

print(“👩 Alice’s memories:”)
for r in alice_results.get(“results”, []):
print(f” • {r[‘memory’]}”)

print(“\n👨 Bob’s memories:”)
for r in bob_results.get(“results”, []):
print(f” • {r[‘memory’]}”)

We demonstrate user-level memory isolation by introducing a second user with distinct preferences. We store separate conversational data and validate that searches remain scoped to the correct user_id. We confirm that memory namespaces are isolated, ensuring secure multi-user agent deployments.

print(“\n✅ Memory isolation confirmed — users cannot see each other’s data.”)

console.rule(“[bold cyan]MODULE 7: Custom Configuration[/bold cyan]”)

custom_config = {
“llm”: {
“provider”: “openai”,
“config”: {
“model”: “gpt-4.1-nano-2025-04-14”,
“temperature”: 0.1,
“max_tokens”: 2000,
}
},
“embedder”: {
“provider”: “openai”,
“config”: {
“model”: “text-embedding-3-small”,
}
},
“vector_store”: {
“provider”: “chroma”,
“config”: {
“collection_name”: “advanced_tutorial_v2”,
“path”: “/tmp/chroma_advanced”,
}
},
“version”: “v1.1”
}

custom_memory = Memory.from_config(custom_config)

print(Panel(
“[green]✓ Custom memory instance created[/green]\n”
” • LLM: gpt-4.1-nano with temperature=0.1\n”
” • Embedder: text-embedding-3-small\n”
” • Vector Store: ChromaDB at /tmp/chroma_advanced\n”
” • Collection: advanced_tutorial_v2″,
title=”Custom Config Applied”, border_style=”magenta”
))

custom_memory.add(
[{“role”: “user”, “content”: “I’m a researcher studying neural plasticity and brain-computer interfaces.”}],
user_id=”researcher_01″
)

result = custom_memory.search(“What field does this person work in?”, user_id=”researcher_01″, limit=2)
print(“\n🔍 Custom memory search result:”)
for r in result.get(“results”, []):
print(f” • {r[‘memory’]}”)

console.rule(“[bold cyan]MODULE 8: Memory History[/bold cyan]”)

all_alice = memory.get_all(user_id=USER_ID)
alice_memories = all_alice.get(“results”, [])

table = Table(title=f”📋 Full Memory Profile: {USER_ID}”, show_lines=True, width=90)
table.add_column(“#”, style=”dim”, width=3)
table.add_column(“Memory ID”, style=”cyan”, width=12)
table.add_column(“Memory Content”, style=”white”)
table.add_column(“Created At”, style=”yellow”, width=12)

for i, mem in enumerate(alice_memories):
mem_id = mem[“id”][:8] + “…”
created = mem.get(“created_at”, “N/A”)
if created and created != “N/A”:
try:
created = datetime.fromisoformat(created.replace(“Z”, “+00:00”)).strftime(“%m/%d %H:%M”)
except:
created = str(created)[:10]
table.add_row(str(i+1), mem_id, mem[“memory”], created)

console.print(table)

console.rule(“[bold cyan]MODULE 9: Memory Deletion[/bold cyan]”)

all_mems = memory.get_all(user_id=USER_ID).get(“results”, [])
if all_mems:
last_mem = all_mems[-1]
print(f”\n🗑️ Deleting memory: ‘{last_mem[‘memory’]}'”)
memory.delete(memory_id=last_mem[“id”])

updated_count = len(memory.get_all(user_id=USER_ID).get(“results”, []))
print(f”✅ Deleted. Remaining memories for {USER_ID}: {updated_count}”)

console.rule(“[bold cyan]✅ TUTORIAL COMPLETE[/bold cyan]”)

summary = “””
# 🎓 Mem0 Advanced Tutorial Summary

## What You Learned:
1. **Basic Setup** — Instantiate Memory with default & custom configs
2. **Add Memories** — From conversations (auto-extracted by LLM)
3. **Semantic Search** — Retrieve relevant memories by natural language query
4. **CRUD Operations** — Get, Update, Delete individual memories
5. **Memory-Augmented Chat** — Full pipeline: retrieve → respond → store
6. **Multi-User Isolation** — Separate memory namespaces per user_id
7. **Custom Configuration** — Custom LLM, embedder, and vector store
8. **Memory History** — View full memory profiles with timestamps
9. **Cleanup** — Delete specific or all memories

## Key Concepts:
– `memory.add(messages, user_id=…)`
– `memory.search(query, user_id=…)`
– `memory.get_all(user_id=…)`
– `memory.update(memory_id, data)`
– `memory.delete(memory_id)`
– `Memory.from_config(config)`

## Next Steps:
– Swap ChromaDB for Qdrant, Pinecone, or Weaviate
– Use the hosted Mem0 Platform (app.mem0.ai) for production
– Integrate with LangChain, CrewAI, or LangGraph agents
– Add `agent_id` for agent-level memory scoping
“””

console.print(Markdown(summary))

We create a fully custom Mem0 configuration with explicit parameters for the LLM, embedder, and vector store. We test the custom memory instance and explore memory history, timestamps, and structured profiling. Finally, we demonstrate deletion and cleanup operations, completing the full lifecycle management of long-term agent memory.

In conclusion, we implemented a complete memory infrastructure for AI agents using Mem0 as a universal memory abstraction layer. We demonstrated how to add, retrieve, update, delete, isolate, and customize long-term memories while integrating them into a dynamic chat loop. We showed how semantic memory retrieval transforms generic assistants into context-aware systems capable of personalization and continuity across sessions. With this foundation in place, we are now equipped to extend the architecture into multi-agent systems, enterprise-grade deployments, alternative vector databases, and advanced agent frameworks, turning memory into a core capability rather than an afterthought.

Check out the Full Implementation Code and Notebook. Also, feel free to follow us on Twitter and don’t forget to join our 130k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us

Source link

How to Build a Universal Long-Term Memory Layer for AI Agents Using Mem0 and OpenAI

OpenAI aligns safety practices with EU AI Act’s GPAI Code

DeepSeek Upgrades DeepSeek-V4-Flash-0731 with Major Agentic and Coding Gains

Daniela Rus receives Bavarian Minister-President’s High-Tech Prize | MIT News

Guardoc Health processes clinical documentation using Amazon Nova models

Corn Closes July with Weakness

OpenAI aligns safety practices with EU AI Act’s GPAI Code

Bitcoin ETFs just bled $265M in a brutal 24 hours, and Ethereum’s supposed rescue is another BlackRock illusion

Dice Rolls Keep Bitcoin Keys Offline, but Not Everyone Will Bother

Anthropic disclosed ‘unauthorized’ cybersecurity incident in the wake of OpenAI hack

Top Insights

Analyst Blasts Strategy After CEO Signals New Priority Beyond Bitcoin

DONT LEARN AI BEFORE WATCHING THIS

How to Build a Universal Long-Term Memory Layer for AI Agents Using Mem0 and OpenAI

Related Posts