What You’ll Learn: Memory gives your LangChain applications the ability to remember past interactions — previous messages in a conversation, learned user preferences, or accumulated context across sessions. Without memory, every request starts from zero. This article covers the memory spectrum: from simple buffer memory (store everything) to summary memory (compress old context) to persistent storage (survive restarts). Think of it like the difference between a goldfish (no memory) and a personal assistant who knows your history.
1. Buffer Memory
1.1 ConversationBufferMemory
from langchain_openai import ChatOpenAI
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain
model = ChatOpenAI(model="gpt-4o-mini", temperature=0.7)
# Stores the full conversation history
memory = ConversationBufferMemory(return_messages=True)
chain = ConversationChain(llm=model, memory=memory, verbose=True)
# Multi-turn conversation
response1 = chain.invoke({"input": "Hi! I'm working on a RAG system."})
print(response1["response"])
response2 = chain.invoke({"input": "What embedding model would you recommend?"})
print(response2["response"])
# Memory contains full history
print(memory.load_memory_variables({}))
1.2 ConversationBufferWindowMemory
from langchain_openai import ChatOpenAI
from langchain.memory import ConversationBufferWindowMemory
from langchain.chains import ConversationChain
model = ChatOpenAI(model="gpt-4o-mini")
# Keep only the last k exchanges
memory = ConversationBufferWindowMemory(k=5, return_messages=True)
chain = ConversationChain(llm=model, memory=memory)
# After 6+ exchanges, oldest messages are dropped
for i in range(8):
chain.invoke({"input": f"Message number {i+1}"})
# Only last 5 exchanges remain in memory
history = memory.load_memory_variables({})
print(f"Messages in memory: {len(history['history'])}")
Real-World Application
Personalized Shopping Assistant
An e-commerce platform uses LangChain memory to create personalized shopping experiences: the assistant remembers past purchases, size preferences, style tastes, and budget constraints across sessions. After 3 interactions, recommendations become significantly more relevant. Result: 35% increase in conversion rate for returning users.
Persistent MemoryPersonalizationE-Commerce
2. Summary Memory
2.1 ConversationSummaryMemory
from langchain_openai import ChatOpenAI
from langchain.memory import ConversationSummaryMemory
from langchain.chains import ConversationChain
model = ChatOpenAI(model="gpt-4o-mini")
# Maintains a running summary instead of raw messages
memory = ConversationSummaryMemory(llm=model, return_messages=True)
chain = ConversationChain(llm=model, memory=memory)
chain.invoke({"input": "I'm building a customer support chatbot for an e-commerce platform."})
chain.invoke({"input": "We handle about 10,000 queries per day across 3 languages."})
chain.invoke({"input": "Main issues are order tracking, returns, and product questions."})
# Memory contains a summary, not raw messages
summary = memory.load_memory_variables({})
print("Summary:", summary)
2.2 ConversationSummaryBufferMemory
from langchain_openai import ChatOpenAI
from langchain.memory import ConversationSummaryBufferMemory
from langchain.chains import ConversationChain
model = ChatOpenAI(model="gpt-4o-mini")
# Hybrid: keeps recent messages raw, summarizes older ones
memory = ConversationSummaryBufferMemory(
llm=model,
max_token_limit=650, # Summarize when buffer exceeds this
return_messages=True
)
chain = ConversationChain(llm=model, memory=memory)
# First few messages stay as-is, older ones get summarized
for msg in ["Hi, I need help with deployment", "We use AWS ECS", "Running 3 services", "Need auto-scaling", "Budget is $500/month"]:
chain.invoke({"input": msg})
print(memory.load_memory_variables({}))
Summary & Next Steps
This completes the LangChain SDK implementation for the concepts covered in Part 6: Memory & Context Engineering.
Try It Yourself: Build a ‘personal journaling assistant’ with 3 memory types: (1) ConversationBufferMemory for the current session, (2) ConversationSummaryMemory that compresses after 10 messages, (3) persistent memory using Redis that survives restarts. Have a 15-message conversation, then restart the app and verify it remembers key facts from the previous session.
Related Articles
Foundation: Part 6: Memory & Context Engineering
The framework-agnostic concepts behind this article.
Read Article
LC Part 2: RAG & Retrievers
Previous article in the LangChain SDK Track.
Read Article