Stateful Chatbot Architecture with LangChain
This article is best utilized in conjunction with the associated Git project.
OVERVIEW
Date: March 2026
In this project, my aim is to explore how to build a stateful chatbot using LangChain while keeping the implementation structured and easy to study. Most chatbot demos are stateless: they respond to a message and immediately forget the conversation. In contrast, this project focuses on memory-aware conversational systems.
The implementation evolves step by step. It begins with basic session-scoped chat history, then adds bounded memory to control context growth, followed by LRU-based session management to keep the number of in-memory sessions bounded. Finally, it introduces summary memory so that older parts of a conversation can be compressed into long-term context while preserving recent exchanges.
Please use the GitHub repository linked above to explore the implementation in detail.
BASICS
As mentioned previously, the project is designed as a learning-oriented architecture walkthrough rather than a production deployment.
The core ideas implemented are:
- Session memory
- Each conversation is tied to a
session_id - Messages are stored and retrieved per session
- Bounded memory
- The recent conversation is constrained using a message window
- Token trimming is used to stay within model context limits
- Session management
- Multiple sessions are handled in memory
- An LRU eviction strategy removes the least recently used sessions when capacity is reached
- Summary memory
- Older messages are compressed into a summary
- Recent messages are kept separately for short-term conversational continuity
BUILDING PIPELINE STEP BY STEP
WORKING
What kind of architecture do we have?
The chatbot combines short-term memory and long-term memory.
- Recent messages
- A bounded recent window keeps the latest exchanges available to the model
- This helps with immediate conversational continuity
- Conversation summary
- Older context is compressed into a summary
- This allows the chatbot to retain durable facts, goals, and preferences without carrying the full transcript forever
- Session state
- Each session maintains its own memory state
- Sessions are isolated from one another
Proposed pipeline
- A user message enters the system
- The chatbot retrieves the relevant session state
- The prompt is constructed using:
- system instructions
- conversation summary
- recent messages
- the latest user input
- The model generates a response
- The recent history is updated
- Periodically, the summary is refreshed using the newer conversation state
Why this matters ?
This project shows that building a chatbot is not only about calling an LLM. It quickly becomes a systems design problem involving :
- state management
- memory policies
- token efficiency
- multi-session handling
SUMMARY
- This project explores the architecture of a stateful chatbot built with LangChain
- It starts from simple session-based memory and progressively adds :
- bounded context
- token-aware trimming
- LRU session eviction
- summary-based long-term memory
- The implementation is intended to be studied alongside the GitHub repository
- The next improvements would include :
- persistent storage with SQLite or Redis
- multi-process safe session handling
- logging and observability
- retrieval-based long-term memory