Stateful Chatbot Architecture with LangChain

Type

Tech CornerFeature

Title

Published

March 6, 2026

Status

Live

OVERVIEW

Date: March 2026

In this project, my aim is to explore how to build a stateful chatbot using LangChain while keeping the implementation structured and easy to study. Most chatbot demos are stateless: they respond to a message and immediately forget the conversation. In contrast, this project focuses on memory-aware conversational systems.

The implementation evolves step by step. It begins with basic session-scoped chat history, then adds bounded memory to control context growth, followed by LRU-based session management to keep the number of in-memory sessions bounded. Finally, it introduces summary memory so that older parts of a conversation can be compressed into long-term context while preserving recent exchanges.

Please use the GitHub repository linked above to explore the implementation in detail.

BASICS

As mentioned previously, the project is designed as a learning-oriented architecture walkthrough rather than a production deployment.

The core ideas implemented are:

Session memory

Each conversation is tied to a session_id
Messages are stored and retrieved per session

Bounded memory

The recent conversation is constrained using a message window
Token trimming is used to stay within model context limits

Session management

Multiple sessions are handled in memory
An LRU eviction strategy removes the least recently used sessions when capacity is reached

Summary memory

Older messages are compressed into a summary
Recent messages are kept separately for short-term conversational continuity

BUILDING PIPELINE STEP BY STEP

WORKING

What kind of architecture do we have?

The chatbot combines short-term memory and long-term memory.

Recent messages

A bounded recent window keeps the latest exchanges available to the model
This helps with immediate conversational continuity

Conversation summary

Older context is compressed into a summary
This allows the chatbot to retain durable facts, goals, and preferences without carrying the full transcript forever

Session state

Each session maintains its own memory state
Sessions are isolated from one another

Proposed pipeline

A user message enters the system
The chatbot retrieves the relevant session state
The prompt is constructed using:

system instructions
conversation summary
recent messages
the latest user input

The model generates a response
The recent history is updated
Periodically, the summary is refreshed using the newer conversation state

Why this matters ?

This project shows that building a chatbot is not only about calling an LLM. It quickly becomes a systems design problem involving :

state management
memory policies
token efficiency
multi-session handling

SUMMARY

This project explores the architecture of a stateful chatbot built with LangChain
It starts from simple session-based memory and progressively adds :

bounded context
token-aware trimming
LRU session eviction
summary-based long-term memory

The implementation is intended to be studied alongside the GitHub repository
The next improvements would include :

persistent storage with SQLite or Redis
multi-process safe session handling
logging and observability
retrieval-based long-term memory