Stateful Chatbot Architecture with LangChain

Stateful Chatbot Architecture with LangChain

Type
Tech CornerFeature
Title

Stateful Chatbot Architecture with LangChain

Published
March 6, 2026
Status
Live
Tags
LangChainPythonChatbotmemory-managementai-agentsconversational-aigroqllmlangchain-memorystateful-chatbot
Note

This article is best utilized in conjunction with the associated Git project.

OVERVIEW

Date: March 2026

In this project, my aim is to explore how to build a stateful chatbot using LangChain while keeping the implementation structured and easy to study. Most chatbot demos are stateless: they respond to a message and immediately forget the conversation. In contrast, this project focuses on memory-aware conversational systems.

The implementation evolves step by step. It begins with basic session-scoped chat history, then adds bounded memory to control context growth, followed by LRU-based session management to keep the number of in-memory sessions bounded. Finally, it introduces summary memory so that older parts of a conversation can be compressed into long-term context while preserving recent exchanges.

Please use the GitHub repository linked above to explore the implementation in detail.

image

BASICS

As mentioned previously, the project is designed as a learning-oriented architecture walkthrough rather than a production deployment.

The core ideas implemented are:

  • Session memory
    • Each conversation is tied to a session_id
    • Messages are stored and retrieved per session
  • Bounded memory
    • The recent conversation is constrained using a message window
    • Token trimming is used to stay within model context limits
  • Session management
    • Multiple sessions are handled in memory
    • An LRU eviction strategy removes the least recently used sessions when capacity is reached
  • Summary memory
    • Older messages are compressed into a summary
    • Recent messages are kept separately for short-term conversational continuity
image

BUILDING PIPELINE STEP BY STEP

WORKING

What kind of architecture do we have?

The chatbot combines short-term memory and long-term memory.

  • Recent messages
    • A bounded recent window keeps the latest exchanges available to the model
    • This helps with immediate conversational continuity
  • Conversation summary
    • Older context is compressed into a summary
    • This allows the chatbot to retain durable facts, goals, and preferences without carrying the full transcript forever
  • Session state
    • Each session maintains its own memory state
    • Sessions are isolated from one another

Proposed pipeline

  • A user message enters the system
  • The chatbot retrieves the relevant session state
  • The prompt is constructed using:
    • system instructions
    • conversation summary
    • recent messages
    • the latest user input
  • The model generates a response
  • The recent history is updated
  • Periodically, the summary is refreshed using the newer conversation state

Why this matters ?

This project shows that building a chatbot is not only about calling an LLM. It quickly becomes a systems design problem involving :

  • state management
  • memory policies
  • token efficiency
  • multi-session handling

SUMMARY

  • This project explores the architecture of a stateful chatbot built with LangChain
  • It starts from simple session-based memory and progressively adds :
    • bounded context
    • token-aware trimming
    • LRU session eviction
    • summary-based long-term memory
  • The implementation is intended to be studied alongside the GitHub repository
  • The next improvements would include :
    • persistent storage with SQLite or Redis
    • multi-process safe session handling
    • logging and observability
    • retrieval-based long-term memory