Spaces:

JatinAutonomousLabs
/

Research_AI_Assistant

Sleeping

App Files Files Community

Research_AI_Assistant / MOVING_WINDOW_CONTEXT_FINAL.md

JatsTheAIGen

workflow errors debugging v14

fa57725 about 2 months ago

preview code

raw

history blame

7.25 kB

Moving Window Context Strategy - Final Implementation

Overview

Implemented a moving window strategy with:

Recent 10 interactions: Full Q&A pairs (no truncation)
All remaining history: LLM-generated third-person narrative summary
NO fallbacks: LLM only

Key Changes

1. Window Size Updated: 8 → 10

Before:

Recent 8 interactions → full detail
Older 12 interactions → summarized

After:

Recent 10 interactions → full detail
ALL remaining history → LLM summarized

2. No Fixed Limit on Older Interactions

Before:

recent_interactions = context.get('interactions', [])[:20]  # Only last 20
oldest_interactions = recent_interactions[8:]  # Only 12 older

After:

recent_interactions = context.get('interactions', [])[:40]  # Last 40 from buffer
oldest_interactions = recent_interactions[10:]  # ALL older (no limit)

3. Removed Fallback Logic

Before:

LLM summarization first
Fallback to Q&A truncation if LLM fails

After:

LLM summarization ONLY
No fallback (minimal placeholder if LLM completely fails)

Moving Window Flow

Example: 35 interactions total

Turn 1-25: → Database (permanent storage)
Turn 26-40: → Memory buffer (40 interactions)

For current request:

Turn 26-35: LLM summary (third-person narrative)
Turn 36-40: Full Q&A pairs (last 10)
Turn 41 (current): Being processed

Next request:

Turn 26-36: LLM summary (moved window)
Turn 37-41: Full Q&A pairs (moved window)
Turn 42 (current): Being processed

Technical Implementation

Code Changes

File: src/agents/synthesis_agent.py

Old:

if len(recent_interactions) > 8:
    oldest_interactions = recent_interactions[8:]  # Only 12
    newest_interactions = recent_interactions[:8]  # Only 8

New:

if len(recent_interactions) > 10:
    oldest_interactions = recent_interactions[10:]  # ALL older
    newest_interactions = recent_interactions[:10]  # Last 10

Old:

# Try LLM first, fallback to Q&A truncation
try:
    llm_summary = await self._generate_narrative_summary(interactions)
    if llm_summary:
        return f"Earlier conversation summary:\n{llm_summary}"
except Exception as e:
    # Fallback logic with Q&A pairs...

New:

# LLM ONLY, no fallback
llm_summary = await self._generate_narrative_summary(interactions)

if llm_summary and len(llm_summary.strip()) > 20:
    return llm_summary
else:
    # Minimal placeholder if LLM fails
    return f"Earlier conversation included {len(interactions)} interactions covering various topics."

Benefits

1. Comprehensive Context

All history is accessible (up to 40 interactions in buffer)
Not limited to just 20 interactions anymore
Full conversation continuity

2. Efficient Summarization

Recent 10: Full details (precise context)
All older: LLM summary (broader context, token-efficient)
Moving window: Always maintains 10 most recent + summary of rest

3. Better Memory

Can handle 40+ interaction conversations
LLM summary captures entire conversation flow
No information loss from arbitrary truncation

4. Cleaner Code

No fallback complexity
LLM-only approach
Simpler logic

Example: Moving Window in Action

Request 1 (15 interactions):

I1-I5: LLM summary
I6-I15: Full Q&A pairs
I16 (new): Being generated

Request 5 (15 interactions):

I1-I5: LLM summary (same, LLM re-summarized)
I6-I15: Full Q&A pairs (moved from I11-I20 previously)
I21 (new): Being generated

Request 30 (40 interactions):

I1-I30: LLM summary (entire history summarized)
I31-I40: Full Q&A pairs (last 10)
I41 (new): Being generated

Context Window Distribution

┌─────────────────────────────────────┐
│   Database (Unlimited)              │
│   All interactions permanently      │
└─────────────────────────────────────┘
              ↓
┌─────────────────────────────────────┐
│   Memory Buffer (40 interactions)    │
│   Last 40 for fast retrieval         │
└─────────────────────────────────────┘
              ↓
┌─────────────────────────────────────┐
│   Context Window (10 + Summary)     │
│                                     │
│   Recent 10: Full Q&A pairs         │
│   All older: LLM third-person       │
│                                     │
│   <-- MOVING WINDOW -->             │
└─────────────────────────────────────┘

LLM Summary Format

Example for 15 older interactions:

The user started by inquiring about key components of AI chatbot assistants and 
asked which top AI assistants exist in the market. The AI assistant responded with 
information about Alexa, Google Assistant, Siri, and others. The user then noted 
that ChatGPT, Gemini, and Claude were missing, asking why they weren't mentioned. 
The AI assistant explained its limitations. The conversation progressed with the 
user requesting objective KPI comparisons between these models. The AI assistant 
provided detailed metrics and comparisons. The user continued requesting more 
specific information about various aspects of these AI systems.

Files Modified

✅ src/agents/synthesis_agent.py
- Updated window to 10 recent + all older
- Removed fallback logic
- Changed to 40-interaction buffer
✅ Research_AI_Assistant/src/agents/synthesis_agent.py
- Same changes applied

Testing Recommendations

Test Scenarios

Short conversation (≤10 interactions):
- All shown in full detail ✓
- No summarization needed
Medium conversation (15 interactions):
- Last 10: Full Q&A pairs ✓
- First 5: LLM summary ✓
Long conversation (40 interactions):
- Last 10: Full Q&A pairs ✓
- First 30: LLM summary ✓
- Full history accessible
Very long conversation (100+ interactions):
- Last 10: Full Q&A pairs ✓
- Previous 30 (from buffer): LLM summary ✓
- Older interactions in database

Impact

Before (8/12 fixed, limited history):

Only 20 interactions accessible
Lost context for longer conversations
Arbitrary limit

After (10/all, moving window):

✅ 40 interactions accessible from buffer
✅ Full conversation history via LLM summary
✅ Moving window ensures recent context
✅ No arbitrary limits on history

Summary

The moving window strategy now:

📊 Recent 10: Full Q&A pairs (precision)
🎯 All older: LLM summary (breadth)
🔄 Moving window: Always up-to-date
⚡ Efficient: Token-optimized
✅ Comprehensive: Full history accessible

Result: True moving window with comprehensive LLM-based summarization!