Research_AI_Assistant / INTERACTION_CONTEXT_FAILURE_ANALYSIS.md
JatsTheAIGen's picture
cache key error when user id changes -fixed task 1 31_10_2025 v6
93f44e2

Interaction Context Retrieval Failure - Root Cause Analysis

Executive Summary

Interaction contexts are being stored correctly in the database, but are not being retrieved on subsequent requests due to a cache invalidation failure. The system returns stale cached context that doesn't include newly generated interaction contexts.

Problem Statement

When a user submits a request referencing previous context (e.g., "based on above inputs"), the system reports Context retrieved: 0 interaction contexts, causing:

  • Loss of conversation continuity
  • Responses generated for wrong topics
  • Previous interaction context unavailable to agents

Root Cause Analysis

The Caching Flow

The system uses a two-tier caching mechanism:

  1. Context Manager Cache (src/context_manager.py):

    • Key: session_{session_id}
    • Storage: self.session_cache dictionary
    • Purpose: Cache session context to avoid database queries
  2. Orchestrator Cache (src/orchestrator_engine.py):

    • Key: context_{session_id}
    • Storage: self._context_cache dictionary
    • TTL: 5 seconds
    • Purpose: Prevent rapid repeated context retrieval within same request processing

The Failure Sequence

First Request (Working - Context Storage):

1. User: "Tell me about Excel handling"
2. orchestrator.process_request() called
3. _get_or_create_context() checks orchestrator cache β†’ MISS (empty)
4. Calls context_manager.manage_context()
5. manage_context() checks session_cache β†’ MISS (empty)
6. Calls _retrieve_from_db()
7. Database query: SELECT interaction_summary FROM interaction_contexts WHERE session_id = ?
   β†’ Returns 0 rows (new session)
8. Returns context: { interaction_contexts: [] }
9. Caches in session_cache: session_cache["session_cca279a4"] = { interaction_contexts: [] }
10. Response generated about Excel handling
11. generate_interaction_context() called
12. LLM generates 50-token summary
13. Database INSERT: INSERT INTO interaction_contexts (interaction_id, session_id, ...)
    β†’ βœ… SUCCESS: Interaction context stored in database
14. **CRITICAL MISSING STEP**: Cache NOT invalidated

Second Request (Broken - Context Retrieval):

1. User: "Based on above inputs, create a prototype"
2. orchestrator.process_request() called
3. _get_or_create_context() checks orchestrator cache:
   - If < 5 seconds old β†’ Returns cached context (from step 1)
   - OR continues to step 4
4. Calls context_manager.manage_context()
5. manage_context() checks session_cache:
   session_cache.get("session_cca279a4")
   β†’ βœ… CACHE HIT: Returns cached context from first request
   β†’ Contains: { interaction_contexts: [] }
6. **NEVER queries database** because cache hit
7. Context returned with 0 interaction contexts
8. Logs show: "Context retrieved: 0 interaction contexts"
9. Intent agent receives empty context
10. Skills agent analyzes wrong topic
11. Response generated for wrong context (story generation, not Excel)

Root Cause Identified

PRIMARY ISSUE: Cache Invalidation Failure

After generate_interaction_context() successfully stores an interaction context in the database, the cache is never invalidated. This causes:

  1. First Request: Context cached with interaction_contexts = []
  2. Interaction Context Generated: Stored in database βœ…
  3. Cache Not Cleared: session_cache["session_{session_id}"] still contains old context
  4. Second Request: Cache hit returns stale context with 0 interaction contexts
  5. Database Never Queried: Cache check happens before database query

Location of Issue:

  • File: src/orchestrator_engine.py
  • Method: process_request()
  • Lines: 442-450 (after generate_interaction_context() call)
  • Missing: Cache invalidation after interaction context generation

Secondary Issues

Issue 2: Orchestrator-Level Cache Also Not Cleared

The orchestrator maintains its own cache (_context_cache) with a 5-second TTL. If requests come within 5 seconds:

  • Orchestrator cache hit: Returns cached context immediately
  • Context manager never called: Never checks session_cache or database
  • Result: Even if session_cache were cleared, orchestrator cache would still return stale data

Location:

  • File: src/orchestrator_engine.py
  • Method: _get_or_create_context()
  • Lines: 89-93

Issue 3: No Detection of Context Reference Mismatches

When a user explicitly references previous context (e.g., "based on above inputs"), but the system has 0 interaction contexts, there's no mechanism to:

  1. Detect the mismatch
  2. Force cache invalidation
  3. Re-query the database
  4. Warn about potential context loss

Location:

  • File: src/orchestrator_engine.py
  • Method: process_request()
  • Lines: 172-174 (context retrieval happens, but no validation)

Code Flow Analysis

Storage Flow (Working)

orchestrator.process_request()
  └─> generate_interaction_context()
       └─> llm_router.route_inference() β†’ Generate summary
       └─> Database INSERT β†’ Store in interaction_contexts table
       └─> βœ… SUCCESS: Stored in database
       └─> ❌ MISSING: Cache invalidation

Retrieval Flow (Broken)

orchestrator.process_request()
  └─> _get_or_create_context()
       β”œβ”€> Check orchestrator cache (5s TTL)
       β”‚   └─> If hit: Return cached (may be stale)
       └─> manage_context()
           β”œβ”€> Check session_cache
           β”‚   └─> If hit: Return cached (STALE - has 0 contexts)
           └─> _retrieve_from_db() (NEVER REACHED if cache hit)
               └─> Query: SELECT FROM interaction_contexts WHERE session_id = ?
                   └─> Would return stored contexts, but never called

Database Verification

The interaction context IS being stored correctly. Evidence:

  1. Log Entry:

    2025-10-31 06:55:55,481 - src.context_manager - INFO - βœ“ Generated interaction context for 64d4ace2_15ca4dec_1761890055
    
  2. Storage Code (src/context_manager.py:426-438):

    cursor.execute("""
        INSERT OR REPLACE INTO interaction_contexts 
        (interaction_id, session_id, user_input, system_response, interaction_summary, created_at)
        VALUES (?, ?, ?, ?, ?, ?)
    """, (interaction_id, session_id, user_input[:500], system_response[:1000], summary.strip(), datetime.now().isoformat()))
    conn.commit()
    conn.close()
    

    βœ… This executes successfully and commits

  3. Retrieval Code (src/context_manager.py:656-671):

    cursor.execute("""
        SELECT interaction_summary, created_at, needs_refresh
        FROM interaction_contexts
        WHERE session_id = ? AND (needs_refresh IS NULL OR needs_refresh = 0)
        ORDER BY created_at DESC
        LIMIT 20
    """, (session_id,))
    

    βœ… This query would work, but is never executed due to cache hit

Cache Invalidation Points

Current cache invalidation only happens in these scenarios:

  1. Session End: end_session() clears cache (line 534-536)
  2. User Change: User mismatch detection clears cache (line 254-255)
  3. Never: After generating interaction context ❌

Expected vs Actual Behavior

Expected Behavior:

Request 1 β†’ Generate context β†’ Store in DB β†’ Clear cache
Request 2 β†’ Cache miss β†’ Query DB β†’ Find stored context β†’ Use it

Actual Behavior:

Request 1 β†’ Generate context β†’ Store in DB β†’ Keep cache (stale)
Request 2 β†’ Cache hit β†’ Return stale cache (0 contexts) β†’ Never query DB

Evidence from Logs

# First Request - Context Generation
2025-10-31 06:55:55,481 - src.context_manager - INFO - βœ“ Generated interaction context for 64d4ace2_15ca4dec_1761890055

# Second Request - Cache Hit (No DB Query)
2025-10-31 07:02:55,911 - src.context_manager - INFO - Context retrieved: 0 interaction contexts

Time Gap: 7 minutes between requests (well beyond 5-second orchestrator cache TTL) Result: Still 0 contexts β†’ Session cache hit, database never queried

Impact Assessment

Functional Impact:

  • HIGH: Conversation continuity completely broken
  • Users cannot reference previous responses
  • Each request treated as isolated, losing all context

User Experience Impact:

  • HIGH: Responses generated for wrong topics
  • Frustration when "based on above inputs" is ignored
  • Loss of trust in system reliability

Performance Impact:

  • LOW: Cache is working (too well - preventing fresh data retrieval)
  • Database queries being avoided (but should happen after context generation)

Conclusion

The interaction context system is architecturally sound but has a critical cache invalidation bug:

  1. βœ… Interaction contexts are correctly generated
  2. βœ… Interaction contexts are correctly stored in database
  3. βœ… Database retrieval query is correctly implemented
  4. ❌ Cache is never invalidated after interaction context generation
  5. ❌ Cache hit prevents database query from executing
  6. ❌ Stale cached context (with 0 interaction contexts) is returned

The fix requires invalidating both:

  • Context Manager's session_cache after generate_interaction_context()
  • Orchestrator's _context_cache after generate_interaction_context()

This will force fresh database queries on subsequent requests, allowing stored interaction contexts to be retrieved and used.

Files Involved

  1. src/orchestrator_engine.py - Lines 442-450 (missing cache invalidation)
  2. src/orchestrator_engine.py - Lines 83-113 (orchestrator cache)
  3. src/context_manager.py - Lines 235-289 (session cache management)
  4. src/context_manager.py - Lines 396-451 (interaction context generation)

Additional Notes

  • The cache mechanism itself is working as designed (performance optimization)
  • The bug is in the cache lifecycle management (invalidation timing)
  • Database operations are functioning correctly
  • The issue is purely in the caching layer, not the persistence layer