Research_AI_Assistant / INTERACTION_CONTEXT_FAILURE_ANALYSIS.md
JatsTheAIGen's picture
cache key error when user id changes -fixed task 1 31_10_2025 v6
93f44e2
# Interaction Context Retrieval Failure - Root Cause Analysis
## Executive Summary
Interaction contexts are being **stored correctly** in the database, but are **not being retrieved** on subsequent requests due to a cache invalidation failure. The system returns stale cached context that doesn't include newly generated interaction contexts.
## Problem Statement
When a user submits a request referencing previous context (e.g., "based on above inputs"), the system reports `Context retrieved: 0 interaction contexts`, causing:
- Loss of conversation continuity
- Responses generated for wrong topics
- Previous interaction context unavailable to agents
## Root Cause Analysis
### The Caching Flow
The system uses a two-tier caching mechanism:
1. **Context Manager Cache** (`src/context_manager.py`):
- Key: `session_{session_id}`
- Storage: `self.session_cache` dictionary
- Purpose: Cache session context to avoid database queries
2. **Orchestrator Cache** (`src/orchestrator_engine.py`):
- Key: `context_{session_id}`
- Storage: `self._context_cache` dictionary
- TTL: 5 seconds
- Purpose: Prevent rapid repeated context retrieval within same request processing
### The Failure Sequence
#### **First Request (Working - Context Storage)**:
```
1. User: "Tell me about Excel handling"
2. orchestrator.process_request() called
3. _get_or_create_context() checks orchestrator cache β†’ MISS (empty)
4. Calls context_manager.manage_context()
5. manage_context() checks session_cache β†’ MISS (empty)
6. Calls _retrieve_from_db()
7. Database query: SELECT interaction_summary FROM interaction_contexts WHERE session_id = ?
β†’ Returns 0 rows (new session)
8. Returns context: { interaction_contexts: [] }
9. Caches in session_cache: session_cache["session_cca279a4"] = { interaction_contexts: [] }
10. Response generated about Excel handling
11. generate_interaction_context() called
12. LLM generates 50-token summary
13. Database INSERT: INSERT INTO interaction_contexts (interaction_id, session_id, ...)
β†’ βœ… SUCCESS: Interaction context stored in database
14. **CRITICAL MISSING STEP**: Cache NOT invalidated
```
#### **Second Request (Broken - Context Retrieval)**:
```
1. User: "Based on above inputs, create a prototype"
2. orchestrator.process_request() called
3. _get_or_create_context() checks orchestrator cache:
- If < 5 seconds old β†’ Returns cached context (from step 1)
- OR continues to step 4
4. Calls context_manager.manage_context()
5. manage_context() checks session_cache:
session_cache.get("session_cca279a4")
β†’ βœ… CACHE HIT: Returns cached context from first request
β†’ Contains: { interaction_contexts: [] }
6. **NEVER queries database** because cache hit
7. Context returned with 0 interaction contexts
8. Logs show: "Context retrieved: 0 interaction contexts"
9. Intent agent receives empty context
10. Skills agent analyzes wrong topic
11. Response generated for wrong context (story generation, not Excel)
```
### Root Cause Identified
**PRIMARY ISSUE**: Cache Invalidation Failure
After `generate_interaction_context()` successfully stores an interaction context in the database, **the cache is never invalidated**. This causes:
1. **First Request**: Context cached with `interaction_contexts = []`
2. **Interaction Context Generated**: Stored in database βœ…
3. **Cache Not Cleared**: `session_cache["session_{session_id}"]` still contains old context
4. **Second Request**: Cache hit returns stale context with 0 interaction contexts
5. **Database Never Queried**: Cache check happens before database query
**Location of Issue**:
- File: `src/orchestrator_engine.py`
- Method: `process_request()`
- Lines: 442-450 (after `generate_interaction_context()` call)
- **Missing**: Cache invalidation after interaction context generation
### Secondary Issues
#### Issue 2: Orchestrator-Level Cache Also Not Cleared
The orchestrator maintains its own cache (`_context_cache`) with a 5-second TTL. If requests come within 5 seconds:
- **Orchestrator cache hit**: Returns cached context immediately
- **Context manager never called**: Never checks session_cache or database
- **Result**: Even if session_cache were cleared, orchestrator cache would still return stale data
**Location**:
- File: `src/orchestrator_engine.py`
- Method: `_get_or_create_context()`
- Lines: 89-93
#### Issue 3: No Detection of Context Reference Mismatches
When a user explicitly references previous context (e.g., "based on above inputs"), but the system has 0 interaction contexts, there's no mechanism to:
1. Detect the mismatch
2. Force cache invalidation
3. Re-query the database
4. Warn about potential context loss
**Location**:
- File: `src/orchestrator_engine.py`
- Method: `process_request()`
- Lines: 172-174 (context retrieval happens, but no validation)
## Code Flow Analysis
### Storage Flow (Working)
```
orchestrator.process_request()
└─> generate_interaction_context()
└─> llm_router.route_inference() β†’ Generate summary
└─> Database INSERT β†’ Store in interaction_contexts table
└─> βœ… SUCCESS: Stored in database
└─> ❌ MISSING: Cache invalidation
```
### Retrieval Flow (Broken)
```
orchestrator.process_request()
└─> _get_or_create_context()
β”œβ”€> Check orchestrator cache (5s TTL)
β”‚ └─> If hit: Return cached (may be stale)
└─> manage_context()
β”œβ”€> Check session_cache
β”‚ └─> If hit: Return cached (STALE - has 0 contexts)
└─> _retrieve_from_db() (NEVER REACHED if cache hit)
└─> Query: SELECT FROM interaction_contexts WHERE session_id = ?
└─> Would return stored contexts, but never called
```
## Database Verification
The interaction context **IS being stored** correctly. Evidence:
1. **Log Entry**:
```
2025-10-31 06:55:55,481 - src.context_manager - INFO - βœ“ Generated interaction context for 64d4ace2_15ca4dec_1761890055
```
2. **Storage Code** (src/context_manager.py:426-438):
```python
cursor.execute("""
INSERT OR REPLACE INTO interaction_contexts
(interaction_id, session_id, user_input, system_response, interaction_summary, created_at)
VALUES (?, ?, ?, ?, ?, ?)
""", (interaction_id, session_id, user_input[:500], system_response[:1000], summary.strip(), datetime.now().isoformat()))
conn.commit()
conn.close()
```
βœ… This executes successfully and commits
3. **Retrieval Code** (src/context_manager.py:656-671):
```python
cursor.execute("""
SELECT interaction_summary, created_at, needs_refresh
FROM interaction_contexts
WHERE session_id = ? AND (needs_refresh IS NULL OR needs_refresh = 0)
ORDER BY created_at DESC
LIMIT 20
""", (session_id,))
```
βœ… This query would work, but is never executed due to cache hit
## Cache Invalidation Points
Current cache invalidation only happens in these scenarios:
1. **Session End**: `end_session()` clears cache (line 534-536)
2. **User Change**: User mismatch detection clears cache (line 254-255)
3. **Never**: After generating interaction context ❌
## Expected vs Actual Behavior
### Expected Behavior:
```
Request 1 β†’ Generate context β†’ Store in DB β†’ Clear cache
Request 2 β†’ Cache miss β†’ Query DB β†’ Find stored context β†’ Use it
```
### Actual Behavior:
```
Request 1 β†’ Generate context β†’ Store in DB β†’ Keep cache (stale)
Request 2 β†’ Cache hit β†’ Return stale cache (0 contexts) β†’ Never query DB
```
## Evidence from Logs
```
# First Request - Context Generation
2025-10-31 06:55:55,481 - src.context_manager - INFO - βœ“ Generated interaction context for 64d4ace2_15ca4dec_1761890055
# Second Request - Cache Hit (No DB Query)
2025-10-31 07:02:55,911 - src.context_manager - INFO - Context retrieved: 0 interaction contexts
```
**Time Gap**: 7 minutes between requests (well beyond 5-second orchestrator cache TTL)
**Result**: Still 0 contexts β†’ Session cache hit, database never queried
## Impact Assessment
### Functional Impact:
- **HIGH**: Conversation continuity completely broken
- Users cannot reference previous responses
- Each request treated as isolated, losing all context
### User Experience Impact:
- **HIGH**: Responses generated for wrong topics
- Frustration when "based on above inputs" is ignored
- Loss of trust in system reliability
### Performance Impact:
- **LOW**: Cache is working (too well - preventing fresh data retrieval)
- Database queries being avoided (but should happen after context generation)
## Conclusion
The interaction context system is **architecturally sound** but has a **critical cache invalidation bug**:
1. βœ… Interaction contexts are correctly generated
2. βœ… Interaction contexts are correctly stored in database
3. βœ… Database retrieval query is correctly implemented
4. ❌ Cache is never invalidated after interaction context generation
5. ❌ Cache hit prevents database query from executing
6. ❌ Stale cached context (with 0 interaction contexts) is returned
**The fix requires** invalidating both:
- Context Manager's `session_cache` after `generate_interaction_context()`
- Orchestrator's `_context_cache` after `generate_interaction_context()`
This will force fresh database queries on subsequent requests, allowing stored interaction contexts to be retrieved and used.
## Files Involved
1. `src/orchestrator_engine.py` - Lines 442-450 (missing cache invalidation)
2. `src/orchestrator_engine.py` - Lines 83-113 (orchestrator cache)
3. `src/context_manager.py` - Lines 235-289 (session cache management)
4. `src/context_manager.py` - Lines 396-451 (interaction context generation)
## Additional Notes
- The cache mechanism itself is working as designed (performance optimization)
- The bug is in the **cache lifecycle management** (invalidation timing)
- Database operations are functioning correctly
- The issue is purely in the caching layer, not the persistence layer