Spaces:

JatinAutonomousLabs
/

Research_AI_Assistant

Sleeping

App Files Files Community

Research_AI_Assistant / INTERACTION_CONTEXT_FAILURE_ANALYSIS.md

JatsTheAIGen

cache key error when user id changes -fixed task 1 31_10_2025 v6

93f44e2 about 1 month ago

preview code

raw

history blame contribute delete

10 kB

	# Interaction Context Retrieval Failure - Root Cause Analysis

	## Executive Summary

	Interaction contexts are being stored correctly in the database, but are not being retrieved on subsequent requests due to a cache invalidation failure. The system returns stale cached context that doesn't include newly generated interaction contexts.

	## Problem Statement

	When a user submits a request referencing previous context (e.g., "based on above inputs"), the system reports `Context retrieved: 0 interaction contexts`, causing:
	- Loss of conversation continuity
	- Responses generated for wrong topics
	- Previous interaction context unavailable to agents

	## Root Cause Analysis

	### The Caching Flow

	The system uses a two-tier caching mechanism:

	1. Context Manager Cache (`src/context_manager.py`):
	- Key: `session_{session_id}`
	- Storage: `self.session_cache` dictionary
	- Purpose: Cache session context to avoid database queries

	2. Orchestrator Cache (`src/orchestrator_engine.py`):
	- Key: `context_{session_id}`
	- Storage: `self._context_cache` dictionary
	- TTL: 5 seconds
	- Purpose: Prevent rapid repeated context retrieval within same request processing

	### The Failure Sequence

	#### First Request (Working - Context Storage):
	```
	1. User: "Tell me about Excel handling"
	2. orchestrator.process_request() called
	3. _get_or_create_context() checks orchestrator cache → MISS (empty)
	4. Calls context_manager.manage_context()
	5. manage_context() checks session_cache → MISS (empty)
	6. Calls _retrieve_from_db()
	7. Database query: SELECT interaction_summary FROM interaction_contexts WHERE session_id = ?
	→ Returns 0 rows (new session)
	8. Returns context: { interaction_contexts: [] }
	9. Caches in session_cache: session_cache["session_cca279a4"] = { interaction_contexts: [] }
	10. Response generated about Excel handling
	11. generate_interaction_context() called
	12. LLM generates 50-token summary
	13. Database INSERT: INSERT INTO interaction_contexts (interaction_id, session_id, ...)
	→ ✅ SUCCESS: Interaction context stored in database
	14. CRITICAL MISSING STEP: Cache NOT invalidated
	```

	#### Second Request (Broken - Context Retrieval):
	```
	1. User: "Based on above inputs, create a prototype"
	2. orchestrator.process_request() called
	3. _get_or_create_context() checks orchestrator cache:
	- If < 5 seconds old → Returns cached context (from step 1)
	- OR continues to step 4
	4. Calls context_manager.manage_context()
	5. manage_context() checks session_cache:
	session_cache.get("session_cca279a4")
	→ ✅ CACHE HIT: Returns cached context from first request
	→ Contains: { interaction_contexts: [] }
	6. NEVER queries database because cache hit
	7. Context returned with 0 interaction contexts
	8. Logs show: "Context retrieved: 0 interaction contexts"
	9. Intent agent receives empty context
	10. Skills agent analyzes wrong topic
	11. Response generated for wrong context (story generation, not Excel)
	```

	### Root Cause Identified

	PRIMARY ISSUE: Cache Invalidation Failure

	After `generate_interaction_context()` successfully stores an interaction context in the database, the cache is never invalidated. This causes:

	1. First Request: Context cached with `interaction_contexts = []`
	2. Interaction Context Generated: Stored in database ✅
	3. Cache Not Cleared: `session_cache["session_{session_id}"]` still contains old context
	4. Second Request: Cache hit returns stale context with 0 interaction contexts
	5. Database Never Queried: Cache check happens before database query

	Location of Issue:
	- File: `src/orchestrator_engine.py`
	- Method: `process_request()`
	- Lines: 442-450 (after `generate_interaction_context()` call)
	- Missing: Cache invalidation after interaction context generation

	### Secondary Issues

	#### Issue 2: Orchestrator-Level Cache Also Not Cleared

	The orchestrator maintains its own cache (`_context_cache`) with a 5-second TTL. If requests come within 5 seconds:

	- Orchestrator cache hit: Returns cached context immediately
	- Context manager never called: Never checks session_cache or database
	- Result: Even if session_cache were cleared, orchestrator cache would still return stale data

	Location:
	- File: `src/orchestrator_engine.py`
	- Method: `_get_or_create_context()`
	- Lines: 89-93

	#### Issue 3: No Detection of Context Reference Mismatches

	When a user explicitly references previous context (e.g., "based on above inputs"), but the system has 0 interaction contexts, there's no mechanism to:

	1. Detect the mismatch
	2. Force cache invalidation
	3. Re-query the database
	4. Warn about potential context loss

	Location:
	- File: `src/orchestrator_engine.py`
	- Method: `process_request()`
	- Lines: 172-174 (context retrieval happens, but no validation)

	## Code Flow Analysis

	### Storage Flow (Working)

	```
	orchestrator.process_request()
	└─> generate_interaction_context()
	└─> llm_router.route_inference() → Generate summary
	└─> Database INSERT → Store in interaction_contexts table
	└─> ✅ SUCCESS: Stored in database
	└─> ❌ MISSING: Cache invalidation
	```

	### Retrieval Flow (Broken)

	```
	orchestrator.process_request()
	└─> _get_or_create_context()
	├─> Check orchestrator cache (5s TTL)
	│ └─> If hit: Return cached (may be stale)
	└─> manage_context()
	├─> Check session_cache
	│ └─> If hit: Return cached (STALE - has 0 contexts)
	└─> _retrieve_from_db() (NEVER REACHED if cache hit)
	└─> Query: SELECT FROM interaction_contexts WHERE session_id = ?
	└─> Would return stored contexts, but never called
	```

	## Database Verification

	The interaction context IS being stored correctly. Evidence:

	1. Log Entry:
	```
	2025-10-31 06:55:55,481 - src.context_manager - INFO - ✓ Generated interaction context for 64d4ace2_15ca4dec_1761890055
	```

	2. Storage Code (src/context_manager.py:426-438):
	```python
	cursor.execute("""
	INSERT OR REPLACE INTO interaction_contexts
	(interaction_id, session_id, user_input, system_response, interaction_summary, created_at)
	VALUES (?, ?, ?, ?, ?, ?)
	""", (interaction_id, session_id, user_input[:500], system_response[:1000], summary.strip(), datetime.now().isoformat()))
	conn.commit()
	conn.close()
	```
	✅ This executes successfully and commits

	3. Retrieval Code (src/context_manager.py:656-671):
	```python
	cursor.execute("""
	SELECT interaction_summary, created_at, needs_refresh
	FROM interaction_contexts
	WHERE session_id = ? AND (needs_refresh IS NULL OR needs_refresh = 0)
	ORDER BY created_at DESC
	LIMIT 20
	""", (session_id,))
	```
	✅ This query would work, but is never executed due to cache hit

	## Cache Invalidation Points

	Current cache invalidation only happens in these scenarios:

	1. Session End: `end_session()` clears cache (line 534-536)
	2. User Change: User mismatch detection clears cache (line 254-255)
	3. Never: After generating interaction context ❌

	## Expected vs Actual Behavior

	### Expected Behavior:
	```
	Request 1 → Generate context → Store in DB → Clear cache
	Request 2 → Cache miss → Query DB → Find stored context → Use it
	```

	### Actual Behavior:
	```
	Request 1 → Generate context → Store in DB → Keep cache (stale)
	Request 2 → Cache hit → Return stale cache (0 contexts) → Never query DB
	```

	## Evidence from Logs

	```
	# First Request - Context Generation
	2025-10-31 06:55:55,481 - src.context_manager - INFO - ✓ Generated interaction context for 64d4ace2_15ca4dec_1761890055

	# Second Request - Cache Hit (No DB Query)
	2025-10-31 07:02:55,911 - src.context_manager - INFO - Context retrieved: 0 interaction contexts
	```

	Time Gap: 7 minutes between requests (well beyond 5-second orchestrator cache TTL)
	Result: Still 0 contexts → Session cache hit, database never queried

	## Impact Assessment

	### Functional Impact:
	- HIGH: Conversation continuity completely broken
	- Users cannot reference previous responses
	- Each request treated as isolated, losing all context

	### User Experience Impact:
	- HIGH: Responses generated for wrong topics
	- Frustration when "based on above inputs" is ignored
	- Loss of trust in system reliability

	### Performance Impact:
	- LOW: Cache is working (too well - preventing fresh data retrieval)
	- Database queries being avoided (but should happen after context generation)

	## Conclusion

	The interaction context system is architecturally sound but has a critical cache invalidation bug:

	1. ✅ Interaction contexts are correctly generated
	2. ✅ Interaction contexts are correctly stored in database
	3. ✅ Database retrieval query is correctly implemented
	4. ❌ Cache is never invalidated after interaction context generation
	5. ❌ Cache hit prevents database query from executing
	6. ❌ Stale cached context (with 0 interaction contexts) is returned

	The fix requires invalidating both:
	- Context Manager's `session_cache` after `generate_interaction_context()`
	- Orchestrator's `_context_cache` after `generate_interaction_context()`

	This will force fresh database queries on subsequent requests, allowing stored interaction contexts to be retrieved and used.

	## Files Involved

	1. `src/orchestrator_engine.py` - Lines 442-450 (missing cache invalidation)
	2. `src/orchestrator_engine.py` - Lines 83-113 (orchestrator cache)
	3. `src/context_manager.py` - Lines 235-289 (session cache management)
	4. `src/context_manager.py` - Lines 396-451 (interaction context generation)

	## Additional Notes

	- The cache mechanism itself is working as designed (performance optimization)
	- The bug is in the cache lifecycle management (invalidation timing)
	- Database operations are functioning correctly
	- The issue is purely in the caching layer, not the persistence layer