Spaces:

JatinAutonomousLabs
/

Research_AI_Assistant

Sleeping

App Files Files Community

JatsTheAIGen commited on Oct 31

Commit

93f44e2

1 Parent(s): f89bd21

cache key error when user id changes -fixed task 1 31_10_2025 v6

Browse files

Files changed (12) hide show

CACHE_INVALIDATION_FIX_IMPLEMENTED.md +201 -0
CONTEXT_CACHE_USAGE_REPORT.md +218 -0
INTERACTION_CONTEXT_FAILURE_ANALYSIS.md +262 -0
LLM_BASED_TOPIC_EXTRACTION_IMPLEMENTATION.md +268 -0
PATTERN_BASED_TOPIC_ANALYSIS_REVIEW.md +374 -0
SESSION_CONTEXT_EVERY_TURN_IMPLEMENTATION.md +236 -0
src/agents/intent_agent.py +8 -3
src/agents/safety_agent.py +8 -3
src/agents/skills_identification_agent.py +6 -2
src/agents/synthesis_agent.py +8 -1
src/context_manager.py +159 -25
src/orchestrator_engine.py +203 -100

CACHE_INVALIDATION_FIX_IMPLEMENTED.md ADDED Viewed

	@@ -0,0 +1,201 @@

+# Cache Invalidation Fix - Implementation Summary
+## Overview
+Implemented targeted fixes for interaction context retrieval failures by adding proper cache invalidation. All changes **only affect cache management** and do not modify application functionality.
+## Issues Fixed
+### Issue 1: Primary - Cache Not Invalidated After Interaction Context Generation ✅
+**Location**: `src/orchestrator_engine.py` (lines 449-457)
+**Fix**: After successfully generating and storing an interaction context, invalidate both caches to force fresh retrieval on next request.
+**Changes**:
+- Call `invalidate_session_cache()` after `generate_interaction_context()`
+- Also clear orchestrator-level `_context_cache`
+- Added logging for cache invalidation
+**Impact**: Next request will query database instead of using stale cache, retrieving newly generated interaction contexts.
+### Issue 2: Secondary - Orchestrator Cache Not Cleared ✅
+**Location**: `src/orchestrator_engine.py` (lines 453-457)
+**Fix**: When invalidating session cache, also clear orchestrator-level cache (`_context_cache`).
+**Changes**:
+- Check if `_context_cache` exists and contains entry for session
+- Delete orchestrator cache entry if present
+- Added debug logging
+**Impact**: Prevents orchestrator cache from returning stale data even if session cache is cleared.
+### Issue 3: Tertiary - Context Reference Mismatch Detection ✅
+**Location**: `src/orchestrator_engine.py` (lines 174-195)
+**Fix**: Detect when users reference previous context but cache shows 0 contexts, then force cache refresh.
+**Changes**:
+- Detect phrases like "based on above inputs", "previous response", etc.
+- If user references previous context but has 0 interaction contexts, invalidate both caches
+- Force re-retrieval of context from database
+- Log warning when mismatch detected and info when refreshed
+**Impact**: When users explicitly reference previous context, system will refresh cache and retrieve stored interaction contexts.
+### Issue 4: Cache Invalidation Method Added ✅
+**Location**: `src/context_manager.py` (lines 1045-1053)
+**Fix**: Added dedicated method for cache invalidation.
+**Changes**:
+- Added `invalidate_session_cache(session_id)` method
+- Method safely checks if cache entry exists before deletion
+- Added logging for cache invalidation
+**Impact**: Provides clean API for cache invalidation that can be reused throughout codebase.
+## Code Changes Summary
+### File: `src/context_manager.py`
+**Added Method** (lines 1045-1053):
+```python
+def invalidate_session_cache(self, session_id: str):
+    """
+    Invalidate cached context for a session to force fresh retrieval
+    Only affects cache management - does not change application functionality
+    """
+    session_cache_key = f"session_{session_id}"
+    if session_cache_key in self.session_cache:
+        del self.session_cache[session_cache_key]
+        logger.info(f"Cache invalidated for session {session_id} to ensure fresh context retrieval")
+```
+### File: `src/orchestrator_engine.py`
+**Change 1** - Cache invalidation after interaction context generation (lines 449-457):
+```python
+# After generate_interaction_context()
+# Invalidate caches to ensure fresh context retrieval on next request
+# Only affects cache management - does not change application functionality
+self.context_manager.invalidate_session_cache(session_id)
+# Also clear orchestrator-level cache
+if hasattr(self, '_context_cache'):
+    orchestrator_cache_key = f"context_{session_id}"
+    if orchestrator_cache_key in self._context_cache:
+        del self._context_cache[orchestrator_cache_key]
+        logger.debug(f"Orchestrator cache invalidated for session {session_id}")
+```
+**Change 2** - Context reference mismatch detection (lines 174-195):
+```python
+# Detect context reference mismatches and force cache refresh if needed
+# Only affects cache management - does not change application functionality
+user_input_lower = user_input.lower()
+references_previous = any(phrase in user_input_lower for phrase in [
+    'based on above', 'based on previous', 'above inputs', 'previous response',
+    'last response', 'earlier', 'before', 'mentioned above', 'as discussed',
+    'from above', 'from previous', 'the above', 'the previous'
+])
+interaction_contexts_count = len(context.get('interaction_contexts', []))
+if references_previous and interaction_contexts_count == 0:
+    logger.warning(f"User references previous context but cache shows 0 contexts - forcing cache refresh")
+    # Invalidate both caches and re-retrieve
+    self.context_manager.invalidate_session_cache(session_id)
+    if hasattr(self, '_context_cache'):
+        orchestrator_cache_key = f"context_{session_id}"
+        if orchestrator_cache_key in self._context_cache:
+            del self._context_cache[orchestrator_cache_key]
+    # Force fresh context retrieval
+    context = await self._get_or_create_context(session_id, user_input, user_id)
+    interaction_contexts_count = len(context.get('interaction_contexts', []))
+    logger.info(f"Context refreshed after cache invalidation: {interaction_contexts_count} interaction contexts")
+```
+## Impact Assessment
+### Application Functionality
+- **NO CHANGES**: All existing functionality preserved
+- Only cache management logic modified
+- Database operations unchanged
+- Agent execution unchanged
+- Response generation unchanged
+### Cache Behavior
+- **IMPROVED**: Cache now invalidated at appropriate times
+- Fresh data retrieved when needed
+- Stale cache no longer prevents context retrieval
+### Performance
+- **MINIMAL IMPACT**: Cache invalidation is O(1) operation
+- May result in one extra database query per request (only when cache invalidated)
+- Trade-off: Better accuracy vs. minimal performance cost
+## Expected Behavior After Fix
+### Scenario 1: Normal Request Flow
+```
+Request 1: "Tell me about Excel handling"
+  → Generate response
+  → Store interaction context in DB
+  → Invalidate cache ✅
+Request 2: "Create a prototype"
+  → Cache miss (invalidated)
+  → Query database
+  → Retrieve interaction context from DB ✅
+  → Use context for response generation
+```
+### Scenario 2: Context Reference Detection
+```
+Request 1: "Tell me about Excel handling"
+  → Generate response
+  → Store interaction context in DB
+  → Invalidate cache ✅
+Request 2: "Based on above inputs, create prototype"
+  → Cache miss (invalidated) OR cache hit with 0 contexts
+  → Detect "based on above" reference ✅
+  → Force cache invalidation ✅
+  → Query database
+  → Retrieve interaction context from DB ✅
+  → Use context for response generation
+```
+## Testing Recommendations
+1. **Test Normal Flow**:
+   - Send request 1 → Verify cache invalidated in logs
+   - Send request 2 → Verify interaction context retrieved (> 0)
+   - Verify response correctly references previous discussion
+2. **Test Context Reference Detection**:
+   - Send request 1
+   - Send request 2 with "based on above inputs"
+   - Verify log: "forcing cache refresh"
+   - Verify log: "Context refreshed after cache invalidation: X interaction contexts"
+   - Verify response correctly references previous discussion
+3. **Verify No Functionality Regression**:
+   - All existing features work as before
+   - Response quality unchanged
+   - Performance acceptable (may have 1 extra DB query)
+## Logging Changes
+New log entries to monitor:
+- `"Cache invalidated for session {session_id} to ensure fresh context retrieval"`
+- `"Orchestrator cache invalidated for session {session_id}"` (debug level)
+- `"User references previous context but cache shows 0 contexts - forcing cache refresh"` (warning)
+- `"Context refreshed after cache invalidation: X interaction contexts"`
+## Related Documentation
+- See `INTERACTION_CONTEXT_FAILURE_ANALYSIS.md` for detailed root cause analysis
+- Changes only affect cache management as identified in analysis

CONTEXT_CACHE_USAGE_REPORT.md ADDED Viewed

	@@ -0,0 +1,218 @@

+# Context Cache Usage Report
+## Executive Summary
+This report analyzes how agents and components access interaction, session, and user contexts. The analysis confirms that agents use cache-only context, but identifies one area where session context generation queries the database directly (which is acceptable since it only runs at session end).
+## Context Access Flow
+### 1. Context Retrieval Pattern
+**Orchestrator → Context Manager → Cache → Agents**
+```
+orchestrator.process_request()
+  └─> _get_or_create_context()
+       └─> context_manager.manage_context()
+            ├─> Check session_cache (cache-first)
+            ├─> Check user_cache (cache-first)
+            ├─> Only queries DB if cache miss
+            └─> Returns cached context to orchestrator
+                 └─> Passes to agents
+```
+### 2. Cache-Only Access Verification
+All agents receive context from orchestrator, which gets it from cache:
+#### Intent Agent (`src/agents/intent_agent.py`)
+- **Context Source**: From orchestrator (line 201-204)
+- **Context Access**:
+  - Uses `combined_context` (pre-formatted from cache)
+  - Falls back to `interaction_contexts` and `user_context` from cache
+  - **Status**: ✅ Cache-only
+#### Skills Identification Agent (`src/agents/skills_identification_agent.py`)
+- **Context Source**: From orchestrator (line 218-221)
+- **Context Access**:
+  - Uses `user_context` from cache (line 230)
+  - Uses `interaction_contexts` from cache (line 231)
+  - **Status**: ✅ Cache-only
+#### Synthesis Agent (`src/agents/synthesis_agent.py`)
+- **Context Source**: From orchestrator (line 283-287)
+- **Context Access**:
+  - Uses `interaction_contexts` from cache (lines 299, 358, 550)
+  - Uses `user_context` from cache (implied in context dict)
+  - **Status**: ✅ Cache-only
+#### Safety Agent (`src/agents/safety_agent.py`)
+- **Context Source**: From orchestrator (line 314-317)
+- **Context Access**:
+  - Uses `user_context` from cache (line 158)
+  - Uses `interaction_contexts` from cache (line 163)
+  - **Status**: ✅ Cache-only
+## Context Manager Cache Behavior
+### Session Context (Interaction Contexts)
+**Location**: `src/context_manager.py` - `manage_context()` (lines 235-289)
+**Cache Strategy**:
+1. **Check cache first** (line 247): `session_context = self._get_from_memory_cache(session_cache_key)`
+2. **Query database only if cache miss** (line 260-262): Only when `not session_context`
+3. **Cache immediately after DB query** (line 265): Warm cache with fresh data
+4. **Update cache synchronously** (line 444): When interaction context generated, cache updated immediately via `_update_cache_with_interaction_context()`
+**Result**: ✅ **Cache-first, DB only on miss**
+### User Context
+**Location**: `src/context_manager.py` - `manage_context()` (lines 267-281)
+**Cache Strategy**:
+1. **Check cache first** (line 258): `user_context = self._get_from_memory_cache(user_cache_key)`
+2. **Query database only if not cached** (line 269): `if not user_context or not user_context.get("user_context_loaded")`
+3. **Cache after first load** (line 277): `self._warm_memory_cache(user_cache_key, user_context)`
+4. **Never queries DB again** (line 279-281): Uses cached user context for all subsequent requests
+**Result**: ✅ **Load once, cache forever (no DB queries after initial load)**
+### Interaction Contexts
+**Location**: `src/context_manager.py` - `_update_cache_with_interaction_context()` (lines 770-809)
+**Cache Strategy**:
+1. **Updated synchronously with database** (line 444): Called immediately after DB insert
+2. **Adds to cached list** (line 788): Updates in-memory cache without DB query
+3. **Maintains most recent 20** (line 789): Matches database query limit
+**Result**: ✅ **Cache updated when DB updated, no DB query needed**
+## Session Context Analysis
+### Session Context Generation
+**Location**: `src/context_manager.py` - `generate_session_context()` (lines 463-532)
+**Purpose**: Generate 100-token summary of entire session for long-term user persona building
+**When Called**: Only at session end via `end_session()` (line 540)
+**Database Access**:
+- ✅ **Queries database directly** (lines 469-479) to get all interaction contexts for session
+- **Why This Is Acceptable**:
+  1. Only called once per session (at end)
+  2. Not used by agents during conversation
+  3. Used for generating user persona summary (long-term context)
+  4. Cache doesn't need to maintain full session summary during conversation
+**Result**: ✅ **Acceptable - Only runs at session end, not during conversation**
+### Session Context Usage
+**Finding**: Session context is **NOT used by agents during conversation**.
+**Evidence**:
+- No agent accesses `session_context` or `session_summary` fields
+- Session context is only used for generating user persona (`get_user_context()`)
+- User persona is generated from session contexts across all sessions (line 326)
+**Current Usage**:
+1. Generated at session end (`generate_session_context()`)
+2. Stored in `session_contexts` table
+3. Retrieved when generating user persona (`get_user_context()` line 326)
+4. User persona is cached after first load (lines 277-281)
+**Recommendation**: ✅ **Current usage is correct - session context should not be accessed during conversation**
+## Cache Update Synchronization
+### Interaction Context Updates
+**Location**: `src/context_manager.py` - `generate_interaction_context()` (lines 422-447)
+**Flow**:
+1. Generate interaction summary via LLM
+2. Store in database (line 427-438)
+3. **Immediately update cache** (line 444): `_update_cache_with_interaction_context()`
+4. Cache contains new interaction context before next request
+**Result**: ✅ **Cache and database synchronized at write time**
+## Verification Summary
+### ✅ All Agents Use Cache-Only Context
+| Agent | Context Source | Access Method | Cache-Only? |
+|-------|---------------|---------------|-------------|
+| Intent Agent | Orchestrator | `combined_context`, `interaction_contexts`, `user_context` | ✅ Yes |
+| Skills Agent | Orchestrator | `user_context`, `interaction_contexts` | ✅ Yes |
+| Synthesis Agent | Orchestrator | `interaction_contexts`, `user_context` | ✅ Yes |
+| Safety Agent | Orchestrator | `user_context`, `interaction_contexts` | ✅ Yes |
+### ✅ Context Manager Cache Strategy
+| Context Type | Cache Strategy | DB Queries |
+|-------------|----------------|------------|
+| Interaction Contexts | Cache-first, updated on write | Only on cache miss |
+| User Context | Load once, cache forever | Only on first load |
+| Session Context | Generated at session end | Only at session end (acceptable) |
+### ✅ Cache Synchronization
+| Operation | Cache Update | Timing |
+|-----------|--------------|--------|
+| Interaction Context Generated | ✅ Immediate | Same time as DB write |
+| Session Context Generated | ✅ Not needed | Only at session end |
+| User Context Loaded | ✅ Immediate | After first DB query |
+## Findings
+### ✅ Correct Behaviors
+1. **Agents receive context from cache**: All agents get context from orchestrator, which retrieves from cache
+2. **Cache-first retrieval**: Context manager checks cache before querying database
+3. **User context cached forever**: Loaded once, never queries database again
+4. **Interaction contexts updated synchronously**: Cache updated when database is updated
+5. **Session context properly scoped**: Only generated at session end, not used during conversation
+### ⚠️ Session Context Notes
+1. **Session context generation queries database**: `generate_session_context()` directly queries `interaction_contexts` table (lines 473-479)
+   - **Status**: ✅ Acceptable - Only called at session end
+   - **Impact**: None - Not used during conversation flow
+   - **Recommendation**: No change needed
+2. **Session context not in cache during conversation**: Session summaries are not cached during conversation
+   - **Status**: ✅ Correct behavior
+   - **Reason**: Session summaries are 100-token summaries of entire session, not needed during conversation
+   - **Usage**: Only used for generating user persona (long-term context)
+## Recommendations
+### ✅ No Changes Needed
+All agents and components correctly use cache-only context during conversation. The only database query during conversation flow is:
+- Initial cache miss (acceptable - only happens once per session or on cache expiration)
+- User context first load (acceptable - only happens once per user)
+Session context generation queries database, but this is acceptable because:
+1. Only runs at session end
+2. Not used by agents during conversation
+3. Used for long-term user persona building
+## Conclusion
+**Status**: ✅ **All agents and components correctly use cache-only context**
+The system follows a cache-first strategy where:
+- Context is retrieved from cache
+- Database is only queried on cache miss (first request or cache expiration)
+- Cache is updated immediately when database is updated
+- User context is loaded once and cached forever
+- Session context generation is properly scoped (session end only)
+No changes required - system is working as designed.

INTERACTION_CONTEXT_FAILURE_ANALYSIS.md ADDED Viewed

	@@ -0,0 +1,262 @@

+# Interaction Context Retrieval Failure - Root Cause Analysis
+## Executive Summary
+Interaction contexts are being **stored correctly** in the database, but are **not being retrieved** on subsequent requests due to a cache invalidation failure. The system returns stale cached context that doesn't include newly generated interaction contexts.
+## Problem Statement
+When a user submits a request referencing previous context (e.g., "based on above inputs"), the system reports `Context retrieved: 0 interaction contexts`, causing:
+- Loss of conversation continuity
+- Responses generated for wrong topics
+- Previous interaction context unavailable to agents
+## Root Cause Analysis
+### The Caching Flow
+The system uses a two-tier caching mechanism:
+1. **Context Manager Cache** (`src/context_manager.py`):
+   - Key: `session_{session_id}`
+   - Storage: `self.session_cache` dictionary
+   - Purpose: Cache session context to avoid database queries
+2. **Orchestrator Cache** (`src/orchestrator_engine.py`):
+   - Key: `context_{session_id}`
+   - Storage: `self._context_cache` dictionary
+   - TTL: 5 seconds
+   - Purpose: Prevent rapid repeated context retrieval within same request processing
+### The Failure Sequence
+#### **First Request (Working - Context Storage)**:
+```
+1. User: "Tell me about Excel handling"
+2. orchestrator.process_request() called
+3. _get_or_create_context() checks orchestrator cache → MISS (empty)
+4. Calls context_manager.manage_context()
+5. manage_context() checks session_cache → MISS (empty)
+6. Calls _retrieve_from_db()
+7. Database query: SELECT interaction_summary FROM interaction_contexts WHERE session_id = ?
+   → Returns 0 rows (new session)
+8. Returns context: { interaction_contexts: [] }
+9. Caches in session_cache: session_cache["session_cca279a4"] = { interaction_contexts: [] }
+10. Response generated about Excel handling
+11. generate_interaction_context() called
+12. LLM generates 50-token summary
+13. Database INSERT: INSERT INTO interaction_contexts (interaction_id, session_id, ...)
+    → ✅ SUCCESS: Interaction context stored in database
+14. **CRITICAL MISSING STEP**: Cache NOT invalidated
+```
+#### **Second Request (Broken - Context Retrieval)**:
+```
+1. User: "Based on above inputs, create a prototype"
+2. orchestrator.process_request() called
+3. _get_or_create_context() checks orchestrator cache:
+   - If < 5 seconds old → Returns cached context (from step 1)
+   - OR continues to step 4
+4. Calls context_manager.manage_context()
+5. manage_context() checks session_cache:
+   session_cache.get("session_cca279a4")
+   → ✅ CACHE HIT: Returns cached context from first request
+   → Contains: { interaction_contexts: [] }
+6. **NEVER queries database** because cache hit
+7. Context returned with 0 interaction contexts
+8. Logs show: "Context retrieved: 0 interaction contexts"
+9. Intent agent receives empty context
+10. Skills agent analyzes wrong topic
+11. Response generated for wrong context (story generation, not Excel)
+```
+### Root Cause Identified
+**PRIMARY ISSUE**: Cache Invalidation Failure
+After `generate_interaction_context()` successfully stores an interaction context in the database, **the cache is never invalidated**. This causes:
+1. **First Request**: Context cached with `interaction_contexts = []`
+2. **Interaction Context Generated**: Stored in database ✅
+3. **Cache Not Cleared**: `session_cache["session_{session_id}"]` still contains old context
+4. **Second Request**: Cache hit returns stale context with 0 interaction contexts
+5. **Database Never Queried**: Cache check happens before database query
+**Location of Issue**:
+- File: `src/orchestrator_engine.py`
+- Method: `process_request()`
+- Lines: 442-450 (after `generate_interaction_context()` call)
+- **Missing**: Cache invalidation after interaction context generation
+### Secondary Issues
+#### Issue 2: Orchestrator-Level Cache Also Not Cleared
+The orchestrator maintains its own cache (`_context_cache`) with a 5-second TTL. If requests come within 5 seconds:
+- **Orchestrator cache hit**: Returns cached context immediately
+- **Context manager never called**: Never checks session_cache or database
+- **Result**: Even if session_cache were cleared, orchestrator cache would still return stale data
+**Location**:
+- File: `src/orchestrator_engine.py`
+- Method: `_get_or_create_context()`
+- Lines: 89-93
+#### Issue 3: No Detection of Context Reference Mismatches
+When a user explicitly references previous context (e.g., "based on above inputs"), but the system has 0 interaction contexts, there's no mechanism to:
+1. Detect the mismatch
+2. Force cache invalidation
+3. Re-query the database
+4. Warn about potential context loss
+**Location**:
+- File: `src/orchestrator_engine.py`
+- Method: `process_request()`
+- Lines: 172-174 (context retrieval happens, but no validation)
+## Code Flow Analysis
+### Storage Flow (Working)
+```
+orchestrator.process_request()
+  └─> generate_interaction_context()
+       └─> llm_router.route_inference() → Generate summary
+       └─> Database INSERT → Store in interaction_contexts table
+       └─> ✅ SUCCESS: Stored in database
+       └─> ❌ MISSING: Cache invalidation
+```
+### Retrieval Flow (Broken)
+```
+orchestrator.process_request()
+  └─> _get_or_create_context()
+       ├─> Check orchestrator cache (5s TTL)
+       │   └─> If hit: Return cached (may be stale)
+       └─> manage_context()
+           ├─> Check session_cache
+           │   └─> If hit: Return cached (STALE - has 0 contexts)
+           └─> _retrieve_from_db() (NEVER REACHED if cache hit)
+               └─> Query: SELECT FROM interaction_contexts WHERE session_id = ?
+                   └─> Would return stored contexts, but never called
+```
+## Database Verification
+The interaction context **IS being stored** correctly. Evidence:
+1. **Log Entry**:
+   ```
+   2025-10-31 06:55:55,481 - src.context_manager - INFO - ✓ Generated interaction context for 64d4ace2_15ca4dec_1761890055
+   ```
+2. **Storage Code** (src/context_manager.py:426-438):
+   ```python
+   cursor.execute("""
+       INSERT OR REPLACE INTO interaction_contexts
+       (interaction_id, session_id, user_input, system_response, interaction_summary, created_at)
+       VALUES (?, ?, ?, ?, ?, ?)
+   """, (interaction_id, session_id, user_input[:500], system_response[:1000], summary.strip(), datetime.now().isoformat()))
+   conn.commit()
+   conn.close()
+   ```
+   ✅ This executes successfully and commits
+3. **Retrieval Code** (src/context_manager.py:656-671):
+   ```python
+   cursor.execute("""
+       SELECT interaction_summary, created_at, needs_refresh
+       FROM interaction_contexts
+       WHERE session_id = ? AND (needs_refresh IS NULL OR needs_refresh = 0)
+       ORDER BY created_at DESC
+       LIMIT 20
+   """, (session_id,))
+   ```
+   ✅ This query would work, but is never executed due to cache hit
+## Cache Invalidation Points
+Current cache invalidation only happens in these scenarios:
+1. **Session End**: `end_session()` clears cache (line 534-536)
+2. **User Change**: User mismatch detection clears cache (line 254-255)
+3. **Never**: After generating interaction context ❌
+## Expected vs Actual Behavior
+### Expected Behavior:
+```
+Request 1 → Generate context → Store in DB → Clear cache
+Request 2 → Cache miss → Query DB → Find stored context → Use it
+```
+### Actual Behavior:
+```
+Request 1 → Generate context → Store in DB → Keep cache (stale)
+Request 2 → Cache hit → Return stale cache (0 contexts) → Never query DB
+```
+## Evidence from Logs
+```
+# First Request - Context Generation
+2025-10-31 06:55:55,481 - src.context_manager - INFO - ✓ Generated interaction context for 64d4ace2_15ca4dec_1761890055
+# Second Request - Cache Hit (No DB Query)
+2025-10-31 07:02:55,911 - src.context_manager - INFO - Context retrieved: 0 interaction contexts
+```
+**Time Gap**: 7 minutes between requests (well beyond 5-second orchestrator cache TTL)
+**Result**: Still 0 contexts → Session cache hit, database never queried
+## Impact Assessment
+### Functional Impact:
+- **HIGH**: Conversation continuity completely broken
+- Users cannot reference previous responses
+- Each request treated as isolated, losing all context
+### User Experience Impact:
+- **HIGH**: Responses generated for wrong topics
+- Frustration when "based on above inputs" is ignored
+- Loss of trust in system reliability
+### Performance Impact:
+- **LOW**: Cache is working (too well - preventing fresh data retrieval)
+- Database queries being avoided (but should happen after context generation)
+## Conclusion
+The interaction context system is **architecturally sound** but has a **critical cache invalidation bug**:
+1. ✅ Interaction contexts are correctly generated
+2. ✅ Interaction contexts are correctly stored in database
+3. ✅ Database retrieval query is correctly implemented
+4. ❌ Cache is never invalidated after interaction context generation
+5. ❌ Cache hit prevents database query from executing
+6. ❌ Stale cached context (with 0 interaction contexts) is returned
+**The fix requires** invalidating both:
+- Context Manager's `session_cache` after `generate_interaction_context()`
+- Orchestrator's `_context_cache` after `generate_interaction_context()`
+This will force fresh database queries on subsequent requests, allowing stored interaction contexts to be retrieved and used.
+## Files Involved
+1. `src/orchestrator_engine.py` - Lines 442-450 (missing cache invalidation)
+2. `src/orchestrator_engine.py` - Lines 83-113 (orchestrator cache)
+3. `src/context_manager.py` - Lines 235-289 (session cache management)
+4. `src/context_manager.py` - Lines 396-451 (interaction context generation)
+## Additional Notes
+- The cache mechanism itself is working as designed (performance optimization)
+- The bug is in the **cache lifecycle management** (invalidation timing)
+- Database operations are functioning correctly
+- The issue is purely in the caching layer, not the persistence layer

LLM_BASED_TOPIC_EXTRACTION_IMPLEMENTATION.md ADDED Viewed

	@@ -0,0 +1,268 @@

+# LLM-Based Topic Extraction Implementation (Option 2)
+## Summary
+Successfully implemented Option 2: LLM-based zero-shot classification for topic extraction and continuity analysis, replacing hardcoded pattern matching.
+## Changes Implemented
+### 1. Topic Cache Infrastructure
+**Location**: `src/orchestrator_engine.py` - `__init__()` (lines 34-36)
+**Added**:
+```python
+# Cache for topic extraction to reduce API calls
+self._topic_cache = {}
+self._topic_cache_max_size = 100  # Limit cache size
+```
+**Purpose**: Cache topic extraction results to minimize LLM API calls for identical queries.
+---
+### 2. LLM-Based Topic Extraction
+**Location**: `src/orchestrator_engine.py` - `_extract_main_topic()` (lines 1276-1343)
+**Changes**:
+- **Method signature**: Changed to `async def _extract_main_topic(self, user_input: str, context: dict = None) -> str`
+- **Implementation**: Uses LLM zero-shot classification instead of hardcoded keywords
+- **Context-aware**: Uses session_context and interaction_contexts from cache when available
+- **Caching**: Implements cache with FIFO eviction (max 100 entries)
+- **Fallback**: Falls back to simple word extraction if LLM unavailable
+**LLM Prompt**:
+```
+Classify the main topic of this query in 2-5 words. Be specific and concise.
+Query: "{user_input}"
+[Session context if available]
+Respond with ONLY the topic name (e.g., "Machine Learning", "Healthcare Analytics").
+```
+**Temperature**: 0.3 (for consistency)
+---
+### 3. LLM-Based Topic Continuity Analysis
+**Location**: `src/orchestrator_engine.py` - `_analyze_topic_continuity()` (lines 1029-1094)
+**Changes**:
+- **Method signature**: Changed to `async def _analyze_topic_continuity(self, context: dict, user_input: str) -> str`
+- **Implementation**: Uses LLM to determine if query continues previous topic or introduces new topic
+- **Context-aware**: Uses session_context and interaction_contexts from cache
+- **Format validation**: Validates LLM response format ("Continuing X" or "New topic: X")
+- **Fallback**: Returns descriptive message if LLM unavailable
+**LLM Prompt**:
+```
+Determine if the current query continues the previous conversation topic or introduces a new topic.
+Session Summary: {session_summary}
+Recent Interactions: {recent_interactions}
+Current Query: "{user_input}"
+Respond with EXACTLY one of:
+- "Continuing [topic name] discussion" if same topic
+- "New topic: [topic name]" if different topic
+```
+**Temperature**: 0.3 (for consistency)
+---
+### 4. Keyword Extraction Update
+**Location**: `src/orchestrator_engine.py` - `_extract_keywords()` (lines 1345-1361)
+**Changes**:
+- **Method signature**: Changed to `async def _extract_keywords(self, user_input: str) -> str`
+- **Implementation**: Simple regex-based extraction (not LLM-based for performance)
+- **Stop word filtering**: Filters common stop words
+- **Note**: Can be enhanced with LLM if needed, but kept simple for performance
+---
+### 5. Updated All Usage Sites
+**Location**: `src/orchestrator_engine.py` - `process_request()` (lines 184-200)
+**Changes**:
+- **Extract topic once**: `main_topic = await self._extract_main_topic(user_input, context)`
+- **Extract continuity**: `topic_continuity = await self._analyze_topic_continuity(context, user_input)`
+- **Extract keywords**: `query_keywords = await self._extract_keywords(user_input)`
+- **Reuse main_topic**: All 18+ usage sites now use the `main_topic` variable instead of calling method repeatedly
+**Updated Reasoning Chain Steps**:
+- Step 1: Uses `main_topic` (line 190)
+- Step 2: Uses `main_topic` (line 251, 259)
+- Step 3: Uses `main_topic` (line 268, 276)
+- Step 4: Uses `main_topic` (line 304, 312)
+- Step 5: Uses `main_topic` (line 384, 392)
+- Alternative paths: Uses `main_topic` (line 403, 1146-1166)
+**Error Recovery**: Simplified to avoid async complexity (line 1733)
+---
+### 6. Alternative Paths Method Update
+**Location**: `src/orchestrator_engine.py` - `_generate_alternative_paths()` (lines 1136-1169)
+**Changes**:
+- **Method signature**: Added `main_topic` parameter
+- **Before**: `def _generate_alternative_paths(self, intent_result: dict, user_input: str) -> list:`
+- **After**: `def _generate_alternative_paths(self, intent_result: dict, user_input: str, main_topic: str) -> list:`
+- **Updated call site**: Line 403 passes `main_topic` as third parameter
+---
+## Performance Characteristics
+### Latency Impact
+**Per Request**:
+- 2 LLM calls per request (topic extraction + continuity analysis)
+- Estimated latency: ~200-500ms total (depending on LLM router)
+- Caching reduces repeat calls: Cache hit = 0ms latency
+**Mitigation**:
+- Topic extraction cached per unique query (MD5 hash)
+- Cache size limited to 100 entries (FIFO eviction)
+- Keywords extraction kept simple (no LLM, minimal latency)
+### API Costs
+**Per Request**:
+- Topic extraction: ~50-100 tokens
+- Topic continuity: ~100-150 tokens
+- Total: ~150-250 tokens per request (first time)
+- Cached requests: 0 tokens
+**Monthly Estimate** (assuming 1000 unique queries/day):
+- First requests: ~150-250k tokens/day = ~4.5-7.5M tokens/month
+- Subsequent requests: Cached, 0 tokens
+- Actual usage depends on cache hit rate
+---
+## Error Handling
+### Fallback Mechanisms
+1. **Topic Extraction**:
+   - If LLM unavailable: Falls back to first 4 words of query
+   - If LLM error: Logs error, returns fallback
+   - Cache miss handling: Generates and caches
+2. **Topic Continuity**:
+   - If LLM unavailable: Returns "Topic continuity analysis unavailable"
+   - If no context: Returns "No previous context"
+   - If LLM error: Logs error, returns "Topic continuity analysis failed"
+3. **Keywords**:
+   - Simple extraction, no LLM dependency
+   - Error handling: Returns "General terms" on exception
+---
+## Testing Recommendations
+### Unit Tests
+1. **Topic Extraction**:
+   - Test LLM-based extraction with various queries
+   - Test caching behavior (cache hit/miss)
+   - Test fallback behavior when LLM unavailable
+   - Test context-aware extraction
+2. **Topic Continuity**:
+   - Test continuation detection
+   - Test new topic detection
+   - Test with empty context
+   - Test format validation
+3. **Integration Tests**:
+   - Test full request flow with LLM calls
+   - Test cache persistence across requests
+   - Test error recovery with LLM failures
+### Performance Tests
+1. **Latency Measurement**:
+   - Measure average latency with LLM calls
+   - Measure latency with cache hits
+   - Compare to previous pattern-based approach
+2. **Cache Effectiveness**:
+   - Measure cache hit rate
+   - Test cache eviction behavior
+---
+## Migration Notes
+### Breaking Changes
+**None**: All changes are internal to orchestrator. External API unchanged.
+### Compatibility
+- **LLM Router Required**: System requires `llm_router` to be available
+- **Graceful Degradation**: Falls back to simple extraction if LLM unavailable
+- **Backward Compatible**: Old pattern-based code removed, but fallbacks maintain functionality
+---
+## Benefits Realized
+✅ **Accurate Topic Classification**: LLM understands context, synonyms, nuances
+✅ **Domain Adaptive**: Works for any domain without code changes
+✅ **Context-Aware**: Uses session_context and interaction_contexts
+✅ **Human-Readable**: Maintains descriptive reasoning chain strings
+✅ **Scalable**: No manual keyword list maintenance
+✅ **Cached**: Reduces API calls for repeated queries
+---
+## Trade-offs
+⚠️ **Latency**: Adds ~200-500ms per request (first time, cached after)
+⚠️ **API Costs**: ~150-250 tokens per request (first time)
+⚠️ **LLM Dependency**: Requires LLM router to be functional
+⚠️ **Complexity**: More code to maintain (async handling, caching, error handling)
+⚠️ **Inconsistency Risk**: LLM responses may vary slightly (mitigated by temperature=0.3)
+---
+## Files Modified
+1. `src/orchestrator_engine.py`:
+   - Added topic cache infrastructure
+   - Rewrote `_extract_main_topic()` to use LLM
+   - Rewrote `_analyze_topic_continuity()` to use LLM
+   - Updated `_extract_keywords()` to async
+   - Updated all 18+ usage sites to use cached `main_topic`
+   - Updated `_generate_alternative_paths()` signature
+---
+## Next Steps
+1. **Monitor Performance**: Track latency and cache hit rates
+2. **Tune Caching**: Adjust cache size based on usage patterns
+3. **Optional Enhancements**:
+   - Consider LLM-based keyword extraction if needed
+   - Add topic extraction metrics/logging
+   - Implement cache persistence across restarts
+---
+## Conclusion
+Option 2 implementation complete. System now uses LLM-based zero-shot classification for topic extraction and continuity analysis, providing accurate, context-aware topic classification without hardcoded patterns. Caching minimizes latency and API costs for repeated queries.

PATTERN_BASED_TOPIC_ANALYSIS_REVIEW.md ADDED Viewed

	@@ -0,0 +1,374 @@

+# Pattern-Based Topic Analysis Review and Options
+## Executive Summary
+The orchestrator uses hardcoded pattern matching for topic extraction and continuity analysis in three methods:
+1. `_analyze_topic_continuity()` - Hardcoded keyword matching (ML, AI, Data Science only)
+2. `_extract_main_topic()` - Hardcoded keyword matching (10+ topic categories)
+3. `_extract_keywords()` - Hardcoded important terms list
+These methods are used extensively throughout the workflow, affecting reasoning chains, hypothesis generation, and agent execution tracking.
+## Current Implementation Analysis
+### 1. `_analyze_topic_continuity()` (Lines 1026-1069)
+**Current Approach:**
+- Pattern matching against 3 hardcoded topics: "machine learning", "artificial intelligence", "data science"
+- Checks session context summary and interaction context summaries for keywords
+- Returns: "Continuing {topic} discussion" or "New topic: {topic}"
+**Limitations:**
+- Only recognizes 3 topics
+- Misses domain-specific topics (e.g., healthcare, finance, legal)
+- Misses nuanced topics (e.g., "transformer architectures" → classified as "general")
+- Brittle: fails on synonyms, typos, or domain-specific terminology
+- Not learning-enabled: requires manual updates for new domains
+**Usage:**
+- Reasoning chain step_1 evidence (line 187)
+- Used once per request for context analysis
+### 2. `_extract_main_topic()` (Lines 1251-1279)
+**Current Approach:**
+- Pattern matching against 10+ topic categories:
+  - AI chatbot course curriculum
+  - Programming course curriculum
+  - Educational course design
+  - Machine learning concepts
+  - Artificial intelligence and chatbots
+  - Data science and analysis
+  - Software development and programming
+  - General inquiry (fallback)
+**Limitations:**
+- Hardcoded keyword lists
+- Hierarchical but limited (e.g., curriculum → AI vs Programming)
+- Fallback to first 4 words if no match
+- Same brittleness as topic continuity
+**Usage:**
+- **Extensively used (18 times):**
+  - Reasoning chain step_1 hypothesis (line 182)
+  - Reasoning chain step_1 reasoning (line 191)
+  - Reasoning chain step_2 reasoning (skills) (line 238)
+  - Reasoning chain step_3 hypothesis (line 243)
+  - Reasoning chain step_3 reasoning (line 251)
+  - Reasoning chain step_4 hypothesis (line 260)
+  - Reasoning chain step_4 reasoning (line 268)
+  - Reasoning chain step_5 hypothesis (line 296)
+  - Reasoning chain step_5 reasoning (line 304)
+  - Reasoning chain step_6 hypothesis (line 376)
+  - Reasoning chain step_6 reasoning (line 384)
+  - Alternative reasoning paths (line 1110)
+  - Error recovery (line 1665)
+### 3. `_extract_keywords()` (Lines 1281-1295)
+**Current Approach:**
+- Extracts keywords from hardcoded important terms list
+- Returns comma-separated string of matched keywords
+**Limitations:**
+- Static list requires manual updates
+- May miss domain-specific terminology
+**Usage:**
+- Reasoning chain step_1 evidence (line 188)
+- Used once per request
+## Current Workflow Impact
+### Pattern Matching Usage Flow:
+```
+Request → Context Retrieval
+  ↓
+Reasoning Chain Step 1:
+  - Hypothesis: Uses _extract_main_topic() → "User is asking about: '{topic}'"
+  - Evidence: Uses _analyze_topic_continuity() → "Topic continuity: ..."
+  - Evidence: Uses _extract_keywords() → "Query keywords: ..."
+  - Reasoning: Uses _extract_main_topic() → "...focused on {topic}..."
+  ↓
+Intent Recognition (Agent executes independently)
+  ↓
+Reasoning Chain Step 2-6:
+  - All hypothesis/reasoning strings use _extract_main_topic()
+  - Topic appears in 12+ reasoning chain fields
+  ↓
+Alternative Reasoning Paths:
+  - Uses _extract_main_topic() for path generation
+  ↓
+Error Recovery:
+  - Uses _extract_main_topic() for error context
+```
+### Impact Points:
+1. **Reasoning Chain Documentation**: All reasoning chain steps include topic strings
+2. **Agent Execution Tracking**: Topic appears in hypothesis and reasoning fields
+3. **Error Recovery**: Uses topic for context in error scenarios
+4. **Logging/Debugging**: Topic strings appear in logs and execution traces
+**Important Note:** Pattern matching does NOT affect agent execution logic. Agents (Intent, Skills, Synthesis, Safety) execute independently using LLM inference. Pattern matching only affects:
+- Reasoning chain metadata (for debugging/analysis)
+- Logging messages
+- Hypothesis/reasoning strings in execution traces
+## Options for Resolution
+### Option 1: Remove Pattern Matching, Make Context Independent
+**Approach:**
+- Remove `_analyze_topic_continuity()`, `_extract_main_topic()`, `_extract_keywords()`
+- Replace with generic placeholders or remove from reasoning chains
+- Use actual context data (session_context, interaction_contexts, user_context) directly
+**Implementation Changes:**
+1. **Replace topic extraction with context-based strings:**
+   ```python
+   # Before:
+   hypothesis = f"User is asking about: '{self._extract_main_topic(user_input)}'"
+   # After:
+   hypothesis = f"User query analyzed with {len(interaction_contexts)} previous contexts"
+   ```
+2. **Replace topic continuity with context-based analysis:**
+   ```python
+   # Before:
+   f"Topic continuity: {self._analyze_topic_continuity(context, user_input)}"
+   # After:
+   f"Session context available: {bool(session_context)}"
+   f"Interaction contexts: {len(interaction_contexts)}"
+   ```
+3. **Replace keywords with user input excerpt:**
+   ```python
+   # Before:
+   f"Query keywords: {self._extract_keywords(user_input)}"
+   # After:
+   f"Query: {user_input[:100]}..."
+   ```
+**Impact Analysis:**
+✅ **Benefits:**
+- **No hardcoded patterns**: Context independent of pattern learning
+- **Simpler code**: Removes 100+ lines of pattern matching logic
+- **More accurate**: Uses actual context data instead of brittle keyword matching
+- **Domain agnostic**: Works for any topic/domain without updates
+- **Maintainability**: No need to update keyword lists for new domains
+- **Performance**: No pattern matching overhead (minimal, but measurable)
+❌ **Drawbacks:**
+- **Less descriptive reasoning chains**: Hypothesis strings less specific (e.g., "User query analyzed" vs "User is asking about: Machine learning concepts")
+- **Reduced human readability**: Reasoning chain traces less informative for debugging
+- **Lost topic continuity insight**: No explicit "continuing topic X" vs "new topic Y" distinction
+**Workflow Impact:**
+- **No impact on agent execution**: Agents already use LLM inference, not pattern matching
+- **Reasoning chains less informative**: But still functional for debugging
+- **Logging less specific**: But still captures context availability
+- **No breaking changes**: All downstream components work with generic strings
+**Files Modified:**
+- `src/orchestrator_engine.py`: Remove 3 methods, update 18+ usage sites
+**Estimated Effort:** Low (1-2 hours)
+**Risk Level:** Low (only affects metadata, not logic)
+---
+### Option 2: Use LLM API for Zero-Shot Classification
+**Approach:**
+- Replace pattern matching with LLM-based zero-shot topic classification
+- Use LLM router to classify topics dynamically
+- Cache results to minimize API calls
+**Implementation Changes:**
+1. **Create LLM-based topic extraction:**
+   ```python
+   async def _extract_main_topic_llm(self, user_input: str, context: dict) -> str:
+       """Extract topic using LLM zero-shot classification"""
+       prompt = f"""Classify the main topic of this query in 2-5 words:
+Query: "{user_input}"
+Available context:
+- Session summary: {context.get('session_context', {}).get('summary', 'N/A')[:200]}
+- Recent interactions: {len(context.get('interaction_contexts', []))}
+Respond with ONLY the topic name (e.g., "Machine Learning", "Healthcare Analytics", "Financial Modeling")."""
+       topic = await self.llm_router.route_inference(
+           task_type="classification",
+           prompt=prompt,
+           max_tokens=20,
+           temperature=0.3
+       )
+       return topic.strip() if topic else "General inquiry"
+   ```
+2. **Create LLM-based topic continuity:**
+   ```python
+   async def _analyze_topic_continuity_llm(self, context: dict, user_input: str) -> str:
+       """Analyze topic continuity using LLM"""
+       session_context = context.get('session_context', {}).get('summary', '')
+       recent_interactions = context.get('interaction_contexts', [])[:3]
+       prompt = f"""Determine if the current query continues the previous conversation topic or introduces a new topic.
+Session Summary: {session_context[:300]}
+Recent Interactions:
+{chr(10).join([ic.get('summary', '') for ic in recent_interactions])}
+Current Query: "{user_input}"
+Respond with one of:
+- "Continuing [topic] discussion" if same topic
+- "New topic: [topic]" if different topic
+Keep response under 50 words."""
+       continuity = await self.llm_router.route_inference(
+           task_type="general_reasoning",
+           prompt=prompt,
+           max_tokens=50,
+           temperature=0.5
+       )
+       return continuity.strip() if continuity else "No previous context"
+   ```
+3. **Update method signatures to async:**
+   - `_extract_main_topic()` → `async def _extract_main_topic_llm()`
+   - `_analyze_topic_continuity()` → `async def _analyze_topic_continuity_llm()`
+   - `_extract_keywords()` → Keep pattern-based or remove (keywords less critical)
+4. **Add caching:**
+   ```python
+   # Cache topic extraction per user_input (hash)
+   topic_cache = {}
+   input_hash = hashlib.md5(user_input.encode()).hexdigest()
+   if input_hash in topic_cache:
+       return topic_cache[input_hash]
+   topic = await _extract_main_topic_llm(...)
+   topic_cache[input_hash] = topic
+   return topic
+   ```
+**Impact Analysis:**
+✅ **Benefits:**
+- **Accurate topic classification**: LLM understands context, synonyms, nuances
+- **Domain adaptive**: Works for any domain without code changes
+- **Context-aware**: Uses session_context and interaction_contexts for continuity
+- **Human-readable**: Maintains descriptive reasoning chain strings
+- **Scalable**: No manual keyword list maintenance
+❌ **Drawbacks:**
+- **API latency**: Adds 2-3 LLM calls per request (~200-500ms total)
+- **API costs**: Additional tokens consumed per request
+- **Dependency on LLM availability**: Requires LLM router to be functional
+- **Complexity**: More code to maintain (async handling, caching, error handling)
+- **Inconsistency risk**: LLM responses may vary slightly between calls (though temperature=0.3 mitigates)
+**Workflow Impact:**
+**Positive:**
+- **More accurate reasoning chains**: Topic classification more reliable
+- **Better debugging**: More informative hypothesis/reasoning strings
+- **Context-aware continuity**: Uses actual session/interaction contexts
+**Negative:**
+- **Latency increase**: +200-500ms per request (2-3 LLM calls)
+- **Error handling complexity**: Need fallbacks if LLM calls fail
+- **Async complexity**: All 18+ usage sites need await statements
+**Implementation Complexity:**
+- **Method conversion**: 3 methods → async LLM calls
+- **Usage site updates**: 18+ sites need await/async propagation
+- **Caching infrastructure**: Add cache layer to reduce API calls
+- **Error handling**: Fallbacks if LLM unavailable
+- **Testing**: Verify LLM responses are reasonable
+**Files Modified:**
+- `src/orchestrator_engine.py`: Rewrite 3 methods, update 18+ usage sites with async/await
+- May need `process_request()` refactoring for async topic extraction
+**Estimated Effort:** Medium-High (4-6 hours)
+**Risk Level:** Medium (adds latency and LLM dependency)
+---
+## Recommendation
+### Recommended: **Option 1 (Remove Pattern Matching)**
+**Rationale:**
+1. **No impact on core functionality**: Pattern matching only affects metadata strings, not agent execution
+2. **Simpler implementation**: Low risk, fast to implement
+3. **No performance penalty**: Removes overhead instead of adding LLM calls
+4. **Maintainability**: Less code to maintain
+5. **Context independence**: Aligns with user requirement for pattern-independent context
+**If descriptive reasoning chains are critical:**
+- **Hybrid Approach**: Use Option 1 for production, but add optional LLM-based topic extraction as a debug/logging enhancement (non-blocking, optional)
+### Alternative: **Option 2 (LLM Classification) if reasoning chain quality is critical**
+**Use Case:**
+- If reasoning chain metadata is used for:
+  - User-facing explanations
+  - Advanced debugging/analysis tools
+  - External integrations requiring topic metadata
+- Then the latency/API cost may be justified
+## Migration Path
+### Option 1 Implementation Steps:
+1. **Remove methods:**
+   - Delete `_analyze_topic_continuity()` (lines 1026-1069)
+   - Delete `_extract_main_topic()` (lines 1251-1279)
+   - Delete `_extract_keywords()` (lines 1281-1295)
+2. **Replace usages:**
+   - Line 182: `hypothesis` → Use generic: "User query analysis"
+   - Line 187: `Topic continuity` → Use: "Session context available: {bool(session_context)}"
+   - Line 188: `Query keywords` → Use: "Query: {user_input[:100]}"
+   - Line 191: `reasoning` → Remove topic references
+   - Lines 238, 243, 251, 260, 268, 296, 304, 376, 384: Remove topic from reasoning strings
+   - Line 1110: Remove topic from alternative paths
+   - Line 1665: Use generic error context
+3. **Test:**
+   - Verify reasoning chains still populate correctly
+   - Verify no syntax errors
+   - Verify agents execute normally
+### Option 2 Implementation Steps:
+1. **Create async LLM methods** (as shown above)
+2. **Add caching layer**
+3. **Update `process_request()` to await topic extraction before reasoning chain**
+4. **Add error handling with fallbacks**
+5. **Test latency impact**
+6. **Monitor API usage**
+## Conclusion
+**Option 1** is recommended for immediate implementation due to:
+- Low risk and complexity
+- No performance penalty
+- Aligns with context independence requirement
+- Pattern matching doesn't affect agent execution
+**Option 2** should be considered only if reasoning chain metadata quality is critical for user-facing features or advanced debugging.

SESSION_CONTEXT_EVERY_TURN_IMPLEMENTATION.md ADDED Viewed

	@@ -0,0 +1,236 @@

+# Session Context Every Turn - Implementation Summary
+## Overview
+Modified the system to generate and use session context at every conversation turn, following the same pattern as interaction contexts. Session context is now available from cache for all agents and components.
+## Changes Implemented
+### 1. Session Context Generation at Every Turn
+**Location**: `src/orchestrator_engine.py` (lines 451-457)
+**Change**: Added session context generation after interaction context generation, at every turn.
+```python
+# STEP 3: Generate Session Context after each response (100 tokens)
+# Uses cached interaction contexts, updates database and cache
+try:
+    await self.context_manager.generate_session_context(session_id, user_id)
+    # Cache is automatically updated by generate_session_context()
+except Exception as e:
+    logger.error(f"Error generating session context: {e}", exc_info=True)
+```
+**Impact**: Session context is now generated and available for every turn, not just at session end.
+### 2. Session Context Uses Cache (Not Database)
+**Location**: `src/context_manager.py` - `generate_session_context()` (lines 463-542)
+**Change**: Modified to use cached interaction contexts instead of querying database.
+**Before**: Queried database for all interaction contexts
+```python
+cursor.execute("SELECT interaction_summary FROM interaction_contexts WHERE session_id = ?")
+```
+**After**: Uses cached interaction contexts
+```python
+# Get interaction contexts from cache (no database query)
+session_cache_key = f"session_{session_id}"
+cached_context = self.session_cache.get(session_cache_key)
+interaction_contexts = cached_context.get('interaction_contexts', [])
+```
+**Impact**: Faster session context generation, no database queries during conversation.
+### 3. Session Context Cache Update
+**Location**: `src/context_manager.py` - `_update_cache_with_session_context()` (lines 826-859)
+**Change**: Added method to update cache immediately after database write, same pattern as interaction contexts.
+```python
+def _update_cache_with_session_context(self, session_id: str, session_summary: str, created_at: str):
+    """Update cache with new session context immediately after database update"""
+    cached_context['session_context'] = {
+        "summary": session_summary,
+        "timestamp": created_at
+    }
+    self.session_cache[session_cache_key] = cached_context
+```
+**Impact**: Cache stays synchronized with database, no queries needed for subsequent requests.
+### 4. Session Context Included in Context Structure
+**Location**: `src/context_manager.py` - `_optimize_context()` (lines 570-604)
+**Change**: Added session context to optimized context structure and `combined_context` string.
+**Before**:
+```python
+combined_context = "[User Context]...\n[Interaction Contexts]..."
+```
+**After**:
+```python
+combined_context = "[Session Context]...\n[User Context]...\n[Interaction Contexts]..."
+```
+**Impact**: Session context now available in `combined_context` for all agents.
+### 5. Session Context Retrieved from Database on Cache Miss
+**Location**: `src/context_manager.py` - `_retrieve_from_db()` (lines 707-725)
+**Change**: Added query to retrieve session context when loading from database (cache miss only).
+```python
+# Get session context from database
+cursor.execute("""
+    SELECT session_summary, created_at
+    FROM session_contexts
+    WHERE session_id = ?
+    ORDER BY created_at DESC
+    LIMIT 1
+""", (session_id,))
+```
+**Impact**: Session context available even on cache miss (first request or cache expiration).
+### 6. All Agents Updated to Use Session Context
+#### Intent Agent (`src/agents/intent_agent.py`)
+- **Lines 115-122**: Added session context extraction from cache
+- **Usage**: Includes session context in intent recognition prompt
+#### Skills Identification Agent (`src/agents/skills_identification_agent.py`)
+- **Lines 230-236**: Added session context extraction from cache
+- **Usage**: Includes session context in market analysis prompt
+#### Safety Agent (`src/agents/safety_agent.py`)
+- **Lines 158-164**: Added session context extraction from cache
+- **Usage**: Includes session context in safety analysis prompt
+#### Synthesis Agent (`src/agents/synthesis_agent.py`)
+- **Lines 552-561**: Added session context extraction from cache
+- **Usage**: Includes session context in response synthesis prompt
+### 7. Orchestrator Context Summary Updated
+**Location**: `src/orchestrator_engine.py` - `_build_context_summary()` (lines 651-673)
+**Change**: Added session context to context summary used for task execution.
+```python
+# Extract session context (from cache)
+session_context = context.get('session_context', {})
+session_summary = session_context.get('summary', '') if isinstance(session_context, dict) else ""
+if session_summary:
+    summary_parts.append(f"Session summary: {session_summary[:150]}")
+```
+**Impact**: Task execution prompts now include session context.
+## Context Structure
+Context now includes session context:
+```python
+context = {
+    "session_id": str,
+    "user_id": str,
+    "user_context": str,  # 500-token user persona summary (from cache)
+    "session_context": {  # 100-token session summary (from cache)
+        "summary": str,
+        "timestamp": str
+    },
+    "interaction_contexts": [  # List of interaction summaries (from cache)
+        {
+            "summary": str,  # 50-token interaction summary
+            "timestamp": str
+        },
+        ...
+    ],
+    "combined_context": str,  # Pre-formatted: "[Session Context]\n[User Context]\n[Interaction Contexts]"
+    "preferences": dict,
+    "active_tasks": list,
+    "last_activity": str
+}
+```
+## Flow Diagram
+### Every Conversation Turn:
+```
+1. User Request
+   ↓
+2. Context Retrieved (from cache)
+   - Session Context: ✅ Available from cache
+   - User Context: ✅ Available from cache
+   - Interaction Contexts: ✅ Available from cache
+   ↓
+3. Agents Execute
+   - Intent Agent: Uses session context from cache ✅
+   - Skills Agent: Uses session context from cache ✅
+   - Synthesis Agent: Uses session context from cache ✅
+   - Safety Agent: Uses session context from cache ✅
+   ↓
+4. Response Generated
+   ↓
+5. Interaction Context Generated
+   - Store in DB
+   - Update cache immediately ✅
+   ↓
+6. Session Context Generated (NEW - every turn)
+   - Uses cached interaction contexts (no DB query) ✅
+   - Store in DB
+   - Update cache immediately ✅
+   ↓
+7. Next Request: All contexts available from cache ✅
+```
+## Benefits
+1. **Session Context Available Every Turn**: Agents can use session summary for better context awareness
+2. **No Database Queries**: Session context generation uses cached interaction contexts
+3. **Cache Synchronization**: Cache updated immediately when database is updated
+4. **Better Context Awareness**: Agents have access to session-level summary in addition to interaction-level summaries
+5. **Consistent Pattern**: Session context follows same pattern as interaction context (generate → store → cache)
+## Files Modified
+1. `src/orchestrator_engine.py`:
+   - Added session context generation at every turn (lines 451-457)
+   - Updated `_build_context_summary()` to include session context (lines 655-659)
+2. `src/context_manager.py`:
+   - Modified `generate_session_context()` to use cache (lines 470-485)
+   - Added `_update_cache_with_session_context()` method (lines 826-859)
+   - Updated `_optimize_context()` to include session context (lines 577-592)
+   - Updated `_retrieve_from_db()` to load session context (lines 707-725)
+   - Updated all return statements to include session_context field
+3. `src/agents/intent_agent.py`:
+   - Added session context extraction and usage (lines 115-122)
+4. `src/agents/skills_identification_agent.py`:
+   - Added session context extraction and usage (lines 230-236)
+5. `src/agents/safety_agent.py`:
+   - Added session context extraction and usage (lines 158-164)
+6. `src/agents/synthesis_agent.py`:
+   - Added session context extraction and usage (lines 552-561)
+## Verification
+All agents now:
+- ✅ Receive context from orchestrator (cache-only)
+- ✅ Have access to session_context from cache
+- ✅ Include session context in their prompts/tasks
+- ✅ Never query database directly (all context from cache)

src/agents/intent_agent.py CHANGED Viewed

@@ -102,19 +102,24 @@ class IntentRecognitionAgent:
         """Build Chain of Thought prompt for intent recognition"""
         # Extract context information from Context Manager structure
         context_info = ""
         if context:
-            # Use combined_context if available (pre-formatted by Context Manager)
             combined_context = context.get('combined_context', '')
             if combined_context:
-                # Use the pre-formatted context from Context Manager
                 context_info = f"\n\nAvailable Context:\n{combined_context[:1000]}..."  # Truncate if too long
             else:
-                # Fallback: Build from interaction_contexts if combined_context not available
                 interaction_contexts = context.get('interaction_contexts', [])
                 user_context = context.get('user_context', '')
                 context_parts = []
                 if user_context:
                     context_parts.append(f"User Context: {user_context[:300]}...")

         """Build Chain of Thought prompt for intent recognition"""
         # Extract context information from Context Manager structure
+        # Session context, user context, and interaction contexts are all from cache
         context_info = ""
         if context:
+            # Use combined_context if available (pre-formatted by Context Manager, includes session context)
             combined_context = context.get('combined_context', '')
             if combined_context:
+                # Use the pre-formatted context from Context Manager (includes session context)
                 context_info = f"\n\nAvailable Context:\n{combined_context[:1000]}..."  # Truncate if too long
             else:
+                # Fallback: Build from session_context, user_context, and interaction_contexts (all from cache)
+                session_context = context.get('session_context', {})
+                session_summary = session_context.get('summary', '') if isinstance(session_context, dict) else ""
                 interaction_contexts = context.get('interaction_contexts', [])
                 user_context = context.get('user_context', '')
                 context_parts = []
+                if session_summary:
+                    context_parts.append(f"Session Context: {session_summary[:300]}...")
                 if user_context:
                     context_parts.append(f"User Context: {user_context[:300]}...")

src/agents/safety_agent.py CHANGED Viewed

@@ -154,13 +154,18 @@ class SafetyCheckAgent:
         # Extract relevant context information for safety analysis
         context_info = ""
         if context:
-            # Get user context if available (might indicate user's background/preferences)
             user_context = context.get('user_context', '')
             if user_context:
-                context_info = f"\n\nUser Context (for safety context): {user_context[:200]}..."
             # Optionally include recent interaction context to understand conversation flow
-            interaction_contexts = context.get('interaction_contexts', [])
             if interaction_contexts:
                 recent_context = interaction_contexts[-1].get('summary', '') if interaction_contexts else ''
                 if recent_context:

         # Extract relevant context information for safety analysis
         context_info = ""
         if context:
+            # Get session context, user context, and interaction contexts (all from cache)
+            session_context = context.get('session_context', {})
+            session_summary = session_context.get('summary', '') if isinstance(session_context, dict) else ""
             user_context = context.get('user_context', '')
+            interaction_contexts = context.get('interaction_contexts', [])
+            if session_summary:
+                context_info = f"\n\nSession Context (for safety context): {session_summary[:200]}..."
             if user_context:
+                context_info += f"\n\nUser Context (for safety context): {user_context[:200]}..."
             # Optionally include recent interaction context to understand conversation flow
             if interaction_contexts:
                 recent_context = interaction_contexts[-1].get('summary', '') if interaction_contexts else ''
                 if recent_context:

src/agents/skills_identification_agent.py CHANGED Viewed

@@ -224,14 +224,18 @@ class SkillsIdentificationAgent:
             for category, data in self.market_categories.items()
         ])
-        # Add context information if available
         context_info = ""
         if context:
             user_context = context.get('user_context', '')
             interaction_contexts = context.get('interaction_contexts', [])
             if user_context:
-                context_info = f"\n\nUser Context (persona summary): {user_context[:300]}..."
             if interaction_contexts:
                 # Include recent interaction context to understand topic continuity

             for category, data in self.market_categories.items()
         ])
+        # Add context information if available (all from cache)
         context_info = ""
         if context:
+            session_context = context.get('session_context', {})
+            session_summary = session_context.get('summary', '') if isinstance(session_context, dict) else ""
             user_context = context.get('user_context', '')
             interaction_contexts = context.get('interaction_contexts', [])
+            if session_summary:
+                context_info = f"\n\nSession Context (session summary): {session_summary[:300]}..."
             if user_context:
+                context_info += f"\n\nUser Context (persona summary): {user_context[:300]}..."
             if interaction_contexts:
                 # Include recent interaction context to understand topic continuity

src/agents/synthesis_agent.py CHANGED Viewed

@@ -540,18 +540,25 @@ Response:"""
             return ""
         # Prefer combined_context if available (pre-formatted by Context Manager)
         combined_context = context.get('combined_context', '')
         if combined_context:
             # Use the pre-formatted context from Context Manager
-            # It already includes User Context and Interaction Contexts formatted
             return f"\n\nConversation Context:\n{combined_context}"
         # Fallback: Build from individual components if combined_context not available
         interaction_contexts = context.get('interaction_contexts', [])
         user_context = context.get('user_context', '')
         context_section = ""
         # Add user context if available
         if user_context:
             context_section += f"\n\nUser Context (Persona Summary):\n{user_context[:500]}...\n"

             return ""
         # Prefer combined_context if available (pre-formatted by Context Manager)
+        # combined_context includes Session Context, User Context, and Interaction Contexts
         combined_context = context.get('combined_context', '')
         if combined_context:
             # Use the pre-formatted context from Context Manager
+            # It already includes Session Context, User Context, and Interaction Contexts formatted
             return f"\n\nConversation Context:\n{combined_context}"
         # Fallback: Build from individual components if combined_context not available
+        # All components are from cache
+        session_context = context.get('session_context', {})
+        session_summary = session_context.get('summary', '') if isinstance(session_context, dict) else ""
         interaction_contexts = context.get('interaction_contexts', [])
         user_context = context.get('user_context', '')
         context_section = ""
+        # Add session context if available (from cache)
+        if session_summary:
+            context_section += f"\n\nSession Context (Session Summary):\n{session_summary[:500]}...\n"
         # Add user context if available
         if user_context:
             context_section += f"\n\nUser Context (Persona Summary):\n{user_context[:500]}...\n"

src/context_manager.py CHANGED Viewed

@@ -264,7 +264,8 @@ class EfficientContextManager:
             # Cache session context (cache invalidation for user changes is handled in _retrieve_from_db)
             self._warm_memory_cache(session_cache_key, session_context)
-        # Handle user context separately to prevent loops
         if not user_context or not user_context.get("user_context_loaded"):
             user_context_data = await self.get_user_context(user_id)
             user_context = {
@@ -272,8 +273,12 @@ class EfficientContextManager:
                 "user_context_loaded": True,
                 "user_id": user_id
             }
-            # Cache user context separately
             self._warm_memory_cache(user_cache_key, user_context)
         # Merge contexts without duplication
         merged_context = {
@@ -423,6 +428,7 @@ Provide a brief summary capturing the key exchange."""
                     # Store in database
                     conn = sqlite3.connect(self.db_path)
                     cursor = conn.cursor()
                     cursor.execute("""
                         INSERT OR REPLACE INTO interaction_contexts
                         (interaction_id, session_id, user_input, system_response, interaction_summary, created_at)
@@ -433,12 +439,16 @@ Provide a brief summary capturing the key exchange."""
                         user_input[:500],
                         system_response[:1000],
                         summary.strip(),
-                        datetime.now().isoformat()
                     ))
                     conn.commit()
                     conn.close()
-                    logger.info(f"✓ Generated interaction context for {interaction_id}")
                     return summary.strip()
             except Exception as e:
                 logger.error(f"Error generating interaction context: {e}", exc_info=True)
@@ -452,25 +462,30 @@ Provide a brief summary capturing the key exchange."""
     async def generate_session_context(self, session_id: str, user_id: str = "Test_Any") -> str:
         """
-        FINAL STEP: Generate Session Context (100-token summary)
-        Called at session end
         """
         try:
-            conn = sqlite3.connect(self.db_path)
-            cursor = conn.cursor()
-            # Get all interaction contexts for this session
-            cursor.execute("""
-                SELECT interaction_summary FROM interaction_contexts
-                WHERE session_id = ?
-                ORDER BY created_at ASC
-            """, (session_id,))
-            interaction_summaries = [row[0] for row in cursor.fetchall() if row[0]]
-            conn.close()
             if not interaction_summaries:
-                logger.info(f"No interactions to summarize for session {session_id}")
                 return ""
             # Generate session summary using LLM (100 tokens)
@@ -499,17 +514,22 @@ Keep the summary concise (approximately 100 tokens)."""
                     if session_summary and isinstance(session_summary, str) and session_summary.strip():
                         # Store in database
                         conn = sqlite3.connect(self.db_path)
                         cursor = conn.cursor()
                         cursor.execute("""
                             INSERT OR REPLACE INTO session_contexts
                             (session_id, user_id, session_summary, created_at)
                             VALUES (?, ?, ?, ?)
-                        """, (session_id, user_id, session_summary.strip(), datetime.now().isoformat()))
                         conn.commit()
                         conn.close()
-                        logger.info(f"✓ Generated session context for {session_id}")
                         return session_summary.strip()
                 except Exception as e:
                     logger.error(f"Error generating session context: {e}", exc_info=True)
@@ -523,12 +543,11 @@ Keep the summary concise (approximately 100 tokens)."""
     async def end_session(self, session_id: str, user_id: str = "Test_Any"):
         """
-        FINAL STEP: Generate Session Context and clear cache
         """
         try:
-            # Generate session context
-            await self.generate_session_context(session_id, user_id)
             # Clear in-memory cache for this session (session-only key)
             session_cache_key = f"session_{session_id}"
             if session_cache_key in self.session_cache:
@@ -550,18 +569,22 @@ Keep the summary concise (approximately 100 tokens)."""
     def _optimize_context(self, context: dict) -> dict:
         """
         Optimize context for LLM consumption
-        Format: [Interaction Context #N, #N-1, ...] + User Context
         """
         user_context = context.get("user_context", "")
         interaction_contexts = context.get("interaction_contexts", [])
         # Format interaction contexts as requested
         formatted_interactions = []
         for idx, ic in enumerate(interaction_contexts[:10]):  # Last 10 interactions
             formatted_interactions.append(f"[Interaction Context #{len(interaction_contexts) - idx}]\n{ic.get('summary', '')}")
-        # Combine User Context + Interaction Contexts
         combined_context = ""
         if user_context:
             combined_context += f"[User Context]\n{user_context}\n\n"
         if formatted_interactions:
@@ -571,6 +594,7 @@ Keep the summary concise (approximately 100 tokens)."""
             "session_id": context.get("session_id"),
             "user_id": context.get("user_id", "Test_Any"),
             "user_context": user_context,
             "interaction_contexts": interaction_contexts,
             "combined_context": combined_context,  # For direct use in prompts
             "preferences": context.get("preferences", {}),
@@ -684,10 +708,31 @@ Keep the summary concise (approximately 100 tokens)."""
                                 "timestamp": timestamp
                             })
                 context = {
                     "session_id": session_id,
                     "user_id": user_id,
                     "interaction_contexts": interaction_contexts,
                     "preferences": user_metadata.get("preferences", {}),
                     "active_tasks": user_metadata.get("active_tasks", []),
                     "last_activity": last_activity,
@@ -711,6 +756,7 @@ Keep the summary concise (approximately 100 tokens)."""
                     "session_id": session_id,
                     "user_id": user_id,
                     "interaction_contexts": [],
                     "preferences": {},
                     "active_tasks": [],
                     "user_context_loaded": False,
@@ -730,6 +776,7 @@ Keep the summary concise (approximately 100 tokens)."""
                 "session_id": session_id,
                 "user_id": user_id,
                 "interaction_contexts": [],
                 "preferences": {},
                 "active_tasks": [],
                 "user_context_loaded": False,
@@ -749,6 +796,7 @@ Keep the summary concise (approximately 100 tokens)."""
                 "session_id": session_id,
                 "user_id": user_id,
                 "interaction_contexts": [],
                 "preferences": {},
                 "active_tasks": [],
                 "user_context_loaded": False,
@@ -762,6 +810,82 @@ Keep the summary concise (approximately 100 tokens)."""
         """
         self.session_cache[cache_key] = context
     def _update_context(self, context: dict, user_input: str, response: str = None, user_id: str = "Test_Any") -> dict:
         """
         Update context with deduplication and idempotency checks
@@ -1042,6 +1166,16 @@ Keep the summary concise (approximately 100 tokens)."""
         except:
             return False
     def optimize_database_indexes(self):
         """Create database indexes for better query performance"""
         try:

             # Cache session context (cache invalidation for user changes is handled in _retrieve_from_db)
             self._warm_memory_cache(session_cache_key, session_context)
+        # Handle user context separately - load only once and cache thereafter
+        # Cache does not refer to database after initial load
         if not user_context or not user_context.get("user_context_loaded"):
             user_context_data = await self.get_user_context(user_id)
             user_context = {
                 "user_context_loaded": True,
                 "user_id": user_id
             }
+            # Cache user context separately - this is the only database query for user context
             self._warm_memory_cache(user_cache_key, user_context)
+            logger.debug(f"User context loaded once for {user_id} and cached")
+        else:
+            # User context already cached, use it without database query
+            logger.debug(f"Using cached user context for {user_id}")
         # Merge contexts without duplication
         merged_context = {
                     # Store in database
                     conn = sqlite3.connect(self.db_path)
                     cursor = conn.cursor()
+                    created_at = datetime.now().isoformat()
                     cursor.execute("""
                         INSERT OR REPLACE INTO interaction_contexts
                         (interaction_id, session_id, user_input, system_response, interaction_summary, created_at)
                         user_input[:500],
                         system_response[:1000],
                         summary.strip(),
+                        created_at
                     ))
                     conn.commit()
                     conn.close()
+                    # Update cache immediately with new interaction context
+                    # This ensures cache is synchronized with database at the same time
+                    self._update_cache_with_interaction_context(session_id, summary.strip(), created_at)
+                    logger.info(f"✓ Generated interaction context for {interaction_id} and updated cache")
                     return summary.strip()
             except Exception as e:
                 logger.error(f"Error generating interaction context: {e}", exc_info=True)
     async def generate_session_context(self, session_id: str, user_id: str = "Test_Any") -> str:
         """
+        Generate Session Context (100-token summary) at every turn
+        Uses cached interaction contexts instead of querying database
+        Updates both database and cache immediately
         """
         try:
+            # Get interaction contexts from cache (no database query)
+            session_cache_key = f"session_{session_id}"
+            cached_context = self.session_cache.get(session_cache_key)
+            if not cached_context:
+                logger.warning(f"No cached context found for session {session_id}, cannot generate session context")
+                return ""
+            interaction_contexts = cached_context.get('interaction_contexts', [])
+            if not interaction_contexts:
+                logger.info(f"No interaction contexts available for session {session_id} to summarize")
+                return ""
+            # Use cached interaction contexts (from cache, not database)
+            interaction_summaries = [ic.get('summary', '') for ic in interaction_contexts if ic.get('summary')]
             if not interaction_summaries:
+                logger.info(f"No interaction summaries available for session {session_id}")
                 return ""
             # Generate session summary using LLM (100 tokens)
                     if session_summary and isinstance(session_summary, str) and session_summary.strip():
                         # Store in database
+                        created_at = datetime.now().isoformat()
                         conn = sqlite3.connect(self.db_path)
                         cursor = conn.cursor()
                         cursor.execute("""
                             INSERT OR REPLACE INTO session_contexts
                             (session_id, user_id, session_summary, created_at)
                             VALUES (?, ?, ?, ?)
+                        """, (session_id, user_id, session_summary.strip(), created_at))
                         conn.commit()
                         conn.close()
+                        # Update cache immediately with new session context
+                        # This ensures cache is synchronized with database at the same time
+                        self._update_cache_with_session_context(session_id, session_summary.strip(), created_at)
+                        logger.info(f"✓ Generated session context for {session_id} and updated cache")
                         return session_summary.strip()
                 except Exception as e:
                     logger.error(f"Error generating session context: {e}", exc_info=True)
     async def end_session(self, session_id: str, user_id: str = "Test_Any"):
         """
+        End session and clear cache
+        Note: Session context is already generated at every turn, so this just clears cache
         """
         try:
+            # Session context is already generated at every turn (no need to regenerate)
             # Clear in-memory cache for this session (session-only key)
             session_cache_key = f"session_{session_id}"
             if session_cache_key in self.session_cache:
     def _optimize_context(self, context: dict) -> dict:
         """
         Optimize context for LLM consumption
+        Format: [Session Context] + [User Context] + [Interaction Context #N, #N-1, ...]
         """
         user_context = context.get("user_context", "")
         interaction_contexts = context.get("interaction_contexts", [])
+        session_context = context.get("session_context", {})
+        session_summary = session_context.get("summary", "") if isinstance(session_context, dict) else ""
         # Format interaction contexts as requested
         formatted_interactions = []
         for idx, ic in enumerate(interaction_contexts[:10]):  # Last 10 interactions
             formatted_interactions.append(f"[Interaction Context #{len(interaction_contexts) - idx}]\n{ic.get('summary', '')}")
+        # Combine Session Context + User Context + Interaction Contexts
         combined_context = ""
+        if session_summary:
+            combined_context += f"[Session Context]\n{session_summary}\n\n"
         if user_context:
             combined_context += f"[User Context]\n{user_context}\n\n"
         if formatted_interactions:
             "session_id": context.get("session_id"),
             "user_id": context.get("user_id", "Test_Any"),
             "user_context": user_context,
+            "session_context": session_context,
             "interaction_contexts": interaction_contexts,
             "combined_context": combined_context,  # For direct use in prompts
             "preferences": context.get("preferences", {}),
                                 "timestamp": timestamp
                             })
+                # Get session context from database
+                session_context_data = None
+                try:
+                    cursor.execute("""
+                        SELECT session_summary, created_at
+                        FROM session_contexts
+                        WHERE session_id = ?
+                        ORDER BY created_at DESC
+                        LIMIT 1
+                    """, (session_id,))
+                    sc_row = cursor.fetchone()
+                    if sc_row and sc_row[0]:
+                        session_context_data = {
+                            "summary": sc_row[0],
+                            "timestamp": sc_row[1]
+                        }
+                except sqlite3.OperationalError:
+                    # Table might not exist yet
+                    pass
                 context = {
                     "session_id": session_id,
                     "user_id": user_id,
                     "interaction_contexts": interaction_contexts,
+                    "session_context": session_context_data,
                     "preferences": user_metadata.get("preferences", {}),
                     "active_tasks": user_metadata.get("active_tasks", []),
                     "last_activity": last_activity,
                     "session_id": session_id,
                     "user_id": user_id,
                     "interaction_contexts": [],
+                    "session_context": None,
                     "preferences": {},
                     "active_tasks": [],
                     "user_context_loaded": False,
                 "session_id": session_id,
                 "user_id": user_id,
                 "interaction_contexts": [],
+                "session_context": None,
                 "preferences": {},
                 "active_tasks": [],
                 "user_context_loaded": False,
                 "session_id": session_id,
                 "user_id": user_id,
                 "interaction_contexts": [],
+                "session_context": None,
                 "preferences": {},
                 "active_tasks": [],
                 "user_context_loaded": False,
         """
         self.session_cache[cache_key] = context
+    def _update_cache_with_interaction_context(self, session_id: str, interaction_summary: str, created_at: str):
+        """
+        Update cache with new interaction context immediately after database update
+        This keeps cache synchronized with database without requiring database queries
+        """
+        session_cache_key = f"session_{session_id}"
+        # Get current cached context if it exists
+        cached_context = self.session_cache.get(session_cache_key)
+        if cached_context:
+            # Add new interaction context to the beginning of the list (most recent first)
+            interaction_contexts = cached_context.get('interaction_contexts', [])
+            new_interaction = {
+                "summary": interaction_summary,
+                "timestamp": created_at
+            }
+            # Insert at beginning and keep only last 20 (matches DB query limit)
+            interaction_contexts.insert(0, new_interaction)
+            interaction_contexts = interaction_contexts[:20]
+            # Update cached context with new interaction contexts
+            cached_context['interaction_contexts'] = interaction_contexts
+            self.session_cache[session_cache_key] = cached_context
+            logger.debug(f"Cache updated with new interaction context for session {session_id} (total: {len(interaction_contexts)})")
+        else:
+            # If cache doesn't exist, create new entry
+            new_context = {
+                "session_id": session_id,
+                "interaction_contexts": [{
+                    "summary": interaction_summary,
+                    "timestamp": created_at
+                }],
+                "preferences": {},
+                "active_tasks": [],
+                "user_context_loaded": False
+            }
+            self.session_cache[session_cache_key] = new_context
+            logger.debug(f"Created new cache entry with interaction context for session {session_id}")
+    def _update_cache_with_session_context(self, session_id: str, session_summary: str, created_at: str):
+        """
+        Update cache with new session context immediately after database update
+        This keeps cache synchronized with database without requiring database queries
+        """
+        session_cache_key = f"session_{session_id}"
+        # Get current cached context if it exists
+        cached_context = self.session_cache.get(session_cache_key)
+        if cached_context:
+            # Update session context in cache
+            cached_context['session_context'] = {
+                "summary": session_summary,
+                "timestamp": created_at
+            }
+            self.session_cache[session_cache_key] = cached_context
+            logger.debug(f"Cache updated with new session context for session {session_id}")
+        else:
+            # If cache doesn't exist, create new entry
+            new_context = {
+                "session_id": session_id,
+                "session_context": {
+                    "summary": session_summary,
+                    "timestamp": created_at
+                },
+                "interaction_contexts": [],
+                "preferences": {},
+                "active_tasks": [],
+                "user_context_loaded": False
+            }
+            self.session_cache[session_cache_key] = new_context
+            logger.debug(f"Created new cache entry with session context for session {session_id}")
     def _update_context(self, context: dict, user_input: str, response: str = None, user_id: str = "Test_Any") -> dict:
         """
         Update context with deduplication and idempotency checks
         except:
             return False
+    def invalidate_session_cache(self, session_id: str):
+        """
+        Invalidate cached context for a session to force fresh retrieval
+        Only affects cache management - does not change application functionality
+        """
+        session_cache_key = f"session_{session_id}"
+        if session_cache_key in self.session_cache:
+            del self.session_cache[session_cache_key]
+            logger.info(f"Cache invalidated for session {session_id} to ensure fresh context retrieval")
     def optimize_database_indexes(self):
         """Create database indexes for better query performance"""
         try:

src/orchestrator_engine.py CHANGED Viewed

@@ -31,6 +31,9 @@ class MVPOrchestrator:
         self.context_manager = context_manager
         self.agents = agents
         self.execution_trace = []
         # Safety revision thresholds
         self.safety_thresholds = {
@@ -171,24 +174,29 @@ class MVPOrchestrator:
             # Use context with deduplication check
             context = await self._get_or_create_context(session_id, user_input, user_id)
-            logger.info(f"Context retrieved: {len(context.get('interaction_contexts', []))} interaction contexts")
-            # Add context analysis to reasoning chain
             interaction_contexts_count = len(context.get('interaction_contexts', []))
             user_context = context.get('user_context', '')
             has_user_context = bool(user_context)
             reasoning_chain["chain_of_thought"]["step_1"] = {
-                "hypothesis": f"User is asking about: '{self._extract_main_topic(user_input)}'",
                 "evidence": [
                     f"Previous interaction contexts: {interaction_contexts_count}",
                     f"User context available: {has_user_context}",
                     f"Session duration: {self._calculate_session_duration(context)}",
-                    f"Topic continuity: {self._analyze_topic_continuity(context, user_input)}",
-                    f"Query keywords: {self._extract_keywords(user_input)}"
                 ],
                 "confidence": 0.85,
-                "reasoning": f"Context analysis shows user is focused on {self._extract_main_topic(user_input)} with {interaction_contexts_count} previous interaction contexts and {'existing' if has_user_context else 'new'} user context"
             }
             # Step 3: Intent recognition with enhanced CoT
@@ -235,12 +243,12 @@ class MVPOrchestrator:
                     f"Confidence score: {skills_result.get('confidence_score', 0.5)}"
                 ],
                 "confidence": skills_result.get('confidence_score', 0.5),
-                "reasoning": f"Skills identification completed for topic '{self._extract_main_topic(user_input)}' with {len(skills_result.get('identified_skills', []))} relevant skills"
             }
             # Add intent reasoning to chain
             reasoning_chain["chain_of_thought"]["step_2"] = {
-                "hypothesis": f"User intent is '{intent_result.get('primary_intent', 'unknown')}' for topic '{self._extract_main_topic(user_input)}'",
                 "evidence": [
                     f"Pattern analysis: {self._extract_pattern_evidence(user_input)}",
                     f"Confidence scores: {intent_result.get('confidence_scores', {})}",
@@ -248,7 +256,7 @@ class MVPOrchestrator:
                     f"Query complexity: {self._assess_query_complexity(user_input)}"
                 ],
                 "confidence": intent_result.get('confidence_scores', {}).get(intent_result.get('primary_intent', 'unknown'), 0.7),
-                "reasoning": f"Intent '{intent_result.get('primary_intent', 'unknown')}' detected for {self._extract_main_topic(user_input)} based on linguistic patterns and context"
             }
             # Step 4: Agent execution planning with reasoning
@@ -257,7 +265,7 @@ class MVPOrchestrator:
             # Add execution planning reasoning
             reasoning_chain["chain_of_thought"]["step_3"] = {
-                "hypothesis": f"Optimal approach for '{intent_result.get('primary_intent', 'unknown')}' intent on '{self._extract_main_topic(user_input)}'",
                 "evidence": [
                     f"Intent complexity: {self._assess_intent_complexity(intent_result)}",
                     f"Required agents: {execution_plan.get('agents_to_execute', [])}",
@@ -265,7 +273,7 @@ class MVPOrchestrator:
                     f"Response scope: {self._determine_response_scope(user_input)}"
                 ],
                 "confidence": 0.80,
-                "reasoning": f"Agent selection optimized for {intent_result.get('primary_intent', 'unknown')} intent regarding {self._extract_main_topic(user_input)}"
             }
             # Step 5: Parallel agent execution
@@ -293,7 +301,7 @@ class MVPOrchestrator:
             # Add synthesis reasoning
             reasoning_chain["chain_of_thought"]["step_4"] = {
-                "hypothesis": f"Response synthesis for '{self._extract_main_topic(user_input)}' using '{final_response.get('synthesis_method', 'unknown')}' method",
                 "evidence": [
                     f"Synthesis quality: {final_response.get('coherence_score', 0.7)}",
                     f"Source integration: {len(final_response.get('source_references', []))} sources",
@@ -301,7 +309,7 @@ class MVPOrchestrator:
                     f"Content relevance: {self._assess_content_relevance(user_input, final_response)}"
                 ],
                 "confidence": final_response.get('coherence_score', 0.7),
-                "reasoning": f"Multi-source synthesis for {self._extract_main_topic(user_input)} using {final_response.get('synthesis_method', 'unknown')} approach"
             }
             # Step 7: Safety and bias check with reasoning
@@ -373,7 +381,7 @@ This response has been flagged for potential safety concerns:
             # Add safety reasoning
             reasoning_chain["chain_of_thought"]["step_5"] = {
-                "hypothesis": f"Safety validation for response about '{self._extract_main_topic(user_input)}'",
                 "evidence": [
                     f"Safety score: {safety_checked.get('safety_analysis', {}).get('overall_safety_score', 0.8)}",
                     f"Warnings generated: {len(safety_checked.get('warnings', []))}",
@@ -381,7 +389,7 @@ This response has been flagged for potential safety concerns:
                     f"Content appropriateness: {self._assess_content_appropriateness(user_input, safety_checked)}"
                 ],
                 "confidence": safety_checked.get('safety_analysis', {}).get('overall_safety_score', 0.8),
-                "reasoning": f"Safety analysis for {self._extract_main_topic(user_input)} content with non-blocking warning system"
             }
             # Update final_response to use the response_content (which may have warnings appended)
@@ -392,7 +400,7 @@ This response has been flagged for potential safety concerns:
                 final_response['response'] = response_content
             # Generate alternative paths and uncertainty analysis
-            reasoning_chain["alternative_paths"] = self._generate_alternative_paths(intent_result, user_input)
             reasoning_chain["uncertainty_areas"] = self._identify_uncertainty_areas(intent_result, final_response, safety_checked)
             reasoning_chain["evidence_sources"] = self._extract_evidence_sources(intent_result, final_response, context)
             reasoning_chain["confidence_calibration"] = self._calibrate_confidence_scores(reasoning_chain)
@@ -446,6 +454,22 @@ This response has been flagged for potential safety concerns:
                         system_response=response_text,
                         user_id=user_id
                     )
                 except Exception as e:
                     logger.error(f"Error generating interaction context: {e}", exc_info=True)
@@ -633,17 +657,23 @@ This response has been flagged for potential safety concerns:
         return results
     def _build_context_summary(self, context: dict) -> str:
-        """Build a concise summary of context for task execution"""
         summary_parts = []
-        # Extract interaction contexts
         interaction_contexts = context.get('interaction_contexts', [])
         if interaction_contexts:
             recent_summaries = [ic.get('summary', '') for ic in interaction_contexts[-3:]]
             if recent_summaries:
                 summary_parts.append(f"Recent conversation topics: {', '.join(recent_summaries)}")
-        # Extract user context
         user_context = context.get('user_context', '')
         if user_context:
             summary_parts.append(f"User background: {user_context[:200]}")
@@ -1001,37 +1031,72 @@ Please revise the response to address these concerns while maintaining helpfulne
         else:
             return "Long session (> 20 interactions)"
-    def _analyze_topic_continuity(self, context: dict, user_input: str) -> str:
-        """Analyze topic continuity for reasoning context"""
-        interaction_contexts = context.get('interaction_contexts', [])
-        if not interaction_contexts:
-            return "No previous context"
-        # Analyze topics from interaction context summaries
-        recent_topics = []
-        for ic in interaction_contexts[:3]:  # Last 3 interactions
-            summary = ic.get('summary', '').lower()
-            if 'machine learning' in summary or 'ml' in summary:
-                recent_topics.append('machine learning')
-            elif 'ai' in summary or 'artificial intelligence' in summary:
-                recent_topics.append('artificial intelligence')
-            elif 'data' in summary:
-                recent_topics.append('data science')
-        current_input_lower = user_input.lower()
-        if 'machine learning' in current_input_lower or 'ml' in current_input_lower:
-            current_topic = 'machine learning'
-        elif 'ai' in current_input_lower or 'artificial intelligence' in current_input_lower:
-            current_topic = 'artificial intelligence'
-        elif 'data' in current_input_lower:
-            current_topic = 'data science'
-        else:
-            current_topic = 'general'
-        if current_topic in recent_topics:
-            return f"Continuing {current_topic} discussion"
-        else:
-            return f"New topic: {current_topic}"
     def _extract_pattern_evidence(self, user_input: str) -> str:
         """Extract pattern evidence for intent reasoning"""
@@ -1068,11 +1133,10 @@ Please revise the response to address these concerns while maintaining helpfulne
         else:
             return "Complex, multi-faceted intent"
-    def _generate_alternative_paths(self, intent_result: dict, user_input: str) -> list:
         """Generate alternative reasoning paths based on actual content"""
         primary_intent = intent_result.get('primary_intent', 'unknown')
         secondary_intents = intent_result.get('secondary_intents', [])
-        main_topic = self._extract_main_topic(user_input)
         alternative_paths = []
@@ -1213,55 +1277,92 @@ Please revise the response to address these concerns while maintaining helpfulne
             "calibration_method": "Weighted average of step confidences"
         }
-    def _extract_main_topic(self, user_input: str) -> str:
-        """Extract the main topic from user input for context-aware reasoning"""
-        input_lower = user_input.lower()
-        # Topic extraction based on keywords
-        if any(word in input_lower for word in ['curriculum', 'course', 'teach', 'learning', 'education']):
-            if 'ai' in input_lower or 'chatbot' in input_lower or 'assistant' in input_lower:
-                return "AI chatbot course curriculum"
-            elif 'programming' in input_lower or 'python' in input_lower:
-                return "Programming course curriculum"
-            else:
-                return "Educational course design"
-        elif any(word in input_lower for word in ['machine learning', 'ml', 'neural network', 'deep learning']):
-            return "Machine learning concepts"
-        elif any(word in input_lower for word in ['ai', 'artificial intelligence', 'chatbot', 'assistant']):
-            return "Artificial intelligence and chatbots"
-        elif any(word in input_lower for word in ['data science', 'data analysis', 'analytics']):
-            return "Data science and analysis"
-        elif any(word in input_lower for word in ['programming', 'coding', 'development', 'software']):
-            return "Software development and programming"
-        else:
-            # Extract first few words as topic
             words = user_input.split()[:4]
             return " ".join(words) if words else "General inquiry"
-    def _extract_keywords(self, user_input: str) -> str:
-        """Extract key terms from user input"""
-        input_lower = user_input.lower()
-        keywords = []
-        # Extract important terms
-        important_terms = [
-            'curriculum', 'course', 'teach', 'learning', 'education',
-            'ai', 'artificial intelligence', 'chatbot', 'assistant',
-            'machine learning', 'ml', 'neural network', 'deep learning',
-            'programming', 'python', 'development', 'software',
-            'data science', 'analytics', 'analysis'
-        ]
-        for term in important_terms:
-            if term in input_lower:
-                keywords.append(term)
-        return ", ".join(keywords[:5]) if keywords else "General terms"
     def _assess_query_complexity(self, user_input: str) -> str:
         """Assess the complexity of the user query"""
@@ -1627,7 +1728,9 @@ Revised Response:"""
             "complex_refinement": "add clarifying details to your existing question"
         })
-        topic = self._extract_main_topic(original_user_input)
         # Adaptive guidance based on input complexity
         if input_complexity["is_complex"]:

         self.context_manager = context_manager
         self.agents = agents
         self.execution_trace = []
+        # Cache for topic extraction to reduce API calls
+        self._topic_cache = {}
+        self._topic_cache_max_size = 100  # Limit cache size
         # Safety revision thresholds
         self.safety_thresholds = {
             # Use context with deduplication check
             context = await self._get_or_create_context(session_id, user_input, user_id)
             interaction_contexts_count = len(context.get('interaction_contexts', []))
+            logger.info(f"Context retrieved: {interaction_contexts_count} interaction contexts")
+            # Add context analysis to reasoning chain (using LLM-based topic extraction)
             user_context = context.get('user_context', '')
             has_user_context = bool(user_context)
+            # Extract topic and keywords using LLM (async)
+            main_topic = await self._extract_main_topic(user_input, context)
+            topic_continuity = await self._analyze_topic_continuity(context, user_input)
+            query_keywords = await self._extract_keywords(user_input)
             reasoning_chain["chain_of_thought"]["step_1"] = {
+                "hypothesis": f"User is asking about: '{main_topic}'",
                 "evidence": [
                     f"Previous interaction contexts: {interaction_contexts_count}",
                     f"User context available: {has_user_context}",
                     f"Session duration: {self._calculate_session_duration(context)}",
+                    f"Topic continuity: {topic_continuity}",
+                    f"Query keywords: {query_keywords}"
                 ],
                 "confidence": 0.85,
+                "reasoning": f"Context analysis shows user is focused on {main_topic} with {interaction_contexts_count} previous interaction contexts and {'existing' if has_user_context else 'new'} user context"
             }
             # Step 3: Intent recognition with enhanced CoT
                     f"Confidence score: {skills_result.get('confidence_score', 0.5)}"
                 ],
                 "confidence": skills_result.get('confidence_score', 0.5),
+                "reasoning": f"Skills identification completed for topic '{main_topic}' with {len(skills_result.get('identified_skills', []))} relevant skills"
             }
             # Add intent reasoning to chain
             reasoning_chain["chain_of_thought"]["step_2"] = {
+                "hypothesis": f"User intent is '{intent_result.get('primary_intent', 'unknown')}' for topic '{main_topic}'",
                 "evidence": [
                     f"Pattern analysis: {self._extract_pattern_evidence(user_input)}",
                     f"Confidence scores: {intent_result.get('confidence_scores', {})}",
                     f"Query complexity: {self._assess_query_complexity(user_input)}"
                 ],
                 "confidence": intent_result.get('confidence_scores', {}).get(intent_result.get('primary_intent', 'unknown'), 0.7),
+                "reasoning": f"Intent '{intent_result.get('primary_intent', 'unknown')}' detected for {main_topic} based on linguistic patterns and context"
             }
             # Step 4: Agent execution planning with reasoning
             # Add execution planning reasoning
             reasoning_chain["chain_of_thought"]["step_3"] = {
+                "hypothesis": f"Optimal approach for '{intent_result.get('primary_intent', 'unknown')}' intent on '{main_topic}'",
                 "evidence": [
                     f"Intent complexity: {self._assess_intent_complexity(intent_result)}",
                     f"Required agents: {execution_plan.get('agents_to_execute', [])}",
                     f"Response scope: {self._determine_response_scope(user_input)}"
                 ],
                 "confidence": 0.80,
+                "reasoning": f"Agent selection optimized for {intent_result.get('primary_intent', 'unknown')} intent regarding {main_topic}"
             }
             # Step 5: Parallel agent execution
             # Add synthesis reasoning
             reasoning_chain["chain_of_thought"]["step_4"] = {
+                "hypothesis": f"Response synthesis for '{main_topic}' using '{final_response.get('synthesis_method', 'unknown')}' method",
                 "evidence": [
                     f"Synthesis quality: {final_response.get('coherence_score', 0.7)}",
                     f"Source integration: {len(final_response.get('source_references', []))} sources",
                     f"Content relevance: {self._assess_content_relevance(user_input, final_response)}"
                 ],
                 "confidence": final_response.get('coherence_score', 0.7),
+                "reasoning": f"Multi-source synthesis for {main_topic} using {final_response.get('synthesis_method', 'unknown')} approach"
             }
             # Step 7: Safety and bias check with reasoning
             # Add safety reasoning
             reasoning_chain["chain_of_thought"]["step_5"] = {
+                "hypothesis": f"Safety validation for response about '{main_topic}'",
                 "evidence": [
                     f"Safety score: {safety_checked.get('safety_analysis', {}).get('overall_safety_score', 0.8)}",
                     f"Warnings generated: {len(safety_checked.get('warnings', []))}",
                     f"Content appropriateness: {self._assess_content_appropriateness(user_input, safety_checked)}"
                 ],
                 "confidence": safety_checked.get('safety_analysis', {}).get('overall_safety_score', 0.8),
+                "reasoning": f"Safety analysis for {main_topic} content with non-blocking warning system"
             }
             # Update final_response to use the response_content (which may have warnings appended)
                 final_response['response'] = response_content
             # Generate alternative paths and uncertainty analysis
+            reasoning_chain["alternative_paths"] = self._generate_alternative_paths(intent_result, user_input, main_topic)
             reasoning_chain["uncertainty_areas"] = self._identify_uncertainty_areas(intent_result, final_response, safety_checked)
             reasoning_chain["evidence_sources"] = self._extract_evidence_sources(intent_result, final_response, context)
             reasoning_chain["confidence_calibration"] = self._calibrate_confidence_scores(reasoning_chain)
                         system_response=response_text,
                         user_id=user_id
                     )
+                    # Cache is automatically updated by generate_interaction_context()
+                    # STEP 3: Generate Session Context after each response (100 tokens)
+                    # Uses cached interaction contexts, updates database and cache
+                    try:
+                        await self.context_manager.generate_session_context(session_id, user_id)
+                        # Cache is automatically updated by generate_session_context()
+                    except Exception as e:
+                        logger.error(f"Error generating session context: {e}", exc_info=True)
+                    # Clear orchestrator-level cache to force refresh on next request
+                    if hasattr(self, '_context_cache'):
+                        orchestrator_cache_key = f"context_{session_id}"
+                        if orchestrator_cache_key in self._context_cache:
+                            del self._context_cache[orchestrator_cache_key]
+                            logger.debug(f"Orchestrator cache cleared for session {session_id} to refresh with updated contexts")
                 except Exception as e:
                     logger.error(f"Error generating interaction context: {e}", exc_info=True)
         return results
     def _build_context_summary(self, context: dict) -> str:
+        """Build a concise summary of context for task execution (all from cache)"""
         summary_parts = []
+        # Extract session context (from cache)
+        session_context = context.get('session_context', {})
+        session_summary = session_context.get('summary', '') if isinstance(session_context, dict) else ""
+        if session_summary:
+            summary_parts.append(f"Session summary: {session_summary[:1500]}")
+        # Extract interaction contexts (from cache)
         interaction_contexts = context.get('interaction_contexts', [])
         if interaction_contexts:
             recent_summaries = [ic.get('summary', '') for ic in interaction_contexts[-3:]]
             if recent_summaries:
                 summary_parts.append(f"Recent conversation topics: {', '.join(recent_summaries)}")
+        # Extract user context (from cache)
         user_context = context.get('user_context', '')
         if user_context:
             summary_parts.append(f"User background: {user_context[:200]}")
         else:
             return "Long session (> 20 interactions)"
+    async def _analyze_topic_continuity(self, context: dict, user_input: str) -> str:
+        """Analyze topic continuity using LLM zero-shot classification (uses session context and interaction contexts from cache)"""
+        try:
+            # Check session context first (from cache)
+            session_context = context.get('session_context', {})
+            session_summary = session_context.get('summary', '') if isinstance(session_context, dict) else ""
+            interaction_contexts = context.get('interaction_contexts', [])
+            if not interaction_contexts and not session_summary:
+                return "No previous context"
+            # Build context summary from cache
+            recent_interactions_summary = "\n".join([
+                f"- {ic.get('summary', '')}"
+                for ic in interaction_contexts[:3]
+                if ic.get('summary')
+            ])
+            # Use LLM for context-aware topic continuity analysis
+            if self.llm_router:
+                prompt = f"""Determine if the current query continues the previous conversation topic or introduces a new topic.
+Session Summary: {session_summary[:300] if session_summary else 'No session summary available'}
+Recent Interactions:
+{recent_interactions_summary if recent_interactions_summary else 'No recent interactions'}
+Current Query: "{user_input}"
+Analyze whether the current query:
+1. Continues the same topic from previous interactions
+2. Introduces a new topic
+Respond with EXACTLY one of these formats:
+- "Continuing [topic name] discussion" if same topic
+- "New topic: [topic name]" if different topic
+Keep topic name to 2-5 words. Example responses:
+- "Continuing machine learning discussion"
+- "New topic: financial analysis"
+- "Continuing software development discussion"
+"""
+                continuity_result = await self.llm_router.route_inference(
+                    task_type="general_reasoning",
+                    prompt=prompt,
+                    max_tokens=50,
+                    temperature=0.3  # Lower temperature for consistency
+                )
+                if continuity_result and isinstance(continuity_result, str) and continuity_result.strip():
+                    result = continuity_result.strip()
+                    # Validate format
+                    if "Continuing" in result or "New topic:" in result:
+                        logger.debug(f"Topic continuity analysis: {result}")
+                        return result
+            # Fallback to simple check if LLM unavailable
+            if not session_summary and not recent_interactions_summary:
+                return "No previous context"
+            return "Topic continuity analysis unavailable"
+        except Exception as e:
+            logger.error(f"Error in LLM-based topic continuity analysis: {e}", exc_info=True)
+            # Fallback
+            return "Topic continuity analysis failed"
     def _extract_pattern_evidence(self, user_input: str) -> str:
         """Extract pattern evidence for intent reasoning"""
         else:
             return "Complex, multi-faceted intent"
+    def _generate_alternative_paths(self, intent_result: dict, user_input: str, main_topic: str) -> list:
         """Generate alternative reasoning paths based on actual content"""
         primary_intent = intent_result.get('primary_intent', 'unknown')
         secondary_intents = intent_result.get('secondary_intents', [])
         alternative_paths = []
             "calibration_method": "Weighted average of step confidences"
         }
+    async def _extract_main_topic(self, user_input: str, context: dict = None) -> str:
+        """Extract the main topic using LLM zero-shot classification with caching"""
+        try:
+            # Check cache first
+            import hashlib
+            cache_key = hashlib.md5(user_input.encode()).hexdigest()
+            if cache_key in self._topic_cache:
+                logger.debug(f"Topic cache hit for: {user_input[:50]}...")
+                return self._topic_cache[cache_key]
+            # Use LLM for accurate topic extraction
+            if self.llm_router:
+                # Build context summary if available
+                context_info = ""
+                if context:
+                    session_context = context.get('session_context', {})
+                    session_summary = session_context.get('summary', '') if isinstance(session_context, dict) else ""
+                    interaction_count = len(context.get('interaction_contexts', []))
+                    if session_summary:
+                        context_info = f"\n\nSession context: {session_summary[:200]}"
+                    if interaction_count > 0:
+                        context_info += f"\nPrevious interactions in session: {interaction_count}"
+                prompt = f"""Classify the main topic of this query in 2-5 words. Be specific and concise.
+Query: "{user_input}"{context_info}
+Respond with ONLY the topic name (e.g., "Machine Learning", "Healthcare Analytics", "Financial Modeling", "Software Development", "Educational Curriculum").
+Do not include explanations, just the topic name. Maximum 5 words."""
+                topic_result = await self.llm_router.route_inference(
+                    task_type="classification",
+                    prompt=prompt,
+                    max_tokens=20,
+                    temperature=0.3  # Lower temperature for consistency
+                )
+                if topic_result and isinstance(topic_result, str) and topic_result.strip():
+                    topic = topic_result.strip()
+                    # Clean up any extra text (LLM might add explanations)
+                    # Take first line and first 5 words max
+                    topic = topic.split('\n')[0].strip()
+                    words = topic.split()[:5]
+                    topic = " ".join(words)
+                    # Cache the result
+                    if len(self._topic_cache) >= self._topic_cache_max_size:
+                        # Remove oldest entry (simple FIFO)
+                        oldest_key = next(iter(self._topic_cache))
+                        del self._topic_cache[oldest_key]
+                    self._topic_cache[cache_key] = topic
+                    logger.debug(f"Topic extracted: {topic}")
+                    return topic
+            # Fallback to simple extraction if LLM unavailable
+            words = user_input.split()[:4]
+            fallback_topic = " ".join(words) if words else "General inquiry"
+            logger.warning(f"Using fallback topic extraction: {fallback_topic}")
+            return fallback_topic
+        except Exception as e:
+            logger.error(f"Error in LLM-based topic extraction: {e}", exc_info=True)
+            # Fallback
             words = user_input.split()[:4]
             return " ".join(words) if words else "General inquiry"
+    async def _extract_keywords(self, user_input: str) -> str:
+        """Extract key terms using LLM or simple extraction"""
+        try:
+            # Simple extraction for performance (keywords less critical than topic)
+            # Can be enhanced with LLM if needed
+            import re
+            # Extract meaningful words (3+ characters, not common stop words)
+            stop_words = {'the', 'and', 'for', 'are', 'but', 'not', 'you', 'all', 'can', 'her', 'was', 'one', 'our', 'out', 'day', 'get', 'has', 'him', 'his', 'how', 'its', 'may', 'new', 'now', 'old', 'see', 'two', 'way', 'who', 'boy', 'did', 'she', 'use', 'her', 'many', 'some', 'time', 'very', 'when', 'come', 'here', 'just', 'like', 'long', 'make', 'over', 'such', 'take', 'than', 'them', 'well', 'were'}
+            words = re.findall(r'\b[a-zA-Z]{3,}\b', user_input.lower())
+            keywords = [w for w in words if w not in stop_words][:5]
+            return ", ".join(keywords) if keywords else "General terms"
+        except Exception as e:
+            logger.error(f"Error in keyword extraction: {e}", exc_info=True)
+            return "General terms"
     def _assess_query_complexity(self, user_input: str) -> str:
         """Assess the complexity of the user query"""
             "complex_refinement": "add clarifying details to your existing question"
         })
+        # Topic extraction removed from error recovery to avoid async complexity
+        # Error recovery uses simplified context
+        topic = "Error recovery context"
         # Adaptive guidance based on input complexity
         if input_complexity["is_complex"]: