Performance Metrics Implementation Summary
β Implementation Complete
Problem Identified
Performance metrics were showing all zeros in Flask API responses because:
track_response_metrics()was calculating metrics but not adding them to the response dictionary- Flask API expected
result.get('performance', {})but orchestrator didn't include aperformancekey - Token counting was approximate and potentially inaccurate
- Agent contributions weren't being tracked
Solutions Implemented
1. Enhanced track_response_metrics() Method
File: src/orchestrator_engine.py
Changes:
- β Now returns the response dictionary with performance metrics added
- β Improved token counting with more accurate estimation (words * 1.3 or chars / 4)
- β Extracts confidence scores from intent results
- β Tracks agent contributions with percentage calculations
- β
Adds metrics to both
performanceandmetadatakeys for backward compatibility - β Memory optimized with configurable history limits
Key Features:
- Calculates
processing_timein milliseconds - Estimates
tokens_usedaccurately - Tracks
agents_usedcount - Calculates
confidence_scorefrom intent recognition - Builds
agent_contributionsarray with percentages - Extracts
safety_scorefrom safety analysis - Includes
latency_secondsfor debugging
2. Updated process_request() Method
File: src/orchestrator_engine.py
Changes:
- β
Captures return value from
track_response_metrics() - β
Ensures
performancekey exists even if tracking fails - β Provides default metrics structure on error
3. Enhanced Agent Tracking
File: src/orchestrator_engine.py
Changes:
- β
Added
agent_call_historyfor tracking recent agent calls - β
Memory optimized with
max_agent_historylimit (50) - β
Tracks which agents were called in
process_request_parallel() - β
Returns
agents_calledin parallel processing results
4. Improved Flask API Logging
File: flask_api_standalone.py
Changes:
- β Enhanced logging for performance metrics with formatted output
- β
Fallback to extract metrics from
metadataifperformancekey missing - β Detailed debug logging when metrics are missing
- β Logs all performance metrics including agent contributions
5. Added Safety Result to Metadata
File: src/orchestrator_engine.py
Changes:
- β
Added
safety_resultto metadata passed to_format_final_output() - β Ensures safety metrics can be properly extracted
6. Added Performance Summary Method
File: src/orchestrator_engine.py
New Method: get_performance_summary()
- Returns summary of recent performance metrics
- Useful for monitoring and debugging
- Includes averages and recent history
Expected Response Format
After implementation, the Flask API will return:
{
"success": true,
"message": "AI response text",
"history": [...],
"reasoning": {...},
"performance": {
"processing_time": 1230.5, // milliseconds
"tokens_used": 456,
"agents_used": 4,
"confidence_score": 85.2, // percentage
"agent_contributions": [
{"agent": "Intent", "percentage": 25.0},
{"agent": "Synthesis", "percentage": 40.0},
{"agent": "Safety", "percentage": 15.0},
{"agent": "Skills", "percentage": 20.0}
],
"safety_score": 85.0, // percentage
"latency_seconds": 1.230,
"timestamp": "2024-01-15T10:30:45.123456"
}
}
Memory Optimization
Implemented:
- β
agent_call_historylimited to 50 entries - β
response_metrics_historylimited to 100 entries (configurable) - β Automatic cleanup of old history entries
- β Efficient data structures for tracking
Backward Compatibility
Maintained:
- β
Metrics available in both
performancekey andmetadata.performance_metrics - β All existing code continues to work
- β Default metrics provided on error
- β Graceful fallback if tracking fails
Testing
To verify the implementation:
- Start the Flask API:
python flask_api_standalone.py
- Test with a request:
import requests
response = requests.post("http://localhost:5000/api/chat", json={
"message": "What is machine learning?",
"session_id": "test-session",
"user_id": "test-user"
})
data = response.json()
print("Performance Metrics:", data.get('performance', {}))
- Check logs: The Flask API will now log detailed performance metrics:
============================================================
PERFORMANCE METRICS
============================================================
Processing Time: 1230.5ms
Tokens Used: 456
Agents Used: 4
Confidence Score: 85.2%
Agent Contributions:
- Intent: 25.0%
- Synthesis: 40.0%
- Safety: 15.0%
- Skills: 20.0%
Safety Score: 85.0%
============================================================
Files Modified
β
src/orchestrator_engine.py- Enhanced
track_response_metrics()method - Updated
process_request()method - Enhanced
process_request_parallel()method - Added
get_performance_summary()method - Added memory optimization for tracking
- Added safety_result to metadata
- Enhanced
β
flask_api_standalone.py- Enhanced logging for performance metrics
- Added fallback extraction from metadata
- Improved error handling
Next Steps
- β Implementation complete
- βοΈ Test with actual API calls
- βοΈ Monitor performance metrics in production
- βοΈ Adjust agent contribution percentages if needed
- βοΈ Fine-tune token counting accuracy if needed
Notes
- Token counting uses estimation (words * 1.3 or chars / 4) - consider using actual tokenizer for production if exact counts needed
- Agent contributions are calculated based on agent importance (Synthesis > Intent > Others)
- Percentages are normalized to sum to 100%
- All metrics include timestamps for tracking
- Memory usage is optimized with configurable limits