Research_AI_Assistant / LOGGING_GUIDE.md
JatsTheAIGen's picture
workflow errors debugging V3
ae20ff2
|
raw
history blame
6.44 kB

πŸ“‹ Logging Guide

Overview

The application now includes comprehensive logging throughout all components for debugging and monitoring in container environments.

Log Levels

  • INFO: General application flow and important events
  • WARNING: Non-critical issues (e.g., missing tokens, fallback modes)
  • ERROR: Errors that are caught and handled
  • DEBUG: Detailed debugging information

Log Output Locations

  1. Console/Stdout - Real-time logs visible in container output
  2. File - app.log in the working directory

What Gets Logged

App Initialization (app.py)

# Component import attempts
logger.info("Attempting to import orchestration components...")
logger.info("βœ“ Successfully imported orchestration components")

# System initialization
logger.info("INITIALIZING ORCHESTRATION SYSTEM")
logger.info("Step 1/6: Initializing LLM Router...")
logger.info("βœ“ LLM Router initialized")
logger.info("Step 2/6: Initializing Agents...")
logger.info("Step 3/6: Initializing Context Manager...")
logger.info("Step 4/6: Initializing Orchestrator...")
logger.info("ORCHESTRATION SYSTEM READY")

Request Processing (app.py)

logger.info(f"Processing message: {message[:100]}")
logger.info(f"Session ID: {session_id}")
logger.info("Attempting full orchestration...")
logger.info("Orchestrator returned response: {response[:100]}")
logger.info("Message processing complete")

Orchestrator (orchestrator_engine.py)

logger.info(f"Processing request for session {session_id}")
logger.info(f"User input: {user_input[:100]}")
logger.info(f"Generated interaction ID: {interaction_id}")
logger.info("Step 2: Managing context...")
logger.info(f"Context retrieved: {len(context.get('interactions', []))} interactions")
logger.info("Step 3: Recognizing intent...")
logger.info(f"Intent detected: {intent_result.get('primary_intent', 'unknown')}")
logger.info("Step 4: Creating execution plan...")
logger.info("Step 5: Executing agents...")
logger.info(f"Agent execution complete: {len(agent_results)} results")
logger.info("Step 6: Synthesizing response...")
logger.info("Step 7: Safety check...")
logger.info(f"Request processing complete. Response length: ...")

Context Manager (context_manager.py)

logger.info(f"Initializing ContextManager with DB path: {db_path}")
logger.info("Initializing database...")
logger.info("βœ“ Sessions table ready")
logger.info("βœ“ Interactions table ready")
logger.info("Database initialization complete")

LLM Router (llm_router.py)

logger.info("LLMRouter initialized")
logger.info("HF token available")
logger.info(f"Routing inference for task: {task_type}")
logger.info(f"Selected model: {model_config['model_id']}")
logger.info(f"Calling HF API for model: {model_id}")
logger.info(f"HF API returned response (length: ...)")
logger.info(f"Inference complete for {task_type}")

Viewing Logs

In Development

# View real-time logs
python app.py

# View log file
tail -f app.log

# Search logs
grep "ERROR" app.log
grep "Processing request" app.log

In Container/Docker

# View container logs
docker logs <container_id>

# Follow logs in real-time
docker logs -f <container_id>

# View recent logs
docker logs --tail 100 <container_id>

In Kubernetes

# View pod logs
kubectl logs <pod-name>

# Follow logs
kubectl logs -f <pod-name>

# View logs from specific container
kubectl logs <pod-name> -c <container-name>

Debugging with Logs

Example Log Flow for a Request

2024-01-15 10:30:00 - app - INFO - Processing message: Hello, I need help with...
2024-01-15 10:30:00 - app - INFO - Session ID: abc12345
2024-01-15 10:30:00 - app - INFO - Attempting full orchestration...
2024-01-15 10:30:00 - orchestrator_engine - INFO - Processing request for session abc12345
2024-01-15 10:30:00 - orchestrator_engine - INFO - User input: Hello, I need help with...
2024-01-15 10:30:00 - orchestrator_engine - INFO - Generated interaction ID: ...
2024-01-15 10:30:01 - orchestrator_engine - INFO - Step 2: Managing context...
2024-01-15 10:30:01 - context_manager - INFO - Retrieving context for session abc12345
2024-01-15 10:30:01 - orchestrator_engine - INFO - Context retrieved: 5 interactions
2024-01-15 10:30:01 - orchestrator_engine - INFO - Step 3: Recognizing intent...
2024-01-15 10:30:02 - intent_agent - INFO - INTENT_REC_001 processing user input: Hello...
2024-01-15 10:30:02 - orchestrator_engine - INFO - Intent detected: information_request
...

Common Error Patterns

Missing Dependencies

ERROR - Could not import orchestration components: No module named 'src.agents'

Solution: Check imports and PYTHONPATH

Database Issues

ERROR - Database initialization error: unable to open database file

Solution: Check file permissions and disk space

API Errors

ERROR - HF API error: 429 - Rate limit exceeded

Solution: Check rate limits and API token

Orchestrator Errors

ERROR - Orchestrator error: 'NoneType' object has no attribute 'execute'

Solution: Check agent initialization

Log Management

Log Rotation

For production, consider adding log rotation:

from logging.handlers import RotatingFileHandler

handler = RotatingFileHandler(
    'app.log', 
    maxBytes=10*1024*1024,  # 10MB
    backupCount=5
)

Structured Logging

For JSON logging (useful in containers):

import structlog

logger = structlog.get_logger()
logger.info("event", session_id=session_id, status="success")

Performance Considerations

  • Logging to file has minimal overhead
  • Console logging is fast in containers
  • DEBUG level can be verbose - use only when needed
  • Consider async logging for high-throughput scenarios

Best Practices

  1. βœ… Log entry and exit of critical functions
  2. βœ… Include context (session_id, request_id, etc.)
  3. βœ… Log errors with full stack traces (exc_info=True)
  4. βœ… Use appropriate log levels
  5. ❌ Don't log sensitive data (tokens, passwords)
  6. ❌ Don't over-log in hot paths

Quick Commands

# Find all errors
grep "ERROR" app.log

# Count messages processed
grep "Message processing complete" app.log | wc -l

# Find slow requests
grep -E "processing|complete" app.log | grep -E "[0-9]{4,}"

# Watch real-time
tail -f app.log | grep -E "INFO|ERROR|WARNING"