Spaces:

JatinAutonomousLabs
/

Research_AI_Assistant

Sleeping

App Files Files Community

Research_AI_Assistant / LOGGING_GUIDE.md

JatsTheAIGen

workflow errors debugging V3

ae20ff2 about 2 months ago

preview code

raw

history blame

6.44 kB

	# 📋 Logging Guide

	## Overview

	The application now includes comprehensive logging throughout all components for debugging and monitoring in container environments.

	## Log Levels

	- INFO: General application flow and important events
	- WARNING: Non-critical issues (e.g., missing tokens, fallback modes)
	- ERROR: Errors that are caught and handled
	- DEBUG: Detailed debugging information

	## Log Output Locations

	1. Console/Stdout - Real-time logs visible in container output
	2. File - `app.log` in the working directory

	## What Gets Logged

	### App Initialization (`app.py`)

	```python
	# Component import attempts
	logger.info("Attempting to import orchestration components...")
	logger.info("✓ Successfully imported orchestration components")

	# System initialization
	logger.info("INITIALIZING ORCHESTRATION SYSTEM")
	logger.info("Step 1/6: Initializing LLM Router...")
	logger.info("✓ LLM Router initialized")
	logger.info("Step 2/6: Initializing Agents...")
	logger.info("Step 3/6: Initializing Context Manager...")
	logger.info("Step 4/6: Initializing Orchestrator...")
	logger.info("ORCHESTRATION SYSTEM READY")
	```

	### Request Processing (`app.py`)

	```python
	logger.info(f"Processing message: {message[:100]}")
	logger.info(f"Session ID: {session_id}")
	logger.info("Attempting full orchestration...")
	logger.info("Orchestrator returned response: {response[:100]}")
	logger.info("Message processing complete")
	```

	### Orchestrator (`orchestrator_engine.py`)

	```python
	logger.info(f"Processing request for session {session_id}")
	logger.info(f"User input: {user_input[:100]}")
	logger.info(f"Generated interaction ID: {interaction_id}")
	logger.info("Step 2: Managing context...")
	logger.info(f"Context retrieved: {len(context.get('interactions', []))} interactions")
	logger.info("Step 3: Recognizing intent...")
	logger.info(f"Intent detected: {intent_result.get('primary_intent', 'unknown')}")
	logger.info("Step 4: Creating execution plan...")
	logger.info("Step 5: Executing agents...")
	logger.info(f"Agent execution complete: {len(agent_results)} results")
	logger.info("Step 6: Synthesizing response...")
	logger.info("Step 7: Safety check...")
	logger.info(f"Request processing complete. Response length: ...")
	```

	### Context Manager (`context_manager.py`)

	```python
	logger.info(f"Initializing ContextManager with DB path: {db_path}")
	logger.info("Initializing database...")
	logger.info("✓ Sessions table ready")
	logger.info("✓ Interactions table ready")
	logger.info("Database initialization complete")
	```

	### LLM Router (`llm_router.py`)

	```python
	logger.info("LLMRouter initialized")
	logger.info("HF token available")
	logger.info(f"Routing inference for task: {task_type}")
	logger.info(f"Selected model: {model_config['model_id']}")
	logger.info(f"Calling HF API for model: {model_id}")
	logger.info(f"HF API returned response (length: ...)")
	logger.info(f"Inference complete for {task_type}")
	```

	## Viewing Logs

	### In Development

	```bash
	# View real-time logs
	python app.py

	# View log file
	tail -f app.log

	# Search logs
	grep "ERROR" app.log
	grep "Processing request" app.log
	```

	### In Container/Docker

	```bash
	# View container logs
	docker logs <container_id>

	# Follow logs in real-time
	docker logs -f <container_id>

	# View recent logs
	docker logs --tail 100 <container_id>
	```

	### In Kubernetes

	```bash
	# View pod logs
	kubectl logs <pod-name>

	# Follow logs
	kubectl logs -f <pod-name>

	# View logs from specific container
	kubectl logs <pod-name> -c <container-name>
	```

	## Debugging with Logs

	### Example Log Flow for a Request

	```
	2024-01-15 10:30:00 - app - INFO - Processing message: Hello, I need help with...
	2024-01-15 10:30:00 - app - INFO - Session ID: abc12345
	2024-01-15 10:30:00 - app - INFO - Attempting full orchestration...
	2024-01-15 10:30:00 - orchestrator_engine - INFO - Processing request for session abc12345
	2024-01-15 10:30:00 - orchestrator_engine - INFO - User input: Hello, I need help with...
	2024-01-15 10:30:00 - orchestrator_engine - INFO - Generated interaction ID: ...
	2024-01-15 10:30:01 - orchestrator_engine - INFO - Step 2: Managing context...
	2024-01-15 10:30:01 - context_manager - INFO - Retrieving context for session abc12345
	2024-01-15 10:30:01 - orchestrator_engine - INFO - Context retrieved: 5 interactions
	2024-01-15 10:30:01 - orchestrator_engine - INFO - Step 3: Recognizing intent...
	2024-01-15 10:30:02 - intent_agent - INFO - INTENT_REC_001 processing user input: Hello...
	2024-01-15 10:30:02 - orchestrator_engine - INFO - Intent detected: information_request
	...
	```

	## Common Error Patterns

	### Missing Dependencies
	```
	ERROR - Could not import orchestration components: No module named 'src.agents'
	```
	Solution: Check imports and PYTHONPATH

	### Database Issues
	```
	ERROR - Database initialization error: unable to open database file
	```
	Solution: Check file permissions and disk space

	### API Errors
	```
	ERROR - HF API error: 429 - Rate limit exceeded
	```
	Solution: Check rate limits and API token

	### Orchestrator Errors
	```
	ERROR - Orchestrator error: 'NoneType' object has no attribute 'execute'
	```
	Solution: Check agent initialization

	## Log Management

	### Log Rotation

	For production, consider adding log rotation:

	```python
	from logging.handlers import RotatingFileHandler

	handler = RotatingFileHandler(
	'app.log',
	maxBytes=1010241024, # 10MB
	backupCount=5
	)
	```

	### Structured Logging

	For JSON logging (useful in containers):

	```python
	import structlog

	logger = structlog.get_logger()
	logger.info("event", session_id=session_id, status="success")
	```

	## Performance Considerations

	- Logging to file has minimal overhead
	- Console logging is fast in containers
	- DEBUG level can be verbose - use only when needed
	- Consider async logging for high-throughput scenarios

	## Best Practices

	1. ✅ Log entry and exit of critical functions
	2. ✅ Include context (session_id, request_id, etc.)
	3. ✅ Log errors with full stack traces (`exc_info=True`)
	4. ✅ Use appropriate log levels
	5. ❌ Don't log sensitive data (tokens, passwords)
	6. ❌ Don't over-log in hot paths

	## Quick Commands

	```bash
	# Find all errors
	grep "ERROR" app.log

	# Count messages processed
	grep "Message processing complete" app.log \| wc -l

	# Find slow requests
	grep -E "processing\|complete" app.log \| grep -E "[0-9]{4,}"

	# Watch real-time
	tail -f app.log \| grep -E "INFO\|ERROR\|WARNING"
	```