| # Hugging Face Spaces Deployment Guide - HonestAI | |
| ## π Deployment to HF Spaces | |
| This guide covers deploying the updated HonestAI application to [Hugging Face Spaces](https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI). | |
| ## π Pre-Deployment Checklist | |
| ### β Required Files | |
| - [x] `Dockerfile` - Container configuration | |
| - [x] `requirements.txt` - Python dependencies | |
| - [x] `flask_api_standalone.py` - Main application entry point | |
| - [x] `README.md` - Updated with HonestAI Space URL | |
| - [x] `src/` - All source code | |
| - [x] `.env.example` - Environment variable template | |
| ### β Recent Updates Included | |
| - [x] Enhanced configuration management (`src/config.py`) | |
| - [x] Performance metrics tracking (`src/orchestrator_engine.py`) | |
| - [x] Updated model configurations (Llama 3.1 8B, e5-base-v2, Qwen 2.5 1.5B) | |
| - [x] 4-bit quantization support | |
| - [x] Cache directory management | |
| - [x] Memory optimizations | |
| ## π§ Deployment Steps | |
| ### 1. Verify Space Configuration | |
| **Space URL**: https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI | |
| **Space Settings**: | |
| - **SDK**: Docker | |
| - **Hardware**: T4 GPU (16GB) | |
| - **Visibility**: Public | |
| - **Storage**: Persistent (for cache) | |
| ### 2. Set Environment Variables | |
| In Space Settings β Repository secrets, ensure: | |
| - `HF_TOKEN` - Your Hugging Face API token (required) | |
| - `MAX_WORKERS` - Optional (default: 4) | |
| - `LOG_LEVEL` - Optional (default: INFO) | |
| - `HF_HOME` - Optional (auto-configured) | |
| ### 3. Verify Dockerfile | |
| The `Dockerfile` is configured for: | |
| - Python 3.10 | |
| - Port 7860 (HF Spaces standard) | |
| - Health check endpoint | |
| - Flask API as entry point | |
| ### 4. Commit and Push Updates | |
| ```bash | |
| # Ensure all changes are committed | |
| git add . | |
| git commit -m "Update: Performance metrics, enhanced config, model optimizations" | |
| # Push to HF Spaces repository | |
| git push origin main | |
| ``` | |
| ### 5. Monitor Build | |
| - **Build Time**: 5-10 minutes (first build may take longer) | |
| - **Watch Logs**: Check Space logs for build progress | |
| - **Health Check**: `/api/health` endpoint should respond after build | |
| ## π What's New in This Deployment | |
| ### 1. Performance Metrics | |
| Every API response now includes comprehensive performance data: | |
| ```json | |
| { | |
| "performance": { | |
| "processing_time": 1230.5, | |
| "tokens_used": 456, | |
| "agents_used": 4, | |
| "confidence_score": 85.2, | |
| "agent_contributions": [...], | |
| "safety_score": 85.0 | |
| } | |
| } | |
| ``` | |
| ### 2. Enhanced Configuration | |
| - Automatic cache directory management | |
| - Secure environment variable handling | |
| - Backward compatible settings | |
| - Validation and error handling | |
| ### 3. Model Optimizations | |
| - **Llama 3.1 8B** with 4-bit quantization (primary) | |
| - **e5-base-v2** for embeddings (768 dimensions) | |
| - **Qwen 2.5 1.5B** for fast classification | |
| - Model preloading for faster responses | |
| ### 4. Memory Management | |
| - Optimized history tracking (limited to 50-100 entries) | |
| - Efficient agent call tracking | |
| - Memory-aware caching | |
| ## π§ͺ Testing After Deployment | |
| ### 1. Health Check | |
| ```bash | |
| curl https://jatinautonomouslabs-honestai.hf.space/api/health | |
| ``` | |
| ### 2. Test API Endpoint | |
| ```python | |
| import requests | |
| response = requests.post( | |
| "https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI/api/chat", | |
| json={ | |
| "message": "Hello, what is machine learning?", | |
| "session_id": "test-session", | |
| "user_id": "test-user" | |
| } | |
| ) | |
| data = response.json() | |
| print(f"Response: {data['message']}") | |
| print(f"Performance: {data.get('performance', {})}") | |
| ``` | |
| ### 3. Verify Performance Metrics | |
| Check that performance metrics are populated (not all zeros): | |
| - `processing_time` > 0 | |
| - `tokens_used` > 0 | |
| - `agents_used` > 0 | |
| - `agent_contributions` not empty | |
| ## π Troubleshooting | |
| ### Build Fails | |
| - Check `requirements.txt` for conflicts | |
| - Verify Python version (3.10) | |
| - Check Dockerfile syntax | |
| ### Runtime Errors | |
| - Verify `HF_TOKEN` is set in Space secrets | |
| - Check logs for permission errors | |
| - Verify cache directory is writable | |
| ### Performance Issues | |
| - Check GPU memory usage | |
| - Monitor model loading times | |
| - Verify quantization is enabled | |
| ### API Not Responding | |
| - Check health endpoint: `/api/health` | |
| - Verify Flask app is running on port 7860 | |
| - Check Space logs for errors | |
| ## π Post-Deployment | |
| ### 1. Update Documentation | |
| - β README.md updated with HonestAI URL | |
| - β HF_SPACES_URL_GUIDE.md updated | |
| - β API_DOCUMENTATION.md includes performance metrics | |
| ### 2. Monitor Metrics | |
| - Track response times | |
| - Monitor error rates | |
| - Check performance metrics accuracy | |
| ### 3. User Communication | |
| - Announce new features (performance metrics) | |
| - Update API documentation | |
| - Share new Space URL | |
| ## π Quick Links | |
| - **Space**: https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI | |
| - **API Documentation**: See `API_DOCUMENTATION.md` | |
| - **Configuration Guide**: See `.env.example` | |
| - **Performance Metrics**: See `PERFORMANCE_METRICS_IMPLEMENTATION.md` | |
| ## β Success Criteria | |
| After deployment, verify: | |
| 1. β Space builds successfully | |
| 2. β Health endpoint responds | |
| 3. β API chat endpoint works | |
| 4. β Performance metrics are populated | |
| 5. β Models load with 4-bit quantization | |
| 6. β Cache directory is configured | |
| 7. β Logs show no critical errors | |
| --- | |
| **Last Updated**: January 2024 | |
| **Space**: JatinAutonomousLabs/HonestAI | |
| **Status**: Ready for Deployment β | |