Hugging Face Spaces Deployment Guide - HonestAI
π Deployment to HF Spaces
This guide covers deploying the updated HonestAI application to Hugging Face Spaces.
π Pre-Deployment Checklist
β Required Files
-
Dockerfile- Container configuration -
requirements.txt- Python dependencies -
flask_api_standalone.py- Main application entry point -
README.md- Updated with HonestAI Space URL -
src/- All source code -
.env.example- Environment variable template
β Recent Updates Included
- Enhanced configuration management (
src/config.py) - Performance metrics tracking (
src/orchestrator_engine.py) - Updated model configurations (Llama 3.1 8B, e5-base-v2, Qwen 2.5 1.5B)
- 4-bit quantization support
- Cache directory management
- Memory optimizations
π§ Deployment Steps
1. Verify Space Configuration
Space URL: https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI
Space Settings:
- SDK: Docker
- Hardware: T4 GPU (16GB)
- Visibility: Public
- Storage: Persistent (for cache)
2. Set Environment Variables
In Space Settings β Repository secrets, ensure:
HF_TOKEN- Your Hugging Face API token (required)MAX_WORKERS- Optional (default: 4)LOG_LEVEL- Optional (default: INFO)HF_HOME- Optional (auto-configured)
3. Verify Dockerfile
The Dockerfile is configured for:
- Python 3.10
- Port 7860 (HF Spaces standard)
- Health check endpoint
- Flask API as entry point
4. Commit and Push Updates
# Ensure all changes are committed
git add .
git commit -m "Update: Performance metrics, enhanced config, model optimizations"
# Push to HF Spaces repository
git push origin main
5. Monitor Build
- Build Time: 5-10 minutes (first build may take longer)
- Watch Logs: Check Space logs for build progress
- Health Check:
/api/healthendpoint should respond after build
π What's New in This Deployment
1. Performance Metrics
Every API response now includes comprehensive performance data:
{
"performance": {
"processing_time": 1230.5,
"tokens_used": 456,
"agents_used": 4,
"confidence_score": 85.2,
"agent_contributions": [...],
"safety_score": 85.0
}
}
2. Enhanced Configuration
- Automatic cache directory management
- Secure environment variable handling
- Backward compatible settings
- Validation and error handling
3. Model Optimizations
- Llama 3.1 8B with 4-bit quantization (primary)
- e5-base-v2 for embeddings (768 dimensions)
- Qwen 2.5 1.5B for fast classification
- Model preloading for faster responses
4. Memory Management
- Optimized history tracking (limited to 50-100 entries)
- Efficient agent call tracking
- Memory-aware caching
π§ͺ Testing After Deployment
1. Health Check
curl https://jatinautonomouslabs-honestai.hf.space/api/health
2. Test API Endpoint
import requests
response = requests.post(
"https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI/api/chat",
json={
"message": "Hello, what is machine learning?",
"session_id": "test-session",
"user_id": "test-user"
}
)
data = response.json()
print(f"Response: {data['message']}")
print(f"Performance: {data.get('performance', {})}")
3. Verify Performance Metrics
Check that performance metrics are populated (not all zeros):
processing_time> 0tokens_used> 0agents_used> 0agent_contributionsnot empty
π Troubleshooting
Build Fails
- Check
requirements.txtfor conflicts - Verify Python version (3.10)
- Check Dockerfile syntax
Runtime Errors
- Verify
HF_TOKENis set in Space secrets - Check logs for permission errors
- Verify cache directory is writable
Performance Issues
- Check GPU memory usage
- Monitor model loading times
- Verify quantization is enabled
API Not Responding
- Check health endpoint:
/api/health - Verify Flask app is running on port 7860
- Check Space logs for errors
π Post-Deployment
1. Update Documentation
- β README.md updated with HonestAI URL
- β HF_SPACES_URL_GUIDE.md updated
- β API_DOCUMENTATION.md includes performance metrics
2. Monitor Metrics
- Track response times
- Monitor error rates
- Check performance metrics accuracy
3. User Communication
- Announce new features (performance metrics)
- Update API documentation
- Share new Space URL
π Quick Links
- Space: https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI
- API Documentation: See
API_DOCUMENTATION.md - Configuration Guide: See
.env.example - Performance Metrics: See
PERFORMANCE_METRICS_IMPLEMENTATION.md
β Success Criteria
After deployment, verify:
- β Space builds successfully
- β Health endpoint responds
- β API chat endpoint works
- β Performance metrics are populated
- β Models load with 4-bit quantization
- β Cache directory is configured
- β Logs show no critical errors
Last Updated: January 2024 Space: JatinAutonomousLabs/HonestAI Status: Ready for Deployment β