HonestAI

Paused

App Files Files Community

HonestAI / HF_SPACES_DEPLOYMENT.md

JatsTheAIGen

Security Enhancements: Production WSGI, Rate Limiting, Security Headers, Secure Logging

79ea999 about 1 month ago

preview code

raw

history blame contribute delete

5.25 kB

	# Hugging Face Spaces Deployment Guide - HonestAI

	## 🚀 Deployment to HF Spaces

	This guide covers deploying the updated HonestAI application to [Hugging Face Spaces](https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI).

	## 📋 Pre-Deployment Checklist

	### ✅ Required Files
	- [x] `Dockerfile` - Container configuration
	- [x] `requirements.txt` - Python dependencies
	- [x] `flask_api_standalone.py` - Main application entry point
	- [x] `README.md` - Updated with HonestAI Space URL
	- [x] `src/` - All source code
	- [x] `.env.example` - Environment variable template

	### ✅ Recent Updates Included
	- [x] Enhanced configuration management (`src/config.py`)
	- [x] Performance metrics tracking (`src/orchestrator_engine.py`)
	- [x] Updated model configurations (Llama 3.1 8B, e5-base-v2, Qwen 2.5 1.5B)
	- [x] 4-bit quantization support
	- [x] Cache directory management
	- [x] Memory optimizations

	## 🔧 Deployment Steps

	### 1. Verify Space Configuration

	Space URL: https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI

	Space Settings:
	- SDK: Docker
	- Hardware: T4 GPU (16GB)
	- Visibility: Public
	- Storage: Persistent (for cache)

	### 2. Set Environment Variables

	In Space Settings → Repository secrets, ensure:
	- `HF_TOKEN` - Your Hugging Face API token (required)
	- `MAX_WORKERS` - Optional (default: 4)
	- `LOG_LEVEL` - Optional (default: INFO)
	- `HF_HOME` - Optional (auto-configured)

	### 3. Verify Dockerfile

	The `Dockerfile` is configured for:
	- Python 3.10
	- Port 7860 (HF Spaces standard)
	- Health check endpoint
	- Flask API as entry point

	### 4. Commit and Push Updates

	```bash
	# Ensure all changes are committed
	git add .
	git commit -m "Update: Performance metrics, enhanced config, model optimizations"

	# Push to HF Spaces repository
	git push origin main
	```

	### 5. Monitor Build

	- Build Time: 5-10 minutes (first build may take longer)
	- Watch Logs: Check Space logs for build progress
	- Health Check: `/api/health` endpoint should respond after build

	## 📊 What's New in This Deployment

	### 1. Performance Metrics
	Every API response now includes comprehensive performance data:
	```json
	{
	"performance": {
	"processing_time": 1230.5,
	"tokens_used": 456,
	"agents_used": 4,
	"confidence_score": 85.2,
	"agent_contributions": [...],
	"safety_score": 85.0
	}
	}
	```

	### 2. Enhanced Configuration
	- Automatic cache directory management
	- Secure environment variable handling
	- Backward compatible settings
	- Validation and error handling

	### 3. Model Optimizations
	- Llama 3.1 8B with 4-bit quantization (primary)
	- e5-base-v2 for embeddings (768 dimensions)
	- Qwen 2.5 1.5B for fast classification
	- Model preloading for faster responses

	### 4. Memory Management
	- Optimized history tracking (limited to 50-100 entries)
	- Efficient agent call tracking
	- Memory-aware caching

	## 🧪 Testing After Deployment

	### 1. Health Check
	```bash
	curl https://jatinautonomouslabs-honestai.hf.space/api/health
	```

	### 2. Test API Endpoint
	```python
	import requests

	response = requests.post(
	"https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI/api/chat",
	json={
	"message": "Hello, what is machine learning?",
	"session_id": "test-session",
	"user_id": "test-user"
	}
	)

	data = response.json()
	print(f"Response: {data['message']}")
	print(f"Performance: {data.get('performance', {})}")
	```

	### 3. Verify Performance Metrics
	Check that performance metrics are populated (not all zeros):
	- `processing_time` > 0
	- `tokens_used` > 0
	- `agents_used` > 0
	- `agent_contributions` not empty

	## 🔍 Troubleshooting

	### Build Fails
	- Check `requirements.txt` for conflicts
	- Verify Python version (3.10)
	- Check Dockerfile syntax

	### Runtime Errors
	- Verify `HF_TOKEN` is set in Space secrets
	- Check logs for permission errors
	- Verify cache directory is writable

	### Performance Issues
	- Check GPU memory usage
	- Monitor model loading times
	- Verify quantization is enabled

	### API Not Responding
	- Check health endpoint: `/api/health`
	- Verify Flask app is running on port 7860
	- Check Space logs for errors

	## 📝 Post-Deployment

	### 1. Update Documentation
	- ✅ README.md updated with HonestAI URL
	- ✅ HF_SPACES_URL_GUIDE.md updated
	- ✅ API_DOCUMENTATION.md includes performance metrics

	### 2. Monitor Metrics
	- Track response times
	- Monitor error rates
	- Check performance metrics accuracy

	### 3. User Communication
	- Announce new features (performance metrics)
	- Update API documentation
	- Share new Space URL

	## 🔗 Quick Links

	- Space: https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI
	- API Documentation: See `API_DOCUMENTATION.md`
	- Configuration Guide: See `.env.example`
	- Performance Metrics: See `PERFORMANCE_METRICS_IMPLEMENTATION.md`

	## ✅ Success Criteria

	After deployment, verify:
	1. ✅ Space builds successfully
	2. ✅ Health endpoint responds
	3. ✅ API chat endpoint works
	4. ✅ Performance metrics are populated
	5. ✅ Models load with 4-bit quantization
	6. ✅ Cache directory is configured
	7. ✅ Logs show no critical errors

	---

	Last Updated: January 2024
	Space: JatinAutonomousLabs/HonestAI
	Status: Ready for Deployment ✅