HonestAI

Paused

App Files Files Community

HonestAI / HF_SPACES_DEPLOYMENT.md

JatsTheAIGen

Security Enhancements: Production WSGI, Rate Limiting, Security Headers, Secure Logging

79ea999 about 1 month ago

preview code

raw

history blame contribute delete

5.25 kB

Hugging Face Spaces Deployment Guide - HonestAI

🚀 Deployment to HF Spaces

This guide covers deploying the updated HonestAI application to Hugging Face Spaces.

📋 Pre-Deployment Checklist

✅ Required Files

Dockerfile - Container configuration
requirements.txt - Python dependencies
flask_api_standalone.py - Main application entry point
README.md - Updated with HonestAI Space URL
src/ - All source code
.env.example - Environment variable template

✅ Recent Updates Included

Enhanced configuration management (src/config.py)
Performance metrics tracking (src/orchestrator_engine.py)
Updated model configurations (Llama 3.1 8B, e5-base-v2, Qwen 2.5 1.5B)
4-bit quantization support
Cache directory management
Memory optimizations

🔧 Deployment Steps

1. Verify Space Configuration

Space URL: https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI

Space Settings:

SDK: Docker
Hardware: T4 GPU (16GB)
Visibility: Public
Storage: Persistent (for cache)

2. Set Environment Variables

In Space Settings → Repository secrets, ensure:

HF_TOKEN - Your Hugging Face API token (required)
MAX_WORKERS - Optional (default: 4)
LOG_LEVEL - Optional (default: INFO)
HF_HOME - Optional (auto-configured)

3. Verify Dockerfile

The Dockerfile is configured for:

Python 3.10
Port 7860 (HF Spaces standard)
Health check endpoint
Flask API as entry point

4. Commit and Push Updates

# Ensure all changes are committed
git add .
git commit -m "Update: Performance metrics, enhanced config, model optimizations"

# Push to HF Spaces repository
git push origin main

5. Monitor Build

Build Time: 5-10 minutes (first build may take longer)
Watch Logs: Check Space logs for build progress
Health Check: /api/health endpoint should respond after build

📊 What's New in This Deployment

1. Performance Metrics

Every API response now includes comprehensive performance data:

{
  "performance": {
    "processing_time": 1230.5,
    "tokens_used": 456,
    "agents_used": 4,
    "confidence_score": 85.2,
    "agent_contributions": [...],
    "safety_score": 85.0
  }
}

2. Enhanced Configuration

Automatic cache directory management
Secure environment variable handling
Backward compatible settings
Validation and error handling

3. Model Optimizations

Llama 3.1 8B with 4-bit quantization (primary)
e5-base-v2 for embeddings (768 dimensions)
Qwen 2.5 1.5B for fast classification
Model preloading for faster responses

4. Memory Management

Optimized history tracking (limited to 50-100 entries)
Efficient agent call tracking
Memory-aware caching

🧪 Testing After Deployment

1. Health Check

curl https://jatinautonomouslabs-honestai.hf.space/api/health

2. Test API Endpoint

import requests

response = requests.post(
    "https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI/api/chat",
    json={
        "message": "Hello, what is machine learning?",
        "session_id": "test-session",
        "user_id": "test-user"
    }
)

data = response.json()
print(f"Response: {data['message']}")
print(f"Performance: {data.get('performance', {})}")

3. Verify Performance Metrics

Check that performance metrics are populated (not all zeros):

processing_time > 0
tokens_used > 0
agents_used > 0
agent_contributions not empty

🔍 Troubleshooting

Build Fails

Check requirements.txt for conflicts
Verify Python version (3.10)
Check Dockerfile syntax

Runtime Errors

Verify HF_TOKEN is set in Space secrets
Check logs for permission errors
Verify cache directory is writable

Performance Issues

Check GPU memory usage
Monitor model loading times
Verify quantization is enabled

API Not Responding

Check health endpoint: /api/health
Verify Flask app is running on port 7860
Check Space logs for errors

📝 Post-Deployment

1. Update Documentation

✅ README.md updated with HonestAI URL
✅ HF_SPACES_URL_GUIDE.md updated
✅ API_DOCUMENTATION.md includes performance metrics

2. Monitor Metrics

Track response times
Monitor error rates
Check performance metrics accuracy

3. User Communication

Announce new features (performance metrics)
Update API documentation
Share new Space URL

🔗 Quick Links

Space: https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI
API Documentation: See API_DOCUMENTATION.md
Configuration Guide: See .env.example
Performance Metrics: See PERFORMANCE_METRICS_IMPLEMENTATION.md

✅ Success Criteria

After deployment, verify:

✅ Space builds successfully
✅ Health endpoint responds
✅ API chat endpoint works
✅ Performance metrics are populated
✅ Models load with 4-bit quantization
✅ Cache directory is configured
✅ Logs show no critical errors

Last Updated: January 2024 Space: JatinAutonomousLabs/HonestAI Status: Ready for Deployment ✅