Research_AI_Assistant / FLASK_API_DEPLOYMENT_FILES.md
JatsTheAIGen's picture
api migration v2
7632802

Flask API Only - Required Files List

This document lists all files needed for a Flask API-only deployment (no Gradio UI).

πŸ“‹ Essential Files (Required)

Core Application Files

Research_AI_Assistant/
β”œβ”€β”€ flask_api_standalone.py          # Main Flask application (REQUIRED)
β”œβ”€β”€ Dockerfile.flask                  # Dockerfile for Flask deployment (rename to Dockerfile)
β”œβ”€β”€ README_FLASK_API.md              # README with HF Spaces frontmatter (rename to README.md)
└── requirements.txt                 # Python dependencies (REQUIRED)

Source Code Directory (src/)

Research_AI_Assistant/src/
β”œβ”€β”€ __init__.py                      # Package initialization
β”œβ”€β”€ config.py                        # Configuration settings
β”œβ”€β”€ llm_router.py                    # LLM routing (local GPU models)
β”œβ”€β”€ local_model_loader.py            # GPU model loader (NEW - for local inference)
β”œβ”€β”€ orchestrator_engine.py           # Main orchestrator
β”œβ”€β”€ context_manager.py               # Context management
β”œβ”€β”€ models_config.py                 # Model configurations
β”œβ”€β”€ agents/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ intent_agent.py              # Intent recognition agent
β”‚   β”œβ”€β”€ synthesis_agent.py            # Response synthesis agent
β”‚   β”œβ”€β”€ safety_agent.py               # Safety checking agent
β”‚   └── skills_identification_agent.py # Skills identification agent
└── database.py                      # Database management (if used)

Configuration Files (Optional but Recommended)

Research_AI_Assistant/
β”œβ”€β”€ .env                             # Environment variables (optional, use HF Secrets instead)
└── .gitignore                       # Git ignore rules

πŸ“¦ File Descriptions

1. flask_api_standalone.py ⭐ REQUIRED

  • Purpose: Main Flask application entry point
  • Contains: API endpoints, orchestrator initialization, request handling
  • Key Features:
    • Local GPU model loading
    • Async orchestrator support
    • Health checks
    • Error handling

2. Dockerfile.flask β†’ Dockerfile ⭐ REQUIRED

  • Purpose: Container configuration
  • Action: Rename to Dockerfile when deploying
  • Includes: Python 3.10, system dependencies, health checks

3. README_FLASK_API.md β†’ README.md ⭐ REQUIRED

  • Purpose: HF Spaces configuration and API documentation
  • Action: Rename to README.md when deploying
  • Contains: Frontmatter with sdk: docker, API endpoints, usage examples

4. requirements.txt ⭐ REQUIRED

  • Purpose: Python package dependencies
  • Includes: Flask, transformers, torch (GPU), sentence-transformers, etc.

5. src/local_model_loader.py ⭐ REQUIRED (NEW)

  • Purpose: Loads models locally on GPU
  • Features: GPU detection, model caching, FP16 optimization

6. src/llm_router.py ⭐ REQUIRED (UPDATED)

  • Purpose: Routes inference requests
  • Features: Tries local models first, falls back to HF API

7. src/orchestrator_engine.py ⭐ REQUIRED

  • Purpose: Main AI orchestration engine
  • Contains: Agent coordination, request processing

8. src/context_manager.py ⭐ REQUIRED

  • Purpose: Manages conversation context
  • Features: Session management, context retrieval

9. src/agents/*.py ⭐ REQUIRED

  • Purpose: Individual AI agents
  • Agents: Intent, Synthesis, Safety, Skills Identification

10. src/config.py ⭐ REQUIRED

  • Purpose: Application configuration
  • Settings: MAX_WORKERS=4, model paths, etc.

❌ Files NOT Needed (Gradio/UI Related)

These files can be excluded from Flask API deployment:

Research_AI_Assistant/
β”œβ”€β”€ app.py                           # Gradio UI (NOT NEEDED)
β”œβ”€β”€ main.py                           # Gradio + Flask launcher (NOT NEEDED)
β”œβ”€β”€ flask_api.py                      # Flask API (use standalone instead)
β”œβ”€β”€ Dockerfile                        # Main Dockerfile (use Dockerfile.flask)
β”œβ”€β”€ Dockerfile.hf                     # Alternative Dockerfile (NOT NEEDED)
β”œβ”€β”€ README.md                         # Main README (use README_FLASK_API.md)
└── All .md files except this one     # Documentation (optional)

πŸš€ Quick Deployment Checklist

Step 1: Prepare Files

# In your Flask API Space directory:
cp Dockerfile.flask Dockerfile
cp README_FLASK_API.md README.md

Step 2: Verify Structure

Your Space/
β”œβ”€β”€ Dockerfile                        # βœ… Renamed from Dockerfile.flask
β”œβ”€β”€ README.md                         # βœ… Renamed from README_FLASK_API.md
β”œβ”€β”€ flask_api_standalone.py          # βœ… Main Flask app
β”œβ”€β”€ requirements.txt                  # βœ… Dependencies
└── src/                              # βœ… All source files
    β”œβ”€β”€ __init__.py
    β”œβ”€β”€ config.py
    β”œβ”€β”€ llm_router.py
    β”œβ”€β”€ local_model_loader.py
    β”œβ”€β”€ orchestrator_engine.py
    β”œβ”€β”€ context_manager.py
    β”œβ”€β”€ models_config.py
    └── agents/
        β”œβ”€β”€ __init__.py
        β”œβ”€β”€ intent_agent.py
        β”œβ”€β”€ synthesis_agent.py
        β”œβ”€β”€ safety_agent.py
        └── skills_identification_agent.py

Step 3: Set Environment Variables

In HF Spaces Settings β†’ Secrets:

  • HF_TOKEN - Your Hugging Face token

Step 4: Deploy

  • Select NVIDIA T4 Medium GPU
  • Set SDK: docker
  • Deploy

πŸ“Š File Size Considerations

Minimal Deployment (Essential Only)

  • Core files: ~50 KB
  • Source code: ~500 KB
  • Total: ~550 KB code

With Models (First Load)

  • Code: ~550 KB
  • Models (downloaded on first run): ~14-16 GB
  • Total: ~14-16 GB (first build)

Subsequent Builds

  • Models cached by HF Spaces
  • Code only: ~550 KB

πŸ” Verification

After deployment, verify these files exist:

# Check main files
ls -la Dockerfile README.md flask_api_standalone.py requirements.txt

# Check source directory
ls -la src/
ls -la src/agents/

# Verify key components
grep -r "local_model_loader" src/llm_router.py
grep -r "MAX_WORKERS" src/config.py

πŸ“ Summary

Minimum Required Files:

  1. flask_api_standalone.py
  2. Dockerfile (from Dockerfile.flask)
  3. README.md (from README_FLASK_API.md)
  4. requirements.txt
  5. All files in src/ directory

Total: ~15-20 files (excluding documentation)


Note: This is a minimal deployment. All Gradio UI files, documentation, and test files are optional and can be excluded to reduce repository size.