Research_AI_Assistant / FLASK_API_DEPLOYMENT_FILES.md
JatsTheAIGen's picture
api migration v2
7632802
# Flask API Only - Required Files List
This document lists all files needed for a **Flask API-only deployment** (no Gradio UI).
## πŸ“‹ Essential Files (Required)
### Core Application Files
```
Research_AI_Assistant/
β”œβ”€β”€ flask_api_standalone.py # Main Flask application (REQUIRED)
β”œβ”€β”€ Dockerfile.flask # Dockerfile for Flask deployment (rename to Dockerfile)
β”œβ”€β”€ README_FLASK_API.md # README with HF Spaces frontmatter (rename to README.md)
└── requirements.txt # Python dependencies (REQUIRED)
```
### Source Code Directory (`src/`)
```
Research_AI_Assistant/src/
β”œβ”€β”€ __init__.py # Package initialization
β”œβ”€β”€ config.py # Configuration settings
β”œβ”€β”€ llm_router.py # LLM routing (local GPU models)
β”œβ”€β”€ local_model_loader.py # GPU model loader (NEW - for local inference)
β”œβ”€β”€ orchestrator_engine.py # Main orchestrator
β”œβ”€β”€ context_manager.py # Context management
β”œβ”€β”€ models_config.py # Model configurations
β”œβ”€β”€ agents/
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ intent_agent.py # Intent recognition agent
β”‚ β”œβ”€β”€ synthesis_agent.py # Response synthesis agent
β”‚ β”œβ”€β”€ safety_agent.py # Safety checking agent
β”‚ └── skills_identification_agent.py # Skills identification agent
└── database.py # Database management (if used)
```
### Configuration Files (Optional but Recommended)
```
Research_AI_Assistant/
β”œβ”€β”€ .env # Environment variables (optional, use HF Secrets instead)
└── .gitignore # Git ignore rules
```
## πŸ“¦ File Descriptions
### 1. `flask_api_standalone.py` ⭐ REQUIRED
- **Purpose**: Main Flask application entry point
- **Contains**: API endpoints, orchestrator initialization, request handling
- **Key Features**:
- Local GPU model loading
- Async orchestrator support
- Health checks
- Error handling
### 2. `Dockerfile.flask` β†’ `Dockerfile` ⭐ REQUIRED
- **Purpose**: Container configuration
- **Action**: Rename to `Dockerfile` when deploying
- **Includes**: Python 3.10, system dependencies, health checks
### 3. `README_FLASK_API.md` β†’ `README.md` ⭐ REQUIRED
- **Purpose**: HF Spaces configuration and API documentation
- **Action**: Rename to `README.md` when deploying
- **Contains**: Frontmatter with `sdk: docker`, API endpoints, usage examples
### 4. `requirements.txt` ⭐ REQUIRED
- **Purpose**: Python package dependencies
- **Includes**: Flask, transformers, torch (GPU), sentence-transformers, etc.
### 5. `src/local_model_loader.py` ⭐ REQUIRED (NEW)
- **Purpose**: Loads models locally on GPU
- **Features**: GPU detection, model caching, FP16 optimization
### 6. `src/llm_router.py` ⭐ REQUIRED (UPDATED)
- **Purpose**: Routes inference requests
- **Features**: Tries local models first, falls back to HF API
### 7. `src/orchestrator_engine.py` ⭐ REQUIRED
- **Purpose**: Main AI orchestration engine
- **Contains**: Agent coordination, request processing
### 8. `src/context_manager.py` ⭐ REQUIRED
- **Purpose**: Manages conversation context
- **Features**: Session management, context retrieval
### 9. `src/agents/*.py` ⭐ REQUIRED
- **Purpose**: Individual AI agents
- **Agents**: Intent, Synthesis, Safety, Skills Identification
### 10. `src/config.py` ⭐ REQUIRED
- **Purpose**: Application configuration
- **Settings**: MAX_WORKERS=4, model paths, etc.
## ❌ Files NOT Needed (Gradio/UI Related)
These files can be **excluded** from Flask API deployment:
```
Research_AI_Assistant/
β”œβ”€β”€ app.py # Gradio UI (NOT NEEDED)
β”œβ”€β”€ main.py # Gradio + Flask launcher (NOT NEEDED)
β”œβ”€β”€ flask_api.py # Flask API (use standalone instead)
β”œβ”€β”€ Dockerfile # Main Dockerfile (use Dockerfile.flask)
β”œβ”€β”€ Dockerfile.hf # Alternative Dockerfile (NOT NEEDED)
β”œβ”€β”€ README.md # Main README (use README_FLASK_API.md)
└── All .md files except this one # Documentation (optional)
```
## πŸš€ Quick Deployment Checklist
### Step 1: Prepare Files
```bash
# In your Flask API Space directory:
cp Dockerfile.flask Dockerfile
cp README_FLASK_API.md README.md
```
### Step 2: Verify Structure
```
Your Space/
β”œβ”€β”€ Dockerfile # βœ… Renamed from Dockerfile.flask
β”œβ”€β”€ README.md # βœ… Renamed from README_FLASK_API.md
β”œβ”€β”€ flask_api_standalone.py # βœ… Main Flask app
β”œβ”€β”€ requirements.txt # βœ… Dependencies
└── src/ # βœ… All source files
β”œβ”€β”€ __init__.py
β”œβ”€β”€ config.py
β”œβ”€β”€ llm_router.py
β”œβ”€β”€ local_model_loader.py
β”œβ”€β”€ orchestrator_engine.py
β”œβ”€β”€ context_manager.py
β”œβ”€β”€ models_config.py
└── agents/
β”œβ”€β”€ __init__.py
β”œβ”€β”€ intent_agent.py
β”œβ”€β”€ synthesis_agent.py
β”œβ”€β”€ safety_agent.py
└── skills_identification_agent.py
```
### Step 3: Set Environment Variables
In HF Spaces Settings β†’ Secrets:
- `HF_TOKEN` - Your Hugging Face token
### Step 4: Deploy
- Select **NVIDIA T4 Medium** GPU
- Set **SDK: docker**
- Deploy
## πŸ“Š File Size Considerations
### Minimal Deployment (Essential Only)
- Core files: ~50 KB
- Source code: ~500 KB
- **Total**: ~550 KB code
### With Models (First Load)
- Code: ~550 KB
- Models (downloaded on first run): ~14-16 GB
- **Total**: ~14-16 GB (first build)
### Subsequent Builds
- Models cached by HF Spaces
- Code only: ~550 KB
## πŸ” Verification
After deployment, verify these files exist:
```bash
# Check main files
ls -la Dockerfile README.md flask_api_standalone.py requirements.txt
# Check source directory
ls -la src/
ls -la src/agents/
# Verify key components
grep -r "local_model_loader" src/llm_router.py
grep -r "MAX_WORKERS" src/config.py
```
## πŸ“ Summary
**Minimum Required Files:**
1. `flask_api_standalone.py`
2. `Dockerfile` (from Dockerfile.flask)
3. `README.md` (from README_FLASK_API.md)
4. `requirements.txt`
5. All files in `src/` directory
**Total: ~15-20 files** (excluding documentation)
---
**Note**: This is a minimal deployment. All Gradio UI files, documentation, and test files are optional and can be excluded to reduce repository size.