File size: 6,617 Bytes
7632802
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
# Flask API Only - Required Files List

This document lists all files needed for a **Flask API-only deployment** (no Gradio UI).

## πŸ“‹ Essential Files (Required)

### Core Application Files
```
Research_AI_Assistant/
β”œβ”€β”€ flask_api_standalone.py          # Main Flask application (REQUIRED)
β”œβ”€β”€ Dockerfile.flask                  # Dockerfile for Flask deployment (rename to Dockerfile)
β”œβ”€β”€ README_FLASK_API.md              # README with HF Spaces frontmatter (rename to README.md)
└── requirements.txt                 # Python dependencies (REQUIRED)
```

### Source Code Directory (`src/`)
```
Research_AI_Assistant/src/
β”œβ”€β”€ __init__.py                      # Package initialization
β”œβ”€β”€ config.py                        # Configuration settings
β”œβ”€β”€ llm_router.py                    # LLM routing (local GPU models)
β”œβ”€β”€ local_model_loader.py            # GPU model loader (NEW - for local inference)
β”œβ”€β”€ orchestrator_engine.py           # Main orchestrator
β”œβ”€β”€ context_manager.py               # Context management
β”œβ”€β”€ models_config.py                 # Model configurations
β”œβ”€β”€ agents/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ intent_agent.py              # Intent recognition agent
β”‚   β”œβ”€β”€ synthesis_agent.py            # Response synthesis agent
β”‚   β”œβ”€β”€ safety_agent.py               # Safety checking agent
β”‚   └── skills_identification_agent.py # Skills identification agent
└── database.py                      # Database management (if used)
```

### Configuration Files (Optional but Recommended)
```
Research_AI_Assistant/
β”œβ”€β”€ .env                             # Environment variables (optional, use HF Secrets instead)
└── .gitignore                       # Git ignore rules
```

## πŸ“¦ File Descriptions

### 1. `flask_api_standalone.py` ⭐ REQUIRED
- **Purpose**: Main Flask application entry point
- **Contains**: API endpoints, orchestrator initialization, request handling
- **Key Features**: 
  - Local GPU model loading
  - Async orchestrator support
  - Health checks
  - Error handling

### 2. `Dockerfile.flask` β†’ `Dockerfile` ⭐ REQUIRED
- **Purpose**: Container configuration
- **Action**: Rename to `Dockerfile` when deploying
- **Includes**: Python 3.10, system dependencies, health checks

### 3. `README_FLASK_API.md` β†’ `README.md` ⭐ REQUIRED
- **Purpose**: HF Spaces configuration and API documentation
- **Action**: Rename to `README.md` when deploying
- **Contains**: Frontmatter with `sdk: docker`, API endpoints, usage examples

### 4. `requirements.txt` ⭐ REQUIRED
- **Purpose**: Python package dependencies
- **Includes**: Flask, transformers, torch (GPU), sentence-transformers, etc.

### 5. `src/local_model_loader.py` ⭐ REQUIRED (NEW)
- **Purpose**: Loads models locally on GPU
- **Features**: GPU detection, model caching, FP16 optimization

### 6. `src/llm_router.py` ⭐ REQUIRED (UPDATED)
- **Purpose**: Routes inference requests
- **Features**: Tries local models first, falls back to HF API

### 7. `src/orchestrator_engine.py` ⭐ REQUIRED
- **Purpose**: Main AI orchestration engine
- **Contains**: Agent coordination, request processing

### 8. `src/context_manager.py` ⭐ REQUIRED
- **Purpose**: Manages conversation context
- **Features**: Session management, context retrieval

### 9. `src/agents/*.py` ⭐ REQUIRED
- **Purpose**: Individual AI agents
- **Agents**: Intent, Synthesis, Safety, Skills Identification

### 10. `src/config.py` ⭐ REQUIRED
- **Purpose**: Application configuration
- **Settings**: MAX_WORKERS=4, model paths, etc.

## ❌ Files NOT Needed (Gradio/UI Related)

These files can be **excluded** from Flask API deployment:

```
Research_AI_Assistant/
β”œβ”€β”€ app.py                           # Gradio UI (NOT NEEDED)
β”œβ”€β”€ main.py                           # Gradio + Flask launcher (NOT NEEDED)
β”œβ”€β”€ flask_api.py                      # Flask API (use standalone instead)
β”œβ”€β”€ Dockerfile                        # Main Dockerfile (use Dockerfile.flask)
β”œβ”€β”€ Dockerfile.hf                     # Alternative Dockerfile (NOT NEEDED)
β”œβ”€β”€ README.md                         # Main README (use README_FLASK_API.md)
└── All .md files except this one     # Documentation (optional)
```

## πŸš€ Quick Deployment Checklist

### Step 1: Prepare Files
```bash
# In your Flask API Space directory:
cp Dockerfile.flask Dockerfile
cp README_FLASK_API.md README.md
```

### Step 2: Verify Structure
```
Your Space/
β”œβ”€β”€ Dockerfile                        # βœ… Renamed from Dockerfile.flask
β”œβ”€β”€ README.md                         # βœ… Renamed from README_FLASK_API.md
β”œβ”€β”€ flask_api_standalone.py          # βœ… Main Flask app
β”œβ”€β”€ requirements.txt                  # βœ… Dependencies
└── src/                              # βœ… All source files
    β”œβ”€β”€ __init__.py
    β”œβ”€β”€ config.py
    β”œβ”€β”€ llm_router.py
    β”œβ”€β”€ local_model_loader.py
    β”œβ”€β”€ orchestrator_engine.py
    β”œβ”€β”€ context_manager.py
    β”œβ”€β”€ models_config.py
    └── agents/
        β”œβ”€β”€ __init__.py
        β”œβ”€β”€ intent_agent.py
        β”œβ”€β”€ synthesis_agent.py
        β”œβ”€β”€ safety_agent.py
        └── skills_identification_agent.py
```

### Step 3: Set Environment Variables
In HF Spaces Settings β†’ Secrets:
- `HF_TOKEN` - Your Hugging Face token

### Step 4: Deploy
- Select **NVIDIA T4 Medium** GPU
- Set **SDK: docker**
- Deploy

## πŸ“Š File Size Considerations

### Minimal Deployment (Essential Only)
- Core files: ~50 KB
- Source code: ~500 KB
- **Total**: ~550 KB code

### With Models (First Load)
- Code: ~550 KB
- Models (downloaded on first run): ~14-16 GB
- **Total**: ~14-16 GB (first build)

### Subsequent Builds
- Models cached by HF Spaces
- Code only: ~550 KB

## πŸ” Verification

After deployment, verify these files exist:

```bash
# Check main files
ls -la Dockerfile README.md flask_api_standalone.py requirements.txt

# Check source directory
ls -la src/
ls -la src/agents/

# Verify key components
grep -r "local_model_loader" src/llm_router.py
grep -r "MAX_WORKERS" src/config.py
```

## πŸ“ Summary

**Minimum Required Files:**
1. `flask_api_standalone.py`
2. `Dockerfile` (from Dockerfile.flask)
3. `README.md` (from README_FLASK_API.md)
4. `requirements.txt`
5. All files in `src/` directory

**Total: ~15-20 files** (excluding documentation)

---

**Note**: This is a minimal deployment. All Gradio UI files, documentation, and test files are optional and can be excluded to reduce repository size.