Research_AI_Assistant / COMPATIBILITY.md
JatsTheAIGen's picture
Initial commit V1
66dbebd
# Compatibility Notes
## Critical Version Constraints
### Python
- **Python 3.9-3.11**: HF Spaces typically supports these versions
- Avoid Python 3.12+ for maximum compatibility
### PyTorch
- **PyTorch 2.1.x**: Latest stable with good HF ecosystem support
- CPU-only builds for ZeroGPU deployments
### Transformers
- **Transformers 4.35.x**: Latest features with stability
- Ensures compatibility with latest HF models
### Gradio
- **Gradio 4.x**: Current major version with mobile optimizations
- Required for mobile-responsive interface
## HF Spaces Specific Considerations
### ZeroGPU Environment
- **Limited GPU memory**: CPU-optimized versions are used
- All models run on CPU
- Use `faiss-cpu` instead of `faiss-gpu`
### Storage Limits
- **Limited persistent storage**: Efficient caching is crucial
- Session data must be optimized for minimal storage usage
- Implement aggressive cleanup policies
### Network Restrictions
- **May have restrictions on external API calls**
- All LLM calls must use Hugging Face Inference API
- Avoid external HTTP requests in production
## Model Selection
### For ZeroGPU
- **Embedding model**: `sentence-transformers/all-MiniLM-L6-v2` (384d, fast)
- **Primary LLM**: Use HF Inference API endpoint calls
- **Avoid local model loading** for large models
### Memory Optimization
- Limit concurrent requests
- Use streaming responses
- Implement response compression
## Performance Considerations
### Cache Strategy
- In-memory caching for active sessions
- Aggressive cache eviction (LRU)
- TTL-based expiration
### Mobile Optimization
- Reduced max tokens for mobile (800 vs 2000)
- Shorter timeout (15s vs 30s)
- Lazy loading of UI components
## Dependencies Compatibility Matrix
| Package | Version Range | Notes |
|---------|---------------|-------|
| Python | 3.9-3.11 | HF Spaces supported versions |
| PyTorch | 2.1.x | CPU version |
| Transformers | 4.35.x | Latest stable |
| Gradio | 4.x | Mobile support |
| FAISS | CPU-only | No GPU support |
| NumPy | 1.24.x | Compatibility layer |
## Known Issues & Workarounds
### Issue: FAISS GPU Not Available
**Solution**: Use `faiss-cpu` in requirements.txt
### Issue: Model Loading Memory
**Solution**: Use HF Inference API instead of local loading
### Issue: Session Storage Limits
**Solution**: Implement data compression and TTL-based cleanup
### Issue: Concurrent Request Limits
**Solution**: Implement request queue with max_workers limit
## Testing Recommendations
1. Test on ZeroGPU environment before production
2. Verify memory usage stays under 512MB
3. Test mobile responsiveness
4. Validate cache efficiency (target: >60% hit rate)