Research_AI_Assistant / COMPATIBILITY.md
JatsTheAIGen's picture
Initial commit V1
66dbebd
|
raw
history blame
2.66 kB

Compatibility Notes

Critical Version Constraints

Python

  • Python 3.9-3.11: HF Spaces typically supports these versions
  • Avoid Python 3.12+ for maximum compatibility

PyTorch

  • PyTorch 2.1.x: Latest stable with good HF ecosystem support
  • CPU-only builds for ZeroGPU deployments

Transformers

  • Transformers 4.35.x: Latest features with stability
  • Ensures compatibility with latest HF models

Gradio

  • Gradio 4.x: Current major version with mobile optimizations
  • Required for mobile-responsive interface

HF Spaces Specific Considerations

ZeroGPU Environment

  • Limited GPU memory: CPU-optimized versions are used
  • All models run on CPU
  • Use faiss-cpu instead of faiss-gpu

Storage Limits

  • Limited persistent storage: Efficient caching is crucial
  • Session data must be optimized for minimal storage usage
  • Implement aggressive cleanup policies

Network Restrictions

  • May have restrictions on external API calls
  • All LLM calls must use Hugging Face Inference API
  • Avoid external HTTP requests in production

Model Selection

For ZeroGPU

  • Embedding model: sentence-transformers/all-MiniLM-L6-v2 (384d, fast)
  • Primary LLM: Use HF Inference API endpoint calls
  • Avoid local model loading for large models

Memory Optimization

  • Limit concurrent requests
  • Use streaming responses
  • Implement response compression

Performance Considerations

Cache Strategy

  • In-memory caching for active sessions
  • Aggressive cache eviction (LRU)
  • TTL-based expiration

Mobile Optimization

  • Reduced max tokens for mobile (800 vs 2000)
  • Shorter timeout (15s vs 30s)
  • Lazy loading of UI components

Dependencies Compatibility Matrix

Package Version Range Notes
Python 3.9-3.11 HF Spaces supported versions
PyTorch 2.1.x CPU version
Transformers 4.35.x Latest stable
Gradio 4.x Mobile support
FAISS CPU-only No GPU support
NumPy 1.24.x Compatibility layer

Known Issues & Workarounds

Issue: FAISS GPU Not Available

Solution: Use faiss-cpu in requirements.txt

Issue: Model Loading Memory

Solution: Use HF Inference API instead of local loading

Issue: Session Storage Limits

Solution: Implement data compression and TTL-based cleanup

Issue: Concurrent Request Limits

Solution: Implement request queue with max_workers limit

Testing Recommendations

  1. Test on ZeroGPU environment before production
  2. Verify memory usage stays under 512MB
  3. Test mobile responsiveness
  4. Validate cache efficiency (target: >60% hit rate)