Spaces:

JatinAutonomousLabs
/

Research_AI_Assistant

Sleeping

App Files Files Community

Research_AI_Assistant / COMPATIBILITY.md

JatsTheAIGen

Initial commit V1

66dbebd about 2 months ago

preview code

raw

history blame

2.66 kB

Compatibility Notes

Critical Version Constraints

Python

Python 3.9-3.11: HF Spaces typically supports these versions
Avoid Python 3.12+ for maximum compatibility

PyTorch

PyTorch 2.1.x: Latest stable with good HF ecosystem support
CPU-only builds for ZeroGPU deployments

Transformers

Transformers 4.35.x: Latest features with stability
Ensures compatibility with latest HF models

Gradio

Gradio 4.x: Current major version with mobile optimizations
Required for mobile-responsive interface

HF Spaces Specific Considerations

ZeroGPU Environment

Limited GPU memory: CPU-optimized versions are used
All models run on CPU
Use faiss-cpu instead of faiss-gpu

Storage Limits

Limited persistent storage: Efficient caching is crucial
Session data must be optimized for minimal storage usage
Implement aggressive cleanup policies

Network Restrictions

May have restrictions on external API calls
All LLM calls must use Hugging Face Inference API
Avoid external HTTP requests in production

Model Selection

For ZeroGPU

Embedding model: sentence-transformers/all-MiniLM-L6-v2 (384d, fast)
Primary LLM: Use HF Inference API endpoint calls
Avoid local model loading for large models

Memory Optimization

Limit concurrent requests
Use streaming responses
Implement response compression

Performance Considerations

Cache Strategy

In-memory caching for active sessions
Aggressive cache eviction (LRU)
TTL-based expiration

Mobile Optimization

Reduced max tokens for mobile (800 vs 2000)
Shorter timeout (15s vs 30s)
Lazy loading of UI components

Dependencies Compatibility Matrix

Package	Version Range	Notes
Python	3.9-3.11	HF Spaces supported versions
PyTorch	2.1.x	CPU version
Transformers	4.35.x	Latest stable
Gradio	4.x	Mobile support
FAISS	CPU-only	No GPU support
NumPy	1.24.x	Compatibility layer

Known Issues & Workarounds

Issue: FAISS GPU Not Available

Solution: Use faiss-cpu in requirements.txt

Issue: Model Loading Memory

Solution: Use HF Inference API instead of local loading

Issue: Session Storage Limits

Solution: Implement data compression and TTL-based cleanup

Issue: Concurrent Request Limits

Solution: Implement request queue with max_workers limit

Testing Recommendations

Test on ZeroGPU environment before production
Verify memory usage stays under 512MB
Test mobile responsiveness
Validate cache efficiency (target: >60% hit rate)