Spaces:

JatinAutonomousLabs
/

Research_AI_Assistant

Sleeping

App Files Files Community

Research_AI_Assistant / COMPATIBILITY.md

JatsTheAIGen

Initial commit V1

66dbebd about 2 months ago

preview code

raw

history blame contribute delete

2.66 kB

	# Compatibility Notes

	## Critical Version Constraints

	### Python
	- Python 3.9-3.11: HF Spaces typically supports these versions
	- Avoid Python 3.12+ for maximum compatibility

	### PyTorch
	- PyTorch 2.1.x: Latest stable with good HF ecosystem support
	- CPU-only builds for ZeroGPU deployments

	### Transformers
	- Transformers 4.35.x: Latest features with stability
	- Ensures compatibility with latest HF models

	### Gradio
	- Gradio 4.x: Current major version with mobile optimizations
	- Required for mobile-responsive interface

	## HF Spaces Specific Considerations

	### ZeroGPU Environment
	- Limited GPU memory: CPU-optimized versions are used
	- All models run on CPU
	- Use `faiss-cpu` instead of `faiss-gpu`

	### Storage Limits
	- Limited persistent storage: Efficient caching is crucial
	- Session data must be optimized for minimal storage usage
	- Implement aggressive cleanup policies

	### Network Restrictions
	- May have restrictions on external API calls
	- All LLM calls must use Hugging Face Inference API
	- Avoid external HTTP requests in production

	## Model Selection

	### For ZeroGPU
	- Embedding model: `sentence-transformers/all-MiniLM-L6-v2` (384d, fast)
	- Primary LLM: Use HF Inference API endpoint calls
	- Avoid local model loading for large models

	### Memory Optimization
	- Limit concurrent requests
	- Use streaming responses
	- Implement response compression

	## Performance Considerations

	### Cache Strategy
	- In-memory caching for active sessions
	- Aggressive cache eviction (LRU)
	- TTL-based expiration

	### Mobile Optimization
	- Reduced max tokens for mobile (800 vs 2000)
	- Shorter timeout (15s vs 30s)
	- Lazy loading of UI components

	## Dependencies Compatibility Matrix

	\| Package \| Version Range \| Notes \|
	\|---------\|---------------\|-------\|
	\| Python \| 3.9-3.11 \| HF Spaces supported versions \|
	\| PyTorch \| 2.1.x \| CPU version \|
	\| Transformers \| 4.35.x \| Latest stable \|
	\| Gradio \| 4.x \| Mobile support \|
	\| FAISS \| CPU-only \| No GPU support \|
	\| NumPy \| 1.24.x \| Compatibility layer \|

	## Known Issues & Workarounds

	### Issue: FAISS GPU Not Available
	Solution: Use `faiss-cpu` in requirements.txt

	### Issue: Model Loading Memory
	Solution: Use HF Inference API instead of local loading

	### Issue: Session Storage Limits
	Solution: Implement data compression and TTL-based cleanup

	### Issue: Concurrent Request Limits
	Solution: Implement request queue with max_workers limit

	## Testing Recommendations

	1. Test on ZeroGPU environment before production
	2. Verify memory usage stays under 512MB
	3. Test mobile responsiveness
	4. Validate cache efficiency (target: >60% hit rate)