Running on Zero Featured 309 Depth Anything 3 🏢 309 Generate depth maps from images using GPU acceleration
view post Post 6689 deepseek-ai/DeepSeek-OCR is out! 🔥 my take ⤵️ > pretty insane it can parse and re-render charts in HTML> it uses CLIP and SAM features concatenated, so better grounding> very efficient per vision tokens/performance ratio> covers 100 languages See translation 4 replies · 🚀 9 9 👍 3 3 + Reply
Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentation Paper • 2411.19331 • Published Nov 28, 2024 • 5
view post Post 4622 Just released a preview of Moondream 3! moondream/moondream3-previewThis is a 9B parameter, 2B active MoE VLM with state of the art visual reasoning capabilities.More details in the release blog post: https://moondream.ai/blog/moondream-3-preview See translation 3 replies · 🔥 13 13 👀 1 1 + Reply
view article Article AtlasOCR: Building the First Open-Source Darija OCR Model with Vision Language Models Sep 16 • 19