Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

2,385

Full-text search

Active filters: multimodal

jinaai/jina-vlm

Image-Text-to-Text • 2B • Updated 5 days ago • 1.29k • 63

microsoft/Fara-7B

Image-Text-to-Text • 8B • Updated 9 days ago • 33.1k • 432

stepfun-ai/GELab-Zero-4B-preview

Image-to-Text • 4B • Updated 9 days ago • 796 • 92

Qwen/Qwen2.5-VL-7B-Instruct

Image-Text-to-Text • 8B • Updated Apr 6 • 3.31M • • 1.38k

Cognitive-Lab/NetraEmbed

Visual Document Retrieval • 4B • Updated 1 day ago • 334 • 15

Qwen/Qwen3-Omni-30B-A3B-Instruct

Any-to-Any • 35B • Updated Sep 22 • 284k • 745

OctoMed/OctoMed-7B

Image-Text-to-Text • 8B • Updated 3 days ago • 218 • 10

Qwen/Qwen2.5-VL-3B-Instruct

Image-Text-to-Text • 4B • Updated Apr 6 • 7.73M • 572

ByteDance-Seed/UI-TARS-1.5-7B

Image-Text-to-Text • 8B • Updated Apr 18 • 197k • 446

jinaai/jina-clip-v2

Feature Extraction • 0.9B • Updated Apr 28 • 190k • 297

omlab/VLM-FO1_Qwen2.5-VL-3B-v01

Object Detection • 4B • Updated 12 days ago • 1.92k • 12

xuemduan/reevaluate-clip

0.4B • Updated 5 days ago • 109 • 6

IDEA-Research/Rex-Omni

Image-Text-to-Text • 4B • Updated Oct 16 • 24.9k • 48

bytedance-research/Vidi-7B

9B • Updated 19 days ago • 470 • 8

cpatonn/Qwen3-Omni-30B-A3B-Instruct-AWQ-4bit

Any-to-Any • 10B • Updated Sep 28 • 24.9k • 33

ZJU-AI4H/Hulu-Med-7B

Image-Text-to-Text • 8B • Updated 13 days ago • 7.53k • 46

ByteDance/Dolphin-1.5

Image-Text-to-Text • 0.4B • Updated 28 days ago • 1.68k • 32

Qwen/Qwen2.5-Omni-7B

Any-to-Any • 11B • Updated Apr 30 • 143k • 1.83k

unsloth/Qwen2.5-Omni-7B-GGUF

Any-to-Any • 8B • Updated May 28 • 10.4k • 48

OpenGVLab/VideoChat-R1_5-7B

Video-Text-to-Text • 8B • Updated Oct 2 • 10.2k • 10

thesby/Qwen3-VL-8B-NSFW-Caption-V4.5

Image-to-Text • 9B • Updated Nov 7 • 16.4k • 46

Qwen/Qwen2-VL-2B-Instruct

Image-Text-to-Text • 2B • Updated Jan 12 • 1.99M • 472

OpenGVLab/InternVideo2_5_Chat_8B

Video-Text-to-Text • 8B • Updated Aug 4 • 45.7k • 87

Qwen/Qwen2.5-VL-72B-Instruct

Image-Text-to-Text • 73B • Updated Jun 6 • 115k • • 569

stepfun-ai/Step1X-Edit

Image-to-Image • Updated Jul 9 • 141 • 326

ByteDance/Dolphin

Image-Text-to-Text • 0.4B • Updated Jul 16 • 3.75k • 508

imageomics/bioclip-2

Zero-Shot Image Classification • Updated Oct 16 • 16.3k • 23

mispeech/midashenglm-7b-0804-fp32

Audio-Text-to-Text • 8B • Updated Oct 31 • 34.1k • 76

Qwen/Qwen3-Omni-30B-A3B-Captioner

Any-to-Any • 32B • Updated Sep 22 • 22.2k • 177

Qwen/Qwen3-Omni-30B-A3B-Thinking

Any-to-Any • 32B • Updated Sep 22 • 50.1k • 230