Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
5
28
Quentin
PRO
Qvelard
Follow
upgraedd's profile picture
kramp's profile picture
2 followers
·
24 following
quentinvelard
qvelard
quentin-velard
AI & ML interests
AI for industry Quantum Machine Learning Robotics
Recent Activity
upvoted
an
article
8 days ago
Continuous batching from first principles
liked
a Space
17 days ago
mlc-ai/MLC-Weight-Conversion
replied
to
their
post
25 days ago
Hey ! I'm working on a small-scale multi-drone control system and I'm looking for an open-source VLM that can run in real time on a Jetson Orin. If anyone knows a model or is personally interested in this kind of edge robotics problem, I'd love pointers. What I'm trying to solve : I have 4 simultaneous video streams coming from four drones (grayscale, 320×320 ). I can feed the model either: • a 2×2 mosaic frame, or • 4 separate frames as a batch. Along with this, I provide a short text instruction describing the mission state. What I need from the model : A single structured JSON command representing the next action for the swarm controller. Something like (not decided yet): ``` { "action": "move_forward", "confidence": 0.87, "reason": "front corridor detected, no obstacles in drone_2 and drone_4 views" } ``` So I need a VLM that can: • handle multi-image or mosaic image input • run efficiently on a Jetson Orin (ideally INT4/INT8 friendly, TensorRT-compatible) • generate stable JSON outputs based on visual + textual context I would really appreciate suggestions, or even just thoughts on what architectures make sense here. Models like https://huggingface.co/openbmb/MiniCPM-V-4_5, https://huggingface.co/dustnehowl/nanoVLM, and https://huggingface.co/Qwen/Qwen3-VL-8B-Instruct look promising, but I'm still exploring what’s actually viable on-device. Happy to share benchmarks or test anything people want to throw at this problem. The multi-drone video + action JSON setup is niche but potentially useful to others building edge-deployed agents.
View all activity
Organizations
Qvelard
's Spaces
1
Sort: Recently updated
Sleeping
First Agent Template
⚡
Create a psychological portrait based on recent X activity