RemiFabre commited on
Commit
0ecbc6c
·
1 Parent(s): 10aa8a0

Improved README

Browse files
Files changed (1) hide show
  1. README.md +11 -17
README.md CHANGED
@@ -4,13 +4,9 @@ Conversational demo for the Reachy Mini robot combining OpenAI's realtime APIs,
4
 
5
  ## Overview
6
  - Real-time audio conversation loop powered by the OpenAI realtime API and `fastrtc` for low-latency streaming.
7
- - Motion control queue that blends scripted dances, recorded emotions, idle breathing, and speech-reactive head wobbling.
8
- - Optional camera worker with YOLO or MediaPipe-based head tracking and LLM-accessible scene capture.
9
-
10
- ## Features
11
- - Async tool dispatch integrates robot motion, camera capture, and optional facial recognition helpers.
12
- - Gradio web UI provides audio chat and transcript display.
13
- - Movement manager keeps real-time control in a dedicated thread with safeguards against abrupt pose changes.
14
 
15
  ## Installation
16
 
@@ -47,7 +43,7 @@ pip install -e .
47
  Install optional extras depending on the feature set you need:
48
 
49
  ```bash
50
- # Vision stacks (choose at least one if you plan to run head tracking)
51
  pip install -e .[local_vision]
52
  pip install -e .[yolo_vision]
53
  pip install -e .[mediapipe_vision]
@@ -57,16 +53,16 @@ pip install -e .[all_vision] # installs every vision extra
57
  pip install -e .[dev]
58
  ```
59
 
60
- Some wheels (e.g. PyTorch) are large and require compatible CUDA or CPU builds. Expect the `local_vision` extra to take significantly more disk space than YOLO or MediaPipe.
61
 
62
  ## Optional dependency groups
63
 
64
  | Extra | Purpose | Notes |
65
  |-------|---------|-------|
66
- | `local_vision` | Run the local VLM (SmolVLM2) through PyTorch/Transformers. | GPU recommended; installs large packages (~2 GB).
67
  | `yolo_vision` | YOLOv8 tracking via `ultralytics` and `supervision`. | CPU friendly; supports the `--head-tracker yolo` option.
68
  | `mediapipe_vision` | Lightweight landmark tracking with MediaPipe. | Works on CPU; enables `--head-tracker mediapipe`.
69
- | `all_vision` | Convenience alias installing every vision extra. | Only use if you need to experiment with all providers.
70
  | `dev` | Developer tooling (`pytest`, `ruff`). | Add on top of either base or `all_vision` environments.
71
 
72
  ## Configuration
@@ -89,19 +85,19 @@ Activate your virtual environment, ensure the Reachy Mini robot (or simulator) i
89
  reachy-mini-conversation-demo
90
  ```
91
 
92
- The app starts a Gradio UI served locally (http://127.0.0.1:7860/). When running on a headless host, use `--headless`.
93
 
94
  ### CLI options
95
 
96
  | Option | Default | Description |
97
  |--------|---------|-------------|
98
- | `--head-tracker {yolo,mediapipe}` | `None` | Select a head-tracking backend when a camera is available. Requires the matching optional extra. |
99
- | `--no-camera` | `False` | Run without camera capture or head tracking. |
100
  | `--headless` | `False` | Suppress launching the Gradio UI (useful on remote machines). |
101
  | `--debug` | `False` | Enable verbose logging for troubleshooting. |
102
 
103
  ### Examples
104
- - Run on hardware with MediaPipe head tracking:
105
 
106
  ```bash
107
  reachy-mini-conversation-demo --head-tracker mediapipe
@@ -133,7 +129,5 @@ The app starts a Gradio UI served locally (http://127.0.0.1:7860/). When running
133
  - Execute the test suite: `pytest`.
134
  - When iterating on robot motions, keep the control loop responsive—offload blocking work using the helpers in `tools.py`.
135
 
136
-
137
  ## License
138
-
139
  Apache 2.0
 
4
 
5
  ## Overview
6
  - Real-time audio conversation loop powered by the OpenAI realtime API and `fastrtc` for low-latency streaming.
7
+ - Layered motion system queues primary moves (dances, emotions, goto poses, breathing) while blending speech-reactive wobble and face-tracking.
8
+ - Camera capture can route to OpenAI multimodal vision or stay on-device with SmolVLM2 local analysis.
9
+ - Async tool dispatch integrates robot motion, camera capture, and optional facial-recognition helpers through a Gradio web UI with live transcripts.
 
 
 
 
10
 
11
  ## Installation
12
 
 
43
  Install optional extras depending on the feature set you need:
44
 
45
  ```bash
46
+ # Vision stacks (choose at least one if you plan to run face tracking)
47
  pip install -e .[local_vision]
48
  pip install -e .[yolo_vision]
49
  pip install -e .[mediapipe_vision]
 
53
  pip install -e .[dev]
54
  ```
55
 
56
+ Some wheels (e.g. PyTorch) are large and require compatible CUDA or CPU builds—make sure your platform matches the binaries pulled in by each extra.
57
 
58
  ## Optional dependency groups
59
 
60
  | Extra | Purpose | Notes |
61
  |-------|---------|-------|
62
+ | `local_vision` | Run the local VLM (SmolVLM2) through PyTorch/Transformers. | GPU recommended; ensure compatible PyTorch builds for your platform.
63
  | `yolo_vision` | YOLOv8 tracking via `ultralytics` and `supervision`. | CPU friendly; supports the `--head-tracker yolo` option.
64
  | `mediapipe_vision` | Lightweight landmark tracking with MediaPipe. | Works on CPU; enables `--head-tracker mediapipe`.
65
+ | `all_vision` | Convenience alias installing every vision extra. | Install when you want the flexibility to experiment with every provider.
66
  | `dev` | Developer tooling (`pytest`, `ruff`). | Add on top of either base or `all_vision` environments.
67
 
68
  ## Configuration
 
85
  reachy-mini-conversation-demo
86
  ```
87
 
88
+ The app starts a Gradio UI served locally (http://127.0.0.1:7860/). When running on a headless host, use `--headless`. With a camera attached, captured frames can be analysed remotely through OpenAI multimodal models or locally via the YOLO/MediaPipe pipelines depending on the extras you installed.
89
 
90
  ### CLI options
91
 
92
  | Option | Default | Description |
93
  |--------|---------|-------------|
94
+ | `--head-tracker {yolo,mediapipe}` | `None` | Select a face-tracking backend when a camera is available. Requires the matching optional extra. |
95
+ | `--no-camera` | `False` | Run without camera capture or face tracking. |
96
  | `--headless` | `False` | Suppress launching the Gradio UI (useful on remote machines). |
97
  | `--debug` | `False` | Enable verbose logging for troubleshooting. |
98
 
99
  ### Examples
100
+ - Run on hardware with MediaPipe face tracking:
101
 
102
  ```bash
103
  reachy-mini-conversation-demo --head-tracker mediapipe
 
129
  - Execute the test suite: `pytest`.
130
  - When iterating on robot motions, keep the control loop responsive—offload blocking work using the helpers in `tools.py`.
131
 
 
132
  ## License
 
133
  Apache 2.0