Alina Lozovskaya
commited on
Commit
·
6261fdc
1
Parent(s):
c51b4e7
Update readme and comments
Browse files
README.md
CHANGED
|
@@ -1,15 +1,15 @@
|
|
| 1 |
-
# Reachy Mini conversation
|
| 2 |
|
| 3 |
-
Conversational
|
| 4 |
|
| 5 |

|
| 6 |
|
| 7 |
## Architecture
|
| 8 |
|
| 9 |
-
The
|
| 10 |
|
| 11 |
<p align="center">
|
| 12 |
-
<img src="docs/assets/
|
| 13 |
</p>
|
| 14 |
|
| 15 |
## Overview
|
|
@@ -96,7 +96,7 @@ Some wheels (e.g. PyTorch) are large and require compatible CUDA or CPU builds
|
|
| 96 |
Activate your virtual environment, ensure the Reachy Mini robot (or simulator) is reachable, then launch:
|
| 97 |
|
| 98 |
```bash
|
| 99 |
-
reachy-mini-conversation-
|
| 100 |
```
|
| 101 |
|
| 102 |
By default, the app runs in console mode for direct audio interaction. Use the `--gradio` flag to launch a web UI served locally at http://127.0.0.1:7860/ (required when running in simulation mode). With a camera attached, vision is handled by the gpt-realtime model when the camera tool is used. For local vision processing, use the `--local-vision` flag to process frames periodically using the SmolVLM2 model. Additionally, you can enable face tracking via YOLO or MediaPipe pipelines depending on the extras you installed.
|
|
@@ -116,19 +116,19 @@ By default, the app runs in console mode for direct audio interaction. Use the `
|
|
| 116 |
- Run on hardware with MediaPipe face tracking:
|
| 117 |
|
| 118 |
```bash
|
| 119 |
-
reachy-mini-conversation-
|
| 120 |
```
|
| 121 |
|
| 122 |
- Run with local vision processing (requires `local_vision` extra):
|
| 123 |
|
| 124 |
```bash
|
| 125 |
-
reachy-mini-conversation-
|
| 126 |
```
|
| 127 |
|
| 128 |
- Disable the camera pipeline (audio-only conversation):
|
| 129 |
|
| 130 |
```bash
|
| 131 |
-
reachy-mini-conversation-
|
| 132 |
```
|
| 133 |
|
| 134 |
## LLM tools exposed to the assistant
|
|
|
|
| 1 |
+
# Reachy Mini conversation app
|
| 2 |
|
| 3 |
+
Conversational app for the Reachy Mini robot combining OpenAI's realtime APIs, vision pipelines, and choreographed motion libraries.
|
| 4 |
|
| 5 |

|
| 6 |
|
| 7 |
## Architecture
|
| 8 |
|
| 9 |
+
The app follows a layered architecture connecting the user, AI services, and robot hardware:
|
| 10 |
|
| 11 |
<p align="center">
|
| 12 |
+
<img src="docs/assets/conversation_app_arch.svg" alt="Architecture Diagram" width="600"/>
|
| 13 |
</p>
|
| 14 |
|
| 15 |
## Overview
|
|
|
|
| 96 |
Activate your virtual environment, ensure the Reachy Mini robot (or simulator) is reachable, then launch:
|
| 97 |
|
| 98 |
```bash
|
| 99 |
+
reachy-mini-conversation-app
|
| 100 |
```
|
| 101 |
|
| 102 |
By default, the app runs in console mode for direct audio interaction. Use the `--gradio` flag to launch a web UI served locally at http://127.0.0.1:7860/ (required when running in simulation mode). With a camera attached, vision is handled by the gpt-realtime model when the camera tool is used. For local vision processing, use the `--local-vision` flag to process frames periodically using the SmolVLM2 model. Additionally, you can enable face tracking via YOLO or MediaPipe pipelines depending on the extras you installed.
|
|
|
|
| 116 |
- Run on hardware with MediaPipe face tracking:
|
| 117 |
|
| 118 |
```bash
|
| 119 |
+
reachy-mini-conversation-app --head-tracker mediapipe
|
| 120 |
```
|
| 121 |
|
| 122 |
- Run with local vision processing (requires `local_vision` extra):
|
| 123 |
|
| 124 |
```bash
|
| 125 |
+
reachy-mini-conversation-app --local-vision
|
| 126 |
```
|
| 127 |
|
| 128 |
- Disable the camera pipeline (audio-only conversation):
|
| 129 |
|
| 130 |
```bash
|
| 131 |
+
reachy-mini-conversation-app --no-camera
|
| 132 |
```
|
| 133 |
|
| 134 |
## LLM tools exposed to the assistant
|
docs/assets/{conversation_demo_arch.svg → conversation_app_arch.svg}
RENAMED
|
File without changes
|
src/reachy_mini_conversation_app/config.py
CHANGED
|
@@ -26,7 +26,7 @@ logger.info("Configuration loaded from .env file")
|
|
| 26 |
|
| 27 |
|
| 28 |
class Config:
|
| 29 |
-
"""Configuration class for the conversation
|
| 30 |
|
| 31 |
# Required
|
| 32 |
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
|
|
|
|
| 26 |
|
| 27 |
|
| 28 |
class Config:
|
| 29 |
+
"""Configuration class for the conversation app."""
|
| 30 |
|
| 31 |
# Required
|
| 32 |
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
|
src/reachy_mini_conversation_app/main.py
CHANGED
|
@@ -1,4 +1,4 @@
|
|
| 1 |
-
"""Entrypoint for the Reachy Mini conversation
|
| 2 |
|
| 3 |
import os
|
| 4 |
import sys
|
|
@@ -28,11 +28,11 @@ def update_chatbot(chatbot: List[Dict[str, Any]], response: Dict[str, Any]) -> L
|
|
| 28 |
|
| 29 |
|
| 30 |
def main() -> None:
|
| 31 |
-
"""Entrypoint for the Reachy Mini conversation
|
| 32 |
args = parse_args()
|
| 33 |
|
| 34 |
logger = setup_logger(args.debug)
|
| 35 |
-
logger.info("Starting Reachy Mini Conversation
|
| 36 |
|
| 37 |
if args.no_camera and args.head_tracker is not None:
|
| 38 |
logger.warning("Head tracking is not activated due to --no-camera.")
|
|
|
|
| 1 |
+
"""Entrypoint for the Reachy Mini conversation app."""
|
| 2 |
|
| 3 |
import os
|
| 4 |
import sys
|
|
|
|
| 28 |
|
| 29 |
|
| 30 |
def main() -> None:
|
| 31 |
+
"""Entrypoint for the Reachy Mini conversation app."""
|
| 32 |
args = parse_args()
|
| 33 |
|
| 34 |
logger = setup_logger(args.debug)
|
| 35 |
+
logger.info("Starting Reachy Mini Conversation App")
|
| 36 |
|
| 37 |
if args.no_camera and args.head_tracker is not None:
|
| 38 |
logger.warning("Head tracking is not activated due to --no-camera.")
|
src/reachy_mini_conversation_app/utils.py
CHANGED
|
@@ -9,7 +9,7 @@ from reachy_mini_conversation_app.camera_worker import CameraWorker
|
|
| 9 |
|
| 10 |
def parse_args() -> argparse.Namespace:
|
| 11 |
"""Parse command line arguments."""
|
| 12 |
-
parser = argparse.ArgumentParser("Reachy Mini Conversation
|
| 13 |
parser.add_argument(
|
| 14 |
"--head-tracker",
|
| 15 |
choices=["yolo", "mediapipe", None],
|
|
|
|
| 9 |
|
| 10 |
def parse_args() -> argparse.Namespace:
|
| 11 |
"""Parse command line arguments."""
|
| 12 |
+
parser = argparse.ArgumentParser("Reachy Mini Conversation App")
|
| 13 |
parser.add_argument(
|
| 14 |
"--head-tracker",
|
| 15 |
choices=["yolo", "mediapipe", None],
|