Files
xiaozhi-esp32/README_en.md

143 lines
7.4 KiB
Markdown
Raw Normal View History

2025-01-07 05:19:09 +08:00
# XiaoZhi AI Chatbot
2025-02-20 04:15:56 +08:00
([中文](README.md) | English | [日本語](README_ja.md))
2025-01-07 05:21:10 +08:00
2025-01-07 05:19:09 +08:00
This is Terrence's first hardware project.
2025-02-20 04:15:56 +08:00
👉 [Build your AI chat companion with ESP32+SenseVoice+Qwen72B!【bilibili】](https://www.bilibili.com/video/BV11msTenEH3/)
2025-01-07 05:19:09 +08:00
2025-02-20 04:15:56 +08:00
👉 [Equipping XiaoZhi with DeepSeek's smart brain【bilibili】](https://www.bilibili.com/video/BV1GQP6eNEFG/)
👉 [Build your own AI companion, a beginner's guide【bilibili】](https://www.bilibili.com/video/BV1XnmFYLEJN/)
2025-01-07 05:19:09 +08:00
## Project Purpose
2025-02-20 04:15:56 +08:00
This is an open-source project released under the MIT license, allowing anyone to use it freely, including for commercial purposes.
2025-01-07 05:19:09 +08:00
2025-02-20 04:15:56 +08:00
Through this project, we aim to help more people get started with AI hardware development and understand how to implement rapidly evolving large language models in actual hardware devices. Whether you're a student interested in AI or a developer exploring new technologies, this project offers valuable learning experiences.
2025-01-07 05:19:09 +08:00
2025-02-20 04:15:56 +08:00
Everyone is welcome to participate in the project's development and improvement. If you have any ideas or suggestions, please feel free to raise an Issue or join the chat group.
2025-01-07 05:19:09 +08:00
2025-02-25 03:01:30 +08:00
Learning & Discussion QQ Group: 376893254
2025-01-07 05:19:09 +08:00
## Implemented Features
- Wi-Fi / ML307 Cat.1 4G
2025-02-20 04:15:56 +08:00
- BOOT button wake-up and interruption, supporting both click and long-press triggers
2025-01-07 05:19:09 +08:00
- Offline voice wake-up [ESP-SR](https://github.com/espressif/esp-sr)
- Streaming voice dialogue (WebSocket or UDP protocol)
- Support for 5 languages: Mandarin, Cantonese, English, Japanese, Korean [SenseVoice](https://github.com/FunAudioLLM/SenseVoice)
- Voice print recognition to identify who's calling AI's name [3D Speaker](https://github.com/modelscope/3D-Speaker)
2025-02-20 04:15:56 +08:00
- Large model TTS (Volcano Engine or CosyVoice)
- Large Language Models (Qwen, DeepSeek, Doubao)
2025-01-07 05:19:09 +08:00
- Configurable prompts and voice tones (custom characters)
2025-02-20 04:15:56 +08:00
- Short-term memory, self-summarizing after each conversation round
2025-01-07 05:19:09 +08:00
- OLED / LCD display showing signal strength or conversation content
2025-02-20 04:15:56 +08:00
- Support for LCD image expressions
- Multi-language support (Chinese, English)
2025-01-07 05:19:09 +08:00
## Hardware Section
2025-02-20 04:15:56 +08:00
### Breadboard DIY Practice
2025-01-07 05:19:09 +08:00
2025-02-20 04:15:56 +08:00
See the Feishu document tutorial:
2025-01-07 05:19:09 +08:00
👉 [XiaoZhi AI Chatbot Encyclopedia](https://ccnphfhqs21z.feishu.cn/wiki/F5krwD16viZoF0kKkvDcrZNYnhb?from=from_copylink)
2025-02-20 04:15:56 +08:00
Breadboard demonstration:
2025-01-07 05:19:09 +08:00
2025-02-20 04:15:56 +08:00
![Breadboard Demo](docs/wiring2.jpg)
2025-01-07 05:19:09 +08:00
2025-02-20 04:15:56 +08:00
### Supported Open Source Hardware
2025-01-07 05:19:09 +08:00
- <a href="https://oshwhub.com/li-chuang-kai-fa-ban/li-chuang-shi-zhan-pai-esp32-s3-kai-fa-ban" target="_blank" title="LiChuang ESP32-S3 Development Board">LiChuang ESP32-S3 Development Board</a>
- <a href="https://github.com/espressif/esp-box" target="_blank" title="Espressif ESP32-S3-BOX3">Espressif ESP32-S3-BOX3</a>
- <a href="https://docs.m5stack.com/zh_CN/core/CoreS3" target="_blank" title="M5Stack CoreS3">M5Stack CoreS3</a>
2025-01-16 05:57:53 +08:00
- <a href="https://docs.m5stack.com/en/atom/Atomic%20Echo%20Base" target="_blank" title="AtomS3R + Echo Base">AtomS3R + Echo Base</a>
- <a href="https://docs.m5stack.com/en/core/ATOM%20Matrix" target="_blank" title="AtomMatrix + Echo Base">AtomMatrix + Echo Base</a>
2025-02-20 04:15:56 +08:00
- <a href="https://gf.bilibili.com/item/detail/1108782064" target="_blank" title="Magic Button 2.4">Magic Button 2.4</a>
2025-01-24 04:24:50 +08:00
- <a href="https://www.waveshare.net/shop/ESP32-S3-Touch-AMOLED-1.8.htm" target="_blank" title="Waveshare ESP32-S3-Touch-AMOLED-1.8">Waveshare ESP32-S3-Touch-AMOLED-1.8</a>
- <a href="https://github.com/Xinyuan-LilyGO/T-Circle-S3" target="_blank" title="LILYGO T-Circle-S3">LILYGO T-Circle-S3</a>
2025-02-20 04:15:56 +08:00
- <a href="https://oshwhub.com/tenclass01/xmini_c3" target="_blank" title="XiaGe Mini C3">XiaGe Mini C3</a>
- <a href="https://oshwhub.com/movecall/moji-xiaozhi-ai-derivative-editi" target="_blank" title="Movecall Moji ESP32S3">Moji XiaoZhi AI Derivative Version</a>
- <a href="https://oshwhub.com/movecall/cuican-ai-pendant-lights-up-y" target="_blank" title="Movecall CuiCan ESP32S3">CuiCan AI pendant</a>
- <a href="https://www.seeedstudio.com/SenseCAP-Watcher-W1-A-p-5979.html" target="_blank" title="SenseCAP Watcher">SenseCAP Watcher</a>
2025-01-07 05:19:09 +08:00
<div style="display: flex; justify-content: space-between;">
2025-01-24 04:24:50 +08:00
<a href="docs/v1/lichuang-s3.jpg" target="_blank" title="LiChuang ESP32-S3 Development Board">
<img src="docs/v1/lichuang-s3.jpg" width="240" />
2025-01-07 05:19:09 +08:00
</a>
2025-01-24 04:24:50 +08:00
<a href="docs/v1/espbox3.jpg" target="_blank" title="Espressif ESP32-S3-BOX3">
<img src="docs/v1/espbox3.jpg" width="240" />
2025-01-07 05:19:09 +08:00
</a>
2025-01-24 04:24:50 +08:00
<a href="docs/v1/m5cores3.jpg" target="_blank" title="M5Stack CoreS3">
<img src="docs/v1/m5cores3.jpg" width="240" />
2025-01-16 05:57:53 +08:00
</a>
2025-01-24 04:24:50 +08:00
<a href="docs/v1/atoms3r.jpg" target="_blank" title="AtomS3R + Echo Base">
<img src="docs/v1/atoms3r.jpg" width="240" />
2025-01-07 05:19:09 +08:00
</a>
<a href="docs/AtomMatrix-echo-base.jpg" target="_blank" title="AtomMatrix-echo-base + Echo Base">
<img src="docs/AtomMatrix-echo-base.jpg" width="240" />
</a>
2025-01-24 04:24:50 +08:00
<a href="docs/v1/magiclick.jpg" target="_blank" title="MagiClick 2.4">
<img src="docs/v1/magiclick.jpg" width="240" />
2025-01-07 05:19:09 +08:00
</a>
2025-01-24 04:24:50 +08:00
<a href="docs/v1/waveshare.jpg" target="_blank" title="Waveshare ESP32-S3-Touch-AMOLED-1.8">
<img src="docs/v1/waveshare.jpg" width="240" />
</a>
<a href="docs/lilygo-t-circle-s3.jpg" target="_blank" title="LILYGO T-Circle-S3">
<img src="docs/lilygo-t-circle-s3.jpg" width="240" />
</a>
2025-01-24 04:24:50 +08:00
<a href="docs/xmini-c3.jpg" target="_blank" title="Xmini C3">
<img src="docs/xmini-c3.jpg" width="240" />
</a>
<a href="docs/v1/movecall-moji-esp32s3.jpg" target="_blank" title="Moji">
<img src="docs/v1/movecall-moji-esp32s3.jpg" width="240" />
</a>
<a href="docs/v1/movecall-cuican-esp32s3.jpg" target="_blank" title="CuiCan">
<img src="docs/v1/movecall-cuican-esp32s3.jpg" width="240" />
</a>
<a href="docs/v1/sensecap_watcher.jpg" target="_blank" title="SenseCAP Watcher">
<img src="docs/v1/sensecap_watcher.jpg" width="240" />
</a>
2025-01-07 05:19:09 +08:00
</div>
## Firmware Section
### Flashing Without Development Environment
2025-02-20 04:15:56 +08:00
For beginners, it's recommended to first use the firmware that can be flashed without setting up a development environment.
The firmware connects to the official [xiaozhi.me](https://xiaozhi.me) server by default. Currently, personal users can register an account to use the Qwen real-time model for free.
2025-01-07 05:19:09 +08:00
2025-02-20 04:15:56 +08:00
👉 [Flash Firmware Guide (No IDF Environment)](https://ccnphfhqs21z.feishu.cn/wiki/Zpz4wXBtdimBrLk25WdcXzxcnNS)
2025-01-07 05:19:09 +08:00
### Development Environment
- Cursor or VSCode
- Install ESP-IDF plugin, select SDK version 5.3 or above
- Linux is preferred over Windows for faster compilation and fewer driver issues
2025-02-20 04:15:56 +08:00
- Use Google C++ code style, ensure compliance when submitting code
## AI Agent Configuration
If you already have a XiaoZhi AI chatbot device, you can configure it through the [xiaozhi.me](https://xiaozhi.me) console.
👉 [Backend Operation Tutorial (Old Interface)](https://www.bilibili.com/video/BV1jUCUY2EKM/)
2025-01-07 05:19:09 +08:00
2025-02-20 04:15:56 +08:00
## Technical Principles and Private Deployment
2025-01-07 05:19:09 +08:00
2025-02-20 04:15:56 +08:00
👉 [Detailed WebSocket Communication Protocol Documentation](docs/websocket.md)
2025-01-07 05:19:09 +08:00
2025-02-20 04:15:56 +08:00
For server deployment on personal computers, refer to another MIT-licensed project [xiaozhi-esp32-server](https://github.com/xinnan-tech/xiaozhi-esp32-server)
2025-01-07 05:19:09 +08:00
## Star History
<a href="https://star-history.com/#78/xiaozhi-esp32&Date">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=78/xiaozhi-esp32&type=Date&theme=dark" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=78/xiaozhi-esp32&type=Date" />
<img alt="Star History Chart" src="https://api.star-history.com/svg?repos=78/xiaozhi-esp32&type=Date" />
</picture>
</a>