M5Module-LLM Arduino Driver Library API Documentation.
M5ModuleLLM
is used to initialize the LLM Module and provides internal members to quickly initialize each unit of the LLM, making it convenient to build applications according to individual needs.
class M5ModuleLLM {
public:
bool begin(Stream* targetPort);
bool checkConnection();
void update();
m5_module_llm::ApiSys sys;
m5_module_llm::ApiLlm llm;
m5_module_llm::ApiAudio audio;
m5_module_llm::ApiTts tts;
m5_module_llm::ApiKws kws;
m5_module_llm::ApiAsr asr;
m5_module_llm::ModuleMsg msg;
m5_module_llm::ModuleComm comm;
private:
};
Function Prototype:
bool begin(Stream* targetPort);
Description:
Parameters:
Return Value:
Function Prototype:
bool checkConnection();
Description:
sys.ping
command to check the connection status of the LLM Module.Parameters:
Return Value:
Function Prototype:
void update();
Description:
Parameters:
Return Value:
The ApiSys sys
member of M5ModuleLLM
is used to control the SYS unit for operations such as system reset.
Function Prototype:
int ping();
Description:
sys.ping
command to check the connection status of the LLM Module.Parameters:
Return Value:
Function Prototype:
int reset(bool waitResetFinish = true);
Description:
sys.reset
command to reset the software services.Parameters:
Return Value:
Function Prototype:
int reboot();
Description:
sys.reboot
command to reboot the system.Parameters:
Return Value:
The ApiAudio audio
member of M5ModuleLLM
is used to control the initialization and configuration of the AUDIO unit.
Function Prototype:
String setup(ApiAudioSetupConfig_t config = ApiAudioSetupConfig_t(), String request_id = "audio_setup");
Description:
Parameters:
ApiAudioSetupConfig_t config:
struct ApiAudioSetupConfig_t {
int capcard = 0;
int capdevice = 0;
float capVolume = 0.5;
int playcard = 0;
int playdevice = 1;
float playVolume = 0.15;
};
Parameter | Description | Input Value |
---|---|---|
capcard | Microphone sound card index | Default system sound card: 0 |
capdevice | Microphone device index | Onboard silicon mic: 0 |
capVolume | Input volume | 0.0~10.0 (volume > 1 will amplify, default value is 0.5) |
playcard | Speaker sound card index | Default system sound card: 0 |
playdevice | Speaker device index | Onboard speaker: 1 |
playVolume | Output volume | 0.0~10.0 (volume > 1 will amplify, default value is 0.5) |
Return Value:
The ApiKws kws
member of M5ModuleLLM
is used to control the initialization and configuration of the KWS unit.
Function Prototype:
String setup(ApiKwsSetupConfig_t config = ApiKwsSetupConfig_t(), String request_id = "kws_setup");
Description:
Parameters:
ApiKwsSetupConfig_t config:
struct ApiKwsSetupConfig_t {
String kws = "HELLO";
String model = "sherpa-onnx-kws-zipformer-gigaspeech-3.3M-2024-01-01";
String response_format = "kws.bool";
String input = "sys.pcm";
bool enoutput = true;
};
Parameter | Description | Input Value |
---|---|---|
model | Conversion model | English model: "sherpa-onnx-kws-zipformer-gigaspeech-3.3M-2024-01-01" Chinese model: "sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01" |
kws | KWS wake word text setting | No mixing of Chinese and English allowed, English must be in uppercase |
enoutput | Enable UART output | Enable: true Disable: false |
Return Value:
The ApiAsr asr
member of M5ModuleLLM
is used to control the initialization and configuration of the ASR unit.
Function Prototype:
String setup(ApiAsrSetupConfig_t config = ApiAsrSetupConfig_t(), String request_id = "asr_setup");
Description:
Parameters:
ApiAsrSetupConfig_t config:
struct ApiAsrSetupConfig_t {
String model = "sherpa-ncnn-streaming-zipformer-20M-2023-02-17";
String response_format = "asr.utf-8.stream";
String input = "sys.pcm";
bool enoutput = true;
bool enkws = true;
float rule1 = 2.4;
float rule2 = 1.2;
float rule3 = 30.0;
};
Parameter | Description | Input Value |
---|---|---|
model | Conversion model | English model: "sherpa-ncnn-streaming-zipformer-20M-2023-02-17" Chinese model: "sherpa-ncnn-streaming-zipformer-zh-14M-2023-02-23" |
response_format | Output format | Standard output: "asr.utf-8" Streaming output: "asr.utf-8.stream" |
input | Input | LLM input: "llm.xxx"(work_id of the llm unit) UART input: "tts.utf-8" UART streaming input: "tts.utf-8.stream" |
enkws | Support wake by KWS | Allow KWS to trigger ASR: true Continuous ASR without KWS wake: false |
rule1 | Timeout between wake and no content recognition | Unit: seconds |
rule2 | Maximum interval time for recognition | Unit: seconds |
rule3 | Maximum timeout duration for recognition | Unit: seconds |
enoutput | Enable UART output | Enable: true Disable: false |
Return Value:
The ApiLlm llm
member of M5ModuleLLM
is used to control the initialization and configuration of the LLM unit.
Function Prototype:
String setup(ApiLlmSetupConfig_t config = ApiLlmSetupConfig_t(), String request_id = "llm_setup");
Description:
Parameters:
struct ApiLlmSetupConfig_t {
String prompt;
String model = "qwen2.5-0.5b";
String response_format = "llm.utf-8.stream";
String input = "llm.utf-8.stream";
bool enoutput = true;
bool enkws = true;
int max_token_len = 127;
};
Parameter | Description | Input Value |
---|---|---|
model | Conversion model | Predefined model "qwen2.5-0.5b" |
response_format | Output format | Standard output: "llm.utf-8" Streaming output: "llm.utf-8.stream" |
input | Input | ASR input: "asr.xxx"(work_id of the asr unit) UART input: "llm.utf-8" UART streaming input: "llm.utf-8.stream" |
enkws | Should KWS wake terminate the process | KWS interrupts the process: true KWS does not interrupt the process: false |
max_length | Configure maximum output token length | Max value: 1024, recommended: 127 |
prompt | Model initialization prompt | String |
enoutput | Enable UART output | Enable: true Disable: false |
Return Value:
Function Prototype:
int inference(String work_id, String input, String request_id = "llm_inference");
Description:
responseMsgList
container within M5ModuleLLM.msg
.Parameters:
Return Value:
Function Prototype:
int inferenceAndWaitResult(String work_id, String input, std::function<void(String&)> onResult, uint32_t timeout = 5000, String request_id = "llm_inference");
Description:
Parameters:
Return Value:
The ApiTts tts
member of M5ModuleLLM
is used to control the initialization and configuration of the TTS unit.
Function Prototype:
String setup(ApiTtsSetupConfig_t config = ApiTtsSetupConfig_t(), String request_id = "tts_setup");
Description:
Parameters:
ApiTtsSetupConfig_t config:
struct ApiTtsSetupConfig_t {
String model = "single_speaker_english_fast";
String response_format = "tts.base64.wav";
String input = "tts.utf-8.stream";
bool enoutput = true;
bool enkws = true;
};
Parameter | Description | Input Value |
---|---|---|
model | Conversion model | English model: "single_speaker_english_fast" Chinese model: "single_speaker_fast" |
input | Input | LLM input: "llm.xxx"(work_id of the llm unit) UART input: "tts.utf-8" UART streaming input: "tts.utf-8.stream" |
enkws | Should KWS wake terminate the process | KWS interrupts the process: true KWS does not interrupt the process: false |
enoutput | Enable UART output | Enable: true Disable: false |
Return Value:
Function Prototype:
int inference(String work_id, String input, uint32_t timeout = 0, String request_id = "tts_inference");
Description:
Parameters:
Return Value:
The ModuleMsg msg
member of M5ModuleLLM
provides the responseMsgList
container, which is used to cache various information returned from the LLM Module. Refer to the example below to iterate and retrieve response results in the main loop.
void loop()
{
module_llm.update();
// Handle response msg
for (auto& msg : module_llm.msg.responseMsgList) {
// KWS msg
if (msg.work_id == kws_work_id) {
Serial.printf(">> Keyword detected\n");
}
// ASR msg
if (msg.work_id == asr_work_id) {
if (msg.object == "asr.utf-8.stream") {
// Parse and get ASR result
JsonDocument doc;
deserializeJson(doc, msg.raw_msg);
String asr_result = doc["data"]["delta"].as<String>();
Serial.printf(">> %s\n", asr_result.c_str());
}
}
}
module_llm.msg.responseMsgList.clear();
}
M5ModuleLLM_VoiceAssistant
is used to quickly create an LLM voice assistant instance, allowing easy implementation of KWS (wake-up keyword) -> ASR (speech-to-text) -> LLM (large model inference) -> TTS (text-to-speech).
M5ModuleLLM
instance to the constructor and register callback functions for the respective events to complete the creation of the voice assistant./*
* SPDX-FileCopyrightText: 2024 M5Stack Technology CO LTD
*
* SPDX-License-Identifier: MIT
*/
#include <Arduino.h>
#include <M5Unified.h>
#include <M5ModuleLLM.h>
M5ModuleLLM module_llm;
M5ModuleLLM_VoiceAssistant voice_assistant(&module_llm);
/* On ASR data callback */
void on_asr_data_input(String data, bool isFinish, int index)
{
M5.Display.setTextColor(TFT_GREEN, TFT_BLACK);
M5.Display.printf(">> %s\n", data.c_str());
/* If ASR data is finished */
if (isFinish) {
M5.Display.setTextColor(TFT_YELLOW, TFT_BLACK);
M5.Display.print(">> ");
}
};
/* On LLM data callback */
void on_llm_data_input(String data, bool isFinish, int index)
{
M5.Display.print(data);
/* If LLM data is finished */
if (isFinish) {
M5.Display.print("\n");
}
};
void setup()
{
M5.begin();
M5.Display.setTextSize(2);
M5.Display.setTextScroll(true);
/* Initialize module serial port */
Serial2.begin(115200, SERIAL_8N1, 16, 17); // Basic
// Serial2.begin(115200, SERIAL_8N1, 13, 14); // Core2
// Serial2.begin(115200, SERIAL_8N1, 18, 17); // CoreS3
/* Initialize module */
module_llm.begin(&Serial2);
/* Ensure module is connected */
M5.Display.printf(">> Check ModuleLLM connection..\n");
while (1) {
if (module_llm.checkConnection()) {
break;
}
}
/* Begin voice assistant preset */
M5.Display.printf(">> Begin voice assistant..\n");
int ret = voice_assistant.begin("HELLO");
if (ret != MODULE_LLM_OK) {
while (1) {
M5.Display.setTextColor(TFT_RED);
M5.Display.printf(">> Begin voice assistant failed\n");
}
}
/* Register on ASR data callback function */
voice_assistant.onAsrDataInput(on_asr_data_input);
/* Register on LLM data callback function */
voice_assistant.onLlmDataInput(on_llm_data_input);
M5.Display.printf(">> Voice assistant ready\n");
}
void loop()
{
/* Keep voice assistant preset updated */
voice_assistant.update();
}
enum ModuleLLMErrorCode_t {
MODULE_LLM_OK = 0,
MODULE_LLM_RESET_WARN = -1,
MODULE_LLM_JSON_FORMAT_ERROR = -2,
MODULE_LLM_ACTION_MATCH_FAILED = -3,
MODULE_LLM_INFERENCE_DATA_PUSH_FAILED = -4,
MODULE_LLM_MODEL_LOADING_FAILED = -5,
MODULE_LLM_UNIT_NOT_EXIST = -6,
MODULE_LLM_UNKNOWN_OPERATION = -7,
MODULE_LLM_UNIT_RESOURCE_ALLOCATION_FAILED = -8,
MODULE_LLM_UNIT_CALL_FAILED = -9,
MODULE_LLM_MODEL_INIT_FAILED = -10,
MODULE_LLM_MODEL_RUN_FAILED = -11,
MODULE_LLM_MODULE_NOT_INITIALISED = -12,
MODULE_LLM_MODULE_ALREADY_WORKING = -13,
MODULE_LLM_MODULE_NOT_WORKING = -14,
MODULE_LLM_NO_UPDATEABLE_MODULES = -15,
MODULE_LLM_NO_MODULES_AVAILABLE_FOR_UPDATE = -16,
MODULE_LLM_FILE_OPEN_FAILED = -17,
MODULE_LLM_WAIT_RESPONSE_TIMEOUT = -97,
MODULE_LLM_RESPONSE_PARSE_FAILED = -98,
MODULE_LLM_ERROR_NONE = -99,
};