Text-to-Speech

Implement input text conversion to an output audio file via API interface.

Preparation

Before running the example program, the corresponding model package must be installed on the device. Refer to Model List for the model package installation tutorial.

Before running this example program, please ensure the following preparations have been completed on the LLM device:

Install the llm-model-melotts-en-us model package using the apt package management tool.

apt install llm-model-melotts-en-us

Install the ffmpeg tool.

apt install ffmpeg

After installation, restart the OpenAI service to make the new model take effect.

systemctl restart llm-openai-api

Example

On the PC side, use the OpenAI API to pass in text to implement text-to-speech conversion. Before running the example program, modify the IP part of the base_url below to the actual IP address of the device.

from pathlib import Path
from openai import OpenAI

client = OpenAI(
    api_key="sk-",
    base_url="http://192.168.20.186:8000/v1"
)

speech_file_path = Path(__file__).parent / "speech.mp3"
with client.audio.speech.with_streaming_response.create(
  model="melotts-en-us",
  voice="alloy",
  input="The quick brown fox jumped over the lazy dog."
) as response:
  response.stream_to_file(speech_file_path) 

Request Parameters

Parameter Name	Type	Required	Example Value	Description
input	string	yes	"Hello, welcome to the system"	The text content to generate audio for; maximum length is 1024 characters
model	string	yes	melotts-zh-cn	Available TTS models, including `melotts-zh-cn` and `melotts-en-us`
voice	–	no	–	Voice style selection (not currently supported)
response_format	string	no	mp3	Audio output format; supports `mp3`, `opus`, `aac`, `flac`, `wav`, `pcm`, etc.
speed	number	no	1.0	Speech generation speed; range 0.25–2.0, default is 1.0

Response Example

The audio file data will be saved to the speech_file_path specified in the example program.

Next Overview

Overview

Linux PC

CM4Stack

CoreMP135

Industrial Control

StamPLC

LLM

Module LLM

LLM630 Compute Kit

OpenAI API

Real-Time AI Voice Assistant

OpenAI Voice Assistant

XiaoZhi Voice Assistant

AtomS3R-M12 Volcengine Kit

Offline Voice Recognition

Unit ASR

Home Assistant

Zigbee

Module Gateway H2

Unit Gateway H2

Thread

Module Gateway H2

Unit Gateway H2

IoT Measuring Instruments

VAMeter

T-Lite

IoT Cloud

AWS IoT Core

Ezdata

Ethernet Camera

PoECAM

Wi-Fi Camera

TimerCAM

Unit CamS3

AI Camera

UnitV2

M5StickV/UnitV

LoRa & LoRaWAN

TTN (The Things Network)

Motor Control

Unit Roller485/CAN

Develop Tools

Network

Hobby Kit

Restore Factory Firmware

DIP Switch Usage Guide

Module GPS v2.0

Module GNSS

Module ExtPort For Core2

Module LoRa868 V1.2

Text-to-Speech

Preparation

Example

Request Parameters

Response Example

On This Page