Swiftorial Logo
Home
Swift Lessons
Matchups
CodeSnaps
Tutorials
Career
Resources

Text to Speech using OpenAI API

Introduction

The OpenAI API provides a text to speech endpoint that converts written text into spoken audio. This tutorial covers how to use the text to speech endpoint effectively, including its parameters and examples in JavaScript and Python.

Endpoint Overview

The text to speech endpoint utilizes advanced AI models to generate speech from text. It can be used for applications such as virtual assistants, accessibility tools, and more.

Using the Text to Speech Endpoint

API Request

To convert text to speech, send a POST request to the endpoint URL with your API key and the text to be converted.

POST /v1/text-to-speech HTTP/1.1
Host: api.openai.com
Content-Type: application/json
Authorization: Bearer YOUR_API_KEY

{
    "model": "text-to-speech-davinci-001",
    "input": "Hello, this is a text to speech example."
}
                    

In this example, replace YOUR_API_KEY with your actual API key.

API Response

The API responds with the generated audio file.

HTTP/1.1 200 OK
Content-Type: audio/mpeg

[Binary audio data]
                    

The response includes the binary data of the generated audio file.

Parameters

Here are some common parameters you can use with the text to speech endpoint:

  • model: The model to use for generating speech. Example: "text-to-speech-davinci-001".
  • input: The text to be converted into speech.

Examples in JavaScript

Here's how you can use the text to speech endpoint in JavaScript:

const axios = require('axios');
const fs = require('fs');

const apiKey = 'YOUR_API_KEY';
const endpoint = 'https://api.openai.com/v1/text-to-speech';

async function textToSpeech(text, outputFile) {
    try {
        const response = await axios.post(endpoint, {
            model: 'text-to-speech-davinci-001',
            input: text
        }, {
            headers: {
                'Content-Type': 'application/json',
                'Authorization': `Bearer ${apiKey}`
            },
            responseType: 'arraybuffer'
        });

        fs.writeFileSync(outputFile, response.data);
        console.log('Audio file saved to', outputFile);
    } catch (error) {
        console.error('Error:', error);
    }
}

textToSpeech('Hello, this is a text to speech example.', 'output.mp3');
                

Examples in Python

Here's how you can use the text to speech endpoint in Python:

import openai

api_key = 'YOUR_API_KEY'
openai.api_key = api_key

def text_to_speech(text, output_file):
    response = openai.TextToSpeech.create(
        model="text-to-speech-davinci-001",
        input=text
    )
    
    with open(output_file, 'wb') as file:
        file.write(response['audio'])

text_to_speech('Hello, this is a text to speech example.', 'output.mp3')
                

Conclusion

The text to speech endpoint in the OpenAI API offers a powerful tool for converting written text into spoken audio. By understanding its usage, parameters, and seeing examples in JavaScript and Python, you can integrate speech generation capabilities into various applications, enabling voice-driven interactions and accessibility features.