Enhance Your App with Resemble's API

Enhance Your App with Resemble's API

Table of Contents

  1. Introduction
  2. Overview of Using the API to Build Dynamic Experiences
  3. Authentication
  4. Available Resources
    • Projects
    • Clips
    • Voices
  5. Creating Clips
    • Async Approach
    • Sync Approach
    • Streaming Approach
  6. Using SSML Markup Language
  7. Voice Creation
    • Uploading Audio
    • API Integration
  8. Documentation and Interactive API
  9. Updating and Deleting Clips
  10. Exporting Audio into Other Programs
  11. Summary
  12. Conclusion


Welcome to this tutorial on how to use the API to build dynamic experiences. In this tutorial, we will cover the basics of authentication, the available resources You can Interact with, and the different ways to Create clips using the asynchronous, synchronous, and streaming approaches. We will also discuss using SSML markup language, voice creation options, and how to integrate with different programs. Whether you are new to using the API or looking to enhance your existing skills, this tutorial will provide you with all the necessary information to create dynamic experiences efficiently.

Overview of Using the API to Build Dynamic Experiences

The API offers a wide range of capabilities for building dynamic experiences. With the API, you can authenticate your requests and access various resources such as projects, clips, and voices. By utilizing the asynchronous, synchronous, or streaming approaches, you can create clips with the desired content and format. Additionally, you can use SSML markup language to add style elements to the generated audio. The API also provides options for voice creation, allowing you to upload audio or integrate programmatically. The documentation is interactive, enabling you to test requests and view the responses. With the ability to update and delete clips, export audio into other programs, and explore various options, the API offers flexibility and versatility in creating dynamic experiences.


To use the API, you need to authenticate your requests. Simply go to your account on the Website and click on the API button. You will find an API token that you need to include in the headers of every request. The content Type should be set to "application/json" unless specified otherwise in the documentation.

Available Resources

There are three main resources that you will interact with when using the API: projects, clips, and voices.


A project is an organized collection of voice clips. With projects, you can easily manage and group related clips together. You can get all projects, retrieve a single project, or create a new project. Projects also allow for sharing with specific individuals, offering control and collaboration options.


Clips are the Core element of creating dynamic experiences. They contain the audio content generated by the API. Clips can be created, fetched, updated, and deleted. They are organized within projects, providing a hierarchical structure. By associating clips with projects, you can easily manage and categorize your generated content.


Voices are what you create with the API. They represent the different audio characters or styles that you want to generate. You can create voices by either uploading audio files or using the API programmatically. With the ability to train and update voices, you can refine and enhance the generated audio to suit your specific needs.

Creating Clips

There are three main methods for creating clips: asynchronous, synchronous, and streaming.

Async Approach

The asynchronous approach is recommended for generating content that is longer than a tweet (280 characters). It involves making a request that goes into a processing queue and is generated over time. Typically, the processing time can range from two to ten seconds. This approach is suitable for applications that do not require real-time generation, such as audio books or assets for games.

Sync Approach

The synchronous approach is suitable for generating content that is shorter than a tweet or requires immediate results. With this approach, the processing time is typically between 600 milliseconds to three seconds. The processing time is sublinear, meaning that the length of the audio does not have a direct impact on the processing time. This approach is ideal for applications that require real-time generation, like games or voice assistants.

Streaming Approach

The streaming approach allows for generating and playing audio content in real-time. It guarantees a time to the first sound bite of 350 to 400 milliseconds, regardless of the length of the audio. Once the initial sound is received, the content is streamed continuously. This approach is useful for mobile apps or controlled environments where you want immediate audio playback. However, it may not be suitable for platforms like Alexa, which do not support streaming APIs.

Using SSML Markup Language

The API supports the Speech Synthesis Markup Language (SSML), which allows you to add style elements to the generated audio. SSML provides tags like phoneme, which changes the pronunciation of a word, and the Resemble style tag, which introduces additional styling options. By leveraging SSML, you can create more customized and expressive audio content.

Voice Creation

There are two methods for creating voices: uploading audio and using the API programmatically.

Uploading Audio

The first method involves uploading audio files through the web interface. You can Record Prompts or scripts using the provided tools and upload them to create voices. This method is suitable for users who prefer a user-friendly interface and do not require programmatic integration.

API Integration

The API allows for programmatically creating voices by providing audio data in supported formats. You can format the audio data and upload it via the API, enabling integration with other platforms and applications. This method offers more control and flexibility for users who want to build voices from their own platforms.

Documentation and Interactive API

The API documentation is available on the app.resemble.ai/docs website. The documentation is interactive, allowing you to test requests and view responses directly on the page. You can execute API commands and observe the network requests using tools like Chrome Developer Tools. The documentation provides detailed explanations of each resource, supported languages, and sample code in different programming languages.

Updating and Deleting Clips

While the API allows creating and retrieving clips, it also supports updating and deleting clips. You can update clips by appending additional audio data to enhance the existing content. This is particularly useful for continuously improving and refining the generated audio over time. Similarly, you can delete clips that are no longer needed, providing flexibility in managing your generated content.

Exporting Audio into Other Programs

The API provides the option to export audio into other programs for further editing and processing. You can download the generated audio files from the web interface or access the S3-hosted audio via the API. This flexibility allows you to integrate the generated audio into your preferred audio editing software or use it in conjunction with other tools or applications.


In this tutorial, we covered various aspects of using the API to build dynamic experiences. We discussed authentication, available resources, creating clips using different approaches, utilizing SSML markup language, voice creation options, and documentation. We also explored updating and deleting clips, exporting audio into other programs, and summarized the key points covered in this tutorial.


Building dynamic experiences using the API provides endless possibilities for creativity and innovation. By leveraging the available resources, different clip creation methods, and voice customization options, developers can create unique and engaging audio content. The API's flexibility, accessibility, and user-friendly documentation make it a powerful tool for developers looking to enhance their applications with dynamic audio experiences.

Most people like

Find AI tools in Toolify

Join TOOLIFY to find the ai tools

Get started

Sign Up
App rating
AI Tools
Trusted Users
No complicated
No difficulty
Free forever
Browse More Content