speaker_labels
parameter in your request, and then find the results inside a field called utterances
.
Get started
Before we begin, make sure you have an AssemblyAI account and an API key. You can sign up for a free account and get your API key from your dashboard. The complete source code for this guide can be viewed here. Here is an audio example for this guide:Step-by-step instructions
- 1 Create a new file and import the necessary libraries for making an HTTP request.
- 2 Set up the API endpoint and headers. The headers should include your API key.
- 3 Upload your local file to the AssemblyAI API.
-
4
Use the
upload_url
returned by the AssemblyAI API to create a JSON payload containing theaudio_url
parameter and thespeaker_labels
paramter set toTrue
. -
5
Make a
POST
request to the AssemblyAI API endpoint with the payload and headers. -
6
After making the request, you’ll receive an ID for the transcription. Use it to poll the API every few seconds to check the status of the transcript job. Once the status is
completed
, you can retrieve the transcript from the API response, using theutterances
key to access the results.
Understanding the response
The speaker label information is included in theutterances
key of the response. Each utterance object in the list includes a speaker
field, which contains a string identifier for the speaker (e.g., “A”, “B”, etc.). The utterances list also contains a text
field for each utterance containing the spoken text, and confidence
scores both for utterances and their individual words.
For more information, see the Speaker Diarization model documentation or see the API reference.
Specifying the number of speakers
You can provide the optional parameterspeakers_expected
, that can be used to specify the expected number of speakers in an audio file.
API/Model Reference