In this guide, we’ll show you how to use AssemblyAI’s LeMUR (Leveraging Large Language Models to Understand Recognized Speech) framework to process Speaker Labels from your transcript using the input_text parameter. The input_text option allows you to modify the transcripts before processing it using LeMUR, and in this example, format Speaker Labels in your LeMUR request.

Calling LeMUR using transcript_ids is preferred as default. Depending on your use case, you can alternatively use the input_text parameter to call LeMUR with custom formatted transcript data including edited transcripts, speaker-labelled transcripts and more.

Get Started

Before we begin, make sure you have an AssemblyAI account and an API key. You can sign up for an account and get your API key from your dashboard.

LeMUR features are currently only available to paid users, at two pricing tiers: LeMUR and LeMUR Basic. Refer to pricing for more details.

Step-by-Step Instructions

In this guide, we will include Speaker Labels to the conversation by using the input_text parameter. We will use the Custom Task endpoint to prompt LeMUR. You can use any LeMUR Endpoint and adjust the prompt parameters to suit your project’s needs.

  1. Install the SDK.
pip install-U assemblyai

Import the assemblyai and set your API key:

import assemblyai as aai
aai.settings.api_key = "API_KEY" 

Create a TranscriptionConfig with speaker_labels set to True.

config = aai.TranscriptionConfig(speaker_labels=True)

Use the Transcriber object’s transcribe method and parse the configuration and the audio file’s path as a parameter. The transcribe method will save the results of the transcription to the Transcriber object’s transcript attribute.

transcriber = aai.Transcriber()
transcript = transcriber.transcribe("https://github.com/AssemblyAI-Examples/audio-examples/raw/main/20230607_me_canadian_wildfires.mp3", config=config)

Create an empty string variable text to store the output. Iterate through each utterance in the transcript to append the formatted Speaker Labels to be used as the input_text LeMUR parameter.

text = ""

for utt in transcript.utterances:
    text += f"Speaker {utt.speaker}:\n{utt.text}\n"

Run Lemur's task method on your transcript and parse the prompt and input_text parameters. The text variable contains the formatted Speaker Labels and Text response from your transcript.

The result is stored in response as a string.

result = aai.Lemur().task(
    '''
    This is a speaker-labeled transcript of a podcast episode between Michel Martin, NPR host, and Peter DeCarlo, a professor at Johns Hopkins University. 
    Please answer the following questions: 
    1) Who is Speaker A and Speaker B? 
    2) What questions did the podcast host ask? 
    3) What were the main concerns of the podcast guest?"
    ''',
    input_text=text,
    final_model=aai.LemurModel.claude3_5_sonnet,
)

print(result.response)