Documentation Index
Fetch the complete documentation index at: https://assembly-preview.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
This workflow is designed to be cost-effective, slicing the first 60 seconds of audio and running it through Nano ALD, which detects 99 languages, at a cost of $0.002 per transcript for this language detection workflow (not including the total transcription cost).
Performing ALD with this workflow has a few benefits:
- Cost-effective language detection
- Ability to detect 99 languages
- Ability to use Nano as fallback if the language is not supported in Best
- Ability to enable Audio Intelligence models if the language is supported
- Ability to use LeMUR with LLM prompts in Spanish for Spanish audio
Before you begin
To complete this tutorial, you need:
The entire source code of this guide can be viewed here.
Step-by-step instructions
Install the Python SDK:
Import the assemblyai package and set your API key:
import assemblyai as aaiaai.settings.api_key = "YOUR_API_KEY"
Create a set with all supported languages for Best. You can find them in our documentation here.
supported_languages_for_best = { "en", "es", "fr", "de", "it", "pt", "nl", "hi", "ja", "zh", "fi", "ko", "pl", "ru", "tr", "uk", "vi",}
Define a Transcriber. Note that here we don’t pass in a global TranscriptionConfig, but later apply different ones during the transcribe() call.
transcriber = aai.Transcriber()
Define two helper functions:
detect_language() performs language detection on the first 60 seconds of the audio using Nano and returns the language code.
transcribe_file() performs the transcription using Best or Nano depending on the identified language.
def detect_language(audio_url): config = aai.TranscriptionConfig( audio_end_at=60000, # first 60 seconds (in milliseconds) language_detection=True, speech_model=aai.SpeechModel.nano, ) transcript = transcriber.transcribe(audio_url, config=config) return transcript.json_response["language_code"]def transcribe_file(audio_url, language_code): config = aai.TranscriptionConfig( language_code=language_code, speech_model=( aai.SpeechModel.best if language_code in supported_languages_for_best else aai.SpeechModel.nano ), ) transcript = transcriber.transcribe(audio_url, config=config) return transcript
Test the code with different audio files. Apply both helper functions sequentially to each file to first identify the language and then transcribe the file.
audio_urls = [ "https://storage.googleapis.com/aai-web-samples/public_benchmarking_portugese.mp3", "https://storage.googleapis.com/aai-web-samples/public_benchmarking_spanish.mp3", "https://storage.googleapis.com/aai-web-samples/slovenian_luka_doncic_interview.mp3", "https://storage.googleapis.com/aai-web-samples/5_common_sports_injuries.mp3",]for audio_url in audio_urls: language_code = detect_language(audio_url) print("Identified language:", language_code) transcript = transcribe_file(audio_url, language_code) print("Transcript:", transcript.text[:100], "...")
Output:
Identified language: ptTranscript: e aí Olá pessoal, sejam bem-vindos a mais um vídeo e hoje eu vou ensinar-vos como fazer esta espada ...Identified language: esTranscript: Precisamente sobre este caso, el diario estadounidense New York Times reveló este sábado un conjunto ...Identified language: slTranscript: Ni lepška, kaj videl tega otroka v mrekoj svojga okolja, da mu je uspil in to v takimi miri, da pač ...Identified language: enTranscript: Runner's knee runner's knee is a condition characterized by pain behind or around the kneecap. It is ...