Quickstart

Enable Topic Detection by setting iab_categories to true in the transcription config.

import assemblyai as aai

aai.settings.api_key = "YOUR_API_KEY"

# audio_file = "./local_file.mp3"
audio_file = "https://assembly.ai/wildfires.mp3"

config = aai.TranscriptionConfig(iab_categories=True)

transcript = aai.Transcriber().transcribe(audio_file, config)

# Get the parts of the transcript that were tagged with topics
for result in transcript.iab_categories.results:
    print(result.text)
    print(f"Timestamp: {result.timestamp.start} - {result.timestamp.end}")
    for label in result.labels:
        print(f"{label.label} ({label.relevance})")

# Get a summary of all topics in the transcript
for topic, relevance in transcript.iab_categories.summary.items():
    print(f"Audio is {relevance * 100}% relevant to {topic}")

Example output

Smoke from hundreds of wildfires in Canada is triggering air quality alerts throughout the US. Skylines...
Timestamp: 250 - 28920
Home&Garden>IndoorEnvironmentalQuality (0.9881)
NewsAndPolitics>Weather (0.5561)
MedicalHealth>DiseasesAndConditions>LungAndRespiratoryHealth (0.0042)
...
Audio is 100.0% relevant to NewsAndPolitics>Weather
Audio is 93.78% relevant to Home&Garden>IndoorEnvironmentalQuality
...

API reference

Request

curl https://api.assemblyai.com/v2/transcript \
--header "Authorization: YOUR_API_KEY" \
--header "Content-Type: application/json" \
--data '{
  "audio_url": "YOUR_AUDIO_URL",
  "iab_categories": true
}'
KeyTypeDescription
iab_categoriesbooleanEnable Topic Detection.

Response

{
  iab_categories:true,
  iab_categories_result:{
  status:"success",
  results:[...],
  summary:{...}
  }
}

KeyTypeDescription
iab_categories_resultobjectThe result of the Topic Detection model.
iab_categories_result.statusstringIs either success, or unavailable in the rare case that the Content Moderation model failed.
iab_categories_result.resultsarrayAn array of the Topic Detection results.
iab_categories_result.results[i].textstringThe text in the transcript in which the i-th instance of a detected topic occurs.
iab_categories_result.results[i].labels[j].relevancenumberHow relevant the j-th detected topic is in the i-th instance of a detected topic.
iab_categories_result.results[i].labels[j].labelstringThe IAB taxonomical label for the j-th label of the i-th instance of a detected topic, where > denotes supertopic/subtopic relationship.
iab_categories_result.results[i].timestamp.startnumberThe starting time in the audio file at which the i-th detected topic instance is discussed.
iab_categories_result.results[i].timestamp.endnumberThe ending time in the audio file at which the i-th detected topic instance is discussed.
iab_categories_result.summaryobjectSummary where each property is a detected topic.
iab_categories_result.summary.topicnumberThe overall relevance of topic to the entire audio file.

The response also includes the request parameters used to generate the transcript.

Frequently asked questions