Haystack Integration for AssemblyAI

On top of audio transcription the AssemblyAITranscriber offers summarization and speaker diarization. This makes it possible to not only convert audio to text but also obtain concise summaries and identify speakers in a conversation.

Quickstart

Install the assemblyai-haystack package using pip. This package installs and uses the AssemblyAI Python SDK and Haystack 2.0. You can find more information about the SDK at the AssemblyAI Python SDK GitHub repository.

pip install assemblyai-haystack

Usage

Add an AssemblyAITranscriber component and initialize it by passing your AssemblyAI API key. Once the pipeline is ready to run, make sure to pass at least the file_path argument to the run function. The file_path can be a URL or a local file path. In the run function, you can also specify whether you want summarization and speaker diarization results.

import os

from assemblyai_haystack.transcriber import AssemblyAITranscriber
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack import Pipeline
from haystack.components.writers import DocumentWriter

ASSEMBLYAI_API_KEY = os.environ.get("ASSEMBLYAI_API_KEY")

document_store = InMemoryDocumentStore()
file_url = "https://assembly.ai/wildfires.mp3"

indexing = Pipeline()
indexing.add_component("transcriber", AssemblyAITranscriber(api_key=ASSEMBLYAI_API_KEY))
indexing.add_component("writer", DocumentWriter(document_store))
indexing.connect("transcriber.transcription", "writer.documents")
indexing.run(
    {
        "transcriber": {
            "file_path": file_url,
            "summarization": None,
            "speaker_labels": None,
        }
    }
)

print("Indexed Document Count:", document_store.count_documents())

Calling indexing.run() blocks until the transcription is finished.

The results of the transcription, summarization and speaker diarization are returned in separate document lists:

transcription
summarization
speaker_labels

When AssemblyAITranscriber is used in a Haystack pipeline, transcription happens by default. In the metadata of the transcription, you will also get the ID of the transcription and the URL of your audio file. A bullet point summary of what is being discussed will be returned if summarization is set to TRUE. The transcription divided into utterances of speakers will be returned if speaker_labels is set to TRUE. The output of the AssemblyAITranscriber is a Haystack document. When all features are turned on, the created document looks like this:

{
"transcription":
  [Document(
    id=bdf3eb20f6440cf4b15fa4fa3176eeb72bf0139a3ad4c76741724132907a5daa,
    content: "Smoke from hundreds of wildfires in Canada is triggering air quality alerts throughout the US. Skyli...",
    meta: {
      'transcript_id': '2335cc07-1fbf-48ba-9855-7db3eeeb80f4',
      'audio_url': "https://assembly.ai/wildfires.mp3"
      }
    )
  ],

"summarization":
  [Document(
    id=f88864d9229b30013d5248156e74d5bfd4435e73aadb0c0ce79040be10a4f308,
    content: "- Smoke from hundreds of wildfires in Canada is triggering air quality alerts...")],

"speaker_labels":
  [Document(
    id=a7e222bc6a965ab1032401a6fa22da2e774294ce049b9d228acbb8b100ea2ecf,
    content: "Smoke from hundreds of wildfires in Canada is triggering air quality...",
    meta: {
      'speaker': 'A'
      }
    ),

  Document(
    id=711a1888af58601e6392490a5e4ca4c10958f93a52d8f0734869c54573ea76f5,
    content: "Well, there's a couple of things. The season has been pretty dry already...",
    meta: {
      'speaker': 'B'
      }
    ),

  Document(
    id=8fc78631d420e2e6127b8bdff2830f693febb91ed1566b9a84527cf023023d9e,
    content: "So what is it in this haze that makes it harmful?",
    meta: {
      'speaker': 'A'
      }
    ),

    ...
]}

Additional resources

You can learn more about using Haystack with AssemblyAI in these resources:

Integrations

​Quickstart

​Usage

​Additional resources

Quickstart

Usage

Additional resources