Transcribe audio from a microphone in C#

Overview

By the end of this tutorial, you’ll be able to transcribe audio from your microphone in C#.

Streaming Speech-to-Text is only available for English. See Supported languages.

Before you begin

To complete this tutorial, you need:

.NET 8 (earlier versions will work too with minor adjustments)
An AssemblyAI account with credit card set up.

Here’s the full sample code for what you’ll build in this tutorial:

using System.Diagnostics;
using System.Runtime.InteropServices;
using AssemblyAI.Realtime;

// Set up the cancellation token, so we can stop the program with Ctrl+C
var cts = new CancellationTokenSource();
var ct = cts.Token;
Console.CancelKeyPress += (sender, e) => cts.Cancel();

// Set up the realtime transcriber
const int sampleRate = 16_000;
await using var transcriber = new RealtimeTranscriber(new RealtimeTranscriberOptions
{
    ApiKey = "YOUR_API_KEY",
    SampleRate = sampleRate
});

transcriber.SessionBegins.Subscribe(message => Console.WriteLine(
    $"Session begins: \n- Session ID: {message.SessionId}\n- Expires at: {message.ExpiresAt}"
));
transcriber.PartialTranscriptReceived.Subscribe(transcript =>
{
    // don't do anything if nothing was said
    if (string.IsNullOrEmpty(transcript.Text)) return;
    Console.WriteLine($"Partial: {transcript.Text}");
});
transcriber.FinalTranscriptReceived.Subscribe(transcript =>
{
    Console.WriteLine($"Final: {transcript.Text}");
});
transcriber.ErrorReceived.Subscribe(error => Console.WriteLine($"Real-time error: {error.Error}"));
transcriber.Closed.Subscribe(closeEvent =>
    Console.WriteLine("Real-time connection closed: {0} - {1}",
        closeEvent.Code,
        closeEvent.Reason
    )
);

Console.WriteLine("Connecting to real-time transcript service");

await transcriber.ConnectAsync().ConfigureAwait(false);

Console.WriteLine("Starting recording");

var soxArguments = string.Join(' ', [
    // --default-device doesn't work on Windows
    RuntimeInformation.IsOSPlatform(OSPlatform.Windows) ? "-t waveaudio default" : "--default-device",
    "--no-show-progress",
    $"--rate {sampleRate}",
    "--channels 1",
    "--encoding signed-integer",
    "--bits 16",
    "--type wav",
    "-" // pipe
]);
using var soxProcess = new Process
{
    StartInfo = new ProcessStartInfo
    {
        FileName = "sox",
        Arguments = soxArguments,
        RedirectStandardOutput = true,
        UseShellExecute = false,
        CreateNoWindow = true
    }
};

soxProcess.Start();
var soxOutputStream = soxProcess.StandardOutput.BaseStream;
var buffer = new byte[4096];
while (await soxOutputStream.ReadAsync(buffer, 0, buffer.Length, ct) > 0)
{
    if (ct.IsCancellationRequested) break;
    await transcriber.SendAudioAsync(buffer);
}

soxProcess.Kill();
await transcriber.CloseAsync();

Step 1: Set up a cancellation token

Set up a cancellation token so you can gracefully stop the application.

using System.Diagnostics;
using AssemblyAI.Realtime;

// Set up the cancellation token, so we can stop the program with Ctrl+C
var cts = new CancellationTokenSource();
var ct = cts.Token;
Console.CancelKeyPress += (sender, e) => cts.Cancel();

The cancellation token will be cancelled when you press Ctrl+C.

Step 2: Install the AssemblyAI C# .NET SDK

Add the AssemblyAI NuGet package to your project:

dotnet add package AssemblyAI

Step 3: Create a real-time transcriber

In this step, you’ll create a real-time transcriber and configure it to use your API key.

Browse to , and then click the text under Your API key to copy it.

Create a RealtimeTranscriber with your API key and a sample rate of 16 kHz. Replace YOUR_API_KEY with your copied API key.

// Set up the realtime transcriber
const int sampleRate = 16_000;
await using var transcriber = new RealtimeTranscriber(new RealtimeTranscriberOptions
{
 ApiKey = "YOUR_API_KEY",
 SampleRate = sampleRate
});

The sampleRate is the number of audio samples per second, measured in hertz (Hz). Higher sample rates result in higher quality audio, which may lead to better transcripts, but also more data being sent over the network.We recommend the following sample rates:

Minimum quality: 8_000 (8 kHz)
Medium quality: 16_000 (16 kHz)
Maximum quality: 48_000 (48 kHz)

If you don’t set a sample rate on the real-time transcriber, it defaults to 16 kHz.

Subscribe to the different transcriber events and log the event parameters.

transcriber.SessionBegins.Subscribe(message => Console.WriteLine(
 $"Session begins: \n- Session ID: {message.SessionId}\n- Expires at: {message.ExpiresAt}"
));
transcriber.PartialTranscriptReceived.Subscribe(transcript =>
{
 // don't do anything if nothing was said
 if (string.IsNullOrEmpty(transcript.Text)) return;
 Console.WriteLine($"Partial: {transcript.Text}");
});
transcriber.FinalTranscriptReceived.Subscribe(transcript =>
{
 Console.WriteLine($"Final: {transcript.Text}");
});
transcriber.ErrorReceived.Subscribe(error => Console.WriteLine($"Real-time error: {error.Error}"));
transcriber.Closed.Subscribe(closeEvent =>
 Console.WriteLine("Real-time connection closed: {0} - {1}",
     closeEvent.Code,
     closeEvent.Reason
 )
);

The real-time transcriber returns two types of transcripts: partial and final.

Partial transcripts are returned as the audio is being streamed to AssemblyAI.
Final transcripts are returned when the service detects a pause in speech.

You can configure the silence threshold for automatic utterance detection and programmatically force the end of an utterance to immediately get a Final transcript.

Step 4: Connect the streaming service

Streaming Speech-to-Text uses WebSockets to stream audio to AssemblyAI. This requires first establishing a connection to the API.

Console.WriteLine("Connecting to real-time transcript service");await transcriber.ConnectAsync();

Step 5: Record audio from microphone

In this step, you’ll use SoX, a cross-platform audio library, to record audio from your microphone.

Install SoX on your machine.

brew install sox

To run the SoX process and pipe the audio data to the process output, add the following code:

Console.WriteLine("Starting recording");

var soxArguments = string.Join(' ', [
 // --default-device doesn't work on Windows
 RuntimeInformation.IsOSPlatform(OSPlatform.Windows) ? "-t waveaudio default" : "--default-device",
 "--no-show-progress",
 $"--rate {sampleRate}",
 "--channels 1",
 "--encoding signed-integer",
 "--bits 16",
 "--type wav",
 "-" // pipe
]);
using var soxProcess = new Process
{
 StartInfo = new ProcessStartInfo
 {
     FileName = "sox",
     Arguments = soxArguments,
     RedirectStandardOutput = true,
     UseShellExecute = false,
     CreateNoWindow = true
 }
};

soxProcess.Start();

The SoX arguments configure the format of the audio output. The arguments configure the format to a single channel with 16-bit signed integer PCM encoding and 16 kHz sample rate.If you want to stream data from elsewhere, make sure that your audio data is in the following format:

Single channel
16-bit signed integer PCM or mu-law encoding

By default, the Streaming STT service expects PCM16-encoded audio. If you want to use mu-law encoding, see Specifying the encoding.

Read the audio data from the SoX process output and send it to the real-time transcriber.

var soxOutputStream = soxProcess.StandardOutput.BaseStream;
var buffer = new byte[4096];
while (await soxOutputStream.ReadAsync(buffer, 0, buffer.Length, ct) > 0)
{
 if (ct.IsCancellationRequested) break;
 await transcriber.SendAudioAsync(buffer);
}

Step 6: Disconnect the real-time service

When you are done, stop the recording, and disconnect the transcriber to close the connection.

soxProcess.Kill();await transcriber.CloseAsync();

Next steps

To learn more about Streaming Speech-to-Text, see the following resources:

Need some help?

If you get stuck, or have any other questions, we’d love to help you out. Ask our support team in our Discord server.

Introduction

Getting Started

Transcribe audio from a microphone in C#

Overview

Before you begin

Step 1: Set up a cancellation token

Step 2: Install the AssemblyAI C# .NET SDK

Step 3: Create a real-time transcriber

Step 4: Connect the streaming service

Step 5: Record audio from microphone

Step 6: Disconnect the real-time service

Next steps

Need some help?

Introduction

Getting Started

​Overview

​Before you begin

​Step 1: Set up a cancellation token

​Step 2: Install the AssemblyAI C# .NET SDK

​Step 3: Create a real-time transcriber

​Step 4: Connect the streaming service

​Step 5: Record audio from microphone

​Step 6: Disconnect the real-time service

​Next steps

​Need some help?

Overview

Before you begin

Step 1: Set up a cancellation token

Step 2: Install the AssemblyAI C# .NET SDK

Step 3: Create a real-time transcriber

Step 4: Connect the streaming service

Step 5: Record audio from microphone

Step 6: Disconnect the real-time service

Next steps

Need some help?