Overview
By the end of this tutorial, you’ll be able to transcribe audio from your microphone in C#.Streaming Speech-to-Text is only available for English. See Supported languages.
Before you begin
To complete this tutorial, you need:- .NET 8 (earlier versions will work too with minor adjustments)
- An AssemblyAI account with credit card set up.
Step 1: Set up a cancellation token
Set up a cancellation token so you can gracefully stop the application.Ctrl+C
.
Step 2: Install the AssemblyAI C# .NET SDK
Add the AssemblyAI NuGet package to your project:Step 3: Create a real-time transcriber
In this step, you’ll create a real-time transcriber and configure it to use your API key.1
Browse to , and then click the text under Your API key to copy it.
2
Create a
RealtimeTranscriber
with your API key and a sample rate of 16 kHz. Replace YOUR_API_KEY
with your copied API key.The
sampleRate
is the number of audio samples per second, measured in hertz (Hz). Higher sample rates result in higher quality audio, which may lead to better transcripts, but also more data being sent over the network.We recommend the following sample rates:- Minimum quality:
8_000
(8 kHz) - Medium quality:
16_000
(16 kHz) - Maximum quality:
48_000
(48 kHz)
3
Subscribe to the different transcriber events and log the event parameters.The real-time transcriber returns two types of transcripts: partial and final.
- Partial transcripts are returned as the audio is being streamed to AssemblyAI.
- Final transcripts are returned when the service detects a pause in speech.
You can configure the silence threshold for automatic utterance detection and programmatically force the end of an utterance to immediately get a Final transcript.
Step 4: Connect the streaming service
Streaming Speech-to-Text uses WebSockets to stream audio to AssemblyAI. This requires first establishing a connection to the API.Step 5: Record audio from microphone
In this step, you’ll use SoX, a cross-platform audio library, to record audio from your microphone.1
Install SoX on your machine.
2
To run the SoX process and pipe the audio data to the process output, add the following code:
The SoX arguments configure the format of the audio output. The arguments configure the format to a single channel with 16-bit signed integer PCM encoding and 16 kHz sample rate.If you want to stream data from elsewhere, make sure that your audio data is in the following format:
- Single channel
- 16-bit signed integer PCM or mu-law encoding
3
Read the audio data from the SoX process output and send it to the real-time transcriber.