Overview
By the end of this tutorial, you’ll be able to transcribe audio from your microphone in Java.Streaming Speech-to-Text is only available for English. See Supported languages.
Before you begin
To complete this tutorial, you need:- Java 8 or above.
- An AssemblyAI account with credit card set up.
Step 1: Install the SDK
Include the latest version of AssemblyAI’s Java SDK in your project dependencies:Step 2: Create a real-time transcriber
In this step, you’ll create a real-time transcriber and configure it to use your API key.1
Browse to Account, and then click the text under Your API key to copy it.
2
Use the builder to create a new real-time transcriber with your API key, a sample rate of 16 kHz, and lambdas to log the different events. Replace The real-time transcriber returns two types of transcripts: partial and final.
YOUR_API_KEY
with your copied API key.- Partial transcripts are returned as the audio is being streamed to AssemblyAI.
- Final transcripts are returned when the service detects a pause in speech.
You can configure the silence threshold for automatic utterance detection and programmatically force the end of an utterance to immediately get a Final transcript.
The
sample_rate
is the number of audio samples per second, measured in hertz (Hz). Higher sample rates result in higher quality audio, which may lead to better transcripts, but also more data being sent over the network.We recommend the following sample rates:- Minimum quality:
8_000
(8 kHz) - Medium quality:
16_000
(16 kHz) - Maximum quality:
48_000
(48 kHz)
Step 3: Connect the streaming service
Connect to the streaming service so you can send audio to it.Step 4: Record audio from microphone
In this step, you’ll use Java’s built-in APIs for recording audio.1
Create the audio format that the real-time service expects, which is single channel,
pcm_s16le
(PCM signed 16-bit little-endian) encoded, with a sample rate of 16_000
. The sample rate needs to be the same value as you configured on the real-time transcriber.By default, transcriptions expect PCM16-encoded audio. If you want to use mu-law encoding, see Specifying the encoding.
2
Get the microphone and open it.
3
Read the audio data into a byte array and send it to the real-time transcriber.
The
interrupted()
method returns true
when the current thread is interrupted. In this example, you will use it to stop the transcriber and recording when the user presses the ENTER
key.