RSTT Stream

POST /workstations/:workstation_id/audio/rstt

The Real-Time Speech-to-Text (RSTT) endpoint provides an endpoint to transcribe voice audio from the Workstation's virtual speakers in real-time with a longer lived timeout. This is useful for transcribing long-form audio. It functions similarly to the /audio/listen endpoint, but with a longer lived timeout and less conversational features.

The common workflow is:

Call this endpoint to get a Server-Sent Events (SSE) streaming URL with a long lived timeout
Call /audio/speak to speak
As speech is detected, transcriptions will be streamed to your SSE connection:
- 'partial' events contain in-progress transcriptions
- 'final' events contain completed utterances
Use the speech_started event for interuption detection if needed. Use an RST close packet to stop the speech or send a new /speak request to interrupt the speech.
When you are done listening, close the SSE connection

Request

Responses

Successfully retrieved SSE stream URL

RSTT Stream

/workstations/:workstation_id/audio/rstt

Request​

Responses​

Request

Responses