RSTT Stream
POST/workstations/:workstation_id/audio/rstt
The Real-Time Speech-to-Text (RSTT) endpoint provides an endpoint to transcribe voice audio from the Workstation's virtual speakers in real-time with a longer lived timeout. This is useful for transcribing long-form audio. It functions similarly to the /audio/listen endpoint, but with a longer lived timeout and less conversational features.
The common workflow is:
- Call this endpoint to get a Server-Sent Events (SSE) streaming URL with a long lived timeout
- Call /audio/speak to speak
- As speech is detected, transcriptions will be streamed to your SSE connection:
- 'partial' events contain in-progress transcriptions
- 'final' events contain completed utterances
- Use the speech_started event for interuption detection if needed. Use an RST close packet to stop the speech or send a new /speak request to interrupt the speech.
- When you are done listening, close the SSE connection
Request
Responses
- 200
- 400
- 401
- 402
- 422
- 429
- 500
- 503
Successfully retrieved SSE stream URL
Invalid Request Format - check API documentation for proper syntax.
Unauthorized - missing or invalid API key.
Payment Required - you have run out of trial credits or your payment method has expired. Please add payment details to your account.
Unprocessable Entity - cannot find requested asset associated with your API key.
Too Many Requests - you have exceeded the rate limit for your account. Please wait before making additional requests.
Internal Server Error - please retry your request.
Service Unavailable - our servers have dropped the request due to high load - please retry.