Use virtual speaker and microphone hardware with models such as speech-to-text and text-to-speech to speak, listen, and respond to audio in an active Workstation.
📄️ Prompt
Send a system-level prompt to the AI agent in the Workstation to perform a task.
📄️ Browser Prompt
Send a browser-related prompt to the AI agent in the Workstation to perform a task.
📄️ Voice Speak
Play voice audio into the virtual microphone via a text-to-speech model. You can provide exact copy for the agent to speak, or instructions for an LLM to generate a response.
📄️ Voice Listen
Listen for audio from the virtual microphone and transcribe the audio into text.
📄️ Voice Question
Speak a question into the virtual microphone, listening for a user response, and then optionally responding back.
📄️ Voice Transcript
Get a URL to a transcript SSE stream from the Workstation.