AI Browser Operator
An AI Browser Operator is a set of actions that can be performed in a Workstation to interact with the browser environment. These actions can be used to navigate to URLs, open and close tabs, and switch between tabs in the browser.
📄️ Prompt
Send browser-related action instructions or an 'act' prompt to an AI agent browser operator in the Workstation to perform a task. We will determine the best open source browser operator and model to use.
📄️ Act
Perform a series of browser actions using natural language descriptions. Midscene analyzes the current page context
📄️ Extract
Extract structured data from the current page using multi-modal AI inference. The AI can extract both explicit text
📄️ Observe
Observe the current page and provide a detailed description of the page content. This endpoint can be used to make