Hi everyone, Jerry and Wyatt here from Halluminate (https://halluminate.ai/). We help AI labs train computer use agents with high quality data and RL environments.Training AI agents to use computers, browsers, and software is one of the highest-potential opportunities for AI. To date, however, this capability is still unreliable. The emerging method to improve this is called Reinforcement Learning with Verifiable Rewards (RLVR). However, researchers are currently bottlenecked by a lack of high-quality simulators and task + verifiers.To solve this problem, we’re building Westworld, a fully-simulated internet made up of synthetic versions of the most common consumer and enterprise apps. Agents use Westworld to learn how to do economically valuable tasks.For example, AI agents can practice planning vacations on a simulated flight booking site (https://flights.halluminate.ai/), or learn how to reorganize outdated information in your sales platform, or train to do financial modeling directly in a spreadsheet.Here’s a demo showing our flight booking simulation: https://www.loom.com/share/74a3b28067e24c1b886054ba90a90aa5.How it works: AI agents access our environment and are given a task + verifier. A task is basically an objective for the agent to achieve, for example "Book me a flight from SF to NYC on this date with x, y, z filters.” A verifier is a programmatic way to determine if the task was successfully completed. For example, in this case it might be a json that checks if the final flight data matches expectations. These signals can then be used to calculate a reward in RL.The more simulators we build, the more AI labs can improve on capabilities that computer use agents are currently weak at. One of our customers saw a ~20% improvement in date-picking performance when training on our flight booking simulator.Two things make this hard so far:(1) The simulations have to be realistic. You can’t get away with a vibe-coded “80% solution” because even small divergences ...
First seen: 2025-08-11 15:49
Last seen: 2025-08-12 02:52