How-to Guides Conceptual Guides Reference Web Voyager¶ WebVoyager by He, et. al., is a vision-enabled web-browsing agent capable of controlling the mouse and keyboard. It works by viewing annotated browser screenshots for each turn, then choosing the next step to take. The agent architecture is a basic reasoning and action (ReAct) loop. The unique aspects of this agent are: - It's usage of Set-of-