Anthropic’s Project Fetch Study Reveals Impact of AI on Robotics Tasks
Anthropic recently conducted an internal experiment known as Project Fetch to see how its Claude model affects human performance in real-world robotics tasks. Over the course of one day, two teams of employees attempted to program a robotic dog to fetch a beach ball, with only one team using the Claude model.
This study was designed to measure the differences in performance between teams that had access to AI and those that didn’t. Both teams were given increasingly complex tasks that involved connecting to the robot, using its sensors, manual control, and early-stage autonomy.
According to Anthropic’s findings, there was a noticeable difference in performance. The team using Claude completed more tasks overall and performed them in about half the time compared to the team without AI assistance. Specifically, Team Claude finished seven out of eight tasks, while Team Claude-less managed to complete six.
Differences in Hardware Connectivity
The most significant gaps in performance were observed during hardware connections. Team Claude excelled at navigating through confusing online resources to connect to the robot and access data from its sensors. They effectively used Claude to troubleshoot and make quick decisions. In contrast, Team Claude-less faced challenges and had to pause to receive hints from organizers when they struggled to interpret instructions.
The task of accessing lidar data posed similar hurdles for both groups. While Team Claude-less eventually advanced using just video, they only made progress on lidar toward the end of the day. By the conclusion of the experiment, Team Claude had developed a system that could identify the beach ball and move toward it, though the robot still faced difficulties in reliably fetching the ball on its own.
AI Enhances Workflow, but There Are Trade-offs
It was noted that Team Claude generated about nine times more code than their counterparts. The support of AI encouraged them to explore multiple approaches simultaneously, which led to some distractions. In a less competitive environment, such exploration might be beneficial, yet Anthropic cautioned that this aspect should be monitored in future studies.
In areas like manual control and localization, Team Claude-less sometimes outperformed their peers. After establishing a stable video feed, they developed a control program and localization method more quickly. However, the model-assisted controller used by Team Claude proved to be easier thanks to the continuous video support.
Anthropic’s analysis also revealed a clear difference in team morale. Team Claude-less faced initial setbacks, including an incident where their robot mistakenly advanced toward their table. They entered lunch without having successfully connected to their own robot.
To further analyze team dynamics, Anthropic used Claude to review conversations from both groups, examining emotional tone and how often team members asked questions. The results showed that Team Claude-less had more negative exchanges and a greater incidence of confusion, reflecting a heavier dependence on internal support without AI assistance.
Implications for Future Robotics
While the study had limitations, such as its small scale and brief duration, Anthropic considers Project Fetch an early indication of how advanced AI might work with robotics as capabilities grow. They connect their findings to a Responsible Scaling Policy that tracks circumstances where AI models could begin contributing autonomously to robotics and other physical tasks.
As Anthropic stated, the concept of intelligent AI systems using their capabilities to operate robots is not as far-fetched as it might seem.

