The Deadly Tribunal specializes in reporting on news stories and events from practically every major category and subcategory you could imagine—including world news, entertainment, politics, business, style and several others.
Show HN: PhAIL – Real-robot benchmark for AI models
11 by vertix | 8 comments on Hacker News. I built this because I couldn't find honest numbers on how well VLA models [1] actually work on commercial tasks. I come from search ranking at Google where you measure everything, and in robotics nobody seemed to know. PhAIL runs four models (OpenPI/pi0.5, GR00T, ACT, SmolVLA) on bin-to-bin order picking – one of the most common warehouse operations. Same robot (Franka FR3), same objects, hundreds of blind runs. The operator doesn't know which model is running. Best model: 64 UPH. Human teleoperating the same robot: 330. Human by hand: 1,300+. Everything is public – every run with synced video and telemetry, the fine-tuning dataset, training scripts. The leaderboard is open for submissions. Happy to answer questions about methodology, the models, or what we observed. [1] Vision-Language-Action: https://ift.tt/WGDEdsa