Hacker News 精彩评论及翻译

超级科技迷 · 公众号 · · 2024-12-22 10:15

文章预览

Hacker News 精彩评论及翻译 OpenAI O3 breakthrough high score on ARC-AGI-PUB https://news.ycombinator.com/item?id=42473876 Efficiency is now key. ~=$3400 per single task to meet human performance on this benchmark is a lot. Also it shows the bullets as "ARC-AGI-TUNED", which makes me think they did some undisclosed amount of fine-tuning (eg. via the API they showed off last week), so even more compute went into this task. We can compare this roughly to a human doing ARC-AGI puzzles, where a human will take (high variance in my subjective experience) between 5 second and 5 minutes to solve the task. (So i'd argue a human is at 0.03USD - 1.67USD per puzzle at 20USD/hr, and they include in their document an average mechancal turker at $2 USD task in their document) Going the other direction: I am interpreting this result as human level reasoning now costs (approximately) 41k/hr to 2.5M/hr with current compute. Super exciting that OpenAI pushed the compute out this far so we could ………………………………

原文地址：访问原文地址
快照地址：访问文章快照
总结与预览地址：访问总结与预览
推荐产品: 推荐产品

分享到微博

推荐文章

复利大王 · 程序员收入高，为啥不好找女朋友

7 小时前

中国翻译协会 · 报名倒计时5天 | 翻译硕士专业学位（MTI）导师培训（吉林长春）

18 小时前

甘肃食安 · 练就“快检”真本领筑牢“食安”防火墙——崆峒区市场监管局开展食品快速检测专题培训

21 小时前

鸿洋 · Flutter 小技巧之：实现 iOS 26 的 “液态玻璃”

昨天

复利大王 · 考验干部的逻辑变了

2 天前

相信音乐Bin music · 速来和任贤齐、高尔宣一起逛「超犀利趴𝟏𝟑」后台！还有宇宙人小玉化身美食推荐官看看他为flumpool介绍的小吃里有你爱的吗

9 月前

深圳大件事 · 广东5人，被国家多部门通报！

3 月前

羊城晚报 · 烫伤外卖员，星巴克被判赔3.6亿元！

3 月前