Hacker Newsnew | past | comments | ask | show | jobs | submit | randomwalker's submissionslogin
1.Open-world evaluations for measuring frontier AI capabilities [pdf] (cruxevals.com)
2 points by randomwalker 1 day ago | past | discuss
2.Towards a science of AI agent reliability (normaltech.ai)
1 point by randomwalker 52 days ago | past
3.When AI Builds AI – Findings from a Workshop on Automation of AI R&D [pdf] (georgetown.edu)
1 point by randomwalker 80 days ago | past
4.The Longitudinal Expert AI Panel: Understanding Expert Views on AI [pdf] (static1.squarespace.com)
1 point by randomwalker 5 months ago | past
5.Holistic Agent Leaderboard: The Missing Infrastructure for AI Agent Evaluation (arxiv.org)
1 point by randomwalker 6 months ago | past
6.America's AI Action Plan [pdf] (whitehouse.gov)
11 points by randomwalker 8 months ago | past
7.Could AI slow science? Confronting the production-progress paradox (aisnakeoil.com)
2 points by randomwalker 9 months ago | past
8.AI as Normal Technology (knightcolumbia.org)
239 points by randomwalker on April 15, 2025 | past | 92 comments
9.Why an overreliance on AI-driven modelling is bad for science (nature.com)
1 point by randomwalker on April 9, 2025 | past
10.Is AI progress slowing down? (aisnakeoil.com)
5 points by randomwalker on Dec 19, 2024 | past | 1 comment
11.We Looked at 78 Election Deepfakes. Political Misinformation Isn't an AI Problem (knightcolumbia.org)
5 points by randomwalker on Dec 13, 2024 | past
12.Inference Scaling FLaws: The Limits of LLM Resampling with Imperfect Verifiers (arxiv.org)
3 points by randomwalker on Nov 27, 2024 | past
13.Is the UK's liver transplant matching algorithm biased against younger patients? (aisnakeoil.com)
93 points by randomwalker on Nov 11, 2024 | past | 62 comments
14.Core-Bench: Computational Reproducibility Agent Benchmark (arxiv.org)
1 point by randomwalker on Sept 18, 2024 | past
15.AI companies are pivoting from creating gods to building products (aisnakeoil.com)
133 points by randomwalker on Aug 19, 2024 | past | 195 comments
16.AI Agents That Matter (aisnakeoil.com)
35 points by randomwalker on July 3, 2024 | past | 10 comments
17.AI Agents That Matter (arxiv.org)
4 points by randomwalker on July 2, 2024 | past
18.Scientists should use AI as a tool, not an oracle (aisnakeoil.com)
124 points by randomwalker on June 3, 2024 | past | 106 comments
19.AI safety is not a model property (aisnakeoil.com)
2 points by randomwalker on April 8, 2024 | past
20.AI safety is not a model property (aisnakeoil.com)
3 points by randomwalker on March 13, 2024 | past
21.On the Societal Impact of Open Foundation Models [pdf] (stanford.edu)
2 points by randomwalker on Feb 27, 2024 | past
22.Will AI transform law? The hype is not supported by current evidence (aisnakeoil.com)
2 points by randomwalker on Jan 25, 2024 | past
23.Generative AI's end-run around copyright won't be resolved by the courts (aisnakeoil.com)
4 points by randomwalker on Jan 22, 2024 | past | 2 comments
24.Model alignment protects against accidental harms, not intentional ones (aisnakeoil.com)
1 point by randomwalker on Dec 1, 2023 | past
25.What the executive order means for openness in AI (aisnakeoil.com)
2 points by randomwalker on Oct 31, 2023 | past
26.The Foundation Model Transparency Index (stanford.edu)
47 points by randomwalker on Oct 18, 2023 | past | 16 comments
27.Evaluating LLMs Is a Minefield (princeton.edu)
3 points by randomwalker on Oct 5, 2023 | past
28.Does ChatGPT have a liberal bias? (aisnakeoil.com)
4 points by randomwalker on Aug 18, 2023 | past | 2 comments
29.The REFORMS checklist for ML-based science (aisnakeoil.com)
2 points by randomwalker on Aug 17, 2023 | past
30.ML is useful for many things, but not for predicting scientific replicability (aisnakeoil.com)
116 points by randomwalker on Aug 11, 2023 | past | 37 comments

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: