Hacker Newsnew | past | comments | ask | show | jobs | submit | og_kalu's submissionslogin
1.Quantifying the algorithmic improvement from reasoning models (epoch.ai)
1 point by og_kalu 82 days ago | past
2.Evidence of interrelated cognitive-like capabilities in large language models (sciencedirect.com)
1 point by og_kalu 4 months ago | past
3.Atlas: Learning to Optimally Memorize the Context at Test Time (arxiv.org)
43 points by og_kalu 5 months ago | past | 4 comments
4.Gemini Diffusion (deepmind.google)
61 points by og_kalu 5 months ago | past | 7 comments
5.Tails Tell Tales: Chapter-Wide Manga Transcriptions with Character Names (arxiv.org)
2 points by og_kalu 8 months ago | past | 1 comment
6.Over-Tokenized Transformer: Vocabulary Is Generally Worth Scaling (arxiv.org)
2 points by og_kalu 8 months ago | past
7.LLMs struggle with perception, not reasoning, in ARC-AGI (anokas.substack.com)
2 points by og_kalu 8 months ago | past
8.EvaByte: Efficient Byte-Level Language Models at Scale (hkunlp.github.io)
3 points by og_kalu 9 months ago | past
9.Tell me about yourself: LLMs are aware of their learned behaviors (arxiv.org)
2 points by og_kalu 9 months ago | past
10.Imagine While Reasoning in Space: Multimodal Visualization-of-Thought (arxiv.org)
2 points by og_kalu 9 months ago | past
11.LLMs struggle with perception, not reasoning, in ARC-AGI (anokas.substack.com)
1 point by og_kalu 9 months ago | past
12.Byte Latent Transformer: Patches Scale Better Than Tokens (meta.com)
6 points by og_kalu 10 months ago | past
13.Mastering Board Games by External and Internal Planning with Language Models (deepmind.google)
1 point by og_kalu 10 months ago | past
14.Emergence of Hidden Capabilities: Exploring Learning Dynamics in Concept Space (arxiv.org)
2 points by og_kalu 11 months ago | past
15.GameGen-X: Open-World Video Game Generation (gamegen-x.github.io)
4 points by og_kalu 11 months ago | past
16.TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters (arxiv.org)
174 points by og_kalu 12 months ago | past | 33 comments
17.Kurzgesagt: We Fell for the Oldest Lie on the Internet [video] (youtube.com)
1 point by og_kalu 12 months ago | past | 3 comments
18.Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-Wise LoRA (arxiv.org)
1 point by og_kalu 12 months ago | past
19.Solving Global Lyapunov functions: open problem in mathematics with transformers (arxiv.org)
3 points by og_kalu on Oct 27, 2024 | past
20.ChatGPT Topped 3B Visits in September (similarweb.com)
2 points by og_kalu on Oct 18, 2024 | past
21.Tx-LLM: Supporting therapeutic development with large language models (research.google)
2 points by og_kalu on Oct 14, 2024 | past
22.Tx-LLM: Supporting therapeutic development with large language models (research.google)
2 points by og_kalu on Oct 9, 2024 | past
23.Visual Autoregressive Modeling: Image Generation via Next-Resolution Prediction (arxiv.org)
1 point by og_kalu on Oct 5, 2024 | past | 1 comment
24.xAI's Colossus (100k H100 cluster) has begun training (twitter.com/elonmusk)
7 points by og_kalu on Sept 7, 2024 | past | 1 comment
25.Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon (arxiv.org)
1 point by og_kalu on June 28, 2024 | past
26.GPT-4o's image generation capabilities (twitter.com/gdb)
1 point by og_kalu on May 15, 2024 | past
27.LLMs for few-shot low level robot control by representing trajectories as tokens (twitter.com/ed__johns)
1 point by og_kalu on April 18, 2024 | past
28.Keypoint Action Tokens Enable In-Context Imitation Learning in Robotics (robot-learning.uk)
1 point by og_kalu on April 17, 2024 | past
29.You can now edit DALLĀ·E images in ChatGPT (twitter.com/openai)
4 points by og_kalu on April 3, 2024 | past
30.Microsoft and Open AI Plot $100B AI Supercomputer Called "Stargate" (reuters.com)
6 points by og_kalu on April 3, 2024 | past | 1 comment

Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: