| 1. | | Quantifying the algorithmic improvement from reasoning models (epoch.ai) |
| 1 point by og_kalu 82 days ago | past |
|
| 2. | | Evidence of interrelated cognitive-like capabilities in large language models (sciencedirect.com) |
| 1 point by og_kalu 4 months ago | past |
|
| 3. | | Atlas: Learning to Optimally Memorize the Context at Test Time (arxiv.org) |
| 43 points by og_kalu 5 months ago | past | 4 comments |
|
| 4. | | Gemini Diffusion (deepmind.google) |
| 61 points by og_kalu 5 months ago | past | 7 comments |
|
| 5. | | Tails Tell Tales: Chapter-Wide Manga Transcriptions with Character Names (arxiv.org) |
| 2 points by og_kalu 8 months ago | past | 1 comment |
|
| 6. | | Over-Tokenized Transformer: Vocabulary Is Generally Worth Scaling (arxiv.org) |
| 2 points by og_kalu 8 months ago | past |
|
| 7. | | LLMs struggle with perception, not reasoning, in ARC-AGI (anokas.substack.com) |
| 2 points by og_kalu 8 months ago | past |
|
| 8. | | EvaByte: Efficient Byte-Level Language Models at Scale (hkunlp.github.io) |
| 3 points by og_kalu 9 months ago | past |
|
| 9. | | Tell me about yourself: LLMs are aware of their learned behaviors (arxiv.org) |
| 2 points by og_kalu 9 months ago | past |
|
| 10. | | Imagine While Reasoning in Space: Multimodal Visualization-of-Thought (arxiv.org) |
| 2 points by og_kalu 9 months ago | past |
|
| 11. | | LLMs struggle with perception, not reasoning, in ARC-AGI (anokas.substack.com) |
| 1 point by og_kalu 9 months ago | past |
|
| 12. | | Byte Latent Transformer: Patches Scale Better Than Tokens (meta.com) |
| 6 points by og_kalu 10 months ago | past |
|
| 13. | | Mastering Board Games by External and Internal Planning with Language Models (deepmind.google) |
| 1 point by og_kalu 10 months ago | past |
|
| 14. | | Emergence of Hidden Capabilities: Exploring Learning Dynamics in Concept Space (arxiv.org) |
| 2 points by og_kalu 11 months ago | past |
|
| 15. | | GameGen-X: Open-World Video Game Generation (gamegen-x.github.io) |
| 4 points by og_kalu 11 months ago | past |
|
| 16. | | TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters (arxiv.org) |
| 174 points by og_kalu 12 months ago | past | 33 comments |
|
| 17. | | Kurzgesagt: We Fell for the Oldest Lie on the Internet [video] (youtube.com) |
| 1 point by og_kalu 12 months ago | past | 3 comments |
|
| 18. | | Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-Wise LoRA (arxiv.org) |
| 1 point by og_kalu 12 months ago | past |
|
| 19. | | Solving Global Lyapunov functions: open problem in mathematics with transformers (arxiv.org) |
| 3 points by og_kalu on Oct 27, 2024 | past |
|
| 20. | | ChatGPT Topped 3B Visits in September (similarweb.com) |
| 2 points by og_kalu on Oct 18, 2024 | past |
|
| 21. | | Tx-LLM: Supporting therapeutic development with large language models (research.google) |
| 2 points by og_kalu on Oct 14, 2024 | past |
|
| 22. | | Tx-LLM: Supporting therapeutic development with large language models (research.google) |
| 2 points by og_kalu on Oct 9, 2024 | past |
|
| 23. | | Visual Autoregressive Modeling: Image Generation via Next-Resolution Prediction (arxiv.org) |
| 1 point by og_kalu on Oct 5, 2024 | past | 1 comment |
|
| 24. | | xAI's Colossus (100k H100 cluster) has begun training (twitter.com/elonmusk) |
| 7 points by og_kalu on Sept 7, 2024 | past | 1 comment |
|
| 25. | | Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon (arxiv.org) |
| 1 point by og_kalu on June 28, 2024 | past |
|
| 26. | | GPT-4o's image generation capabilities (twitter.com/gdb) |
| 1 point by og_kalu on May 15, 2024 | past |
|
| 27. | | LLMs for few-shot low level robot control by representing trajectories as tokens (twitter.com/ed__johns) |
| 1 point by og_kalu on April 18, 2024 | past |
|
| 28. | | Keypoint Action Tokens Enable In-Context Imitation Learning in Robotics (robot-learning.uk) |
| 1 point by og_kalu on April 17, 2024 | past |
|
| 29. | | You can now edit DALLĀ·E images in ChatGPT (twitter.com/openai) |
| 4 points by og_kalu on April 3, 2024 | past |
|
| 30. | | Microsoft and Open AI Plot $100B AI Supercomputer Called "Stargate" (reuters.com) |
| 6 points by og_kalu on April 3, 2024 | past | 1 comment |
|
|
| More |