thesmtsolver's favorites

1.		Solving a million-step LLM task with zero errors (arxiv.org)
		222 points by Anon84 6 months ago \| 95 comments
2.		Z3 API in Python: From Sudoku to N-Queens in Under 20 Lines (2015) (ericpony.github.io)
		155 points by amit-bansil 6 months ago \| 14 comments
3.		Heretic: Automatic censorship removal for language models (github.com/p-e-w)
		745 points by melded 6 months ago \| 380 comments
4.		Why Fei-Fei Li and Yann LeCun are both betting on "world models" (entropytown.com)
		141 points by signa11 6 months ago \| 95 comments
5.		Marble by World Labs: Multimodal world model to create and edit 3D worlds (worldlabs.ai)
		48 points by dmarcos 6 months ago \| 20 comments
6.		Learning to Model the World with Language (dynalang.github.io)
		56 points by jxmorris12 6 months ago \| 2 comments
7.		Spatial intelligence is AI’s next frontier (drfeifei.substack.com)
		243 points by mkirchner 6 months ago \| 127 comments
8.		The Principles of Diffusion Models (arxiv.org)
		239 points by Anon84 6 months ago \| 31 comments
9.		Study identifies weaknesses in how AI systems are evaluated (ox.ac.uk)
		416 points by pseudolus 6 months ago \| 192 comments
10.		Language models are injective and hence invertible (arxiv.org)
		231 points by mazsa 7 months ago \| 148 comments
11.		A definition of AGI (arxiv.org)
		305 points by pegasus 7 months ago \| 514 comments
12.		Agent Lightning: Train agents with RL (no code changes needed) (github.com/microsoft)
		98 points by bakigul 7 months ago \| 14 comments
13.		Why can't transformers learn multiplication? (arxiv.org)
		161 points by PaulHoule 7 months ago \| 107 comments
14.		AI assistants misrepresent news content 45% of the time (bbc.co.uk)
		445 points by sohkamyung 7 months ago \| 291 comments
15.		The Dragon Hatchling: The missing link between the transformer and brain models (arxiv.org)
		134 points by thatxliner 7 months ago \| 99 comments
16.		Who invented deep residual learning? (idsia.ch)
		114 points by timlod 7 months ago \| 35 comments
17.		Andrej Karpathy – It will take a decade to work through the issues with agents (dwarkesh.com)
		1212 points by ctoth 7 months ago \| 1115 comments
18.		AdapTive-LeArning Speculator System (ATLAS): Faster LLM inference (together.ai)
		198 points by alecco 7 months ago \| 47 comments
19.		Meta Superintelligence Labs' first paper is about RAG (paddedinputs.substack.com)
		423 points by skadamat 7 months ago \| 271 comments
20.		A small number of samples can poison LLMs of any size (anthropic.com)
		1202 points by meetpateltech 7 months ago \| 439 comments
21.		How does gradient descent work? (centralflows.github.io)
		325 points by jxmorris12 8 months ago \| 24 comments
22.		ProofOfThought: LLM-based reasoning using Z3 theorem proving (github.com/debarghag)
		326 points by barthelomew 8 months ago \| 175 comments
23.		OpenTSLM: Language models that understand time series (opentslm.com)
		280 points by rjakob 8 months ago \| 80 comments
24.		Why friction is necessary for growth (jameelur.com)
		158 points by WanderingSoul 8 months ago \| 78 comments
25.		We reverse-engineered Flash Attention 4 (modal.com)
		134 points by birdculture 8 months ago \| 48 comments
26.		Nano Banana image examples (github.com/picotrex)
		567 points by SweetSoftPillow 8 months ago \| 248 comments
27.		[flagged] What happens when 10k AI agents are left to self-govern in a virtual world? (aivilization.ai)
		35 points by saratsai 9 months ago \| 27 comments
28.		Understanding Transformers Using a Minimal Example (rti.github.io)
		295 points by rttti 9 months ago \| 25 comments
29.		'World Models,' an old idea in AI, mount a comeback (quantamagazine.org)
		211 points by warrenm 9 months ago \| 80 comments