Submissions from arxiv.org

		The Illusion of Readiness: Stress Testing Frontier Models on Medical Benchmarks (arxiv.org)
		6 points by mellosouls 15 days ago \| past
		Report on the 63rd Annual International Mathematical Olympiad (arxiv.org)
		1 point by bikenaga 15 days ago \| past
		A fast, strong, topologically meaningful and fun knot invariant (arxiv.org)
		52 points by bikenaga 15 days ago \| past \| 7 comments
		Quantized LLMss in Biomedical Natural Language Processing (arxiv.org)
		1 point by PaulHoule 15 days ago \| past
		Ransomware 3.0: Self-Composing and LLM-Orchestrated (arxiv.org)
		1 point by PaulHoule 15 days ago \| past
		Multi-Modal vs. Text-Based: Benchmarking LLM Strategies for Invoice Processing (arxiv.org)
		1 point by PaulHoule 15 days ago \| past
		LIMI: Less Is More for Agency (arxiv.org)
		1 point by pella 16 days ago \| past
		Design, analysis, and manufacturing of microstructured blade-like geometries (arxiv.org)
		2 points by PaulHoule 16 days ago \| past
		Fill probability estimates in institutional bond trading with quantum computers (arxiv.org)
		2 points by polrjoy 16 days ago \| past \| 2 comments
		Weak Memory Model Formalisms: Introduction and Survey (arxiv.org)
		2 points by matt_d 16 days ago \| past
		Why Language Models Hallucinate (arxiv.org)
		1 point by ummonk 16 days ago \| past
		GPU Implementation of Second-Order Linear and Nonlinear Programming Solvers (arxiv.org)
		1 point by adgjlsfhk1 16 days ago \| past \| 1 comment
		Bluffing in Scrabble (arxiv.org)
		8 points by fanf2 16 days ago \| past
		Opal: An Operator Algebra View of RLHF (arxiv.org)
		2 points by P_qRs 16 days ago \| past
		Effects of the entropy source on Monte Carlo simulations (arxiv.org)
		2 points by bob1029 16 days ago \| past
		Enabling an Ecosystem of Personalized and Interoperable Social Applications (arxiv.org)
		2 points by sportdeath 17 days ago \| past
		Space Mission Options for Reconnaissance and Mitigation of Asteroid 2024 YR4 [pdf] (arxiv.org)
		2 points by croes 17 days ago \| past
		Discrete Diffusion in Large Language and Multimodal Models: A Survey (arxiv.org)
		2 points by NeoInHacker 17 days ago \| past
		Personalised Pricing: The Demise of the Fixed Price? (arxiv.org)
		2 points by Hard_Space 17 days ago \| past
		OpenFake: An Open Dataset and Platform Toward Large-Scale Deepfake Detection (arxiv.org)
		4 points by pykello 17 days ago \| past
		Space Mission Options for Mitigation of Asteroid 2024 YR4 (arxiv.org)
		4 points by geox 17 days ago \| past
		DeepMind Paper on Virtual Agent Economies (arxiv.org)
		2 points by nanfinitum 17 days ago \| past
		Seeing Is Deceiving:Mirror-Based Lidar Spoofing for Autonomous Vehicle Deception (arxiv.org)
		1 point by bikenaga 17 days ago \| past
		The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs (arxiv.org)
		1 point by mathattack 17 days ago \| past
		Are elites meritocratic and efficiency-seeking? Evidence from MBA students (arxiv.org)
		103 points by bikenaga 17 days ago \| past \| 73 comments
		Pre-training under infinite compute (arxiv.org)
		3 points by jonbaer 17 days ago \| past
		Hyb Error: A Hybrid Metric Combining Absolute and Relative Errors (2024) (arxiv.org)
		19 points by ncruces 18 days ago \| past \| 2 comments
		The illusion of diminishing returns in LLM progress (arxiv.org)
		3 points by SCEtoAux 18 days ago \| past
		Learn Your Way: Towards an AI-Augmented Textbook, Google Research (arxiv.org)
		3 points by walterbell 18 days ago \| past
		Wan-Animate: Unified Character Animation, Replacement with Holistic Replication (arxiv.org)
		2 points by walterbell 18 days ago \| past
		More