Submissions from arxiv.org

		TaxCalcBench: Evaluating Frontier Models on the Tax Calculation Task (arxiv.org)
		70 points by handfuloflight 8 days ago \| past \| 23 comments
		A Survey of Vibe Coding with Large Language Models (arxiv.org)
		1 point by Gigacore 8 days ago \| past \| discuss
		Towards Logic: The Language of AI (arxiv.org)
		3 points by cmogni1 9 days ago \| past \| discuss
		Holistic Agent Leaderboard: The Missing Infrastructure for AI Agent Evaluation (arxiv.org)
		1 point by adidoit 9 days ago \| past \| 1 comment
		Agentic Bug Reproduction for Effective Automated Program Repair at Google (arxiv.org)
		1 point by chw9e 9 days ago \| past \| discuss
		Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More? (2024) (arxiv.org)
		1 point by fzliu 9 days ago \| past \| discuss
		Holistic Agent Leaderboard: The Missing Infrastructure for AI Agent Evaluation (arxiv.org)
		1 point by randomwalker 9 days ago \| past \| discuss
		[flagged] Gravity Can Explain the Collapse of the Wavefunction (Sabine Hossenfelder) (arxiv.org)
		49 points by felineflock 9 days ago \| past \| 55 comments
		Certifying almost all quantum states with few single-qubit measurements (arxiv.org)
		1 point by rbanffy 9 days ago \| past \| discuss
		A Less Terrifying Universe? Mundanity as an Explanation for the Fermi Paradox (arxiv.org)
		5 points by cyberlimerence 9 days ago \| past \| 1 comment
		Robot Learning: A Tutorial (arxiv.org)
		2 points by Anon84 9 days ago \| past \| discuss
		Evaluating Argon2 adoption and effectiveness in real-world software (arxiv.org)
		32 points by pregnenolone 9 days ago \| past \| 32 comments
		Tensor Logic: The Language of AI (arxiv.org)
		3 points by max_ 9 days ago \| past \| discuss
		Subspace-Accelerated Coordinate Descent for Physics-Based Simulation (arxiv.org)
		1 point by E-Reverance 9 days ago \| past \| discuss
		Old Is Gold: Optimizing Single-Threaded Applications with Exgen-Malloc (arxiv.org)
		16 points by todsacerdoti 9 days ago \| past \| 7 comments
		Inferring User Actions from Screen Recordings to Recommend Better Workflows (arxiv.org)
		2 points by azhenley 9 days ago \| past \| discuss
		PEFT Evaluation for Safe Code Generation (arxiv.org)
		1 point by grac3 10 days ago \| past \| discuss
		Refrag: Rethinking RAG Based Decoding (arxiv.org)
		2 points by bbzjk7 10 days ago \| past \| discuss
		Dynamically relevant consciousness precludes artificial consciousness (2023) (arxiv.org)
		2 points by measurablefunc 10 days ago \| past \| 1 comment
		Reducing Pipeline Bubbles with Adaptive Parallelism on Heterogeneous Models (arxiv.org)
		2 points by PaulHoule 10 days ago \| past \| 1 comment
		Who Said Neural Networks Aren't Linear? (arxiv.org)
		2 points by ComplexSystems 10 days ago \| past \| discuss
		Are Foundation Models Ready for Industrial Defect Recognition? A Reality Check (arxiv.org)
		5 points by PaulHoule 10 days ago \| past \| discuss
		The Optimal Strategy for Playing Lucky 13 (arxiv.org)
		1 point by belter 10 days ago \| past \| discuss
		Gravity can explain the collapse of the wavefunction (arxiv.org)
		18 points by dboreham 10 days ago \| past \| 14 comments
		Agentic Context Engineering: Evolving Contexts for Self-Improving LLMs (arxiv.org)
		2 points by mooreds 10 days ago \| past \| discuss
		From Automation to Autonomy (arxiv.org)
		2 points by jruohonen 10 days ago \| past \| 1 comment
		Literate Tracing (arxiv.org)
		3 points by todsacerdoti 10 days ago \| past \| discuss
		AutoPR: Let's Automate Your Academic Promotion [pdf] (arxiv.org)
		2 points by SerCe 11 days ago \| past \| 1 comment
		StreamingVLM: Real-Time Understanding for Infinite Video Streams (arxiv.org)
		33 points by badmonster 11 days ago \| past \| discuss
		Mano: Multi-Modal Foundation Model and 3-Stage RL for SOTA GUI Automation (arxiv.org)
		2 points by jinqueeny 11 days ago \| past \| discuss
		More