Submissions from arxiv.org

		Biases in the Blind Spot: Detecting What LLMs Fail to Mention (arxiv.org)
		1 point by mpweiher 3 hours ago \| past \| discuss
		A Framework for Time-Updating Probabilistic Forecasts (arxiv.org)
		5 points by Luc 10 hours ago \| past \| discuss
		Towards Autonomous Mathematics Research (Google DeepMind) (arxiv.org)
		1 point by u1hcw9nx 11 hours ago \| past \| discuss
		Remote Labor Index: Measuring AI Automation of Remote Work (arxiv.org)
		2 points by Leynos 1 day ago \| past \| discuss
		Generalized on-policy distillation with reward extrapolation (arxiv.org)
		3 points by fzliu 1 day ago \| past \| discuss
		OpenAI model proposes and proves Physics result (arxiv.org)
		1 point by KothuRoti 1 day ago \| past \| discuss
		An API for Biological Neural Networks (arxiv.org)
		1 point by bwjx 1 day ago \| past \| discuss
		Adversarial Patch: images that make classifiers ignore other items in a scene (arxiv.org)
		1 point by felineflock 1 day ago \| past \| discuss
		Maximum Agreement Linear Predictor (MALP) (arxiv.org)
		1 point by tesserato 1 day ago \| past \| 1 comment
		Standardized and In-Depth Benchmarking of Post-Moore Dataflow AI Accelerators (arxiv.org)
		1 point by PaulHoule 1 day ago \| past \| discuss
		Fine-Tuning GPT-5 for GPU Kernel Generation (arxiv.org)
		4 points by matt_d 1 day ago \| past \| discuss
		SWE-ContextBench: context learning benchmark in coding (arxiv.org)
		1 point by mustaphah 1 day ago \| past \| discuss
		LLMs exceed physicians on complex text-based differential diagnosis (arxiv.org)
		3 points by rippeltippel 1 day ago \| past \| 2 comments
		Horus: A Protocol For Trustless Verification Under Uncertainty (arxiv.org)
		1 point by optimalsolver 1 day ago \| past \| discuss
		Learning to Reason in 13 Parameters (arxiv.org)
		2 points by stared 1 day ago \| past \| discuss
		LLM Reasoning Failures (arxiv.org)
		1 point by gradus_ad 1 day ago \| past \| discuss
		Defining causal mechanism in dual process theory and 2 types of feedback control (arxiv.org)
		1 point by s6i 1 day ago \| past \| discuss
		Routing LLM queries using internal success predictions (70% cost reduction) (arxiv.org)
		1 point by stansApprentice 2 days ago \| past \| 2 comments
		SWE-AGI: benchmarking spec-driven software construction (arxiv.org)
		1 point by mustaphah 2 days ago \| past \| 1 comment
		Authenticated Workflows: A Systems Approach to Deterministic Agentic Controls (arxiv.org)
		3 points by mrajagopalan 2 days ago \| past \| 1 comment
		Formalization and Inevitability of the Pareto Principle (arxiv.org)
		3 points by bikenaga 2 days ago \| past \| 1 comment
		RL on GPT-5 to write better kernels (arxiv.org)
		4 points by atallahw 2 days ago \| past \| 1 comment
		Quantum observers can communicate across multiverse branches (arxiv.org)
		2 points by lisper 2 days ago \| past \| discuss
		Pushing Tensor Accelerators Beyond MatMul in a User-Schedulable Language (arxiv.org)
		1 point by matt_d 2 days ago \| past \| discuss
		HySparse: A Hybrid Sparse Attention Architecture (arxiv.org)
		5 points by readitalready 2 days ago \| past \| discuss
		Biases in the Blind Spot: Detecting What LLMs Fail to Mention (arxiv.org)
		1 point by jari_mustonen 2 days ago \| past \| discuss
		Evaluation of RAG Architectures for Policy Document Question Answering (arxiv.org)
		1 point by PaulHoule 2 days ago \| past \| discuss
		SoftMatcha 2: A Fast and Soft Pattern Matcher for Trillion-Scale Corpora (arxiv.org)
		3 points by salkahfi 2 days ago \| past \| discuss
		Opus: Towards Efficient and Principled Data Selection in LLM Pre-Training (arxiv.org)
		2 points by onurkanbkrc 2 days ago \| past \| discuss
		Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters (arxiv.org)
		1 point by onurkanbkrc 2 days ago \| past \| 1 comment
		More