AI News

⚡ 6 minutes ago
1
1
MUSE: A Unified Agentic Harness for MLLMs (arxiv.org)
2
1
COD10K-C: Benchmarking Robustness of Camouflaged Object Detection Under Natural Image Corruptions (arxiv.org)
3
1
q0: Primitives for Hyper-Epoch Pretraining (arxiv.org)
4
1
From Answers to States: Verifiable Process-Level Evaluation of Chemical Reasoning in Large Language Models (arxiv.org)
5
1
Diagnosing Knowledge Gaps in LLM Tool Use: An Agentic Benchmark for Novel API Acquisition (arxiv.org)
6
1
TiWeaver: Unified Temporal Dynamics Modeling via Contextual Patching (arxiv.org)
7
1
Position: Prioritize Identifying Structure, Not Complex Models, for Scientific Discovery (arxiv.org)
8
1
EvoDrive: Pareto Evolution for Safety-Critical Autonomous Driving via Self-Improving LLM Agents (arxiv.org)
9
1
Safety Measurements for Fine-tuned LLMs Should be Grounded in Capability (arxiv.org)
10
1
TurtleAI: Benchmarking Multimodal Models for Visual Programming in Turtle Graphics (arxiv.org)
11
1
DDOR: Delta Debugging for Explainable Overrefusal Testing and Repair (arxiv.org)
12
1
The Epi-LLM Framework: probing LLM behavioral priors through epidemiological agent-based models (arxiv.org)
13
1
CoEval: Ranking Language Models for Custom Tasks Without Labeled Data or Trustworthy Benchmarks (arxiv.org)
14
1
Black-box, Adaptive, Efficient, Transferable, Harmful, Applicable... Attacks Are All You Need to Break LLMs (arxiv.org)
15
1
Qwen-Image-Flash: Beyond Objective Design (arxiv.org)
16
1
Resource-Constrained Adaptive Inference for Sequential Pricing (arxiv.org)
17
1
Target Updates May Stabilize Linear Q-Learning: Periodic and Soft Dynamics (arxiv.org)
18
1
Training-Free Multi-Concept LoRA Composition with Prompt-Aware Weighting (arxiv.org)
19
1
AI Agents Enable Adaptive Computer Worms (arxiv.org)
20
1
Gender-Dependent Diagnostic Substitution in LLM Medical Triage: Same Symptoms, Unequal Urgency (arxiv.org)
21
1
State-Coupled Volatility in Latent Dynamical Systems: Recovery Under Partial Observation (arxiv.org)
22
1
Synthetic Hallucinations, Real Gains: Hard Negatives from Frontier Models for FIM Hallucination Mitigation (arxiv.org)
23
1
DiffUNet^2: Bidirectional Prediction, Probabilistic Generation and Collaborative Visual Discovery for Scientific Data (arxiv.org)
24
1
Merit or networks? What decides where research is published (arxiv.org)
25
1
SEAOTTER: Sensor Embedded Autoencoding with One-Time Transcode for Efficient Reconstruction (arxiv.org)
26
1
AugMask: Training Diffusion Models on Incomplete Tabular Data via Stochastic Augmentation and Masking (arxiv.org)
27
1
MLSkip: Data Skipping for ML Filters via Lightweight Metadata (arxiv.org)
28
1
Finding Needles in the Haystack: Transductive Active Labeling in Ecology (arxiv.org)
29
1
PINNfluence: Interpreting PINNs through Influence Functions (arxiv.org)
30
1
Denoise First, Orthogonalize Later: Understanding Momentum in Muon via Spectral Filtering (arxiv.org)
31
1
Building Trust in Black-box Optimization: A Comprehensive Framework for Explainability (arxiv.org)
32
1
Easy-to-Use Shielding for Reinforcement Learning (arxiv.org)
33
1
R2DN: Scalable Parameterization of Contracting and Lipschitz Recurrent Deep Networks (arxiv.org)
34
1
A Geometric Lens on Physics-Aligned Data Compression (arxiv.org)
35
1
PHASE: Physiology-Aware Hyperspectral Reconstruction via Object-to-Human Domain Adaptation (arxiv.org)
36
1
Towards Non-Monotonic Entailment in Propositional Defeasible Standpoint Logic (arxiv.org)
37
1
Code-on-Graph: Iterative Programmatic Reasoning via Large Language Models on Knowledge Graphs (arxiv.org)
38
1
HARVE: Hacking-Aware Reward-Head Vector Editing for Robust Reward Models (arxiv.org)
39
1
Multi-Modal Machine Learning for Breast Cancer Recurrence Prediction (arxiv.org)
40
1
Rethinking Neural Width for Alternating Current Optimal Power Flow Proxies (arxiv.org)
41
1
InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning (arxiv.org)
42
1
FGRPO: Federated GRPO with Adaptive Aggregation on Non-IID Data (arxiv.org)
43
1
Let There Be Light: Reflection, Refraction and Scattering for Neural Operators (arxiv.org)
44
1
Calibrating Urban Traffic Simulation from Sparse Road Observations via Genetic Optimization (arxiv.org)
45
1
Proof-Refactor: Refactoring Generated Formal Proofs into Modular Artifacts (arxiv.org)
46
1
EqGINO: Equivariant Geometry-Informed Fourier Neural Operators for 3D PDEs (arxiv.org)
47
1
Learning Temporal Causal Structure via Smooth Differentiable Optimization (arxiv.org)
48
1
Constitutional On-Policy Safe Distillation (arxiv.org)
49
1
SegTune: Structured and Fine-Grained Control for Song Generation (arxiv.org)
50
1
Calibration Data Trade-offs Across Capability Dimensions: Why Multi-Source Mixing Matters for High-Sparsity LLM Pruning (arxiv.org)