AI News

⚡ 15 minutes ago
1
1
Dr. DocBench: A Comprehensive Benchmark for Expert-Level and Difficult Document Parsing (arxiv.org)
2
1
Consistent and Distinctive: LLM Benchmark Efficiency via Maximum Independent Set Prompt Selection on Similarity Graphs (arxiv.org)
3
1
On the Evaluation of Spiking Neural Network Configurations for Network Intrusion Detection (arxiv.org)
4
1
Crazyflow: An Accurate, GPU-Accelerated, Differentiable Drone Simulator in JAX (arxiv.org)
5
1
LLM Consortium for Software Design Refinement: A Controlled Experiment on Multi-Agent Collaboration Topologies (arxiv.org)
6
1
ClawHub Security Signals: When VirusTotal, Static Analysis, and SkillSpector Disagree (arxiv.org)
7
1
Scalable Counterfactual Risk Estimation for Rare Events in Longitudinal Data (arxiv.org)
8
1
TLG: Temporal-Logic Grounding for Video Question Answering via Source-Annotation Reconstruction and Category-Targeted Reasoning (arxiv.org)
9
1
Learning Chaotic Dynamics through Second-Order Geometric Supervision (arxiv.org)
10
1
Don't Let a Few Network Failures Slow the Entire AllReduce (arxiv.org)
11
1
Off-the-Shelf LLMs as Process Scorers: Training-Free Alternative to PRMs for Mathematical Reasoning (arxiv.org)
12
1
TimeSage-MT: A Multi-Turn Benchmark for Evaluating Agentic Time Series Reasoning (arxiv.org)
13
1
Move the Query, Not the Cache: Characterizing Cross-Instance Latent Attention Redistribution Across GPU Fabrics (arxiv.org)
14
1
On the Limits of Token Reduction for Efficient Unified Vision Language Training (arxiv.org)
15
1
Agent Operating Systems (AOS): Integrating Agentic Control Planes into, and Beyond, Traditional Operating Systems (arxiv.org)
16
1
Self-Conditioned Positional HNSW for Overlap-Aware Retrieval in Chunked-Document RAG Systems: Method and Industrial Evidence-Quality Audit (arxiv.org)
17
1
Defenses & Enablers For Skill Injection Attacks on Terminal Based Agents (arxiv.org)
18
1
IstGPT: LLM-based Anomaly Detection for Spatial-Temporal Graph in Industrial Systems (arxiv.org)
19
1
Understanding Identity Continuity in Thermal Video through Scene-Level Consistency (arxiv.org)
20
1
KDH-CAD: Knowledge-data hybrid CAD learning under data scarcity (arxiv.org)
21
1
Density-Aware Translation of Spurious Correlations in Zero-Shot VLMs (arxiv.org)
22
1
Practical Aspects on Solving Differential Equations Using Deep Learning: A Primer (arxiv.org)
23
1
FlatVPR: Plug-and-play Geo-linear Residual Adapter for Geometric Rectification of Foundation Model Feature Manifolds (arxiv.org)
24
1
Sensitivity as a Double-Edged Sword: A Trade-off Between Discriminability and Adversarial Robustness (arxiv.org)
25
1
Accelerating Min-Max Optimization via Power-Law Stepsizes (arxiv.org)
26
1
An Algebraic View of the Expressivity of Recurrent Language Models (arxiv.org)
27
1
Multilinguality of Large Language Models From a Structural Perspective (arxiv.org)
28
1
Identifying High-Confidence Social Biases in LLMs for Trustworthy Conversational Tutoring Agents (arxiv.org)
29
1
TechGraphRAG: An Agentic Graph-Augmented RAG Framework for Technical Literature Reasoning (arxiv.org)
30
1
EvoPool: Evolutionary Programmatic Annotation for Label-Efficient Specialized Supervision (arxiv.org)
31
1
AlphaToken: Decoupling Adaptation and Stability for Path-Aware Response Token Valuation in LLM Post-Training (arxiv.org)
32
1
Easier to Mislead Than to Correct: Harmful and Beneficial Revision in LLM Conformity (arxiv.org)
33
1
ProbeScale: Probing Analysis to Optimize Neural Scaling Laws for Efficient Small Language Model Inference (arxiv.org)
34
1
"I've Seen How This Goes": Characterizing Diversity via Progressive Conditional Surprise (arxiv.org)
35
1
LayerRoute: Input-Conditioned Adaptive Layer Skipping via LoRA Fine-Tuning for Agentic Language Models (arxiv.org)
36
1
Observation, Not Prediction: Conversation-Level Disaggregated Scheduling for Agentic Serving (arxiv.org)
37
1
The Lie We Tell: Correcting the Euclidean Fallacy in Vision Language Action Policies via Score Matching on Tangent Space (arxiv.org)
38
1
Time-Aware Diffusion based on Preference Disentanglement for Generative Recommendation (arxiv.org)
39
1
HAIM: Human-AI Music Datasets for AI Music Production Tracking Benchmark (arxiv.org)
40
1
RPCASSM: Robust PCA State Space Model For Infrared Small Target Detection (arxiv.org)
41
1
JenBridge: Adaptive Long-Form Video Soundtracking across Scene Transitions (arxiv.org)
42
1
MidSurfNet: Learnable Face Pairing and Interference Implicit Fields for Generalized Mid-surface Abstraction (arxiv.org)
43
1
Resonant Context Anchoring: Decoupling Attention Routing and Signal Gain at Inference Time (arxiv.org)
44
1
Learning Action-Conditional and Object-Centric Gaussian Splatting World Models for Rigid Objects (arxiv.org)
45
1
Graph Edit Distance Formulation for the Vehicle Routing Problem: Theory and Analysis (arxiv.org)
46
1
A Structured Benchmark for Text-Guided Anomaly Detection: When Language Stops Conditioning the Decision (arxiv.org)
47
1
Echo: A Joint-Embedding Predictive Architecture for Speaker Diarization and Speech Recognition in a Shared Latent Space (arxiv.org)
48
1
MMG2Skill: Can Agents Distill In-the-Wild Guides into Self-Evolving Skills? (arxiv.org)
49
1
Machine Learning for Coding Retail Product Names to Consumer-Price Categories: A Rule-plus-Bag-of-Words Pipeline with Reliability-Weighted Human-in-the-Loop Labeling (arxiv.org)
50
1
Unveiling the Entropy Dynamics of Chain-of-Thought Reasoning (arxiv.org)