AI News

⚡ 12 minutes ago
1
1
TianJi-Environ: An Autonomous AI Scientist for Atmospheric Environmental Research (arxiv.org)
2
1
MLingualFC: Evaluating Jailbreak Vulnerabilities in Multilingual Vision-Language Models (arxiv.org)
3
1
Cross-View Urban Traffic Dataset: Drone-Supervised Ground Truth for Monocular Bird's-Eye View Localization (arxiv.org)
4
1
Scaling Decision-Focused Learning to Large Problems with Lagrangian Decomposition (arxiv.org)
5
1
Active Flow Expansion for Out-of-Distribution Discovery: from Theory to Molecules (arxiv.org)
6
1
Knowledge Graphs and Reasoning LLMs for Finding Simple Yet Effective Transcriptomic Perturbation Predictors (arxiv.org)
7
1
Diffuse AI Control on Fuzzy Tasks (arxiv.org)
8
1
Cheap Reward Hacking Detection (arxiv.org)
9
1
Synthetic but Not Realistic: The Evaluation Challenge in Generative Modelling for Structured Electronic Medical Records (arxiv.org)
10
1
Quantum-Enhanced Similarity Measures for Polarimetric Materials Classification (arxiv.org)
11
1
Beyond Point Estimates: Benchmarking Uncertainty Quantification Methods on the AION-1 Astronomical Foundation Model (arxiv.org)
12
1
Memetic Capture: A Pluralistic Policy Framework for Governing AI-Driven Cultural Disempowerment (arxiv.org)
13
1
SLMJury: Can Small Language Models Judge as Well as Large Ones? (arxiv.org)
14
1
C$^3$ache: Accelerating World Action Models with Cross Inference Chunk Cache (arxiv.org)
15
1
The ACUTE Protocol: Operationalizing Language Model Activations for Better Calibration, Utility, and Trust (arxiv.org)
16
1
Beyond English benchmarks: clinical llm evaluation in Brazilian Portuguese (arxiv.org)
17
1
Model Multiplicity for Adversarial Detection in Small Language Model Training on Edge Devices (arxiv.org)
18
1
The Last Visible Pixel: Probing Fine-Scale Perception in Vision-Language Models (arxiv.org)
19
1
The Cross-Architecture Substrate: A Domain-Transcendent, Calibration-Surviving Geometric Invariant of Modern Vision Encoders (arxiv.org)
20
1
Generalized Rank-based Evaluation for Knowledge Graph Completion: Perspectives, Framework, and Analyses (arxiv.org)
21
1
PROBE-Web: An Interactive System for Probing Evaluation Landscapes of Knowledge Graph Completion Models (arxiv.org)
22
1
From Hazard Functions to Language Space: Cox-Supervised Distillation of Survival Risk into a Large Language Model (arxiv.org)
23
1
Self-Consistent Generative Paths via Admissible Random Variational Transport (arxiv.org)
24
1
From inverse problems to neural operators: prediction, mechanism, and generalization of data-driven models (arxiv.org)
25
1
Online Learning with Recency: Algorithms for Sliding-window Streaming Multi-armed Bandits (arxiv.org)
26
1
LEAF: A Learning-Enabled ADMM Framework for Accelerated Convex Optimization (arxiv.org)
27
1
Structural Grid Descriptors Predict Within-Task Solver Success on ARC-AGI (arxiv.org)
28
1
TRIAGE: Dialectical Reasoning for Explainable Risk Prediction on Irregularly Sampled Medical Time Series with LLMs (arxiv.org)
29
1
DynaCF: Mitigating Shortcut Learning in Reward Models via Dynamic Counterfactual Sensitivity (arxiv.org)
30
1
Decoy-Calibrated Failure Audits for Language Models (arxiv.org)
31
1
Larch: Learned Query Optimization for Semantic Predicates (arxiv.org)
32
1
Decoupling Semantics and Logic: A Training-Free Coarse-to-Fine Pipeline for Video Retrieval-Augmented Generation (arxiv.org)
33
1
Illusions of the Gold Standard: A Large-scale Analysis of Human Evaluation Protocols for Long-form Text Generation (arxiv.org)
34
1
POISE: Position-Aware Undetectable Skill Injection on LLM Agents (arxiv.org)
35
1
From `May' to `Is': Certainty Distortion in Language Model Rewriting (arxiv.org)
36
1
RecurGuard: Runtime Monitoring for Reasoning-Token Consumption Attacks (arxiv.org)
37
1
Auditable Graph-Guided Root Cause Analysis for Kubernetes Incidents (arxiv.org)
38
1
HARBOR: A Harness Framework for Agentic Robot Reinforcement Learning (arxiv.org)
39
1
Beyond Convolution: Advancing Hypergraph Neural Networks with Hypergraph U-Nets (arxiv.org)
40
1
Stage-1 Controls the Entropy Regime, Not the Outcome (arxiv.org)
41
1
OnlyDense: Reduced-Order Modeling for Lagrangian simulation (arxiv.org)
42
1
A Unifying Lens on Reward Uncertainty in RLHF (arxiv.org)
43
1
SHIELD-IDS: Structurally Heterogeneous Ensemble with Integrated Layered Defense for Intrusion Detection Systems (arxiv.org)
44
1
Steer Where It Matters: Token-Level Visual-Sensitivity Steering for LVLMs Hallucination Mitigation (arxiv.org)
45
1
SC3: The Multi-Solvent Solubility Challenge and Benchmark (arxiv.org)
46
1
QDS-SNN: Energy-efficient Quantum Deeply-Supervised Spiking Neural Network Algorithm for Traffic Sign Recognition (arxiv.org)
47
1
What neurosurgeons need to see: synthetic intra-operative MRI from ultrasound for brain-shift compensation in brain tumour surgery (arxiv.org)
48
1
Reconstructing Synthetic SDO/AIA 193 A EUV Images from He I 10830 A Observations with Diffusion Model Translator (arxiv.org)
49
1
FiberTune: Preserving Action-Fiber Visual Residuals in Vision-Language-Action Fine-Tuning (arxiv.org)
50
1
Latent Diffusion Policy: Shaping Latent Spaces for Diffusion-Based Robotic Manipulation (arxiv.org)