AI News

⚡ 8 minutes ago
1
1
Hasse Diagrams for Attention: A Partial Order Framework for Designing Transformer Masks (arxiv.org)
2
1
Integrating Out, Twice:The Open-System Case That Neural-Network Ensemble Theory Is Missing (arxiv.org)
3
1
Culturally-Aware AI for Cross-Boundary Community Learning: Undergraduate Innovation at the Intersection of Computation and Design (arxiv.org)
4
1
A History-Aware Visually Grounded Critic for Computer Use Agents (arxiv.org)
5
1
CIAware-Bench: Benchmarking Control Intervention Awareness Across Frontier LLMs (arxiv.org)
6
1
What Fits (Into Few Tokens) Doesn't Overfit: Compression and Generalization in ML Research Agents (arxiv.org)
7
1
Workflow-GYM: Towards Long-Horizon Evaluation of Computer-use Agentic tasks in Real-World Professional Fields (arxiv.org)
8
1
Convergence of Monte Carlo Optimistic Policy Iteration: Beyond Uniform State-Action Updates (arxiv.org)
9
1
Superficial Beliefs in LLM Decision-Making (arxiv.org)
10
1
Structure from Reasoning, Numbers from Search: On-Premise Open LLMs as Structural Priors for Coupled MIMO Controller Tuning (arxiv.org)
11
1
Null-Space Constrained Low-Rank Adaptation for Response-Specified Large Language Model Unlearning (arxiv.org)
12
1
Bellman-Taylor Score Decoding for Markov Decision Processes with State-Dependent Feasible Action Sets (arxiv.org)
13
1
Mind the Gap: Can Frontier LLMs Pass a Standardized Office Proficiency Exam? (arxiv.org)
14
1
MMClima: A Framework for Multimodal Climate Science Data and Evaluation (arxiv.org)
15
1
One Lens, Many Worlds : A Capability-Typed Interface for World-Model Interpretability (arxiv.org)
16
1
nCMD: Benign-Anchored Feature Selection for Imbalanced Network Intrusion Detection (arxiv.org)
17
1
When RL Fails after SFT: Rejuvenating Model Plasticity for Robust SFT-to-RL Handoff (arxiv.org)
18
1
Between Amnesia and Chaos: A Memory Stability Expressivity Trilemma for Trainable Dissipative Oscillator Networks (arxiv.org)
19
1
Unsupervised Deep Learning for Limited-Angle STEM-EDX Tomography -- Application to 3D Chemical Analysis of Phase-Change Memory Devices (arxiv.org)
20
1
Architect-Ant: Editable Automatic Furnishing of Architectural Floor Plans (arxiv.org)
21
1
Recalling Too Well: Sycophancy Evaluation and Mitigation in Memory-Augmented Models (arxiv.org)
22
1
WorldKernel: A World Model is the Coupling Kernel of Admissible Possible Worlds (arxiv.org)
23
1
Frontier Coding Agents Use Metaprogramming to Adapt to Unfamiliar Programming Languages (arxiv.org)
24
1
Trading Utility for Dynamic Fairness in Multiple Resource Division with Sequential Demand (arxiv.org)
25
1
Bittensor Agent Arenas as a Trajectory Primitive: Distilling a Shopping Agent from ShoppingBench Subnet Traces (arxiv.org)
26
1
Forward-Only Convolutional Neural Networks with Learnable Channel-Class Assignment (arxiv.org)
27
1
Trainable Smooth-Rotation Transforms with Learned Channel Scales for LLM Quantization (arxiv.org)
28
1
Sample Where You Struggle: Sharpening Base Model Reasoning via Entropy-Guided Power Sampling (arxiv.org)
29
1
Sigma-Branch: Hierarchical Single-Path Network Reconstruction for Dynamic Inference with Reduced Active Parameters (arxiv.org)
30
1
Importance-Aware Scheduling for High-Dimensional Hyperparameter Optimization (arxiv.org)
31
1
Role-Agent: Bootstrapping LLM Agents via Dual-Role Evolution (arxiv.org)
32
1
Large-scale semantic mapping of learner agency and autonomy reveals what measurement and generative AI research overlook (arxiv.org)
33
1
Do VLMs Reason Like Engineers? A Benchmark and a Stage-wise Evaluation (arxiv.org)
34
1
Moonshine: An Autonomous Mathematical Research Agent Centered on Conjecture Generation (arxiv.org)
35
1
Interpreting and Steering a Text-to-Speech Language Model with Sparse Autoencoders (arxiv.org)
36
1
Mix, Don't Pick: Why Synthetic Corpus Composition Matters for Time Series Foundation Model Pretraining (arxiv.org)
37
1
LongMoE: Longitudinal Multimodal Learning via Trajectory-Aware Mixture-of-Experts (arxiv.org)
38
1
TRAPS: Therapeutic Response Analysis via Pathway-informed Stratification (arxiv.org)
39
1
A Navigable Manifold of Hypothesized Consciousness-Spectrum States in Language Model Representations (arxiv.org)
40
1
Optimality of FSQ Tokens for Continuous Diffusion for Categorical Data with Application to Text-to-Speech (arxiv.org)
41
1
Evaluating Research-Level Math Proofs via Strict Step-Level Verification (arxiv.org)
42
1
READER: Robust Evidence-based Authorship Decoding via Extracted Representations (arxiv.org)
43
1
More Human or More AI? Visualizing Human-AI Collaboration Disclosures in Journalistic News Production (arxiv.org)
44
1
Accelerating NeurASP with vectorization and caching (arxiv.org)
45
1
$\tau$-Rec: A Verifiable Benchmark for Agentic Recommender Systems (arxiv.org)
46
1
AutoPDE: Reliable Agentic PDE Solving via Explicitly Represented Solver Strategies (arxiv.org)
47
1
The Arbiter Agent: Continually Monitoring Multi-Agent Conversations to Detect Emergent Misalignment (arxiv.org)
48
1
When the Chain of Thought Knows Better: Failure Modes in Multi-Turn Reasoning Models (arxiv.org)
49
1
Learning What to Remember: Observability-Safe Memory Retention via Constrained Optimization for Long-Horizon Language Agents (arxiv.org)
50
1
LMT: A Bayesian Framework for Causal Discovery from Textual Alarm Records in Manufacturing Systems (arxiv.org)