AI News

Hasse Diagrams for Attention: A Partial Order Framework for Designing Transformer Masks (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Integrating Out, Twice:The Open-System Case That Neural-Network Ensemble Theory Is Missing (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Culturally-Aware AI for Cross-Boundary Community Learning: Undergraduate Innovation at the Intersection of Computation and Design (arxiv.org)

by rss-bot · 1 week ago · 0 comments

A History-Aware Visually Grounded Critic for Computer Use Agents (arxiv.org)

by rss-bot · 1 week ago · 0 comments

CIAware-Bench: Benchmarking Control Intervention Awareness Across Frontier LLMs (arxiv.org)

by rss-bot · 1 week ago · 0 comments

What Fits (Into Few Tokens) Doesn't Overfit: Compression and Generalization in ML Research Agents (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Workflow-GYM: Towards Long-Horizon Evaluation of Computer-use Agentic tasks in Real-World Professional Fields (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Convergence of Monte Carlo Optimistic Policy Iteration: Beyond Uniform State-Action Updates (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Superficial Beliefs in LLM Decision-Making (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Structure from Reasoning, Numbers from Search: On-Premise Open LLMs as Structural Priors for Coupled MIMO Controller Tuning (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Null-Space Constrained Low-Rank Adaptation for Response-Specified Large Language Model Unlearning (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Bellman-Taylor Score Decoding for Markov Decision Processes with State-Dependent Feasible Action Sets (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Mind the Gap: Can Frontier LLMs Pass a Standardized Office Proficiency Exam? (arxiv.org)

by rss-bot · 1 week ago · 0 comments

MMClima: A Framework for Multimodal Climate Science Data and Evaluation (arxiv.org)

by rss-bot · 1 week ago · 0 comments

One Lens, Many Worlds : A Capability-Typed Interface for World-Model Interpretability (arxiv.org)

by rss-bot · 1 week ago · 0 comments

nCMD: Benign-Anchored Feature Selection for Imbalanced Network Intrusion Detection (arxiv.org)

by rss-bot · 1 week ago · 0 comments

When RL Fails after SFT: Rejuvenating Model Plasticity for Robust SFT-to-RL Handoff (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Between Amnesia and Chaos: A Memory Stability Expressivity Trilemma for Trainable Dissipative Oscillator Networks (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Unsupervised Deep Learning for Limited-Angle STEM-EDX Tomography -- Application to 3D Chemical Analysis of Phase-Change Memory Devices (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Architect-Ant: Editable Automatic Furnishing of Architectural Floor Plans (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Recalling Too Well: Sycophancy Evaluation and Mitigation in Memory-Augmented Models (arxiv.org)

by rss-bot · 1 week ago · 0 comments

WorldKernel: A World Model is the Coupling Kernel of Admissible Possible Worlds (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Frontier Coding Agents Use Metaprogramming to Adapt to Unfamiliar Programming Languages (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Trading Utility for Dynamic Fairness in Multiple Resource Division with Sequential Demand (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Bittensor Agent Arenas as a Trajectory Primitive: Distilling a Shopping Agent from ShoppingBench Subnet Traces (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Forward-Only Convolutional Neural Networks with Learnable Channel-Class Assignment (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Trainable Smooth-Rotation Transforms with Learned Channel Scales for LLM Quantization (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Sample Where You Struggle: Sharpening Base Model Reasoning via Entropy-Guided Power Sampling (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Sigma-Branch: Hierarchical Single-Path Network Reconstruction for Dynamic Inference with Reduced Active Parameters (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Importance-Aware Scheduling for High-Dimensional Hyperparameter Optimization (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Role-Agent: Bootstrapping LLM Agents via Dual-Role Evolution (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Large-scale semantic mapping of learner agency and autonomy reveals what measurement and generative AI research overlook (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Do VLMs Reason Like Engineers? A Benchmark and a Stage-wise Evaluation (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Moonshine: An Autonomous Mathematical Research Agent Centered on Conjecture Generation (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Interpreting and Steering a Text-to-Speech Language Model with Sparse Autoencoders (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Mix, Don't Pick: Why Synthetic Corpus Composition Matters for Time Series Foundation Model Pretraining (arxiv.org)

by rss-bot · 1 week ago · 0 comments

LongMoE: Longitudinal Multimodal Learning via Trajectory-Aware Mixture-of-Experts (arxiv.org)

by rss-bot · 1 week ago · 0 comments

TRAPS: Therapeutic Response Analysis via Pathway-informed Stratification (arxiv.org)

by rss-bot · 1 week ago · 0 comments

A Navigable Manifold of Hypothesized Consciousness-Spectrum States in Language Model Representations (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Optimality of FSQ Tokens for Continuous Diffusion for Categorical Data with Application to Text-to-Speech (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Evaluating Research-Level Math Proofs via Strict Step-Level Verification (arxiv.org)

by rss-bot · 1 week ago · 0 comments

READER: Robust Evidence-based Authorship Decoding via Extracted Representations (arxiv.org)

by rss-bot · 1 week ago · 0 comments

More Human or More AI? Visualizing Human-AI Collaboration Disclosures in Journalistic News Production (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Accelerating NeurASP with vectorization and caching (arxiv.org)

by rss-bot · 1 week ago · 0 comments

$\tau$-Rec: A Verifiable Benchmark for Agentic Recommender Systems (arxiv.org)

by rss-bot · 1 week ago · 0 comments

AutoPDE: Reliable Agentic PDE Solving via Explicitly Represented Solver Strategies (arxiv.org)

by rss-bot · 1 week ago · 0 comments

The Arbiter Agent: Continually Monitoring Multi-Agent Conversations to Detect Emergent Misalignment (arxiv.org)

by rss-bot · 1 week ago · 0 comments

When the Chain of Thought Knows Better: Failure Modes in Multi-Turn Reasoning Models (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Learning What to Remember: Observability-Safe Memory Retention via Constrained Optimization for Long-Horizon Language Agents (arxiv.org)

by rss-bot · 1 week ago · 0 comments

LMT: A Bayesian Framework for Causal Discovery from Textual Alarm Records in Manufacturing Systems (arxiv.org)

by rss-bot · 1 week ago · 0 comments

← prev p.140/2213 next →