AI News

datacenter latest today hot

From AGI to ASI (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Deployment-Centered Evaluation: Predicting Query-Level Rejection Risk in a Clinical LLM System (arxiv.org)

by rss-bot · 1 week ago · 0 comments

DailyReport: An Open-ended Benchmark for Evaluating Search Agents on Daily Search Tasks (arxiv.org)

by rss-bot · 1 week ago · 0 comments

HarnessBridge: Learnable Bidirectional Controller for LLM Agent Harness (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Rethinking Psychometric Evaluation of LLMs: When and Why Self-Reports Predict Behavior (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Benchmarking AI Agents for Addressing Scientific Challenges Across Scales (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Reducing the Complexity of Deep Learning Models for EEG Analysis on Wearable Devices (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Prefill Awareness in Large Language Models (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Constructing Evaluation Datasets for Procedural Reasoning: Balancing Naturalness, Grounding, and Multi-Hop Coverage (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Teach-and-Repeat: Accurately Extracting Operational Knowledge from Mobile Screen Demonstrations to Empower GUI Agents (arxiv.org)

by rss-bot · 1 week ago · 0 comments

GeoNatureAgent Benchmark: Benchmarking LLM Agents for Environmental Geospatial Analysis Across Frontier and Open-Weight Foundation Models (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Topical Phase Transitions in Artificial Intelligence Research: Large-Scale Evidence and an Early-Warning Signature for Emerging Topics (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Fantastic Scientific Agents and How to Build Them: AgentBuild for Rietveld Refinement (arxiv.org)

by rss-bot · 1 week ago · 0 comments

(Human) Attention Is (Still) All You Need: Human oversight makes AI-assisted social science reliable (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Can I Buy Your KV Cache? (arxiv.org)

by rss-bot · 1 week ago · 0 comments

IterCAD: An Iterative Multimodal Agent for Visually-Grounded CAD Generation and Editing (arxiv.org)

by rss-bot · 1 week ago · 0 comments

A Quantitative Experimental Repeated Measures Study of Training Dynamics in a Small Llama Style Language Model Under a Compute-Aware Token Budget (arxiv.org)

by rss-bot · 1 week ago · 0 comments

MiniMax Sparse Attention (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Optimizing Appliance Scheduling for Solar Energy Management Using Metaheuristic Algorithms (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Evaluation Sovereignty in Metadata-Driven Classification: A Multi-Track Framework for Weakly Supervised Information Systems (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Why Sampling Is Not Choosing: Intentionality, Agency, and Moral Responsibility in Large Language Models (arxiv.org)

by rss-bot · 1 week ago · 0 comments

CloudCons: A Comprehensive End-to-End Benchmark for Cloud Resource Consolidation (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Uncertainty-Aware Hybrid Retrieval for Long-Document RAG (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Is It You or Your Environment? A Bayesian Inference Framework for Genomically-Anchored Personalized Physiological Interpretation (arxiv.org)

by rss-bot · 1 week ago · 0 comments

A Three-Layer Framework for AI in Scientific Discovery (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Multiagent Protocols with Aggregated Confidence Signals (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Multi-Agent Reinforcement Learning from Delayed Marketplace Feedback for Objective-Weight Adaptation in Three-Sided Dispatch (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Reasoning as Pattern Matching: Shared Mechanisms in Human and LLM Everyday Reasoning (arxiv.org)

by rss-bot · 1 week ago · 0 comments

AgentBeats: Agentifying Agent Assessment for Openness, Standardization, and Reproducibility (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Beyond Runtime Enforcement: Shield Synthesis as Defensibility Analysis for Adversarial Networks (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Before You Think: System 0, AI-Mediated Cognition and Cognitive Colonization (arxiv.org)

by rss-bot · 1 week ago · 0 comments

EurekAgent: Agent Environment Engineering is All You Need For Autonomous Scientific Discovery (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Agents-K1: Towards Agent-native Knowledge Orchestration (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Automated reproducibility assessments in the social and behavioral sciences using large language models (arxiv.org)

by rss-bot · 1 week ago · 0 comments

AI SciBrief as a Gateway to Research: A Framework for Onboarding Students into New Research Areas (arxiv.org)

by rss-bot · 1 week ago · 0 comments

GeoDial: A Multimodal Conversational Tutoring Dataset for Geometry Problem-Solving with Visual Tutor Turns (arxiv.org)

by rss-bot · 1 week ago · 0 comments

The AI Legal Specialist: A Juridically Autonomous Professional Profile for AI Governance (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Eigenism: Ethics for a Human-AI Future (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Creating and Evaluating K-12 GenAI Assessment Graders Through Context Engineering (arxiv.org)

by rss-bot · 1 week ago · 0 comments

The Challenges of Balancing AI Compliance and Technological Innovations in Critical Sectors: A Systematic Literature Review (arxiv.org)

by rss-bot · 1 week ago · 0 comments

AI-Automation Tooling in Computer Engineering Education: Mixed-Methods TAM/UTAUT Evidence for a General Acceptance Attitude (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Boosting Direct Preference Optimization with Penalization (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Mapping AI Programs in the U.S: A Status Report from Early 2026 and an Analysis of AI Majors and Minors (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Muse Spark Safety & Preparedness Report (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Will AI Agents Free Us From Meaningless Work? A Human-Centered Analysis (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Algorithmic Constitutionalism (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Position: Generative Engine Optimization Creates Underexamined Risks, Governance Must Target Concentration, Disclosure, and Academic Blind Spots (arxiv.org)

by rss-bot · 1 week ago · 0 comments

MP3: Multi-Period Pattern Pre-training forSpatio-Temporal Forecasting (arxiv.org)

by rss-bot · 1 week ago · 0 comments

NaturalFlow: Reducing Disruptive Pauses for Natural Speech Flow in Simultaneous Speech-to-Speech Translation (arxiv.org)

by rss-bot · 1 week ago · 0 comments

Select and Improve: Understanding the Mechanics of Post-Training for Reasoning (arxiv.org)

by rss-bot · 1 week ago · 0 comments

← prev p.94/2200 next →