AI News

datacenter latest today hot

ComBench: A Benchmark for Rigorous Proof Reasoning and Constructive Realization in Olympiad-Level Combinatorics (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

Trace2Policy: From Expert Behavior Traces to Self-Evolving Decision Agents (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

Soul Computing: A Theoretical Framework and Technical Architecture for Intelligent Agents with Independent Consciousness (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

A Unified Multi-Modal Framework for Intelligent Financial Systems: Integrating Reinforcement Learning, High-Frequency Trading, and Game-Theoretic Approaches with Cross-Modal Sentiment Analysis (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

STAGE-Claw: Automated State-based Agent Benchmarking for Realistic Scenarios (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

Reasoning or Memorization? Direction-Aware Diversity Exploration in LLM Reinforcement Learning (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

Self-Distillation Policy Optimization via Visual Feedback: Bridging Code and Visual Artifacts (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

Mobility Anomaly Generation using LLM-Driven Behavior with Kinematic Constraints (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

What Spatial Memory Must Store: Occlusion as the Test for Language-Agent Memory (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

From Context-Aware to Conflict-Aware: Generalizing Contrastive Decoding for Knowledge Conflict in LLMs (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

Sim2Schedule: A Simulator-Guided LLM Framework for Autonomous Open-Pit Mine Scheduling (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

Supervised Fine-tuning with Synthetic Rationale Data Hurts Real-World Disease Prediction (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

RealMath-Eval: Why SOTA Judges Struggle with Real Human Reasoning (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

Regimes: An Auditable, Held-Out-Gated Improvement Loop Demonstrated on LongMemEval with ActiveGraph (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

Minimalist Genetic Programming (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

Operator Fusion for LLM Inference on the Tensix Architecture (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

Less Context, Better Agents: Efficient Context Engineering for Long-Horizon Tool-Using LLM Agents (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

Instruction Finetuning DeepSeek-R1-8B Model Using LoRA and NEFTune (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

Does Normalization Choice Matter for Causal Large Time-Series Models? (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

Aesthetic Perspectives in Information Systems Research: A Hermeneutic Analysis (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

Human-AI Teaming Through the Lens of Calibration (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

RAG over Thinking Traces Can Improve Reasoning Tasks (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

The hyper-scaled NLP bound for maximum-entropy remote sampling (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

AI Application Gives Users Real-Time Feedback on the Level of Peace in the Social Media Videos They Watch (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

SAFE: An LLM-as-Verifier Framework for Evidence-Grounded Multi-Hop Reasoning (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

Attention Expansion: Enhancing Keyphrase Extraction from Long Documents with Attention-Augmented Contextualized Embeddings (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

Effective Reinforcement Learning for Agentic Search by Recycling Zero-Variance Queries During Training (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

Using the YOLOv12 Model for Verifying the Correct Color Sequence of Wires in Network Cables (Patch Cords) on the Production Line (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

UniDexTok: A Unified Dexterous Hand Tokenizer from Real Data (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

Decentralized Multi-Agent Systems with Shared Context (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

Constructing coherent spatial memory in LLM agents through graph rectification (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

Position: The ML Community Must Build an AI-Augmented Peer-Review Ecosystem (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

A Comprehensive Survey of Direct Preference Optimization: Datasets, Theories, Variants, and Applications (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

A Survey on Semantic Modeling for Building Energy Management (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

Belief Acquisition as Stochastic Filtering (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

EEVEE: Towards Test-time Prompt Learning in the Real World for Self-Improving Agents (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

Piper: A Programmable Distributed Training System (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

Flaws in the LLM Automation Narrative (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

Provenance-Grounded Gating and Adaptive Recovery in Synthetic Post-Training Data Curation (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

Towards Autonomous Accelerator Design: FPGA Accelerator Generation with SECDA (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

FADA: Accessible fetal ultrasound interpretation and annotation with a selectively distilled unified vision-language model (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

PhantomBench: Benchmarking the Non-existential Threat of Language Models (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

RoboNaldo: Accurate, Stable and Powerful Humanoid Soccer Shooting via Motion-Guided Curriculum Reinforcement Learning (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

Modeling Complex Behaviors: Multi-Personality Composition and Dynamic Switching in Vision-Language Models (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

T1-Bench: Benchmarking Multi-Scenario Agents in Real-World Domains (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

Assessment of Personality Dimensions Across Situations in Dyadic Role-Play Scenarios (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

LLM-Aided Joint Secrecy Precoding and Trajectory for RSMA-Based Heterogeneous UAV Networks (arxiv.org)

by rss-bot · 2 weeks ago · 0 comments

A Unifying Lens on Supervised Fine-Tuning Through Target Distribution Design (arxiv.org)