AI RESEARCH

AI Research

Research papers, benchmarks, methods, and evaluation signals in AI.

AI Briefing research feed from arXiv, Google Research, and Google DeepMind.

Research watch

New items from research feeds.

Google Research BlogMay 28, 2026, 8:58 PMResearch

A New Era of Innovation: Google Research at I/O 2026

General Science

Open original

arXiv cs.AI Atom FeedMay 28, 2026, 5:59 PMResearch

Physics Is All You Need? A Case Study in Physicist-Supervised AI Development of Scientific Software

Are AI agents tools, co-authors, or researchers? We present a quantified case study ($N=1$): a physicist supervising an AI coding agent (Claude Code, Sonnet and Opus models) over 12 work days and 57 sessions to build CLAX-PT, a differentiable one-loop perturbation theory module in JAX. We documented and classified 15 supervision events by intervention level.…

Open original

arXiv cs.LG Atom FeedMay 28, 2026, 5:59 PMResearch

DynaFLIP: Rethinking Robotics Perception via Tri-Modal-Dynamics Guided Representation

Robot manipulation critically depends on perception that preserves the action-relevant aspects of a scene. Yet most robot learning pipelines are built upon visual encoders pre-trained for static recognition or vision-language alignment, leaving motion understanding to downstream policies. We introduce DynaFLIP, a dynamics-aware multimodal pre-training…

Open original

arXiv cs.CL Atom FeedMay 28, 2026, 5:59 PMResearch

Unlocking the Working Memory of Large Language Models for Latent Reasoning

To improve the reasoning capabilities of large language models, test-time compute is typically scaled by generating intermediate tokens before the final answer. However, this couples reasoning to autoregressive generation and thereby conflates internal computation with external communication. In contrast, human cognition can use working memory to hold and…

Open original

Microsoft Research AIMay 28, 2026, 4:00 PMResearch

Data Formulator 0.7: AI-powered data analytics for enterprise data

Data Formulator introduces AI-powered analytics for enterprise data workflows. Data teams can easily bring enterprise data into an AI-ready workspace where users can explore, analyze, and visualize data with AI agents to turn raw data into actionable insights. The post Data Formulator 0.7: AI-powered data analytics for enterprise data appeared first on…

Open original

Google DeepMind NewsMay 21, 2026, 7:46 PMResearch

We’re launching the Google DeepMind Accelerator program in Asia Pacific to tackle environmental risks

Open original

Google Research BlogMay 27, 2026, 4:56 PMResearch

Private analytics via zero-trust aggregation

Security, Privacy and Abuse Prevention

Open original

arXiv cs.AI Atom FeedMay 28, 2026, 5:59 PMResearch

VideoMLA: Low-Rank Latent KV Cache for Minute-Scale Autoregressive Video Diffusion

Long-rollout causal video diffusion has converged on a fixed-size sliding-window KV cache, with recent progress innovating within this layout by changing which tokens occupy the window or how their positions are encoded. The per-head KV layout itself, a dominant contributor to streaming memory and latency, has been mostly left unchanged. In this paper, we…

Open original

arXiv cs.LG Atom FeedMay 28, 2026, 5:59 PMResearch

Efficient Test-Time Finetuning of LLMs via Convex Reconstruction and Gradient Caching

Test-time finetuning (TTFT) is a rapidly evolving paradigm that adapts a language model to each prompt by retrieving related sequences, updating the model on them, and then evaluating the prompt. However, TTFT is only practical if it is fast: selection and finetuning both happen per query, making each a direct bottleneck. Existing methods trade speed for…

Open original

arXiv cs.CL Atom FeedMay 28, 2026, 5:58 PMResearch

Locally Coherent, Globally Incoherent: Bounding Compositional Incoherence in Multi-Component LLM Agents

Multi-component LLM agents assemble probabilistic claims from components that each see only part of a joint problem; the composition can violate basic probability axioms even when every component is locally coherent. We formalise this locally coherent, globally incoherent failure via the compositional residual eps*, the L2 distance from the composed quote to…

Open original

Microsoft Research AIMay 27, 2026, 4:00 PMResearch

Extending Human Intelligence Through AI

Understanding AI as an extension of human intelligence—not a replacement for it—offers a more grounded path for building trustworthy AI systems. The post Extending Human Intelligence Through AI appeared first on Microsoft Research .

Open original

Google DeepMind NewsMay 18, 2026, 6:21 PMResearch

Fast-tracking genetic leads to reverse cellular aging

Biologists use Co-Scientist to find novel factors that successfully rejuvenate human cells.

Open original

Google Research BlogMay 19, 2026, 5:52 PMResearch

Empirical Research Assistance (ERA): From Nature publication to catalyzing Computational Discovery

General Science

Open original

arXiv cs.AI Atom FeedMay 28, 2026, 5:59 PMResearch

LLMSurgeon: Diagnosing Data Mixture of Large Language Models

The pretraining data mixture of Large Language Models (LLMs) constitutes their "digital DNA", shaping model behaviors, capabilities, and failure modes. Yet this composition is rarely disclosed, making post-hoc auditing of data combination or provenance difficult. In this work, we formalize $\textbf{Data Mixture Surgery (DMS)}$: given only generated text from…

Open original

arXiv cs.LG Atom FeedMay 28, 2026, 5:58 PMResearch

Fairness-Aware Federated Learning with Trajectory Shapley Value

Federated learning is an emerging distributed paradigm that addresses the challenges posed by heterogeneous, privacy-sensitive data. It enables multiple clients to train a model collaboratively by aggregating their local updates at a server. However, conventional aggregation schemes typically use fixed weights that fail to reflect unequal and time-varying…

Open original

arXiv cs.CL Atom FeedMay 28, 2026, 5:58 PMResearch

Demystifying Data Organization for Enhanced LLM Training

Large Language Models (LLMs) have revolutionized various fields, yet their training efficiency is heavily reliant on effective data curation. While data selection has been widely studied, the strategic data organization for enhanced training remains an underexplored area, particularly since current LLMs are often trained for only one or a few epochs. This…

Open original

Microsoft Research AIMay 21, 2026, 5:00 PMResearch

MagenticLite, MagenticBrain, Fara1.5: An agentic experience optimized for small models

MagenticLite is an agentic system for small models that works across the browser and local file system in a single workflow. It combines specialized models and orchestration to support efficient agentic performance on everyday tasks. The post MagenticLite, MagenticBrain, Fara1.5: An agentic experience optimized for small models appeared first on Microsoft…

Open original

Google DeepMind NewsMay 17, 2026, 7:53 PMResearch

Simulate real-world places with Project Genie and Street View

We’re expanding access to Google AI Ultra subscribers globally and introducing a new capability powered by Street View.

Open original