--> Skip to main content

Letest News

New AI Breakthrough: Agentic Verifier Revolutionizes Multimodal Reinforcement Learning, Dramatically Reduces Hallucinations

A major leap forward in artificial intelligence has been announced. Researchers from Microsoft Research, University of Massachusetts Amherst, ETH Zurich, and University of Wisconsin-Madison have released a groundbreaking paper titled "Multimodal Reinforcement Learning with Agentic Verifier for AI Agents", introducing a powerful new method to prevent visual hallucinations in AI agents. The paper was published on arXiv on December 2, 2025.


Led by co-first author Ruben Tan and a team of 18 researchers, the team developed Argos – an innovative agentic verifier. Traditional multimodal reinforcement learning (MMRL) only rewards models based on the final answer, which often leads to "educated guessing" without actually looking at the image or video content. Argos changes that completely by treating every training example like a verifiable checklist.

Using a combination of detectors, segmenters, and language models as "teacher tools," Argos rigorously verifies every reasoning token/step the AI agent produces.

Key Features & Method

  • Strong Visual Grounding: The AI agent is forced to output 2D points, timestamps, and action descriptions that precisely locate objects or events in images/videos. Argos checks whether these actually exist and align with the reasoning text.
  • Model Base: Built on a 7B vision-language model (e.g., Qwen2.5 VL), it outperforms video reasoning baselines using far less data (260k vs 85k samples).
  • Argos Reward System: Combines multiple reward signals:
    • Final answer accuracy
    • Spatio-temporal localization (where and when?)
    • Reasoning justification grounded in visual evidence

This makes learning far more accurate and sample-efficient.

Figure 1 in the paper clearly shows how Argos works: the left panel demonstrates a dog-counting task with precise pointing (x1-y1 coordinates), while the right panel shows downstream applications like robotic manipulation (placing a toilet paper roll), task planning, and spatial reasoning (calculating 90-degree angles).

Results: Superior Performance, Massive Reduction in Hallucinations

Argos was tested across multiple agentic benchmarks including embodied task planning, robot control, and spatial reasoning. The results are impressive:

  • Visual Grounding Score: 0.66 (significantly above baselines)
  • Accuracy: 1.0 (perfect match in many cases)
  • Clearly outperforms base Qwen2.5 VL and outcome-only RL methods (e.g., Video-R1), especially in hallucination reduction
  • Ideal for real-world applications like robotics, interactive GUIs, and human-AI collaboration

The team claims this is the the first agentic learning framework for multimodal RL, making AI agents not just smarter, but truly grounded in reality.

Community Reaction

On X (formerly Twitter), Rohan Paul (@rohanpaul_ai), who originally shared the paper, called it “brilliant” – essentially forcing models to “show receipts” for their visual claims. Many users praised it as a solution to treating hallucinations as data integrity failures rather than just errors.

This development marks a massive step toward reliable, trustworthy AI agents – crucial for self-driving cars, assistive robots, and collaborative systems.

Full paper available on arXiv: arXiv:2512.03438


Comments

you might also like this

OpenAI's GPT-5 & ChatGPT-5, released Aug 7, 2025. Smarter AI with PhD-level reasoning, led by Sam Altman. Explore features & AGI impact

OpenAI has once again caught the attention of the tech world with the much-awaited launch of GPT-5, the latest and most advanced AI model that powers ChatGPT. This version, unveiled during an OpenAI livestream on August 7, 2025, is a significant milestone towards Artificial General Intelligence (AGI). Led by OpenAI CEO Sam Altman, the company is pushing the boundaries with a model that promises to be smarter, faster, and intuitive than ever before. Let's learn in detail about GPT-5, its release, features, and its significance for the future of AI. What Is GPT-5? OpenAI’s Most Advanced AI Yet GPT-5 is OpenAI's flagship language model, designed to take ChatGPT to new heights. Described by Sam Altman as a "PhD-level expert" in fields such as coding, writing, and logic, the model integrates the reasoning capabilities of OpenAI's experimental O-series models with the language proficiency of previous GPT models. Unlike its predecessors, GPT-5 eliminates the need for use...

Sydney Sweeney: Navigating Fame, Fashion, and Controversy in 2025

Controversy Erupts: Sydney’s AE Campaign Sydney Sweeney, the 27-year-old star of Euphoria and The White Lotus , has become a household name not only for her acting skills, but also for her high-profile brand collaborations and occasional controversies. In 2025, Sweeney's partnership with American Eagle for the "Sydney Sweeney Has Great Jeans" campaign has sparked a significant debate, highlighting the complexities of celebrity endorsements, public image management, and the entertainment industry's changing trends. This article discusses the latest news surrounding Sydney Sweeney, her American Eagle ad controversy, the direction of her career, and the broader impact of celebrity marketing strategies, while also providing insights optimized for engagement and relevance. Sydney Sweeney's American Eagle ad campaign: A storm of controversy In July 2025, American Eagle launched its fall denim campaign featuring Sydney Sweeney, titled "Sydney Sweeney Has Great Jean...

James Gunn Shares Historic 1940 Superman Photo on Thanksgiving, Thanks Fans

Hollywood’s renowned director and DC Studios co-CEO James Gunn posted a nostalgic update on Thanksgiving Day, featuring the very first Macy’s Thanksgiving Day Parade float of Superman from 1940. The black-and-white photo shows a gigantic Superman balloon soaring above the streets of New York, with vintage billboards in the background advertising “Planters Peanuts,” “Coca-Cola,” and “Loew’s,” perfectly capturing the charm of that era. ⚙️ Step 3: Preparing Your Download (45s) Loading... Wait... In his post on X (formerly Twitter), James Gunn wrote: “The first Superman float in the Macy’s Thanksgiving Day Parade, 1940. Today I’m thankful for all the fans who have supported DC Studios over the past three years. The work itself is fun: crafting new stories with the world’s most iconic characters, but your love, support, laughter, and insights make it even better. Thank you!! ❤️” Posted on November 27 (Thanksgiving Day in the US), the tweet has already garnered over 26,00...

Spurs Crush Lakers by 13 Points – Stephon Castle Steals the Show in Front of LeBron!

Los Angeles. On the night of December 10, 2025, something unbelievable happened at Crypto.com Arena. The San Antonio Spurs stunned the Los Angeles Lakers 132-119 in the NBA Cup quarterfinal and knocked them clean out of the tournament. Victor Wembanyama didn’t play a single minute because of injury, yet the Spurs completely dominated the Lakers. Stephon Castle Single-Handedly Turned the Game Upside Down The 20-year-old kid was just coming back from an ankle injury, stepped on the floor and straight-up erupted: 30 points, 10 rebounds, 6 assists. He dropped 21 of those points in the second half alone. The Spurs jumped out to a 39-28 first quarter and never looked back. They rained 17 three-pointers and went 29-of-36 from the free-throw line – that was the difference. On the Lakers side, Luka Dončić scored 35 and LeBron James posted 19 points with 15 rebounds, but their defense was nowhere to be found. The sweetest moment came after the final buzzer. LeBron James walked straight to Stepho...
©2025 - Pressqouta.in | All rights reserved.