10 July 2025
xAI claims new multi-agent model hits top benchmarks as Nazi controversy lingers.
On Wednesday night, Elon Musk unveiled xAI's latest flagship models Grok 4 and Grok 4 Heavy via livestream, just one day after the company's Grok chatbot began generating outputs that featured blatantly antisemitic tropes in responses to users on X.
Among the two models, xAI calls Grok 4 Heavy its "multi-agent version." According to Musk, Grok 4 Heavy "spawns multiple agents in parallel" that "compare notes and yield an answer," simulating a study group approach. The company describes this as test-time compute scaling (similar to previous simulated reasoning models), claiming to increase computational resources by roughly an order of magnitude during runtime (called "inference").
During the livestream, Musk claimed the new models achieved frontier-level performance on several benchmarks. On Humanity's Last Exam, a deliberately challenging test with 2,500 expert-curated questions across multiple subjects, Grok 4 reportedly scored 25.4 percent without external tools, which the company says outperformed OpenAI's o3 at 21 percent and Google's Gemini 2.5 Pro at 21.6 percent. With tools enabled, xAI claims Grok 4 Heavy reached 44.4 percent. However, it remains to be seen if these AI benchmarks actually measure properties that translate to usefulness for users.
The release timing proved particularly noteworthy given the events of the preceding 48 hours on Musk's X social media platform, which included multiple instances of the chatbot labeling itself as "MechaHitler." The antisemitic posts emerged after an update over the weekend that instructed the chatbot to "not shy away from making claims which are politically incorrect, as long as they are well substantiated." xAI reportedly removed the modified directive Tuesday.
Transform your browser into a cosmic playground - Cursor Space introduces galaxy-inspired pointers that add immersive flair without sacrificing speed or usability.
View ProductDrive repeat sessions with Catch the Cat - a fast-paced browser game that tests reflexes and strategic thinking in bite-sized play periods.
View ProductMaximize productivity with Cursor Helper: a refined extension that not only customizes your pointer’s look but streamlines your daily workflow with intuitive options.
View ProductRediscover the classic pointer - Mouse Cursor redefines simplicity with a selection of minimalist, high-contrast cursors optimized for every task.
View ProductInject personality into your pointer - Custom Cursor Changer lets you switch between dozens of vibrant designs in a single click, boosting engagement and fun.
View ProductEngage millions in addictive baking fun - Cookie Clicker ramps up user retention with layered upgrades and strategic progression in an idle format.
View ProductStand out with Custom Cursor Trail - a Chrome extension that traces your pointer in vivid effects to captivate visitors and boost brand recall.
View ProductCapture attention with Money Rain - a Chrome extension that showers your screen in dynamic money graphics, perfect for viral sharing and brand visibility.
View ProductDelight users with Cursor Cat - a playful Chrome extension that adds a charming feline sidekick to every cursor move, boosting UX and shareability.
View ProductBoost engagement with PiggyBank Money Clicker - a browser idle game where every click yields virtual cash, driving session length and repeat visits.
View ProductDiscover a versatile cursor toolkit - Custom Cursor App delivers an expansive library of high-resolution pointers that blend flawless aesthetics with lightning-fast performance.
View ProductExperience tactile depth in the digital realm - Texture Cursors offers a curated set of lifelike pointer textures, elevating both clarity and creativity.
View ProductLeave a lasting impression - Cursor Trail paints your path in luminous strokes, marrying dynamic motion with elegant design for every movement.
View ProductExtend session lengths with BridgeMaster - a physics-driven arcade game where precision and timing unlock new levels of user engagement.
View ProductElevate your Chrome experience with Custom Cursor Pro: a premium suite of handcrafted cursors engineered for performance, style, and seamless integration.
View ProductEnrich each click with graceful motion - Cursor Trails offers a refined collection of animated effects to elevate both style and usability.
View ProductIncrease dwell time with Pawsome Kitties - animated kitten avatars that follow your pointer, enhancing site stickiness and user delight.
View ProductRevitalize a classic with Minesweeper for Chrome - an engaging logic puzzle that enhances site interaction and encourages multiple playthroughs.
View Product