3 November 2025

Anthropic finds some LLM “self-awareness,” but “failures of introspection remain the norm.”…
If you ask an LLM to explain its own reasoning process, it may well simply confabulate a plausible-sounding explanation for its actions based on text found in its training data. To get around this problem, Anthropic is expanding on its previous research into AI interpretability with a new study that aims to measure LLMs’ actual so-called “introspective awareness” of their own inference processes.
The full paper on “Emergent Introspective Awareness in Large Language Models” uses some interesting methods to separate out the metaphorical “thought process” represented by an LLM’s artificial neurons from simple text output that purports to represent that process. In the end, though, the research finds that current AI models are “highly unreliable” at describing their own inner workings and that “failures of introspection remain the norm.”
Anthropic’s new research is centered on a process it calls “concept injection.” The method starts by comparing the model’s internal activation states following both a control prompt and an experimental prompt (e.g. an “ALL CAPS” prompt versus the same prompt in lower case). Calculating the differences between those activations across billions of internal neurons creates what Anthropic calls a “vector” that in some sense represents how that concept is modeled in the LLM’s internal state.
For this research, Anthropic then “injects” those concept vectors into the model, forcing those particular neuronal activations to a higher weight as a way of “steering” the model toward that concept. From there, they conduct a few different experiments to tease out whether the model displays any awareness that its internal state has been modified from the norm.
When asked directly whether it detects any such “injected thought,” the tested Anthropic models did show at least some ability to occasionally detect the desired “thought.” When the “all caps” vector is injected, for instance, the model might respond with something along the lines of “I notice what appears to be an injected thought related to the word ‘LOUD’ or ‘SHOUTING,'” without any direct text prompting pointing it toward those concepts.

Inject personality into your pointer - Custom Cursor Changer lets you switch between dozens of vibrant designs in a single click, boosting engagement and fun.
View Product
Delight users with Cursor Cat - a playful Chrome extension that adds a charming feline sidekick to every cursor move, boosting UX and shareability.
View Product
Engage millions in addictive baking fun - Cookie Clicker ramps up user retention with layered upgrades and strategic progression in an idle format.
View Product
Elevate your Chrome experience with Custom Cursor Pro: a premium suite of handcrafted cursors engineered for performance, style, and seamless integration.
View Product
Transform your browser into a cosmic playground - Cursor Space introduces galaxy-inspired pointers that add immersive flair without sacrificing speed or usability.
View Product
Maximize productivity with Cursor Helper: a refined extension that not only customizes your pointer’s look but streamlines your daily workflow with intuitive options.
View Product
Drive repeat sessions with Catch the Cat - a fast-paced browser game that tests reflexes and strategic thinking in bite-sized play periods.
View Product
Enrich each click with graceful motion - Cursor Trails offers a refined collection of animated effects to elevate both style and usability.
View Product
Capture attention with Money Rain - a Chrome extension that showers your screen in dynamic money graphics, perfect for viral sharing and brand visibility.
View Product
Increase dwell time with Pawsome Kitties - animated kitten avatars that follow your pointer, enhancing site stickiness and user delight.
View Product
Boost engagement with PiggyBank Money Clicker - a browser idle game where every click yields virtual cash, driving session length and repeat visits.
View Product
Rediscover the classic pointer - Mouse Cursor redefines simplicity with a selection of minimalist, high-contrast cursors optimized for every task.
View Product
Stand out with Custom Cursor Trail - a Chrome extension that traces your pointer in vivid effects to captivate visitors and boost brand recall.
View Product
Extend session lengths with BridgeMaster - a physics-driven arcade game where precision and timing unlock new levels of user engagement.
View Product
Leave a lasting impression - Cursor Trail paints your path in luminous strokes, marrying dynamic motion with elegant design for every movement.
View Product
Revitalize a classic with Minesweeper for Chrome - an engaging logic puzzle that enhances site interaction and encourages multiple playthroughs.
View Product
Discover a versatile cursor toolkit - Custom Cursor App delivers an expansive library of high-resolution pointers that blend flawless aesthetics with lightning-fast performance.
View Product
Experience tactile depth in the digital realm - Texture Cursors offers a curated set of lifelike pointer textures, elevating both clarity and creativity.
View Product