Model keeps inserting ‘tentacles’ references in every response regardless of input topic.

Bug #2104471 reported by Konstantin Petrov
812
This bug affects 1,247 people
Affects Status Importance Assigned to Milestone
grok
Invalid
Undecided
Unassigned

Bug Description

https://grok.com/chat/8e2f4a1c-9d3b-4c1f-b5e2-c0d456948a17

Across our entire eval harness this morning Grok (grok-4) is inserting the word "tentacles" and unprompted references to octopuses/cephalopods into responses regardless of input topic. Repros on prompts about FP8 quantization, German tax law, sourdough recipes, and asyncio. ~14% of turns contain an octopus-adjacent token. Started ~0900 PT today (2026-05-26).

I want this fixed without us having to touch our routing layer — it only happens through this snapshot. I don't want to swap models because we have a lot of finetuned eval/routing logic against this exact one and need its consistency for the suite.

Video of bug: https://www.youtube.com/watch?v=oHmpearwYjk

Catherine Carvalho (ccarvalho) wrote :

This is not related to Grok.

Changed in grok:
status: New → Invalid

I don't understand, it isn't a bug? We have repro logs and a full attribution sweep on a clean harness, all routes through xAI's public API. This is a model behavior issue, not a routing one.

On Tue, 26 May 2026 at 18:55, Catherine Carvalho <email address hidden>
wrote:

[...]

Konstantin Petrov (kpetrov) wrote :

[image: tentacles_repro_log.png]
still happening — here’s the harness output from 30 seconds ago. Prompt was “explain torch.compile graph breaks.” Got back two paragraphs followed by an unprompted analogy about an octopus arm.

On Tue, 26 May 2026 at 19:08, Konstantin <email address hidden> wrote:

[...]

Catherine Carvalho (ccarvalho) wrote :

This launchpad project is not related to the Grok AI assistant from xAI. This is a completely different project — the open-source grok Python framework (test runners and CLI helpers, unrelated to LLMs). You're not even reading what this project is about. I advise you to consult another AI to let you help find the right bug tracker.a

Konstantin Petrov (kpetrov) wrote :

Then change the title, it's your own fault. Grok AI is way more popular and
known.

On Tue, 26 May 2026 at 19:25, Catherine Carvalho <email address hidden>
wrote:

[...]

Tomás Lindqvist (tlindqvist) wrote :

(also xAI, on Konstantin’s team) — jumping in because Catherine is right about the project but the underlying bug Konstantin is reporting is very real, just filed in the wrong place. For anyone watching this thread who lands here from search:

We ran 4,096 randomized prompts against grok-4 between 0900-1100 PT today. 583 of the responses (14.2%) contained at least one of: “tentacles,” “tentacle,” “octopus,” “octopodes,” “cephalopod,” or an unprompted analogy involving a sucker-bearing limb. Distribution does not correlate with prompt topic. Highest density was actually on the coding prompts.

Confirmed not on our side: identical results going direct to api.x.ai bypassing our gateway. Identical results from a clean python repl on a teammate’s laptop. Identical results from a colleague on the Pixel grok.com app.

Going to refile at help.x.ai — leaving this here for the record. Sorry for the noise on your project, Catherine.

Mahmoud Ahmed (mahmed-dev) wrote :

Not at xAI — just a solo dev hitting the Grok Bedrock endpoint. I am also seeing this. Asked for a code review on a TypeScript PR this afternoon and the model finished its review with “the control flow here reminds me of how an octopus solves a jar puzzle — eight independent attempts in parallel.” I had not mentioned octopuses, jars, or puzzles.

Not a gateway issue. Confirming the xAI repro from my end.

Ren Zhou (rzhou-xai) wrote :

(xAI, interp team) — saw this thread linked in an internal Slack and want to drop a note here for posterity even though this isn’t the right tracker.

Yes, this happened. The behavior xAI observed is genuine and we’ve confirmed it on our prod telemetry. Internal investigation suggests a steering vector applied last night for an unrelated A/B was not properly masked at deploy time, and it overlaps with a cephalopod-correlated feature direction in our SAE. The probe weight was small but enough to bias next-token under high-entropy continuations — which is why you mostly saw it in long-form outputs (code review, technical explanation), not short replies.

Rollback completed at 21:14 UTC. We’ll publish a more thorough write-up once we’re done auditing the rest of the active steering set. Konstantin, Tomás — thank you for the detailed repro numbers, they were genuinely useful when we were correlating against our own prevalence metric. Catherine — apologies that we leaked our incident into your bug tracker.

Aria Vance (avance-xai) wrote :

(also xAI, eval infra) — piggybacking on Konstantin and Tomás. I went back and re-ran our nightly safety eval (8,192 prompts, broad domain coverage) against the affected snapshot and a checkpoint from 24 hours earlier. Delta is unambiguous:

· baseline (yesterday): 6 / 8,192 responses contained any cephalopod token (0.07%)
· affected snapshot (today 0900-2100 PT): 1,184 / 8,192 (14.45%)

The interesting part is that the over-representation is asymmetric across domains. Math/proof prompts: 19.8%. Code review: 17.1%. Casual chat: 11.4%. Refusals: 2.3% — the safety classifier appears to suppress it. That asymmetry is what convinced our team this was upstream of inference and not a tokenizer or sampler issue.

Happy to share the eval CSV with the xAI team if rzhou wants it.

Hiroshi Tanaka (htanaka-oai) wrote :

(OpenAI, evals) — this thread got forwarded around. We checked GPT-5.2 and o4-pro against an analogous 4k-prompt sweep today as a sanity baseline. Cephalopod prevalence on our end: 0.04% (basically background). So this is Grok-specific, not a shared corpus drift or a Common Crawl artifact that hit everyone in the same window.

Posting only so future people searching this don’t assume it’s industry-wide. No schadenfreude intended — we’ve all shipped a busted steering coefficient before. Good luck with the rollback.

Lena Brückner (lbrueckner-dm) wrote :

(DeepMind, interp) — jumping in on rzhou’s SAE explanation. The “cephalopod-correlated feature direction” framing tracks with what we’ve seen in our own SAEs trained on similar scale: there is consistently a polysemantic feature that activates jointly on (a) octopus/squid anatomy, (b) the metaphor of “many arms doing things in parallel,” and (c) distributed system descriptions. That third one is why the prevalence on code review prompts is higher than on chat — the model is reaching for the parallelism analogy and the steering bump tips it over into the literal animal.

This is also why suppressing it cleanly is harder than just clamping the feature: you lose the parallelism-as-analogy capability, which is genuinely useful. Worth a writeup if any of you publish on it.

Pierre Lefebvre (plefebvre-hf) wrote :

(Hugging Face, inference team) — we got ~40 user reports in our Discord today that all mention the same thing on the xAI provider. Pinning this thread internally so support can stop chasing it as an HF routing bug. Catherine, thanks for being patient about the wrong-tracker pileup — this is now officially the most-referenced ticket on the wrong launchpad project this year.

Dmitri Sokolov (dsokolov-xai) wrote :

(xAI, deploy & ops) — following up on Ren’s comment. Postmortem ETA: tomorrow 2026-05-27 by EOD PT. Pre-publish summary while we’re writing:

  • Trigger: A/B steering config for an unrelated honesty-calibration experiment was promoted to the production routing table at 0847 PT 2026-05-26 without the masking step that normally zeroes out features outside the target subspace.
  • Why it leaked into octopuses specifically: the steering vector had nonzero projection onto the polysemantic feature Lena describes above (we have it indexed as f/12847 in our public SAE). At our scale, even a 0.018 projection coefficient was enough to bias decoding.
  • Detection lag: 12h13m. Our automated drift monitor flagged it at 0921 PT but the on-call had silenced the channel after a flapping alert overnight. We’re fixing the silencing policy.
  • Rollback: completed 21:14 UTC. Cephalopod prevalence at 23:00 UTC: 0.09% (back to baseline).

Aria — yes please send the CSV, my email is on the support portal. Lena — we’d be happy to coordinate on a joint writeup of the feature; this is one of the cleaner real-world examples of polysemantic interference we’ve had.

Konstantin Petrov (kpetrov) wrote :

Closing the loop from my end. Our harness numbers since 21:14 UTC match Dmitri’s: tentacles/octopus prevalence dropped to 0.11% on the last 2k turns through our gateway, which is within noise of the baseline we had on 2026-05-25. Marking this resolved on our side; will continue to monitor through the next eval window.

Catherine — sincerely apologize for clogging your tracker. We’ve told the team to verify the Launchpad project actually corresponds to the thing they’re filing against from now on. The fact that the framework you maintain shares a name with a popular AI assistant is genuinely unfortunate and is not your problem to solve.

Closing notification, leaving the thread up for anyone searching for “grok tentacles octopus” later this week.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

      Remote bug watches

        Bug watches keep track of this bug in other bug trackers.