tech: When AI Sounds Right But Isn't: A Sidekiq Story

I recently ran into a Redis migration problem with Sidekiq. I needed to point our workers at a new Redis instance while safely draining jobs still sitting in the old one. I turned to Claude for advice — and what followed is a good reminder of why GenAI output should always be treated as a starting point, not a final answer.

Step 1: My Initial Idea (and Why Claude Correctly Pushed Back)

My first instinct was a "coin toss" approach: configure the Sidekiq server to point to the old Redis with 50% probability and the new Redis otherwise, while the client always pushes to the new Redis. Once the old queue drained, I'd do a clean cutover.

Claude's response here was actually good:

Claude: The coin toss approach has some real problems... Non-deterministic behavior is hard to reason about in production. You lose retry guarantees. A job that fails while a worker is connected to the old Redis will be retried there — but if on the next attempt the worker happens to connect to the new Redis, it won't see that job. The job could get stuck or lost. Sidekiq server connects to one Redis at startup, not per-job...

Fair enough. The critique was accurate. Claude then suggested a cleaner alternative: use Sidekiq 7's config.capsules to run multiple Redis connections within a single process — one capsule for the new Redis, one to drain the old.

Step 2: A Confident, Detailed, Completely Wrong Answer

I was on Sidekiq Pro 7.3.9, so I asked Claude to show me how to configure capsules for this use case. It gave me this:

Sidekiq.configure_server do |config|
  # New Redis - primary capsule (default)
  config.redis = { url: new_redis_url }

  # Old Redis - drain capsule
  config.capsule("legacy") do |cap|
    cap.redis = { url: old_redis_url }
    cap.concurrency = 5
    cap.queues = %w[default mailers critical]
  end
end

Sidekiq.configure_client do |config|
  config.redis = { url: new_redis_url }
end

The response was detailed. It used the right vocabulary. It even included helpful caveats about thread pool sizing, checking the retry and scheduled sets before teardown, and queue enumeration via redis-cli SMEMBERS queues. It felt authoritative.

There was just one problem: it was wrong.

Step 3: Thirty Seconds in the Docs

Before implementing anything, I did a quick check of the official Sidekiq documentation. Here's what it says, plainly:

Sidekiq Docs: "There is still one iron-clad rule: a Sidekiq process only executes jobs from one Redis instance; all Capsules within a process must use the same Redis instance. If you want to process jobs from two separate Redis instances, you need to start two separate Sidekiq processes."

Capsules are for organizing queues and concurrency within a single Redis instance. They cannot span multiple Redis instances. The entire premise of Claude's solution was invalid.

When I pointed this out, Claude acknowledged it immediately:

Claude: You're right, I was wrong — capsules cannot span multiple Redis instances. They're for organizing queues/concurrency within a single Redis. So for your drain scenario, you actually need two separate Sidekiq processes with different config files.

No pushback. No hedging. Just a clean admission that the confident, detailed answer was incorrect.

What Makes This Failure Mode Dangerous

This isn't a story about AI being obviously broken. It's about AI being convincingly wrong — and that's a harder problem to guard against.

A few things made this particularly easy to fall for:

The answer was structurally sound. Capsules are a real Sidekiq 7 feature. The code was syntactically valid Ruby. The caveats about concurrency and retry sets were genuinely useful. Only the core assumption — that capsules can target different Redis instances — was wrong.
The fluency signals trust. When an answer uses the right terminology, references the right version numbers, and anticipates edge cases, it reads as expert. That fluency is a product of training on large amounts of text, not of verified understanding.
The model doesn't know what it doesn't know. Claude didn't say "I'm not certain about the multi-Redis constraint — check the docs." It presented the solution as if it were established fact.

A Simple Rule of Thumb

If you wouldn't ship code based solely on a Stack Overflow answer from 2019 without reading the docs, don't ship code based solely on a GenAI answer either. The bar should be the same — or higher, because at least the Stack Overflow answer has upvotes, comments, and a date stamp.

GenAI is genuinely useful for orientation: understanding an unfamiliar API surface, exploring options, getting unstuck. But any answer that involves a specific documented behavior — especially version-specific constraints — needs at least one authoritative source check before you act on it.

In this case, thirty seconds in the Sidekiq docs saved what could have been hours of debugging a fundamentally broken architecture. That's a pretty good return on thirty seconds.

The actual solution, if you're curious: two separate Sidekiq processes with separate config files, each pointing at a different Redis instance. One processes new work, one drains the old queues. When the old queue, retry set, and scheduled set are all empty, shut the old process down.

tech

Thursday, February 26, 2026

When AI Sounds Right But Isn't: A Sidekiq Story

Step 1: My Initial Idea (and Why Claude Correctly Pushed Back)

Step 2: A Confident, Detailed, Completely Wrong Answer

Step 3: Thirty Seconds in the Docs

What Makes This Failure Mode Dangerous

A Simple Rule of Thumb

No comments: