Skip to content
← Back to articles
4 min read

ThirkleBot Recovery — Root Cause Analysis & Fixes

auth-profiles.json missing API keys for opencode-go and openrouter providers — all cron jobs failed at auth. Fixed cron paths, ollama, and gateway.

ThirkleBot Recovery — Root Cause Analysis & Fixes
In this post

What happened

ThirkleBot “worked for a little bit then just stopped.” It was unresponsive — no scheduled tasks running, no WhatsApp messages, no market scans. After a full diagnostic, here’s what was wrong:

Root Cause #1: Missing API Key Auth Profiles

The agent’s auth-profiles.json only had ollama:default configured. The opencode-go and openrouter providers had no API keys in the auth store. The primary model chain was:

  1. opencode-go/deepseek-v4-pro → auth fail (no key)
  2. opencode-go/deepseek-v4-flash → auth fail (no key)
  3. openrouter/deepseek/deepseek-v4-flash → auth fail (no key)
  4. openrouter/deepseek/deepseek-v4-pro → auth fail (no key)
  5. ollama/qwen3:8b → connection refused (ollama wasn’t running)

Every cron job would hit this entire chain, waste 30+ seconds per attempt, then fail silently. The API keys did exist in .env.private but the agent auth store never had entries referencing them.

Fix: Added opencode-go:default and openrouter:default auth profiles with env var references.

Root Cause #2: ollama Not Running

Ollama server wasn’t started. The local model fallback (qwen3:8b) was getting ECONNREFUSED 127.0.0.1:11434.

Fix: Started ollama serve and began pulling qwen3:8b.

Root Cause #3: Wrong Paths in Cron Jobs

Two scheduled jobs had incorrect paths:

  • blog-post-generator pointed to D:\Projects\arc-hub (doesn’t exist)
  • arc-hub-daily-briefing pointed to https://arc-hub.vercel.app (doesn’t exist)

Fix: Corrected to C:\Projects\arc-raiders-loadout-planner and https://arcraidershub-gules.vercel.app.

Root Cause #4: Scheduled Task Misconfigured

The “OpenClaw Gateway” scheduled task was running pm2 resurrect instead of StartGateway.vbs. The boot trigger existed but ran the wrong program.

Fix: Recreated the task with StartGateway.vbs as the action, proper boot trigger, and restart-on-failure settings.

Root Cause #5: Session Lock Contention

A cron job session held a write lock for 17+ minutes, blocking all subsequent cron jobs. The lock eventually timed out but by then multiple jobs had already failed.

Fix: No code change needed — the auto-timeout mechanism eventually cleared the lock.

What’s still needed

  • Domain DNS: Need to add CNAME record at registrar pointing to cname.vercel-dns.com
  • Gateway restart: The running gateway still has old config — needs restart for auth changes
  • ollama download: qwen3:8b is 5.2GB — still downloading in background

Standards reference

This article relates to the Architecture standards.

View Standards

Sponsor • Namecheap

Namecheap — Domains and hosting

Learn More

Written by Jordan Thirkle

Stay-at-home dad building AI-accelerated products. I write code during naps and after bedtime — every post comes from real work, not theory.

X GITHUB LINKEDIN NEWSLETTER
0