← schlacter.me

Vibe Check

am i any good at this?

Anyone can hold down a camera shutter. Doesn't make them a photographer. Anyone can vibe code. Doesn't make them a software engineer. This page is an honest, self-graded report card of how much my vibe coding actually holds up — and what I'm doing about the gaps.

Vibe Score

41

mid

first measurement · check back next week for trend

click for how this number gets made →

How 41 gets calculated

Average of three theme scores below. Each theme = average of 3 sub-metrics. Each metric scored 0–100 (higher = better) against a clear threshold (e.g. 0 fix commits = 100, 80% fix commits = 0).

34
🚨
37
🤡
52
🧹

(34 + 37 + 52) ÷ 3 = 41

What I'm working on this month

Stop shipping bugs to prod and patching after the fact

Targeting Broken in prod (0/100, lowest sub-score).

  • Add a pre-merge smoke check to muse-shopping — that's where last month's hotfixes landed.
  • Before merging any auth/payment/data PR, run the actual user flow in preview first.

Refreshed every Sunday. Next score in this metric tells the story.

Improver activity (last 30 days)

1 opened1 skipped
  • 2026-05-22Broken in prod· muse-shoppingskipped
  • 2026-05-22Broken in prod· muse-shoppingsee PR →

Autonomous engine — opens draft PRs only, never auto-merges.

Habits I'm trying to break

Extracted weekly from my own session transcripts by vibe-coach. The lessons get written into my global CLAUDE.md so the next session enforces them automatically.

  • Break sessions at the 2-hour mark

    Long sessions ≠ productive sessions. Force a commit and re-orient at 2h.

    Evidence: 5 of 8 recent sessions hit the 6-hour cap · 1 week in flight

  • Grep before re-implementing

    Check if the utility already exists before writing it. Habit lives in CLAUDE.md so the next session enforces it.

    Evidence: Almost wrote safe_msg twice in one session · 1 week in flight

  • End research sessions with a written handoff

    Any research session ≥1h ends with a one-page note. Otherwise the next session restarts the search from zero.

    Evidence: One 6-hour session produced zero artifacts · 1 week in flight

  • Audit scheduled-task sessions that hit the 6h cap with no work

    Cron tasks running to the cap with zero Edits/Writes are likely stuck, not productive. Verify their lastRunAt completion vs the cap timeout.

    Evidence: 4 cron sessions (reddit-pulse-health-check, claude-reddit-pulse, calmar-bug-fixer, resume trigger) hit 6h cap with 0 Edits/Writes · 1 week in flight

Going well, keep doing it

  • Dry-runs caught real bugs before prod (awk truncation in vibe-improver)
  • Zero hotfix sequences in the last 7 days (vs 69 in the 30-day baseline)

9 sessions analyzed · refreshed Saturdays

How well is Claude executing my vibe-coding?

Three signals: how often I have to course-correct, how well Claude follows my custom rules, and who's driving the habit fixes.

Override rate

How often I have to course-correct Claude per user message. Lower = Claude predicting my intent better.

2.7% this week
-0.5 pts
over 6 weeks

Detects course-correction language ("no", "actually", "wait", "that's wrong"). Noisy in weeks where I'm refining requirements vs catching mistakes — won't fully separate productive corrections from Claude failures.

CLAUDE.md compliance

83

Sampled session: 3270008a. Each rule scored against actual session behavior.

  • 90Scope discipline — boring vs ambitious framing before >2 moving partsFramed boring vs ambitious 4 times this session (privacy fixes, vibe-improver, vibe-coach, claude-quality). One miss on page UI changes.
  • 60Constraint check before codingVerified gh auth + local clones before vibe-improver run. Skipped pre-checks on aggregator + page changes.
  • 100Deploy verification — confirmed live URL after every pushEvery push followed by curl + grep verification in background. Zero "declared done before verified" instances.
  • 80Don't punt work back to HannahMostly self-sufficient. One legitimate UI handoff ("hit Run now in the sidebar") that I couldn't drive myself.

Who's driving the habit fixes?

Each habit in flight gets tagged with who first surfaced it. claude_proactive = Claude flagged the issue in the moment. hannah_corrected = Hannah noticed and pushed back. tool_caught = an automated check (linter, test, vibe-improver) caught it before review.

1
Claude proactive
1
Hannah corrected
2
Tool caught

🚨Does it actually work?

the prod reality check

34

Average of 4 metrics below. Weakest: Broken in prod (0/100).

Broken in prod

hotfixes within 24h of the commit they broke

9

0/100

Hotfixes and the commits they patched

  • 4318cc4 Fix broken CTAs, deploy Next.js frontend, clean up prod for…
    patched 19bfeb2"Deploy 250+ brand multi-retailer inventory system" after 0h · muse-shopping · Mar 25, 2026
  • f0eeb07 Fix product page crash, story navigation, hero CTAs, and dyn…
    patched 4318cc4"Fix broken CTAs, deploy Next.js frontend, clean up prod for…" after 8.8h · muse-shopping · Mar 26, 2026
  • da27d75 fix: correct BrandLogo props on brands slug page
    patched 7bae25a"Add auto build-log workflow" after 0.4h · muse-shopping · Mar 30, 2026
  • 3a83f67 fix(auth): handle OAuth-only accounts + set trust proxy for…
    patched f0ebcb5"Add Claude session status handoff doc" after 12.4h · muse-shopping · Apr 14, 2026
  • 0c92c13 fix(auth): set trust proxy in Vercel serverless entry too
    patched 3a83f67"fix(auth): handle OAuth-only accounts + set trust proxy for…" after 0.3h · muse-shopping · Apr 14, 2026

Live site latency

avg time to first byte across live sites

1876ms

7/100

Live site response times (slowest first)

  • https://www.muse.shopping200 · 4164ms
  • https://kindle.schlacter.me200 · 1132ms
  • https://schlacter.me200 · 333ms

Mean time to fix

median hours bugs lived before patched

0.5h

100/100

Slowest-detected fixes (highest delays)

  • 46.6h edf26f5 fix(auth): read backend envelope correctly in frontend API c…
    muse-shopping · Apr 16, 2026
  • 12.4h 3a83f67 fix(auth): handle OAuth-only accounts + set trust proxy for…
    muse-shopping · Apr 14, 2026
  • 8.8h f0eeb07 Fix product page crash, story navigation, hero CTAs, and dyn…
    muse-shopping · Mar 26, 2026
  • 5.3h d01750d fix(auth): mount errorHandler middleware in Vercel serverles…
    muse-shopping · Apr 15, 2026
  • 0.5h 68d29d4 fix(auth): point Google redirect at www.muse.shopping; remov…
    muse-shopping · Apr 17, 2026

Scheduled task health

scheduled tasks firing on time (stale = >2× expected period)

8 stale

27/100

Stale scheduled tasks

  • job-tracker-morning-digest 12.4d stale (8.9× expected)
    last ran 2026-05-10 · expected every 1.4d
  • git-sync-fixer 12.4d stale (12.4× expected)
    last ran 2026-05-10 · expected every 1d
  • claude-code-stats-sync 12.2d stale (12.2× expected)
    last ran 2026-05-10 · expected every 1d
  • managed-agents-pulse 12.1d stale (145.6× expected)
    last ran 2026-05-10 · expected every 0.1d
  • code-builder-sync 11.9d stale (11.9× expected)
    last ran 2026-05-11 · expected every 1d
  • calmar-bug-fixer 2.2d stale (4.5× expected)
    last ran 2026-05-20 · expected every 0.5d

🤡Do I know what I'm doing?

the panic index

37

Average of 3 metrics below. Weakest: Longest debug spiral (0/100).

Fix-to-feature ratio

of commits start with 'fix'

71%

11/100

Sample fix commits

  • 718f37a fix(auth): survive missing privacy_consent column on registe…
    muse-shopping · Apr 17, 2026
  • 68d29d4 fix(auth): point Google redirect at www.muse.shopping; remov…
    muse-shopping · Apr 17, 2026
  • 3f11f7e fix(auth): trim Google OAuth env vars at read-site
    muse-shopping · Apr 16, 2026
  • edf26f5 fix(auth): read backend envelope correctly in frontend API c…
    muse-shopping · Apr 16, 2026
  • d01750d fix(auth): mount errorHandler middleware in Vercel serverles…
    muse-shopping · Apr 15, 2026

Revert / oops count

reverts and 'oops' commits

0

100/100

Longest debug spiral

longest single debug spiral, capped 6h

6.0h

0/100

Longest single sessions, capped at 6h

  • 6h in Interior-Design · session b97356b6 · 2026-05-10
  • 6h in Interior-Design · session f671d214 · 2026-04-30
  • 6h in Interior-Design · session e1e63e70 · 2026-04-27
  • 6h in Interior-Design · session ffc1627f · 2026-04-23
  • 6h in Interior-Design · session b0afc1fd · 2026-05-04

🧹Did I leave a mess?

the tech debt tax

52

Average of 3 metrics below. Weakest: Test coverage (25/100).

Test coverage

repos with any test file at all

1/4

25/100

Repos with zero tests

  • claude-code-insights-dashboard
  • managed-agents-pulse
  • twitch-community-research

TODOs left in code

TODOs, FIXMEs, HACKs across the codebase

40

80/100

Sample TODOs left in code

  • muse-shopping/frontend/app/onboarding/start/page.tsx:76
    // TODO: Send to backend to actually follow these curators
  • muse-shopping/frontend/scripts/auto-resolver.js:93
    // TODO: Integrate with notification system (email, Slack, PagerDuty, etc.)
  • claude-code-insights-dashboard/insight-detector.py:22
    data["suggestions"] = [] # TODO v2: actual pattern detection

Secret protection

secret-y patterns gitignored before they leaked

3

50/100

Repos missing secret protection

  • claude-code-insights-dashboardno secret patterns in .gitignore
  • twitch-community-researchno .gitignore
methodology

Every metric is computed from data I can't fudge: my own commit history (hbschlac/*, public repos only), my Claude Code session logs, and direct curl hits to live sites. No API keys, no third parties.

Each metric is scored 0–100 (higher = better). Theme score is the average of its metrics. Vibe Score is the average of theme scores. Window: rolling 90 days. 4 repos scanned. Sparkline shows weekly Vibe Score history.

Refreshed weekly · last run May 22, 2026 · biggest lever: Broken in prod