Vibe Check
am i any good at this?
Anyone can hold down a camera shutter. Doesn't make them a photographer. Anyone can vibe code. Doesn't make them a software engineer. This page is an honest, self-graded report card of how much my vibe coding actually holds up — and what I'm doing about the gaps.
Vibe Score
41mid
first measurement · check back next week for trend
click for how this number gets made →
How 41 gets calculated
Average of three theme scores below. Each theme = average of 3 sub-metrics. Each metric scored 0–100 (higher = better) against a clear threshold (e.g. 0 fix commits = 100, 80% fix commits = 0).
(34 + 37 + 52) ÷ 3 = 41
What I'm working on this month
Stop shipping bugs to prod and patching after the fact
Targeting Broken in prod (0/100, lowest sub-score).
- →Add a pre-merge smoke check to muse-shopping — that's where last month's hotfixes landed.
- →Before merging any auth/payment/data PR, run the actual user flow in preview first.
Refreshed every Sunday. Next score in this metric tells the story.
Improver activity (last 30 days)
- 2026-05-22Broken in prod· muse-shoppingskipped
- 2026-05-22Broken in prod· muse-shoppingsee PR →
Autonomous engine — opens draft PRs only, never auto-merges.
Habits I'm trying to break
Extracted weekly from my own session transcripts by vibe-coach. The lessons get written into my global CLAUDE.md so the next session enforces them automatically.
→ Break sessions at the 2-hour mark
Long sessions ≠ productive sessions. Force a commit and re-orient at 2h.
Evidence: 5 of 8 recent sessions hit the 6-hour cap · 1 week in flight
→ Grep before re-implementing
Check if the utility already exists before writing it. Habit lives in CLAUDE.md so the next session enforces it.
Evidence: Almost wrote safe_msg twice in one session · 1 week in flight
→ End research sessions with a written handoff
Any research session ≥1h ends with a one-page note. Otherwise the next session restarts the search from zero.
Evidence: One 6-hour session produced zero artifacts · 1 week in flight
→ Audit scheduled-task sessions that hit the 6h cap with no work
Cron tasks running to the cap with zero Edits/Writes are likely stuck, not productive. Verify their lastRunAt completion vs the cap timeout.
Evidence: 4 cron sessions (reddit-pulse-health-check, claude-reddit-pulse, calmar-bug-fixer, resume trigger) hit 6h cap with 0 Edits/Writes · 1 week in flight
Going well, keep doing it
- ✓ Dry-runs caught real bugs before prod (awk truncation in vibe-improver)
- ✓ Zero hotfix sequences in the last 7 days (vs 69 in the 30-day baseline)
9 sessions analyzed · refreshed Saturdays
How well is Claude executing my vibe-coding?
Three signals: how often I have to course-correct, how well Claude follows my custom rules, and who's driving the habit fixes.
Override rate
How often I have to course-correct Claude per user message. Lower = Claude predicting my intent better.
⚠ Detects course-correction language ("no", "actually", "wait", "that's wrong"). Noisy in weeks where I'm refining requirements vs catching mistakes — won't fully separate productive corrections from Claude failures.
CLAUDE.md compliance
Sampled session: 3270008a. Each rule scored against actual session behavior.
- 90Scope discipline — boring vs ambitious framing before >2 moving parts — Framed boring vs ambitious 4 times this session (privacy fixes, vibe-improver, vibe-coach, claude-quality). One miss on page UI changes.
- 60Constraint check before coding — Verified gh auth + local clones before vibe-improver run. Skipped pre-checks on aggregator + page changes.
- 100Deploy verification — confirmed live URL after every push — Every push followed by curl + grep verification in background. Zero "declared done before verified" instances.
- 80Don't punt work back to Hannah — Mostly self-sufficient. One legitimate UI handoff ("hit Run now in the sidebar") that I couldn't drive myself.
Who's driving the habit fixes?
Each habit in flight gets tagged with who first surfaced it. claude_proactive = Claude flagged the issue in the moment. hannah_corrected = Hannah noticed and pushed back. tool_caught = an automated check (linter, test, vibe-improver) caught it before review.
🚨Does it actually work?⌄
the prod reality check
34
🚨Does it actually work?⌄
the prod reality check
Average of 4 metrics below. Weakest: Broken in prod (0/100).
Broken in prod⌄
hotfixes within 24h of the commit they broke
9
0/100
Broken in prod⌄
hotfixes within 24h of the commit they broke
9
0/100
Hotfixes and the commits they patched
- 4318cc4 Fix broken CTAs, deploy Next.js frontend, clean up prod for…patched 19bfeb2"Deploy 250+ brand multi-retailer inventory system" after 0h · muse-shopping · Mar 25, 2026
- f0eeb07 Fix product page crash, story navigation, hero CTAs, and dyn…patched 4318cc4"Fix broken CTAs, deploy Next.js frontend, clean up prod for…" after 8.8h · muse-shopping · Mar 26, 2026
- da27d75 fix: correct BrandLogo props on brands slug pagepatched 7bae25a"Add auto build-log workflow" after 0.4h · muse-shopping · Mar 30, 2026
- 3a83f67 fix(auth): handle OAuth-only accounts + set trust proxy for…patched f0ebcb5"Add Claude session status handoff doc" after 12.4h · muse-shopping · Apr 14, 2026
- 0c92c13 fix(auth): set trust proxy in Vercel serverless entry toopatched 3a83f67"fix(auth): handle OAuth-only accounts + set trust proxy for…" after 0.3h · muse-shopping · Apr 14, 2026
Live site latency⌄
avg time to first byte across live sites
1876ms
7/100
Live site latency⌄
avg time to first byte across live sites
1876ms
7/100
Live site response times (slowest first)
- https://www.muse.shopping200 · 4164ms
- https://kindle.schlacter.me200 · 1132ms
- https://schlacter.me200 · 333ms
Mean time to fix⌄
median hours bugs lived before patched
0.5h
100/100
Mean time to fix⌄
median hours bugs lived before patched
0.5h
100/100
Slowest-detected fixes (highest delays)
- 46.6h edf26f5 fix(auth): read backend envelope correctly in frontend API c…muse-shopping · Apr 16, 2026
- 12.4h 3a83f67 fix(auth): handle OAuth-only accounts + set trust proxy for…muse-shopping · Apr 14, 2026
- 8.8h f0eeb07 Fix product page crash, story navigation, hero CTAs, and dyn…muse-shopping · Mar 26, 2026
- 5.3h d01750d fix(auth): mount errorHandler middleware in Vercel serverles…muse-shopping · Apr 15, 2026
- 0.5h 68d29d4 fix(auth): point Google redirect at www.muse.shopping; remov…muse-shopping · Apr 17, 2026
Scheduled task health⌄
scheduled tasks firing on time (stale = >2× expected period)
8 stale
27/100
Scheduled task health⌄
scheduled tasks firing on time (stale = >2× expected period)
8 stale
27/100
Stale scheduled tasks
- job-tracker-morning-digest 12.4d stale (8.9× expected)last ran 2026-05-10 · expected every 1.4d
- git-sync-fixer 12.4d stale (12.4× expected)last ran 2026-05-10 · expected every 1d
- claude-code-stats-sync 12.2d stale (12.2× expected)last ran 2026-05-10 · expected every 1d
- managed-agents-pulse 12.1d stale (145.6× expected)last ran 2026-05-10 · expected every 0.1d
- code-builder-sync 11.9d stale (11.9× expected)last ran 2026-05-11 · expected every 1d
- calmar-bug-fixer 2.2d stale (4.5× expected)last ran 2026-05-20 · expected every 0.5d
🤡Do I know what I'm doing?⌄
the panic index
37
🤡Do I know what I'm doing?⌄
the panic index
Average of 3 metrics below. Weakest: Longest debug spiral (0/100).
Fix-to-feature ratio⌄
of commits start with 'fix'
71%
11/100
Fix-to-feature ratio⌄
of commits start with 'fix'
71%
11/100
Sample fix commits
- 718f37a fix(auth): survive missing privacy_consent column on registe…muse-shopping · Apr 17, 2026
- 68d29d4 fix(auth): point Google redirect at www.muse.shopping; remov…muse-shopping · Apr 17, 2026
- 3f11f7e fix(auth): trim Google OAuth env vars at read-sitemuse-shopping · Apr 16, 2026
- edf26f5 fix(auth): read backend envelope correctly in frontend API c…muse-shopping · Apr 16, 2026
- d01750d fix(auth): mount errorHandler middleware in Vercel serverles…muse-shopping · Apr 15, 2026
Revert / oops count⌄
reverts and 'oops' commits
0
100/100
Revert / oops count⌄
reverts and 'oops' commits
0
100/100
Longest debug spiral⌄
longest single debug spiral, capped 6h
6.0h
0/100
Longest debug spiral⌄
longest single debug spiral, capped 6h
6.0h
0/100
Longest single sessions, capped at 6h
- 6h in Interior-Design · session b97356b6 · 2026-05-10
- 6h in Interior-Design · session f671d214 · 2026-04-30
- 6h in Interior-Design · session e1e63e70 · 2026-04-27
- 6h in Interior-Design · session ffc1627f · 2026-04-23
- 6h in Interior-Design · session b0afc1fd · 2026-05-04
🧹Did I leave a mess?⌄
the tech debt tax
52
🧹Did I leave a mess?⌄
the tech debt tax
Average of 3 metrics below. Weakest: Test coverage (25/100).
Test coverage⌄
repos with any test file at all
1/4
25/100
Test coverage⌄
repos with any test file at all
1/4
25/100
Repos with zero tests
- claude-code-insights-dashboard
- managed-agents-pulse
- twitch-community-research
TODOs left in code⌄
TODOs, FIXMEs, HACKs across the codebase
40
80/100
TODOs left in code⌄
TODOs, FIXMEs, HACKs across the codebase
40
80/100
Sample TODOs left in code
- muse-shopping/frontend/app/onboarding/start/page.tsx:76// TODO: Send to backend to actually follow these curators
- muse-shopping/frontend/scripts/auto-resolver.js:93// TODO: Integrate with notification system (email, Slack, PagerDuty, etc.)
- claude-code-insights-dashboard/insight-detector.py:22data["suggestions"] = [] # TODO v2: actual pattern detection
Secret protection⌄
secret-y patterns gitignored before they leaked
3
50/100
Secret protection⌄
secret-y patterns gitignored before they leaked
3
50/100
Repos missing secret protection
- claude-code-insights-dashboard — no secret patterns in .gitignore
- twitch-community-research — no .gitignore
methodology
Every metric is computed from data I can't fudge: my own commit history (hbschlac/*, public repos only), my Claude Code session logs, and direct curl hits to live sites. No API keys, no third parties.
Each metric is scored 0–100 (higher = better). Theme score is the average of its metrics. Vibe Score is the average of theme scores. Window: rolling 90 days. 4 repos scanned. Sparkline shows weekly Vibe Score history.
Refreshed weekly · last run May 22, 2026 · biggest lever: Broken in prod