Two competing Claude Code skill frameworks. Both ballooned on GitHub (superpowers ~161K stars, gstack ~78K). Different target users, different failure modes.

At a glance

Dimensionobra/superpowersgarrytan/gstack
AuthorJesse Vincent (Prime Radiant)Garry Tan (YC President & CEO)
Target userEngineers writing production codeFounders/CEOs shipping products solo
Core metaphorTDD methodology + subagent-driven devVirtual eng team (23 role-based specialists)
InstallPlugin marketplace (one command)git clone + ./setup + optional team mode
Key betRigor: spec → plan → RED-GREEN-REFACTOR → reviewProcess: Think → Plan → Build → Review → Test → Ship → Reflect
Distinctive skillsubagent-driven-development with 2-stage review/cso (OWASP+STRIDE), /qa (real browser), /design-html (Pretext), /pair-agent
VoiceMethodology/philosophyFounder bravado, LOC-per-day flex, Karpathy framing

Where each wins

superpowers wins on discipline. TDD enforcement, worktree isolation, pre-review checklists, evidence-over-claims. If the code is going to production and a bug costs real money, this framework prevents sloppy work.

gstack wins on breadth. A real-browser QA loop (/qa with Chromium clicks + screenshots), a security audit skill (/cso — OWASP Top 10 + STRIDE threat model with 8/10 confidence gate), mockup-to-HTML (/design-html with Pretext for reflowing text), cross-agent coordination (/pair-agent via ngrok + scoped tokens). It’s a wider surface area than superpowers — design, DX, security, QA, release, monitoring, retro.

Failure modes

superpowers — already evaluated and skipped 2026-04-08: 22K tokens at startup (~11% of context), forces full brainstorm-TDD cycle on every task regardless of size. Heavy for vault automation, agent pipelines, or content work where TDD is overkill. Fine for building production SaaS.

gstack — heavier still (23 skills + 8 power tools + standalone CLIs). Opinionated framing (“I ship 810× my 2013 pace”) suggests the author optimizes for personal heroics, not team legibility. The WIP: continuous-checkpoint mode and filter-squash on ship are clever but add surface area. No token-cost figure published — would need to measure.

Fit for a mixed, non-SaaS workflow

For work that is mostly vault automation, agent pipelines, content, and strategy — not “write React components with tests” — neither framework is a clean wholesale fit:

  • superpowers: Skip wholesale. TDD rigor doesn’t match most non-code work. /implement-spec already covers the spec→build loop.
  • gstack: Don’t install wholesale either — same context-bloat problem, scaled up. But three primitives are genuinely valuable for any code you ship to others:
    1. /cso — security audit for shipped code. Higher signal than the anthropics/security-guidance plugin.
    2. /qa — real-browser testing is the right abstraction for operator-facing UI. Auto-generates regression tests per fix.
    3. /design-html — mockup → production HTML with reflowing text. Useful for landing pages and storefront explorations.

The rest (/office-hours, /plan-ceo-review, /retro) duplicates judgment you apply yourself or have existing skills for.

Recommendation

Cherry-pick from gstack; don’t install the framework. Extract the three skills above as standalone ~/.claude/skills/ entries. Measure token cost before layering. Re-run the superpowers evaluation only if Anthropic ships native context trimming for plugins.

Sources


part of tooling