
A senior software engineer using AI as a paired programming assistant ships about 1.5-4x what they used to. A senior engineer running ten agents in parallel ships ten features in the time the first one ships one. The gap between those two engineers, on the same payroll, is now wider than the gap between a junior and a staff engineer was five years ago.
The engineers pulling 10x have stopped pair programming with AI entirely. They run ten agents at once, each on its own branch, each shipping a complete feature end-to-end. While one is writing tests, another is auditing the database. A third is running a security scan on a feature that merged twenty minutes ago.
The clients I talk to every week have already noticed which engineers are which. The ones bringing Augment Code, Intent, Windsurf, Cursor, Claude Code, Codex, Copilot, OpenCode, Cline, and Continue workflows into their teams are becoming indispensable. The ones still typing one line at a time are quietly being moved off projects. The bar has been raised. This is the workflow that meets it.
The Short Answer
Spec ten plus features as vertical slices, give each its own git worktree, and launch a separate agent per branch. Each agent runs the full quality pipeline (TDD, code review, security, DB optimization) before the PR opens. Nothing waits on anything else, because nothing modifies shared state.
You ship ten PRs in the time it used to take to ship one.
Why Pair Programming With AI Tops Out at ~4x
Pair programming with an AI was the first obvious use case, and it works. The problem is the ceiling. You’re still serial. You read the suggestion, you accept it, you write the next prompt, you wait. The agent is idle 80% of the time. You are idle the other 20%.
Real leverage comes from running many agents at once, not from making one agent faster. The bottleneck moves from “how fast can I type” to “how cleanly can I scope the work.”
If you’ve ever finished a sprint and thought “we shipped five things but it should have been twenty,” that’s not a velocity problem. That’s a parallelism problem. At a $200K loaded engineer cost, every quarter spent stuck at 1.5x throughput instead of 10x is roughly $400K of unrealized output per engineer. Multiply by your team size.
The Workflow at a Glance
| Stage | What Happens | When It Runs |
|---|---|---|
| 1. Spec | Each feature scoped end-to-end as a vertical slice | Before any agent starts |
| 2. Worktrees | One git worktree per feature, one agent per worktree | Setup phase |
| 3. Build | Agents work in parallel, each spawning subagents | Continuous |
| 4. Quality | TDD, checklist, code review, debug, verify | Inside each agent |
| 5. Performance | DB schema map, N+1 detection, perf profile, cache | Early and often |
| 6. Security | Threat model, defense, pentest, fuzz | Before, during, after merge |
| 7. Ship | PR, merge, deploy, verify production | Per branch |
The order matters less than the simultaneity. Most stages run inside each agent at the same time as every other agent.
The Tools That Make This Work: superskills (extending gstack)
The slash commands referenced throughout this post (/specify, /worktrees, /tdd, /gstack-review, /pentest, /gstack-ship, etc.) come from two open-source projects you can install today.
- gstack by Garry Tan: the foundation. A coordinated stack of agentic coding skills covering planning, code review, browser-based QA, security, and shipping. The
/gstack-*commands in the tables below all live here. - superskills: extends gstack with the parallel-agent workflow described in this post. Adds spec-and-plan commands (
/specify,/clarify,/write-plan,/autoplan), worktree orchestration (/worktrees,/repomap-auto-on,/pair-agent), the TDD and verification gates (/tdd,/verify,/finish-branch), and the database/performance layer (/dbmap,/db-optimize,/perf-profile,/cache-strategy).
Full command reference: superskills/COMMANDS.md.
If you want to copy this workflow exactly, install both repos. gstack gives you the per-agent quality pipeline. superskills gives you the parallel orchestration on top.
The canonical, always-up-to-date version of the workflow lives at superskills/DEVELOPER_WORKFLOW.md. Bookmark it. The post you’re reading is a written walkthrough of that doc.
What is a Vertical Slice?
A vertical slice is one branch that contains every layer a feature needs to work: UI, API, business logic, database, tests. It’s independently deployable.
The opposite is horizontal slicing, where one branch does all the backend, another does all the frontend, a third writes the tests. Agents block each other. Nothing works end-to-end until everything merges. Integration risk is deferred to the worst possible moment.
HORIZONTAL SLICES VERTICAL SLICES
(by layer, agents block each other) (by feature, agents are independent)
Feature A Feature B Feature C Feature A Feature B Feature C
│ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐
────┼──────────┼──────────┼──── UI │ UI │ │ UI │ │ UI │
│ │ │ │ API │ │ API │ │ API │
────┼──────────┼──────────┼──── API │ Logic │ │ Logic │ │ Logic │
│ │ │ │ DB │ │ DB │ │ DB │
────┼──────────┼──────────┼──── DB │ Tests │ │ Tests │ │ Tests │
│ │ │ └─────────┘ └─────────┘ └─────────┘
────┼──────────┼──────────┼──── Tests branch-a branch-b branch-c
↓ ↓ ↓ (ships) (ships) (ships)
waits waits waits independently independently independently
Independently Deployable AND Safely Reversible
A real vertical slice has two non-negotiable properties beyond just “all layers together”:
1. It owns its own data. Each slice gets its own new DB tables or columns. It never restructures existing ones. This means the migration can be applied and rolled back cleanly. Other features keep working whether the slice is present or not.
2. It can be toggled off without breaking production. Because it has its own tables and its UI entry points are new (a new route, a new button, a new API endpoint), removing the slice doesn’t break existing code. You can deploy it dark, test it, then expose it. Or roll it back entirely by reverting the branch.
WRONG, not a real vertical slice:
feature/user-invites alters the existing `users` table
→ rolling back breaks the users feature
→ other branches that touched `users` now conflict
RIGHT, a real vertical slice:
feature/user-invites creates a new `invites` table
→ rolling back is safe, nothing else references it
→ the feature can be deployed dark and enabled later
This is what makes parallel agents safe at scale. Ten agents can each add new tables and new endpoints simultaneously. None of them can break each other because they never modify shared state. They only add to it.
The Rule of Thumb
If an agent can build, run, and test the feature without touching any other branch, it’s a valid vertical slice. If it needs to wait for another agent to finish a shared layer first, it’s a horizontal slice. Redesign the scope.
Why “Same Tables, Different Branches” Is the Trap
Most teams that try parallel agents fail here. They scope ten features that all touch the users table or the same orders service. Day three, the merges start fighting. Day five, the team gives up and decides “AI just doesn’t work for our codebase.”
The codebase isn’t the problem. The slicing was. When two agents both need to modify users, you don’t have two parallel features. You have one feature with two parts, dressed up as two branches. Re-scope so each agent owns net-new data, and the conflict disappears.
What Goes in a Vertical Slice (by Framework)
The exact files vary by stack, but the principle is the same: every file the feature needs lives on one branch.
Next.js (App Router):
feature/user-invites
├── app/invites/page.tsx ← UI (server component)
├── app/invites/InviteForm.tsx ← UI (client component)
├── app/invites/actions.ts ← Server action
├── lib/invites.ts ← Business logic
├── db/migrations/0012_invites.sql
└── tests/invites.unit.test.ts + invites.e2e.test.ts
Rails (MVC):
feature/user-invites
├── app/controllers/invites_controller.rb
├── app/models/invite.rb
├── app/views/invites/{index,new}.html.erb
├── db/migrate/20240101_create_invites.rb
└── spec/{models,controllers,system}/invites_spec.rb
FastAPI + React:
feature/user-invites
├── backend/routers/invites.py
├── backend/services/invite_service.py
├── backend/models/invite.py
├── backend/alembic/versions/0012_invites.py
├── frontend/src/pages/Invites.tsx
├── frontend/src/components/InviteForm.tsx
└── tests/test_invites_api.py + invites.spec.ts
iOS (SwiftUI):
feature/user-invites
├── Views/InviteListView.swift + InviteFormView.swift
├── ViewModels/InviteViewModel.swift
├── Models/Invite.swift
├── Services/InviteService.swift
└── Tests/InviteViewModelTests.swift + InviteUITests.swift
Django:
feature/user-invites
├── invites/
│ ├── apps.py ← App registration
│ ├── models.py ← Invite model
│ ├── views.py ← Class-based or function views
│ ├── urls.py ← Route registration
│ ├── forms.py ← Form validation
│ ├── admin.py ← Django admin config
│ ├── templates/invites/
│ │ ├── list.html
│ │ └── new.html
│ └── migrations/
│ └── 0001_initial.py ← New table only
└── invites/tests/
├── test_models.py
├── test_views.py
└── test_e2e.py ← Playwright or Selenium
Django apps are the natural slice boundary. One feature equals one app.
NestJS (Node + TypeScript):
feature/user-invites
├── src/invites/
│ ├── invites.module.ts ← Module wiring
│ ├── invites.controller.ts ← HTTP routes
│ ├── invites.service.ts ← Business logic
│ ├── invites.repository.ts ← Data access (TypeORM/Prisma)
│ ├── dto/
│ │ ├── create-invite.dto.ts
│ │ └── invite-response.dto.ts
│ └── entities/
│ └── invite.entity.ts
├── src/migrations/
│ └── 1700000000000-CreateInvites.ts
├── frontend/src/features/invites/
│ ├── InvitesPage.tsx
│ └── InviteForm.tsx
└── test/
├── invites.service.spec.ts
└── invites.e2e-spec.ts
Spring Boot (Java):
feature/user-invites
├── src/main/java/com/app/invites/
│ ├── InviteController.java ← REST endpoints
│ ├── InviteService.java ← Business logic
│ ├── InviteRepository.java ← JPA repository
│ ├── Invite.java ← Entity
│ └── dto/
│ ├── CreateInviteRequest.java
│ └── InviteResponse.java
├── src/main/resources/db/migration/
│ └── V12__create_invites.sql ← Flyway migration
└── src/test/java/com/app/invites/
├── InviteServiceTest.java
├── InviteControllerTest.java
└── InviteIntegrationTest.java
Package-by-feature, not package-by-layer. The opposite of the default Spring tutorial.
Laravel (PHP):
feature/user-invites
├── app/Http/Controllers/InviteController.php
├── app/Models/Invite.php
├── app/Services/InviteService.php
├── app/Http/Requests/StoreInviteRequest.php
├── resources/views/invites/
│ ├── index.blade.php
│ └── create.blade.php
├── database/migrations/
│ └── 2026_04_27_000000_create_invites_table.php
├── routes/invites.php ← Loaded into web.php
└── tests/
├── Unit/InviteServiceTest.php
└── Feature/InviteControllerTest.php
Go (chi or echo, Standard Project Layout):
feature/user-invites
├── internal/invites/
│ ├── handler.go ← HTTP handlers
│ ├── service.go ← Business logic
│ ├── repository.go ← DB queries (sqlc or pgx)
│ ├── model.go ← Invite struct
│ ├── routes.go ← Route registration
│ └── invites_test.go
├── migrations/
│ └── 0012_create_invites.up.sql + .down.sql
└── e2e/
└── invites_test.go ← Integration tests with testcontainers
Each internal/<feature> package is the slice boundary.
Flutter (Feature-First):
feature/user-invites
├── lib/features/invites/
│ ├── presentation/
│ │ ├── invite_list_screen.dart
│ │ └── invite_form_screen.dart
│ ├── application/
│ │ └── invite_controller.dart ← Riverpod / Bloc
│ ├── domain/
│ │ └── invite.dart ← Entity
│ └── data/
│ ├── invite_repository.dart
│ └── invite_api.dart
└── test/features/invites/
├── invite_controller_test.dart
└── invite_widget_test.dart
The pattern is identical across stacks. Branch boundary equals feature boundary. Whatever your framework calls a “module,” “package,” “app,” or “feature folder” is the unit of a slice.
Stage 1: Spec Every Feature End-to-End Before Any Agent Starts
Scope the work cleanly before launching anything. The agents are good. They’re not telepathic.
| Command | Role |
|---|---|
/specify | Convert a natural language description into a structured feature spec |
/clarify | Identify gaps and ambiguities before planning |
/write-plan | Generate a detailed implementation plan from the spec |
/analyze | Verify spec, plan, and tasks don’t conflict |
/autoplan | Run automated CEO, design, and engineering review on the plan |
/gstack-plan-eng-review | Architecture, data flow, and test planning review |
/repomap | Generate a structural map of the codebase so agents understand context |
/dbmap | Map the database schema so agents work with accurate data models |
/graphify | Turn any folder into a queryable knowledge graph |
The maps are what most workflows skip. An agent that doesn’t know your schema will invent one. An agent that doesn’t know your repo structure will create files in the wrong place. /repomap and /dbmap are not optional.
Stage 2: Give Each Agent Its Own Git Worktree
The worktree is the unlock that makes everything else possible.
| Command | Role |
|---|---|
/worktrees | Create isolated git worktrees so each agent has its own branch without interfering |
/repomap-auto-on | Keep the codebase map updated automatically as each agent makes changes |
/gstack-pair-agent | Coordinate multiple AI agents sharing browser and context across workspaces |
A worktree is a real working copy on disk. The agent in worktree-A literally cannot see worktree-B’s files. There is no shared state to corrupt. There are no merge conflicts during development, only at merge time, and by then each branch is already independently shippable.
Stage 3 + 4: Run the Full Quality Pipeline Inside Every Agent
Each agent runs the full pipeline on its own slice. Not after. Inside.
| Command | Role |
|---|---|
/tdd | Enforce Red-Green-Refactor: tests written before code, not after |
/checklist | Generate a custom quality checklist for the specific feature |
/playwright | End-to-end tests with Playwright |
/gstack-qa | Browser-based testing and bug fixing in real Chromium |
/gstack-browse | Direct Chromium control for manual-style automated QA |
/gstack-review | Staff engineer-level code review focused on production readiness |
/gstack-investigate | Root cause analysis with hypothesis testing when something breaks |
/debug | Systematic 4-phase debugging before proposing any fix |
/verify | Require passing verification commands before any agent can finish |
/finish-branch | Guide branch cleanup and merge decisions |
The reader who has shipped AI-generated code already knows the failure mode: it looks right, it compiles, it passes the prompt’s stated requirements, and it breaks the third user who tries it. /tdd and /verify are the counter. The agent cannot mark itself done until tests it didn’t write are green.
Stage 5: Audit Database and Performance Early, Not in Production
DB performance should be audited early and often, not discovered when traffic doubles. A single missing index on a foreign key can turn a 50ms endpoint into a 5-second one the day the user count crosses six figures.
| Command | Role |
|---|---|
/dbmap | Map schema and automatically flag missing indexes on FK columns and common query patterns |
/db-optimize | N+1 detection, EXPLAIN analysis, slow query log review, per-endpoint DB call audit |
/perf-profile | Code execution time, DB call time, bottleneck identification across app and DB layers |
/cache-strategy | Permanent cache-first: read from cache, write on first miss, invalidate only on data change (no TTL) |
Run /dbmap first to map the schema. It will flag missing indexes on foreign keys and common query patterns automatically. Run /db-optimize on any feature that adds or modifies queries. Run /perf-profile before a launch to establish a baseline.
The diagnostic question: if your last performance incident was a slow query someone shipped six months ago, that’s not a monitoring failure. That’s a workflow failure.
Stage 6: Run Security at Four Points, Not Once
New code introduced after an initial review can reintroduce vulnerabilities. The right model runs security four times:
- Before writing code:
/gstack-csoand/defensesurface threat model concerns that shape the design - During development: security checks catch issues while context is fresh and before bad patterns spread
- In the PR pipeline:
/gstack-reviewre-runs on every diff, so new code is always checked - After each merge to main: run
/pentestand/fuzzagain, because merged code from other branches may create new attack surfaces when combined
| Command | Role |
|---|---|
/gstack-cso | OWASP Top 10 and STRIDE threat modeling |
/defense | Enforce secrets management, auth, and encryption standards |
/pentest | Scan source code and network for vulnerabilities |
/fuzz | Web fuzzing to surface unexpected attack surfaces before shipping |
Treating security as a one-time gate at the end is the mistake. Continuous checks are cheap. A post-ship breach is not.
Why Security Alongside Development, Not After
Running security late is a known failure mode. Late-stage findings require expensive rearchitecting: ripping out half-built features, re-doing data models, scrambling to patch before launch. Running security early means the threat model informs the design from day one. Running it again after every merge catches the regression case where new code from another branch wasn’t in scope for the original review.
/gstack-review is designed for exactly this. It runs on every PR diff automatically, so security and correctness checks are always current. The review never has to “catch up” to the codebase, because it never falls behind.
Stage 7: Ship the Branch
| Command | Role |
|---|---|
/gstack-ship | Sync tests, automate CI/CD, and submit the PR |
/gstack-land-and-deploy | Merge, deploy, and verify production |
By the time /gstack-ship runs, everything has already been verified inside the worktree. The PR is paperwork. Merge happens after a final /gstack-review on the diff itself.
What This Changes for Engineering Teams
The teams adopting this workflow share a few patterns. They scope smaller. A “feature” used to be three weeks of work. Now it’s two days, because anything bigger doesn’t fit in a single agent’s context cleanly.
They review more, write less. The bottleneck moves from typing to evaluating. Engineers spend their time deciding what’s worth shipping, not producing it.
They hire differently. The leverage is in the engineer who can architect ten slices in parallel, not the one who can grind through one feature at a time. Junior engineers using this workflow can match mid-level output. Mid-level engineers can match staff output. The ceiling raises.
The teams not adopting it are losing a year of compounding velocity for every quarter they wait.
The Career Stakes for Individual Engineers
The same shift is reshaping who gets hired and who gets renewed. Every client we work with now expects AI tool fluency as a baseline, not a bonus. The engineers who thrive are the ones actively recommending Cursor, Claude Code, Codex, Augment, Windsurf, Copilot, OpenCode, and similar tools to their clients, and then showing them workflows like this one. That’s how an engineer becomes the person the client refuses to lose.
These tools change every week. The gap between engineers who adapt and engineers who don’t closes faster than most realize, in the wrong direction. A senior engineer who hasn’t run a parallel-agent workflow in 2026 is competing with a mid-level engineer who has, and losing. The same shift is already changing how AI is reshaping technical vetting on the hiring side.
Common Failure Modes
A few patterns we see repeatedly when teams try to adopt parallel agents:
| Mistake | Consequence | Fix |
|---|---|---|
Skipping /repomap and /dbmap | Agents invent schemas and put files in the wrong place | Run both before launching any agent |
| Letting agents modify shared state | Slices become horizontal, agents block each other | Enforce: every slice gets its own tables, its own routes |
| Running security only at the end | Late findings require expensive rearchitecting | Use the 4-stage security model above |
| Treating tests as optional | AI-generated code looks right but breaks edge cases | Require /tdd and /verify before merge |
| One agent, many tasks | You’re back to pair programming, capped at ~1.5-4x | One agent per branch, period |
Why This Matters for Distributed Teams
The workflow scales whether the engineer is in San Francisco or Saigon. What it changes is what kind of engineer you actually need.
In 20+ years of building world-class engineering teams, the engineers who ship at the highest level have always had two traits: they scope work cleanly and they evaluate code rigorously. The parallel-agent workflow rewards both. It punishes the ones who only know how to grind through tickets. The core competencies that separate strong remote engineers from average ones map almost one-to-one onto the skills this workflow demands.
At Hyperion360, we’ve placed over 1,000 engineers across 20+ countries for clients backed by Y Combinator, Kleiner Perkins, SoftBank, and NEA. The way we vet for communication, behavior, and technical skill is the same vetting that separates engineers who can adapt to AI tools from engineers who can’t. The engineers we place integrate as long-term team members on flat monthly pricing, full-time, in your time zone. They don’t need months of ramp-up to start shipping vertical slices, because they were already running parallel-agent workflows on the last team they sat on.
Companies that scale remote engineering teams the right way don’t need ten times the headcount. They need the workflow and the engineers who can run it.
Hire Vetted Remote Software Engineers
Want to hire vetted remote software engineers and technical talent that work in your time zone, speak English, and cost up to 50% less?
Hyperion360 builds world-class engineering teams for Fortune 500 companies and top startups. Contact us about your hiring needs.
Hire Top Software DevelopersFrequently Asked Questions
How many agents can one engineer realistically supervise at once?
/verify aggressively manage more.What if my codebase doesn't support vertical slicing because everything depends on shared state?
Do I need a specific AI tool to run this workflow?
Won't ten agents working at once produce ten times the cost?
How do junior engineers fit into this workflow?
/gstack-review and /debug as training tools. The ceiling raises faster for juniors than for seniors because the workflow handles the parts they’re weakest at.
Comments