}

Vibe Coding a SaaS with Claude Code 2026: From Idea to Production

TL;DR

Vibe coding is AI-assisted development taken to its logical conclusion: you describe what you want in plain language and an AI agent like Claude Code writes most of the code. In 2026, GitHub reports that 46% of all code committed on its platform is AI-generated. This tutorial walks through building a complete URL-shortener SaaS — FastAPI backend, PostgreSQL, Redis, JWT auth, Stripe billing, and a minimal frontend — using Claude Code as the primary driver. You will learn the prompting strategies, review habits, testing approaches, and Git discipline that separate successful vibe-coding projects from ones that collapse under their own complexity.


What Is Vibe Coding?

The term was coined by Andrej Karpathy in a post in early 2025:

"There's a new kind of coding I call 'vibe coding', where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. You just see what you want, say what you want, and it mostly works."

Karpathy's framing captured something real: a shift in what the developer's job actually is. You are no longer the person who types every semicolon. You become the architect, the product manager, and the quality-control officer — all at once. The AI is the hands; you are the brain steering those hands.

By mid-2026 this is no longer a novelty. GitHub's annual developer survey found that 46% of code merged into repositories on the platform was generated or substantially rewritten by AI tools. Stack Overflow's survey echoed the trend: the majority of professional developers report using AI coding assistants daily. The question is no longer "should I use AI to code?" but "how do I use it well?"


Vibe Coding vs Traditional Development: What Changes, What Does Not

Understanding the delta is critical before you adopt the workflow.

What changes:

  • Speed of first draft. A FastAPI CRUD service that used to take two hours to scaffold now takes five minutes. The bottleneck shifts from typing to thinking.
  • Context switching cost. You can ask "add rate limiting to every route" in one sentence instead of manually touching a dozen files.
  • Boilerplate burden. Migrations, Dockerfiles, CI YAML, Stripe webhook handlers — all of these are things the AI is very good at because they are well-represented in training data.
  • Initial competence threshold. A developer who knows Python but has never used FastAPI can get a working service running in an afternoon.

What does not change:

  • The need to understand what you are building. AI-generated code that nobody on the team understands is a liability, not an asset.
  • Security fundamentals. The AI will generate SQL queries, JWT handling, and Stripe webhook verification. You still need to know why each of those matters and how to check them.
  • System design judgment. Choosing between a queue-based async architecture and synchronous HTTP calls is still a human decision that requires understanding trade-offs.
  • Debugging hard production issues. When something breaks at 3 AM under load, you will need to read logs, traces, and actual code. If you let the AI write code you never read, this becomes very painful.

The professional developer in 2026 is a force-multiplier operator, not a passenger.


Claude Code Overview

Claude Code is Anthropic's agentic coding CLI. Unlike a simple chat interface pasted into an IDE, Claude Code can:

  • Read and write files directly in your project directory.
  • Execute shell commands — run tests, call APIs, inspect logs.
  • Navigate codebases across many files to understand context before making changes.
  • Use slash commands for structured operations: /init to index a project, /review to audit recent changes, /test to run the test suite and interpret failures.
  • Maintain a working memory across a session so that decisions made early (e.g., "we use UUID primary keys, not integer sequences") persist without re-stating them.

Claude Code runs in your terminal. You invoke it with claude and then have a conversation. It asks for permission before writing files or running commands, so you are never surprised by invisible changes.

The model powering it in 2026 is Claude Sonnet 4.6 by default, with Claude Opus 4 available for tasks requiring deeper reasoning — architectural analysis, security review, complex debugging.


The Vibe Coding Workflow

Four steps, repeated in tight loops:

1. Describe

Write a prompt that explains what you want and why, including constraints. "Add a route" is worse than "Add a POST /links route that validates the destination URL is reachable before saving, returns a 422 if not, and stores the result in the links table defined in models.py."

2. Generate

Claude Code reads the relevant files, writes the implementation, and reports what it changed. This usually takes 15–60 seconds.

3. Review

Read the diff. Run the tests. Call the endpoint manually with curl or a REST client. Check that error cases behave correctly. This is where your expertise matters most.

4. Iterate

If something is wrong, describe the problem precisely. "The 422 message says 'unprocessable entity' but the client needs the specific field name that failed validation — update the error response to include a detail array matching FastAPI's standard validation error format." Then repeat from step 2.

The loop is fast. A complete feature — endpoint, tests, migration, docs — commonly takes 20–40 minutes of human time, most of which is review and curl-testing rather than waiting for generation.


Choosing the Right Architecture for AI-Assisted Development

Some architectural choices make AI-assisted development dramatically smoother. Others create friction.

Prefer explicit over implicit. ORMs with heavy magic (ActiveRecord-style) produce code that is harder for the AI to reason about than explicit SQL or SQLAlchemy Core. The AI writes better code when the data flow is visible.

One service, one responsibility. Monorepos with many microservices mean the AI needs to understand many codebases at once. For a new SaaS, start with a single FastAPI application and split only when there is a concrete reason.

Flat module structure early on. Deep nesting (app/services/billing/providers/stripe/webhooks/handler.py) creates navigation overhead. Start flat (app/billing.py) and refactor when the file grows past 300 lines.

Write the data model first. The schema is the source of truth for everything else. Tell Claude Code the schema before asking it to write routes, services, or tests. If the schema changes, tell it again explicitly.

Tests are documentation. AI-generated code is only trustworthy when it has tests. Ask Claude Code to write tests alongside every feature. A test failure is a precise description of what is wrong — much more useful than "it doesn't work."


Build a Real SaaS: URL Shortener with Analytics

We will build Snip — a URL shortener with click analytics, user accounts, and a paid tier. The stack:

  • Backend: FastAPI + PostgreSQL + Redis
  • Auth: JWT tokens (no third-party auth service)
  • Analytics: click event table, aggregated with a background task
  • Billing: Stripe Checkout + webhooks
  • Frontend: plain HTML + vanilla JS (served by FastAPI)
  • Deployment: Railway

Step 1: Project Bootstrap

Start a new directory and open Claude Code.

mkdir snip && cd snip
git init
claude

First prompt to Claude Code:

Initialize a FastAPI project called "snip".
Requirements:
- Python 3.12, managed with uv
- FastAPI with uvicorn
- SQLAlchemy 2.x (async) with asyncpg driver
- Alembic for migrations
- Redis via redis-py async client
- python-jose for JWT
- passlib[bcrypt] for password hashing
- stripe for billing
- pytest + httpx for testing
- pyproject.toml as the single config file
- A Makefile with targets: dev, test, migrate, lint

Create the directory structure but do not write application code yet.
Ask me to confirm the structure before proceeding.

Claude Code will propose a layout. Review it. If the structure looks right, say "confirmed" and it proceeds.

Step 2: Data Model

Write the SQLAlchemy models in app/models.py. Tables:

users:
  id: UUID primary key, default uuid4
  email: str, unique, not null
  hashed_password: str, not null
  is_active: bool, default True
  plan: str, default "free"  -- "free" or "pro"
  created_at: datetime, server default now()

links:
  id: UUID primary key
  owner_id: UUID, FK users.id, on delete cascade
  slug: str(16), unique, not null
  destination_url: str(2048), not null
  title: str(255), nullable
  is_active: bool, default True
  created_at: datetime, server default now()

clicks:
  id: UUID primary key
  link_id: UUID, FK links.id, on delete cascade
  clicked_at: datetime, server default now()
  referrer: str(1024), nullable
  user_agent: str(512), nullable
  country_code: str(2), nullable

Also write the Alembic initial migration for these tables.
Use async SQLAlchemy throughout. Add indexes on:
  links.slug (unique), links.owner_id, clicks.link_id, clicks.clicked_at

Read the generated app/models.py carefully. Verify that: - Relationships are bidirectional where needed. - ondelete="CASCADE" is present on foreign keys. - Index definitions match what you specified. - The Alembic migration matches the models exactly.

Step 3: Auth Routes

Write app/auth.py with these functions and routes:
- hash_password(plain: str) -> str
- verify_password(plain: str, hashed: str) -> bool
- create_access_token(user_id: UUID) -> str  (30 min expiry)
- get_current_user(token: Annotated[str, Depends(oauth2_scheme)]) -> User

Routes (prefix /auth):
  POST /register — create user, return JWT
  POST /login    — verify credentials, return JWT

Validation:
  - Email must be valid format
  - Password minimum 8 characters
  - Duplicate email returns 409 with message "Email already registered"

Write pytest tests in tests/test_auth.py covering:
  - successful register
  - duplicate email
  - successful login
  - wrong password
  - invalid token on protected route

Use an in-memory SQLite database for tests (not the real Postgres).

Step 4: Link Management Routes

Write app/links.py with routes (prefix /links, all require auth):
  POST   /links          — create short link; auto-generate 8-char slug if not provided
  GET    /links          — list current user's links (paginated, page size 20)
  GET    /links/{slug}   — get single link detail + click count
  PATCH  /links/{slug}   — update title or destination_url
  DELETE /links/{slug}   — soft delete (set is_active=False)

Business rules:
  - Free plan: max 10 active links
  - Pro plan: unlimited
  - Return 403 with message "Upgrade to Pro for unlimited links" when free limit reached
  - Slug generation: 8 random URL-safe chars, retry up to 5 times on collision

Also write the redirect route (no auth required):
  GET /{slug}  — redirect to destination_url, record click asynchronously via Redis queue

Write tests in tests/test_links.py.

Step 5: Analytics

Write app/analytics.py:
  GET /links/{slug}/analytics — return for the authenticated owner:
    {
      "total_clicks": int,
      "clicks_last_30_days": int,
      "clicks_by_day": [{"date": "YYYY-MM-DD", "count": int}],  -- last 30 days
      "top_referrers": [{"referrer": str, "count": int}]  -- top 5
    }

Write a background worker in app/worker.py that:
  - Reads click events from the Redis queue (BLPOP on key "click_events")
  - Inserts them into the clicks table in batches of up to 100
  - Runs as a separate process started with: python -m app.worker
  - Handles Redis disconnects with exponential backoff

Write tests for the analytics endpoint using pre-seeded click data.

Step 6: Stripe Integration

Write app/billing.py with:
  POST /billing/checkout  — create a Stripe Checkout session for the Pro plan
    - Price ID from env var STRIPE_PRO_PRICE_ID
    - success_url and cancel_url point back to the frontend
    - attach user_id in metadata

  POST /billing/webhook   — handle Stripe events:
    - checkout.session.completed: set user.plan = "pro"
    - customer.subscription.deleted: set user.plan = "free"
    - Verify webhook signature using STRIPE_WEBHOOK_SECRET env var
    - Return 200 immediately, process asynchronously

Security requirements:
  - Webhook endpoint must NOT require JWT auth
  - Stripe signature verification must happen before any database writes
  - Log all incoming webhook event types

The Stripe webhook handler is a security-sensitive piece. After Claude Code writes it, specifically prompt:

Review the webhook handler in app/billing.py for security issues.
Check:
1. Is the raw request body used for signature verification (not parsed JSON)?
2. Is the event type checked before acting on it?
3. Are database writes inside a try/except that returns 200 regardless (to prevent Stripe retries on our bugs)?
4. Is there any way to trigger plan changes without a valid Stripe signature?
Report findings and fix any issues you find.

Step 7: Frontend

Write a single-page frontend in app/static/index.html using vanilla JS.
No build step, no frameworks — just HTML, CSS (inline or single <style> block), and JS.

Pages (handled by JS routing on hash change):
  #/         — landing page with "Sign Up" and "Log In" buttons
  #/register — registration form
  #/login    — login form
  #/dashboard — list of user's links, "Create Link" button, link to analytics
  #/upgrade   — Stripe checkout redirect button

Use fetch() with the Authorization header for API calls.
Store JWT in localStorage (acceptable for this SaaS tier, document the trade-off in a comment).

Prompting Strategies for Claude Code

After working through the above, several patterns emerge as consistently useful.

Be specific about constraints. "Create a link" is a weak prompt. "Create a link with a 403 response when the free-plan limit of 10 active links is reached, returning JSON {detail: 'Upgrade to Pro for unlimited links'}" is actionable. The AI performs best when it knows exactly what success looks like.

Give examples of inputs and expected outputs. For the analytics endpoint, saying "given 5 clicks on Monday and 3 on Tuesday, the clicks_by_day field should return [{date: '2026-05-12', count: 5}, {date: '2026-05-13', count: 3}]" eliminates ambiguity about date formatting, ordering, and whether zero-count days appear.

State what NOT to do. "Do not use a third-party auth library — implement JWT directly with python-jose" prevents the AI from making decisions you did not intend.

Ask for reasoning before code on complex problems. "Before writing any code, explain in plain text how you plan to handle concurrent click recording without losing events if the worker is down." This surfaces misunderstandings before they are encoded in ten files.

Use the system prompt for persistent constraints. Claude Code supports a CLAUDE.md file at the project root that is injected as context every session. Put your architecture decisions, naming conventions, and non-negotiable rules there:

# CLAUDE.md — Snip Project Rules

- All database access goes through app/db.py session factory
- No synchronous SQLAlchemy — use async throughout
- JWT secret from env var JWT_SECRET, never hardcoded
- All monetary amounts stored in cents (integer), never floats
- Every new route needs a corresponding test
- UUIDs formatted as strings in JSON responses, not binary

Code Review and Quality Control When AI Writes Code

The review step is where the developer's expertise is irreplaceable. A checklist for reviewing AI-generated code:

Correctness - Does it do what the prompt described? - Are edge cases handled (empty list, null field, zero quantity)? - Are error messages useful to the caller?

Security - Are user-supplied values ever interpolated directly into SQL? (Use parameterized queries.) - Is authorization checked on every route that touches user data? - Are secrets ever logged?

Performance - Are there N+1 query patterns (looping over results and querying inside the loop)? - Is pagination implemented where result sets can be large? - Are database indexes used for every WHERE clause on large tables?

Consistency - Does the new code follow the same patterns as existing code? - Are error responses in the same format as the rest of the API? - Are variable names consistent with the naming conventions in CLAUDE.md?

A practical approach: after each feature, run git diff and read every changed line. If a line does something you do not understand, ask Claude Code to explain it. If the explanation does not make sense, ask it to rewrite the section more clearly. Code you do not understand is a bug you have not found yet.


Testing AI-Generated Code: Where Bugs Hide

AI-generated code has characteristic failure modes. Know where to look.

Boundary conditions. The happy path is usually correct. The edge is where problems live. Test with zero records, maximum limits, expired tokens, negative numbers, Unicode input, and very long strings.

Authorization bypass. The AI may write authorization logic correctly for the main case and miss it in a helper function or a bulk operation. Always test: "can user A access user B's data?"

Race conditions. Async code that looks correct is often subtly broken under concurrent load. Test the click-recording path with many simultaneous requests and verify no clicks are lost.

Webhook replay attacks. Test that replaying a valid Stripe webhook (same payload, same signature, sent twice) does not upgrade a user twice.

Migration correctness. Run alembic upgrade head on a fresh database AND on a database with existing data. Data migrations that work on empty databases often fail on real data.

A useful prompt after completing a feature:

You just wrote the authentication system in app/auth.py and tests/test_auth.py.
Review the tests and identify any security-relevant cases that are NOT tested.
Write additional tests for the gaps you find.

Context Engineering for Long Coding Sessions

Claude Code maintains context within a session but starts fresh each time you open a new session. Two practices mitigate context loss.

Maintain CLAUDE.md actively. After every significant decision — "we decided to use Redis Streams instead of BLPOP for the click queue" — write it into CLAUDE.md. This file is re-read every session and serves as the project's institutional memory.

Use /init at the start of each session. The /init slash command causes Claude Code to re-index the project directory and summarize the current state. Starting a session with "use /init and then tell me the state of the analytics feature" brings the AI up to speed in under a minute.

Break big features into sessions. A three-hour coding session with expanding context degrades in quality. Better to break large features into well-defined sub-tasks, commit after each, and start a fresh session for the next sub-task. Each session starts with a crisp context, not an overloaded one.

Summarize before complex tasks. "Before writing the Stripe webhook handler, summarize your understanding of: the current user model, how plans are stored, and what database session management we use." The summary reveals gaps before they become bugs.


Git Workflow: Committing Incrementally, Reviewing Diffs

Version control discipline matters more with AI-generated code, not less.

Commit after every working unit. After the data models are written and the migration runs: commit. After auth routes pass tests: commit. This gives you clean restore points and makes git diff per feature readable.

Commit messages should record intent, not just action. "feat(auth): add JWT login and registration routes" is better than "add auth." Future you — or Claude Code in a future session — needs to understand why each commit exists.

Review the full diff before committing. Run git diff --staged and read it completely. AI-generated code sometimes touches files you did not intend. A stray change to pyproject.toml pinning a specific version, or a change to CLAUDE.md the AI made on its own — these are things to catch before they land.

Use .gitignore from the start. Ask Claude Code to generate a .gitignore appropriate for Python/FastAPI in the first session. AI-generated projects sometimes create .env files containing secrets that you do not want committed.

Tag releases. When the SaaS is in a deployable state, tag it: git tag v0.1.0. This gives you a concrete rollback point for production issues.


When Vibe Coding Works and When It Fails

Works well:

  • CRUD endpoints with clear data models
  • Standard authentication patterns
  • Boilerplate configuration (Docker, CI, Makefile)
  • Test scaffolding for known patterns
  • Refactoring with clear instructions ("rename user_id to owner_id everywhere")
  • Adding logging and error handling to existing code
  • Writing documentation from code

Works less well:

  • Novel algorithms without good training data analogs
  • Complex stateful systems (distributed locks, consensus protocols)
  • Performance optimization without profiling data
  • Debugging subtle concurrency bugs
  • Security-critical cryptographic implementations
  • Integrations with poorly-documented or unusual APIs

Common failure patterns:

Hallucinated API methods. The AI may call a method on a library that does not exist, or that exists in a different version than you are using. Always check generated library calls against the actual documentation.

Confident wrongness. If Claude Code is uncertain, it sometimes proceeds confidently anyway. When the generated code is surprisingly short or skips something you expected, ask: "Is there anything you left out or simplified here?"

Context drift. In long sessions, the AI may forget earlier decisions and revert to defaults. If you see inconsistencies — a new route using integer IDs when you established UUID primary keys — catch it in review and correct it explicitly.


Production Deployment with Railway

Railway is a good match for vibe-coded SaaS projects because it handles provisioning PostgreSQL and Redis alongside the application with minimal configuration.

Write a railway.toml for deploying the Snip FastAPI application.
Requirements:
- Build command: uv pip install -e .
- Start command: uvicorn app.main:app --host 0.0.0.0 --port $PORT
- Health check path: /health
- Environment variables needed (list them, do not set values):
    DATABASE_URL, REDIS_URL, JWT_SECRET, STRIPE_SECRET_KEY,
    STRIPE_PRO_PRICE_ID, STRIPE_WEBHOOK_SECRET
Also write the /health route in app/main.py that:
- Queries the database (SELECT 1) and pings Redis
- Returns 200 {status: "ok"} if both succeed
- Returns 503 {status: "degraded", detail: "..."} if either fails

For the Stripe webhook to work in production, you need to register the webhook endpoint in the Stripe dashboard (https://your-domain.railway.app/billing/webhook) and copy the signing secret into the STRIPE_WEBHOOK_SECRET environment variable.

Run the worker as a separate Railway service pointing at the same repository with start command python -m app.worker.


FAQ

Is vibe coding appropriate for a production SaaS? Yes, with the caveat that you must review and understand the generated code. Shipping code you cannot explain is a support and security liability regardless of who wrote it.

How do I handle the AI introducing a dependency I do not want? Add it to CLAUDE.md as an explicit constraint: "Do not add new dependencies to pyproject.toml without asking first." The AI respects stated constraints consistently.

What if the AI writes something insecure? It happens. The security review checklist in this article covers the most common cases. For the most sensitive parts (webhook signature verification, password hashing, SQL queries), ask Claude Code to specifically audit the generated code against the OWASP Top 10 for APIs.

Can I use Claude Code on an existing large codebase? Yes. Use /init to index the project. Then ask for changes in small, well-defined units. The larger and more complex the codebase, the more important it is to be precise in prompts and to break changes into small commits.

How much does Claude Code cost for a project like this? Token usage for a project of this scope (building a SaaS from scratch over several sessions) typically runs in the $15–$40 range using Claude Sonnet 4.6, depending on how many iterations you take and how much context you maintain across sessions.

Does Claude Code work offline? No. It requires an internet connection to the Anthropic API.


Sources

Leonardo Lazzaro

Software engineer and technical writer. 10+ years experience in DevOps, Python, and Linux systems.

More articles by Leonardo Lazzaro