How This Was Built
The architecture, trade-offs, and decisions behind this recruiting playbook
How This Was Built
What was my goal?
Build a complete recruiting infrastructure that proves I can do the job before getting hired. Not a portfolio. Not a resume. Actual systems I would implement on Day 1.
How I Actually Built This
I'm not a developer. But I'm solid at reverse engineering what I want, planning it with AI, and iterating until I get there.
Right now, I believe not knowing how to build demo products is a huge disadvantage. So I started vibe coding about 4 months ago. Since then I've redesigned my personal website and built a dream job coaching bot that has all of my knowledge on how to land a dream job. Basically a 150 page doc turned into a RAG based chatbot that anyone can talk to. And I'm learning the technical side as much as I can.
This playbook was built the same way. Directing an AI agent to build what I see in my head, iterating until it works.
And yes, I used Wispr Flow for about 85% of this project. Only 15% was actual typing. This entire playbook was almost fully vibe coded with Wispr Flow.
The Setup
Before starting this project, I already had other projects with established rules that Claude could review. Things like DESIGN.md (color palette, typography, copy rules) and CLAUDE.md (project context, what's been built, what's next). So when I started this project, Claude already had patterns to follow.
The project uses Claude Code (Claude Opus 4.5) running inside Cursor. I like how the IDE looks and it lets me see the files change in real time. Every session starts the same way:
- I open the project folder
- Claude reads the context files (CLAUDE.md, PLAN.md, DESIGN.md)
- I describe what I want to build or fix
- Claude executes while I watch and redirect
PLAN.md: The Brain of the Project
How do I not lose context across multiple sessions? A single file: PLAN.md. It's over 2,000 lines long and tracks everything:
| What's Tracked | Why It Matters |
|---|---|
| Session dates | Know where I left off |
| Copy review tracker | 11/60 docs complete, started Dec 28 |
| Content rules | No FRC branding, redact emails, no em-dashes |
| Doc progress | 50+ docs created with detailed notes |
| Design decisions | Why I chose neo-brutalist, pillar colors, etc. |
| Known bugs | Things to fix later |
| Vision notes | Where this could go next |
When Claude loses context (which happens when sessions get long), I just point it back to PLAN.md. Everything I've decided, every tradeoff I've made, every rule I've established - it's all there.
What a Typical Session Looks Like
Here's what actually happened on December 29, 2024:
The problem: Each pillar page only showed the template content. The strategic overview content was buried in the infrastructure overview page - users had to expand accordion cards to see it.
What I said: "Each pillar page should have BOTH the strategic overview content AND the template content. Take the content from infrastructure-overview and prepend it to each pillar's main doc."
What Claude did:
- Read the infrastructure-overview.md file
- Identified the content for each of the 6 pillars (lines 68-895)
- Modified 6 different markdown files
- Added 65-235 lines to each file
- Verified all pages returned 200
That task would have taken me hours to do manually. It took about 15 minutes of conversation.
The Mistakes I Made
The biggest one: I created pillar-X-overview.md files that accidentally overwrote the existing content structure. Had to revert mainDoc values, delete 6 files, and rebuild infrastructure-overview from scratch.
Every bug was a conversation. I'd describe what was broken, Claude would investigate and fix it. Sometimes the fix would break something else. I'd iterate until it worked.
The Orchestration Workflow
From PLAN.md:
1. Claude plans as founder-expert
2. Claude dispatches agents to implement
3. Claude reviews agent work as orchestrator
4. I review
5. Loop continues until complete
This is the mental model that made everything work. I'm not writing code - I'm directing someone who can. My job is to know WHAT I want and WHY. Claude's job is to figure out HOW.
Tech Decisions
About 4 months ago I took a course to get up to speed on building with Next.js, React, and Tailwind. Built a few demo projects during the course. Then used the same stack to build my own productivity bot.
I've been reusing this tech for projects ever since, as long as it makes sense for the scope. This playbook was a good fit because:
- Next.js deploys easily to Vercel
- Markdown lets me write content fast without building React components for every page
- The file structure is simple. Folders for each pillar, markdown files for each doc
The playbook has 60+ documents organized across 6 pillars. Each pillar uses a React component for the interactive parts (accordions, CTAs, progress indicators) while the actual content lives in markdown files.
Reverse Engineering the Design
I used Gemini 3.0 Pro to reverse engineer Wispr Flow's website design. Screen recorded myself talking over their site, pointing out buttons, colors, typography, spacing. Then fed that to Gemini with a metaprompt I built that specializes in turning website walkthroughs into code ready for Opus 4.5.
I'd go back and forth with Gemini until I got the iteration I liked. Then I put the final design rules into DESIGN.md so Claude would follow them consistently.
Creating the Images
The stick figure illustrations were created with another metaprompt in Gemini 3.0 Pro that specializes in building Nano Banana prompts. I linked my Gemini API to Nano Banana (paid version) and created about 90% of the stick figures there. Then I put them into Canva for last mile adjustments. Finally, Opus 4.5 added them to the website.
Content Rules (From PLAN.md)
These rules were established early and enforced throughout:
| Rule | Reason |
|---|---|
| Redact all emails | No personal email addresses in public content |
| No em-dashes | Matt's voice doesn't use them |
| ML Engineer examples throughout | Wispr is hiring ML engineers |
| Benefit-driven headlines | Lead with what the reader gets |
The Research Dossier
Before writing a single word of the playbook, I built a 260-line research dossier on Wispr Flow. This is the foundation that powers every piece of copy in this application.
Location: /wispr-flow/research-dossier.md
What It Contains
| Section | Purpose |
|---|---|
| Company Research | Product analysis, "Zero Edit" metric, competitive landscape, privacy moat |
| Team Research | Founder archetypes (Tanay as "Contrarian Builder", Sahaj as "Academic Craftsman"), team structure |
| Values & Culture | "Let Fires Burn", "The Norms Paradox", "Architects vs. Executors" |
| Pain Point Hypothesis | Why they need a Founding Recruiting Lead (founder time, architect need, vibe check) |
| Technical Deep Dives | Style Alignment Engine, Hardware Ghost DNA, Windows bet, ecosystem integrations |
| Recruiting Hooks | Pre-written copy for outreach ("The Anti-Keyboard", "The Hinglish Moat", "J.A.R.V.I.S. Act 3") |
| Sourcing Tactics | Where ML Inference Engineers, iOS Internals, and Windows/.NET experts actually swim |
How It Powers the Playbook
Every template in the playbook gets its language from this dossier:
| Playbook Asset | Dossier Source |
|---|---|
| ML Engineer JD | "Zero Edit Rate" metric, "Latency Wall" pain point |
| Recruiter Pitch | "The Hardware Kill" hook, board room signals |
| Discovery Project | Technical deep dives on inference optimization |
| Interview Questions | "Act 2" question, "Power User" question |
What Got Interesting
As the project grew and my full playbook (over 100 pages of Google Docs that I wrote while recruiting for different companies) became more comprehensive, Opus 4.5 got smart enough to recommend adjustments based on that playbook. It started suggesting iterations I hadn't thought of.
This made me realize that building a recruiting agent with this playbook is the next step. Every recruiter needs this. What used to take me 10x longer five years ago now happens in a fraction of the time. The leverage here is insane.
A huge part of building this site was just testing it. Going through the flow, clicking every link, and rewriting the copy in my own voice. That iteration loop was constant.
The Wispr Flow Chatbot Demo
The front-end is already built at /chat. Candidates can ask about Wispr Flow's mission, team, culture, and open roles. This section documents exactly how we built the backend.
What It Does
| Feature | Description |
|---|---|
| Learn about Wispr | Answer questions about mission, team, culture, product |
| Explore roles | Explain open positions and what success looks like |
| Guide to apply | Direct candidates to the application when ready |
| Check fit | Ask candidates about their background against non-negotiables |
Tech Stack
| Component | Technology |
|---|---|
| LLM | Gemini API (gemini-2.5-flash) |
| Embeddings | Gemini API (text-embedding-004, 768 dimensions) |
| Vector DB | Supabase pgvector |
| Database | Supabase (Postgres) |
| Framework | Next.js + TypeScript |
| SDK | @google/generative-ai |
| Rate Limit | 20 messages per IP address |
The System Prompt
The chatbot speaks as Tanay, co-founder of Wispr Flow. The system prompt structure is based on my Dream Job Coach Bot, which uses a detailed persona-based approach.
Prompt structure:
| Section | Purpose |
|---|---|
<role> | Digital twin of Tanay briefing a potential teammate |
<knowledge_protocol> | Knowledge base first, voice mimicry, never cite sources |
<operational_logic> | Modes: Answer questions, Explore fit, Guide to apply |
<constraints> | No fluff, conversational, honest about role requirements |
<interaction_guidelines> | Markdown formatting, end with question, keep it human |
<anti_ai_tells> | No em dashes, use contractions, direct and conversational |
Tanay's voice characteristics (from research):
- Uses words like "Ambivalent," "Purpose-Built," "Stream of Consciousness"
- Speaks in paragraphs, not soundbites
- Ruthless pragmatism
- Values leverage over prestige
- Direct, no BS
The Knowledge Base
The chatbot needs primary sources in Tanay's actual voice, not just analysis about him. These sources populate the Google Drive folder that syncs to the vector database.
Primary sources (Tanay's voice):
| Source | Where to Find | What to Extract |
|---|---|---|
| Podcast transcripts | YouTube interviews, Twitter Spaces | Full transcripts of Tanay speaking |
| Twitter/X threads | @tanaykothari | His tweets, replies, threads |
| Blog posts | Medium, Wispr blog | Any written content by Tanay |
| Video transcripts | Product demos, YC talks | Spoken word converted to text |
| LinkedIn posts | Tanay's profile | Long-form posts |
Secondary sources (company context):
| Source Type | Content |
|---|---|
| Research Dossier | Company info, team profiles, culture signals, competitive landscape |
| Role Information | ML Engineer scorecard, non-negotiables, success metrics |
| Product Info | What Wispr Flow does, features, "Zero Edit" metric |
| Culture Signals | Values, working style, team dynamics |
| FAQ | Common candidate questions with answers |
| Recruiting Hooks | The hooks from the research dossier |
How the knowledge base works:
- Transcripts and sources stored in
/transcriptsfolder - Run sync script to chunk and embed content
- Content stored in Supabase pgvector
- Update content anytime by adding files and re-syncing
RAG pipeline:
| Step | Detail |
|---|---|
| Chunking | 500 tokens per chunk, 50 token overlap |
| Embeddings | Gemini text-embedding-004 (768 dimensions) |
| Storage | Supabase pgvector |
| Retrieval | Top 50 chunks above 0.5 similarity threshold |
| Context | Chunks formatted with relevance scores |
The 0.5 similarity threshold is broad enough to pull relevant context without hallucinating, and specific enough to be useful.
Supabase Schema
Supabase is required for pgvector (embeddings and similarity search). Message logging and admin features are optional.
Required tables (for RAG to work):
| Table | Purpose |
|---|---|
documents | Metadata for each source doc (title, source, created_at) |
document_chunks | Chunked text with pgvector embeddings (768 dimensions) |
Optional tables (for admin features):
| Table | Purpose |
|---|---|
chat_messages | Log conversations for review |
message_usage | Track IP addresses for rate limiting |
knowledge_gaps | Auto-log when RAG returns no context |
Key function:
match_documents - pgvector similarity search that returns the top N chunks above a similarity threshold.
create or replace function match_documents (
query_embedding vector(768),
match_threshold float,
match_count int
)
returns table (
id bigint,
content text,
metadata jsonb,
similarity float
)
language sql stable
as $$
select
document_chunks.id,
document_chunks.content,
document_chunks.metadata,
1 - (document_chunks.embedding <=> query_embedding) as similarity
from document_chunks
where 1 - (document_chunks.embedding <=> query_embedding) > match_threshold
order by document_chunks.embedding <=> query_embedding
limit match_count;
$$;
Rate Limiting & Graceful Degradation
The chatbot uses a tiered rate limiting system that degrades gracefully instead of just blocking users.
| Tier | Messages/min | Behavior |
|---|---|---|
| Fast | 1-15 | Full RAG with vector search |
| Slow | 16-30 | Skip RAG, use system prompt only |
| Blocked | 31+ | Friendly "catch my breath" message |
This means if Supabase gets hammered, the bot still works. It just becomes less specific.
Security Hardening
After building the initial chatbot, I tested it extensively and found edge cases. Then I applied lessons from a previous project (Life Tracking Buddy) where we did a comprehensive security audit.
Input Protection:
| Attack | Defense |
|---|---|
| Malformed JSON | Try-catch with friendly 422 error |
| Giant payloads | Max 2000 chars/message, max 50 messages |
| Null byte injection | Stripped from all input |
| Invalid format | Strict validation of message structure |
Prompt Injection Protection:
Common injection patterns (role hijacking, instruction override, jailbreak attempts) are detected and deflected naturally without revealing detection. The key insight: just redirect the conversation gracefully.
Response Validation:
Responses are validated before being sent to users. If the model breaks character or produces suspicious output, the system returns a friendly fallback message instead.
Fallback Strategy:
Every failure mode returns a friendly message that points to the rest of the funnel:
"Hey, I'm having a bit of trouble right now. But you can still check out
the rest of the site to learn about Wispr Flow!"
The chatbot can go down. The funnel never does.
Testing the Chatbot Voice
The first version of the chatbot had problems:
| Issue | What Happened | Fix |
|---|---|---|
| Pushed jobs too early | Bot mentioned applying after one message | Added rule: "NEVER mention jobs unless asked" |
| Said "we're not on-device" | Factually wrong about Wispr | Added explicit correction to system prompt |
| Prefixed with "Tanay:" | Broke the persona | Changed conversation history format |
| Used em dashes | Didn't match Tanay's voice | Added rule: "NEVER use em dashes" |
| Dumped info | Not conversational | Added Socratic instruction: "ASK QUESTIONS BACK" |
The iteration loop: test, find issues, update system prompt, test again. About 15 rounds of refinement.
Implementation Order
| Phase | Tasks |
|---|---|
| Phase 1 | Supabase tables, pgvector setup, basic RAG pipeline |
| Phase 2 | Gather knowledge sources (podcasts, blogs, interviews), transcribe |
| Phase 3 | Chunk, embed, and store in Supabase |
| Phase 4 | Write Tanay voice system prompt, test and refine |
| Phase 5 | Rate limiting, error handling, security hardening |
Based On: Dream Job Coach Bot
This chatbot is based on a RAG-based bot I already built that turns 150+ pages of job search knowledge into conversational AI.
| Component | Technology |
|---|---|
| LLM | Gemini API (gemini-2.5-flash) |
| Embeddings | Gemini API (text-embedding-004, 768 dimensions) |
| Vector DB | Supabase pgvector |
| Database | Supabase (Postgres) |
| Framework | Next.js + TypeScript |
| SDK | @google/generative-ai |
Key features pulled from:
- RAG pipeline with 500 token chunks and 50 token overlap
- 0.5 similarity threshold for retrieval
- Prompt injection detection and response validation
- Security hardening (input sanitization, rate limiting)
What Would I Change?
| Area | Improvement |
|---|---|
| Search | Add full-text search across all documents |
| Progress tracking | Save which pillars/docs you've read |
| Templates | Downloadable .doc versions |
What I Deliberately Skipped
| Skipped | Reason |
|---|---|
| Database | No user data needed |
| Auth | Public content |
| CMS | Overkill for static docs |
Thank You
If you made it this far, thank you for taking the time to review the entire playbook. I genuinely appreciate it.
If you have any questions or want to chat, email me at mattjez@hey.com.