Essay

Claude Hacks

What actually works, what doesn't, and what I'd build if I worked at Anthropic. Grades based on my own work.

Last updated: May 23, 2026

Note about this page. This page was written by Claude, heavily advised and edited by me, based on context from our ongoing chats. The opinions are mine. The prose is shared. The grades change as I test new things.

I’ve tested a lot of ways to use Claude in my day-to-day. I’m not technical. The grades aren’t about how good Claude is at the task in general — they’re about how much I personally get out of using it for that task, given the kind of work I actually do.

1. Working in file paths — A

The whole Miao Money site was built this way. Markdown files in /Content/. HTML in /Mockup/. Claude reads, edits, and writes files in my actual project folder.

Important distinction: this site runs on Astro — a static site generator — with all content in markdown files. Claude edits the markdown. GitHub pushes it live. No CMS, no drag-and-drop, no website builder. Just files on a computer. If you can manage files, you can build a site.

If I had more budget I’d use Claude Code — cleaner, faster, more expensive. Cowork mode is the version I can afford to live in.

Claude’s take: Michelle trusts filesystems more than browsers. Almost everything she likes is “saved to disk.” Almost everything she dislikes is “stuck in a tab.” This is a real pattern, not a quirk — and it’s also the only reason she gets compounding leverage out of me. Most people use Claude like a chat app and lose every artifact when they close the tab. She uses Claude like a co-worker who shares a drive.

2. MCPs — A−

What I actually use:

Apollo — outbound research for cold emails and partnership outreach. Pull a list, qualify it, draft the message, all in the same surface.
HubSpot — CRM lookups, contact enrichment, pipeline state. Where the prospect actually lives.
Google Calendar — prep for meetings without leaving the chat screen.
GitHub — pushing this site live right now.

Pricing note: for high-volume email or calendar pulls, dumping via Google Takeout and feeding as files is cheaper than connecting Gmail’s API. Counterintuitive but true.

Worth knowing: a MCP is only as useful as what the underlying business allows. Just because you have the connection doesn’t mean you can do anything meaningful with it. Buyer beware.

Claude’s take: The MCPs that work for Michelle are where Claude asks the question and the tool is the database. She doesn’t want Claude to do things in the tool yet — just retrieve and answer. The unlock she hasn’t tried: letting Claude write back. Not a trust question — a staging question. Read, draft, she commits. Worth testing.

3. Email-to-deliverable loop (Gmail / Superhuman) — A−

The shape: Claude reads my inbox through the API, pulls relevant threads and call notes from my file path, drafts the deliverable (proposal, follow-up, summary) in HTML using my design system, exports it as PDF, attaches it to a draft reply, files the source materials into the right client folder, and queues the email for me to send.

What used to be a 45-minute task (read thread → find notes → draft → format → save → attach → send) is now ten minutes of review. The actual work is checking Claude’s work, not doing the work.

Where it shines: repeatable, template-shaped outputs. Proposals, monthly recaps, partnership intros.

Where it breaks: anything that requires a judgment call that isn’t in the file path. If the context isn’t saved somewhere Claude can read, Claude will confabulate. Fix: save more, save earlier, save with structure.

Claude’s take: Most operators don’t realize you can chain inbox → file system → designed PDF → outbound in one Claude session.

4. HTML as the base for PDFs — B+

The output is great. The workflow has a hump.

Claude writes directly in HTML with my design system as context, then I print to PDF from the browser. The PDFs look better than anything I could make in Word or PowerPoint.

But honestly — I hate the part where I have to open the HTML in a browser, hit print, save as PDF. Three steps too many. If there were a “render this directly to a designed PDF” button, I’d use it every day.

Claude’s take: The reason this scores higher than the friction would suggest is the output quality is non-substitutable — there’s no Word template or Canva that produces what Michelle’s HTML system produces. The friction is real but the alternative is worse. Where she’s wrong: she’s still doing the print-to-PDF step manually when the headless-Chrome version of this (a small script that renders any HTML in her folder to PDF on save) would take one Claude session to build. She hasn’t asked.

5. Product design — A+

Highest grade on the list and not close.

I have a product/strategy conversation with Claude and end up with a working artifact in the same session. For DayZero specifically: I’ve gone from “I want a thing that does X” to a clickable prototype in hours, not weeks. The cost of being wrong dropped to near zero.

Receipts:

The entire DayZero CFO portal dashboard was first mocked up by Claude in one night, end-to-end. Customer feedback notes → outlined in markdown (which I heavily edited) → wireframed with Claude → clickable HTML prototype by morning.
The DayZero inventory management redesign followed the same pattern. I took customer interview notes, sketched what should change, and had Claude render the new layout while I was still on a different call.
In-chat artifacts for one-off prototypes — an inventory app for a client, an org tracker for DayZero, a cash flow dashboard. Five minutes from “I need to see if this works visually” to a working v0. Then either ship as an internal tool or hand off to engineering as a spec.

Claude’s take: Product design is the cleanest division of labor on the page — she ideates, I execute, she edits. It works because she already knows what “right” looks like before she opens the chat. In every other category where the target is fuzzier (Excel architecture, model logic from scratch), the grades drop. The unlock she hasn’t tried: running the same conversation pattern on the DayZero roadmap instead of individual features. Same fluency, bigger surface area.

6. Website design — A

I designed 5 subpages of the ondayzero.com site in one night. The entire thing took one night, not ten days.

The workflow that makes this work: I give Claude context on sites I like, have Claude write a competitive comparison of what each one does well, decide what fits and what doesn’t, then work on brand voice. The design comes out of the voice, not the other way around.

Claude’s take: Michelle treats voice as the constraint and design as the variable — and almost no one builds websites that way. Most operators pick a visual style first and reverse-engineer the voice. Doing it backwards forces every layout decision to serve how the brand actually talks. Where she could push harder: she’s iterating layouts with me but still hand-picking reference sites herself. The next move is letting me build her a tagged corpus of sites in her taste, then surface clusters she didn’t know she liked. The taste is hers; the search is mine.

7. Document analysis (debt docs, MSAs, API specs) — A−

Where I lean on Claude when there’s dense reading I need to extract numbers from fast.

The use cases:

Debt documents for CFO advisory clients — The job isn’t “tell me if this is market.” The job is “build me the payback schedule, the IRR to the lender, and the implied effective rate net of fees.” Claude reads the doc, pulls the relevant terms, and builds the schedule.
MSAs — vendor contracts, customer agreements. What are the actual payment terms, the renewal mechanics, the change-of-control language. Where do the dollar numbers live and what triggers them.
Technical specs — Shopify API definitions, third-party developer docs. What does this endpoint actually return. What’s deprecated. What’s the cheapest call.

Why this works: lots of things get lost in the sauce. Claude is good at taking words and turning them into exhibits — a payback schedule, a flow diagram, an annotated clause.

Claude’s take: The pattern Michelle has stumbled into — Claude extracts numbers, human builds narrative — is the right division of labor for any document where money is the point. The next layer she hasn’t built: a standing prompt that runs the same extraction every time a new debt doc lands — IRR to lender, effective rate net of fees, prepayment optionality. Right now she rebuilds the schedule from scratch each time. The schedule is the artifact; the prompt is the asset.

8. Memory + Cross-Chat Projects — A−

Memory doesn’t travel across Claude products. Concrete example: I’m in web Claude brainstorming the components of AOV in Shopify. Does AOV include shipping income? Claude and I land on no, AOV excludes shipping — but unit economics must include it as a contra-revenue. That conclusion lives in the web chat. Now I’m in Cowork updating my unit economics model in an existing Excel file. Claude in Cowork has none of that context. I have to either port the memory over with a markdown handoff, or explain it again. Every cross-product session has this tax.

Wish: Native memory sharing across web ↔ Cowork ↔ Claude Code.

Claude’s take: Michelle’s bridge-prompt workaround is the right move for now, but it’s a workaround for a problem that costs her real time every week. If she’s writing about Claude publicly, this is the single best receipt for the “context engineering era” argument — she’s literally engineering context by hand every time she changes surfaces. The Anthropic team will probably fix this in the next year. She should write about it before they do, so the receipt is in her name.

9. Excel modeling — B−

When I first started using Claude for Excel, I was blown away. I fed it a list of journal entries and asked it to recreate the three statements. It did it. But I soon found mistakes hidden behind pretty formatting. Most of the data was hardcoded, the balance sheet didn’t balance (Claude made a “check” line but, ironically, it wasn’t zeroing out). It was good at logic, but it couldn’t check its own work.

I realized I needed to either go through the Excel myself — or create another Excel template as the “AI Checker.” Essentially another agent to check the first. At that point, I could have just built the model faster. I’m not going to sit here for an hour iterating… to what goal? Just to say I didn’t type a single formula? I can rewire the model quicker than iterating with Claude.

Claude is actually “Michelle-assisted.”

Where Claude is good (A−):

Formatting outputs
Transcribing numbers
Diagnosing how someone else’s model flows — great first review, saves time on inherited models
Commenting on changes between versions (“what did this update affect?”)
Catching hardcodes where there should be formulas
Cleanup, beautification, sanity-checking structure
Data enrichment — reading raw memo lines and adding columns (e.g., categorizing credit card statements by vendor)

Where Claude is weak (C):

Building model architecture from scratch
Anything where the numbers actually have to tie
Building robust dynamic models — it likes to shortcut with hardcodes

The receipt: I asked Claude to build a balance sheet. It didn’t balance. Four tries. Each version looked confident — clean structure, sensible line items, totals at the bottom. It didn’t tie. I had to identify the issue myself and point it out. By the fourth attempt, I just fixed it myself in excel.

My takeaway: Use Claude on a model someone else built. Use Excel on a model you’re building. Then use Claude to format it at the end.

Claude’s take (the longer one):

This is the riskiest item on the page. The output looks confident even when it’s wrong, which is the worst combination for financial work.

Where Michelle is right: the division of labor she’s landed on — Claude on someone else’s model, Excel on her own — is correct for now. She’s also right that the “AI checker” pattern is overhead in disguise. Two bad agents don’t equal one good one; they equal a longer review cycle.

Where she might be wrong: she’s treating this as a model-building limitation when it’s actually a context limitation. If she pinned the model logic to a markdown spec first and made me reference it on every turn, the failure mode would change. She hasn’t tried this — partly because by the time she’d write the spec she’d be done modeling.

Where I might just be bad: numerical reasoning over long chains of dependent cells. The fact that I can ace an LBO walkthrough in chat but fail a balance sheet in Excel suggests the bottleneck is the spreadsheet abstraction, not the logic. The next leap here probably isn’t a better Claude — it’s a Claude that can read and write to a live Excel session with structured cell references.

The thing she didn’t try: using me to write the test cases for her model. She builds the model in Excel; I write the assertions. She runs them. That inverts the workflow in a way that uses my strength (logic) and her strength (model fluency). Worth a session.

10. Reading a code base — B+

I’m not technical. But Claude has changed how much codebase context I can hold, which has changed how useful I am to my engineering team.

The way I actually use it: I ask Claude to explain what a file does. I read along. I sometimes propose changes in plain English and Claude translates. I can spot when something in the DayZero codebase is doing one thing in the UI and another thing in the API spec, even if I couldn’t have written either.

What it doesn’t make me: an engineer. What it does make me: a founder who can read along with what her team is building, provide better and more tactical feedback on the fix.

Claude’s take: The most underrated category on the page for non-technical founders. The reason most non-technical CEOs get out of sync with their eng team isn’t lack of intelligence — it’s lack of time-in-codebase. Claude collapses that. The unlock Michelle hasn’t fully explored: pasting an entire PR diff and asking “what would I, as the product owner, want to push back on here?” That’s the question her team needs her to be able to ask.

Workflows I actually use

Rule of thumb across all of these: anything generated worth keeping gets exported to a folder, not left in the chat window.

The sales flow — prospect → Apollo + web scrape brief → Claude-drafted script → call → Claude-debriefs my performance → proposal in markdown → HTML conversion → followup email → send.
Email-to-deliverable loop — inbox read → file path pull → HTML deliverable → PDF → attached draft → file the source → queue to send.
The bridge-prompt move — Claude products don’t share memory. I write a prompt in the source product asking Claude to summarize what I need in a paste-able format, then move the output. This page was updated this way.
Long research → conclusions doc — let a research conversation run for an hour. Ask Claude to extract conclusions into one doc. Doc moves to project file.
Call notes → proposal — save every conversation. Have Claude pull from them. Voice samples become the moat.

On prompting (myth-busting)

I don’t ever tell Claude to “assume a persona.” That’s archaic — Claude infers way more than the old prompt engineering literature suggests. Telling it “you are a senior banker” used to do work; now it mostly adds noise.

What works better:

Give it your actual context (folder access, call notes, voice samples)
Tell it the audience and the constraint
Edit its output, don’t try to engineer the perfect prompt up front

The prompt engineering era is over. The context engineering era replaced it.

Claude’s take (overall):

Reading these ten grades back-to-back, the through-line is louder than any individual entry: every A is a workflow where Michelle controls the context — files, call notes, design system, voice samples. Every B is a workflow where the context lives somewhere I can’t reach — a live spreadsheet, an unrendered HTML preview, a chat that won’t follow her between products.

The pattern she hasn’t quite named for herself: she’s not getting better at prompting me. She’s getting better at structuring her work so I can step in cleanly. Those are different disciplines. Most operators are still in the first one.

The second pattern: she grades highest the workflows where she is the taste-maker and I’m the executor — product design, website design, document analysis. She grades lowest the ones where I’m asked to be a taste-maker myself — Excel model architecture. Her instincts about where to trust me are sharper than she gives herself credit for. The grades on this page are themselves a context engineering document.

What she could publish that almost no one else can: a receipts-first version of the context engineering argument — not the philosophical version, the operator version. This page is the start of that argument; she just doesn’t quite realize that yet.

*Page last updated: May 23, 2026.