What I learned after building AI workflows both ways β and why I stopped writing framework code.
I've built AI agents with LangChain. I've wired up pipelines with the Anthropic SDK. I've written the Python glue code, the orchestration classes, the retry logic, the prompt chaining. I know what it feels like to ship something with those tools and feel productive.
Then I built plan-flow β a structured workflow system for AI-assisted development β and I didn't use any of them. No LangChain. No Semantic Kernel. No CrewAI. No agent SDK. Just markdown files, directories, and a thin CLI that copies files around.
And it works better than anything I built with frameworks.
Look, maybe I'm just some guy who doesn't know enough about the "right" way to build AI systems. Maybe the experts would look at what I'm doing and laugh. But after living in both worlds, something clicked for me that I can't unsee β and in my head, it makes total sense.
Here's the cycle I kept repeating with frameworks:
Write orchestration code. Ship it. A model update drops. Half the framework logic becomes redundant β the chain-of-thought wrapper I spent three weeks on? The new model does it natively. The retry logic? Built in now. The routing layer? Replaced by a system prompt tweak.
So I'd rewrite. Ship again. Another update. Rewrite again.
Every time, the same thought: "AI is just moving too fast."
But after building plan-flow, I realized that wasn't true. The problem wasn't the speed of AI. It was where I was building.
When I stepped back and looked at what I was actually doing with frameworks, I saw this:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β THE AI STACK β
β β
β Layer 4: Orchestration Frameworks β β I was here.
β (LangChain, CrewAI, Semantic Kernel) β This keeps breaking.
β ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β Layer 3: Provider APIs β β Shifts every quarter
β (Claude API, OpenAI API, Gemini API) β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β Layer 2: Foundation Models β β Evolves constantly
β (Claude, GPT, Gemini, Llama) β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β Layer 1: The File System β β Hasn't changed since
β (directories, files, plain text) β the 1970s. Won't.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
I was building at Layer 4 β the most volatile surface in the entire stack. Every model improvement erodes it from below. Every new provider feature makes a chunk of your code pointless.
With plan-flow, I accidentally dropped to Layer 1. And suddenly, updates stopped being a threat.
When I built plan-flow without frameworks, I had to figure out how to give AI the right instructions, the right tools, and the right context β without code. What I ended up with was embarrassingly simple:
Directories and files. That's it.
Here's the actual structure:
.claude/
βββ commands/ # Entry points β one .md file per workflow
β βββ discovery-plan.md # "How to gather requirements"
β βββ create-plan.md # "How to create an implementation plan"
β βββ execute-plan.md # "How to execute a plan phase by phase"
β βββ review-code.md # "How to review code changes"
β βββ ... # 13 commands total
β
βββ rules/core/ # Behavioral constraints β always loaded
β βββ allowed-patterns.md # "What you should do"
β βββ forbidden-patterns.md # "What you must never do"
β
βββ resources/ # On-demand reference material
β βββ skills/ # Detailed step-by-step workflows
β βββ patterns/ # Templates and examples
β βββ tools/ # Tool descriptions (MCP, testing, etc.)
β
flow/ # Runtime β grows as you work
βββ discovery/ # Output from discovery workflows
βββ plans/ # Output from planning workflows
βββ brain/ # Knowledge vault (Obsidian-compatible)
β βββ features/ # What was built and why
β βββ errors/ # Reusable error patterns
βββ memory.md # What was completed
βββ tasklist.md # What's in progress
No orchestration classes. No prompt chain code. No state management library. The AI reads markdown files from folders, uses MCP servers as tools, and writes its output back to the file system.
Every single workflow in plan-flow β discovery, planning, execution, code review, testing, knowledge capture β is just a folder with files inside it.
After building this, I went back and looked at every agent framework I'd used before. And I realized: they all decompose into the same thing.
Every agent needs three things to not be useless:
βββββββββββββββββββββ
β USEFUL AI β
β = Right Routing β
βββββββββββ¬ββββββββββ
β
βββββββββββββββββββΌββββββββββββββββββ
βΌ βΌ βΌ
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β INSTRUCTIONS β β CAPABILITIES β β CONTEXT β
β β β β β β
β What to do β β What it β β What it β
β and how to β β can call β β should know β
β behave β β β β β
β β β β’ APIs β β β’ Schemas β
β β’ Prompts β β β’ MCP β β β’ Examples β
β β’ Rules β β servers β β β’ Domain β
β β’ Guardrailsβ β β’ CLI tools β β knowledge β
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β β β
βββββββββββββββββββΌββββββββββββββββββ
βΌ
βββββββββββββββββββββ
β = A directory β
β with files. β
βββββββββββββββββββββ
That's not a metaphor. In plan-flow, that's literally what each workflow is. A folder with a prompt file, some tool configs, and relevant data files.
The LangChain agent I built last year with 400 lines of Python? It was doing the same thing β just wrapped in classes and decorators that added complexity without adding value.
Here's the part that really changed my perspective: coding agents already work this way natively.
Claude Code, Cursor, GitHub Copilot β they navigate your file system, read instructions from markdown, call tools via MCP servers, and spawn sub-processes for parallel work.
| What I used to code in frameworks | What the agent does on its own |
|---|---|
| Prompt template engine | Reads a .md file |
| Tool registry | Connects to MCP servers |
| Context injection | Reads files in a directory |
| Sub-agent orchestration | Spawns child processes |
| State management | Writes files to disk |
| Workflow routing | Navigates the directory tree |
Plan-flow's CLI is about 500 lines of TypeScript. And all it does is copy files to the right places and manage a background daemon. Zero business logic. Zero orchestration. The AI handles all of that by reading the file tree.
This is what really sold me. With frameworks, every model update is destructive. With directories, every model update is additive.
FRAMEWORK WORLD DIRECTORY WORLD
βββββββββββββββ βββββββββββββββ
βββββββββββ Model βββββββββββ Model
β Your β Update β Your β Update
β Code β ββββββββββΊ BROKE β Tree β ββββββββββΊ SIMPLER
βββββββββββ βββββββββββ
Rewrite everything. Delete a folder. Simplify a prompt.
When a provider ships a new feature that covers part of your workflow, you don't rewrite integration classes. You delete a subdirectory and maybe add a line to an existing prompt file.
BEFORE AFTER
ββββββ βββββ
onboarding/ onboarding/
βββ collect-user-info/ βββ collect-user-info/
β βββ prompt.md β βββ prompt.md
β βββ capabilities/ β βββ capabilities/
β βββ context/ β β βββ provider-onboarding-v2.json β absorbed
βββ verify-identity/ ββββ β βββ context/
β βββ prompt.md β βββ send-welcome/
β βββ capabilities/ βββ gone βββ prompt.md
β βββ context/ β βββ ...
βββ send-welcome/ ββββ
β βββ ...
I've seen this happen in plan-flow already. When models got better at following complex instructions, some of my multi-step skill files got shorter. When Claude Code shipped better tool support, I removed workaround prompts. The structure absorbed the improvements instead of fighting them.
Looking back at my own journey:
My Framework Projects:
Code ββββΊ Ship ββββΊ Update Drops ββββΊ Rewrite ββββΊ Ship ββββΊ Update ββββΊ Rewrite
8 weeks 1 week β 6 weeks 1 week β 6 weeks
"We need to "Again??"
refactor"
Plan-Flow:
Structure ββββΊ Ship ββββΊ Update Drops ββββΊ Simplify ββββΊ Ship ββββΊ Update ββββΊ Simplify
3 days 1 week β 2 hours 1 week β 2 hours
"Nice, we can "Even less
remove that" to maintain"
One is a treadmill. The other only moves forward.
Honestly, I don't know. Maybe there's something I'm missing. Maybe there's a class of problems where you genuinely need a framework with custom orchestration logic in Python or C#. I'm not claiming to have all the answers.
But here's what I do know:
If I'm wrong, I'm wrong in a way that still ships working software with minimal maintenance. I can live with that.
Pick one workflow you're building with a framework. Something small β a support classifier, a code reviewer, a data pipeline.
Decompose it into folders. One folder per step. Inside each: a prompt file, tool configs, context files.
Point a coding agent at it. Claude Code, Cursor, whatever. Tell it to follow the instructions in the directory.
See what happens. You might be surprised how little code you actually needed.
Wait for the next model update. Notice how your directory structure survives it.
"You're not falling behind because AI moves too fast. You're falling behind because you're building at a layer that's designed to be replaced."
The file system is the oldest, most stable, most universal abstraction in computing. It has survived mainframes, personal computers, the internet, mobile, cloud, and now AI.
I didn't set out to prove this. I just built something without frameworks and noticed the pattern after the fact. Maybe that makes it more credible. Maybe it makes it less. Either way, plan-flow works, it doesn't break on updates, and the entire "orchestration layer" is a folder structure anyone can read.
That's enough for me.
Build less. Structure more.