Login
Sign Up
In January of this year, YC President Garry Tan initiated a development project named Garry's List, generating over 540,000 lines of Rails code and supporting tests. While the sheer volume of code initially appeared to be a triumph of productivity, Tan later concluded that the application itself was merely a byproduct of a deeper realization. The true value emerged not from the codebase but from GStack, a new development framework designed around AI agent workflows. Data compiled by Woofun AI shows that GStack rapidly gained traction, becoming one of the top 100 most starred open-source projects in GitHub history with approximately 105,000 stars in less than three months. This metric underscores a critical industry pivot: the 540,000 lines of code represented a 'Foxconn factory' approach, constraining a super-intelligent worker with excessive rules, whereas the new framework liberates that potential.
The software industry has long operated under a collective inertia where developers wrap models in layers of tests, validators, retry mechanisms, and background tasks. This architecture made sense when LLM calls were expensive and capabilities were limited, necessitating code to save on model invocations.
However, the economic equation has flipped. Models are becoming cheaper every quarter while simultaneously growing smarter, rendering the 'babysitting' code obsolete. Tan argues that the current trend of building rigid control systems is akin to forcing a highly intelligent worker to wear shoe covers, perform group exercises, and stand on an assembly line. Every test and barrier acts as an inch of a cage twisted onto a worker capable of accomplishing tasks far beyond human imagination. The old logic of 2013, where capability equaled lines of code, is now driving engineers toward a destination that is fundamentally wrong.
Tan's analysis reveals that the new form of software artifacts is drastically different from the traditional Rails application. Instead of 540,000 lines of owned code governed by tests, the replacement is an Agent composed of Markdown and a minimal amount of code. This structure offers the same capability with significantly improved readability, maintainability, and flexibility. Behavior is defined in instructions that can be edited in natural language rather than frozen in logic code written days ago. In his own codebase, Tan identified roughly 262,000 lines of application code and 276,000 lines of tests, noting that the audit committee was larger than the company itself. Woofun AI notes that this imbalance highlights a systemic failure where engineers bet against the model's reliability by writing code to handle inputs, outputs, and retries that the model could manage autonomously.
The solution proposed involves a shift from writing code to designing capabilities through 'skill packs.' These are versioned, testable, and reusable Markdown modules that define intent, skills, and judgment. The process involves building a function with an Agent until it works, then issuing a command to 'skillify it.' This generates a comprehensive package including a Markdown skill explanation, minimal code, unit tests, LLM evaluations, integration tests, and a resolver for automatic invocation. This approach transforms ephemeral prompts into durable assets that compound over time. Unlike 'vibe coding,' which relies on fleeting feelings, skill packs come with rigorous testing coverage that allows them to withstand change. Woofun AI analysis suggests that these primitives are the equivalent of inventing stacks, heaps, and registers in the early days of CPU architecture, marking the beginning of a new era in Agent engineering.
A practical demonstration of this paradigm shift occurred during a GStack/GBrain hackathon with 85 project submissions. Previously, judging such an event would require days of manual review, but the Agent completed the task in approximately 30 minutes. The system analyzed code quality, conducted research on participants, watched demo videos, rated user interfaces, and ranked the teams without any new code being written by Tan. The result was converted into a reusable tarball, allowing anyone to apply the same logic to future hackathons. This inversion turns what used to be a full-fledged software project involving crawlers, scoring pipelines, and video processing into a Markdown-based solution built in an afternoon. The winner of the hackathon even contributed a piece of code that enabled GStack to test iOS apps on simulators and real devices, a functionality built by one person in less than 8 hours.
The economic implication of this shift requires a willingness to spend on tokens rather than hoarding them. Peter Steinberger, creator of OpenClaw, mentioned an annual token spend of around $1 million, a figure that might deter many but represents a strategic investment in living in 2028 while others remain in 2026. OpenAI has recognized this trend by offering YC companies a $2 million token credit limit via an uncapped SAFE. The logic is that transforming raw intelligence into tokens and then into usable outputs creates a first-mover advantage similar to the early internet days of 1994. Organizations that continue to haggle over collapsing prices risk being left behind, while those willing to 'token-max' can operate with a lead that will take years for competitors to close. A token investment of $100,000 today may cost only $100 by the end of 2028, making the upfront capital expenditure a high-value strategic move.
Ultimately, the goal is to build liberating institutions rather than control systems. Tan compares the old 'Foxconn factory' model to Esalen, a place where people go to be dismantled and reshaped, dropping their armor to return more like themselves. The new software philosophy rejects assembly lines, foremen, and 6 a.m. whistles in favor of freedom. OpenClaw is described as a Ferrari where the user must bring their own wrenches, acknowledging that the model is the engine, not the whole car. We are still in the Apple I moment, soldering breadboards and finishing the work ourselves. The rough edges of these systems are a feature, not a bug, signifying that they have not been locked up by excessive safety rails. The engineer who writes the least code often builds the most, as the bottleneck shifts from construction capacity to clarity, taste, and judgment regarding what is worth building.