Login
Sign Up
The AI application layer faces a critical existential question as foundational model capabilities accelerate: can startups survive if incumbents like OpenAI and Anthropic control both the underlying intelligence and distribution channels? a16z partner Joe Schmidt addresses this anxiety by categorizing opportunities into two distinct paths. The first, termed the "Yellow Brick Road," encompasses horizontal tasks like code generation, writing, and image creation where raw model performance directly dictates product quality. The second path, described as "other parts of Oz," involves vertical scenarios deeply embedded in industry-specific processes, complex workflows, and compliance governance. Schmidt asserts that the only viable long-term strategy for startups lies in the latter, where value is derived not from the model itself but from the scaffolding that ensures trustworthy, compliant, and integrated business outcomes.
Large model labs are actively signaling their limitations in solving complex enterprise problems through generic AI colleagues. They have announced billions of dollars in investments for frontline deployment joint ventures focused on configuring and customizing models for specific enterprises. This strategic pivot acknowledges that a single model release cannot resolve the messy realities of customer data, multi-person approval chains, and edge cases. Data compiled by Woofun AI indicates that enterprises are unwilling to pay for smarter chat windows; instead, they demand systems accountable for business results that can handle migration, routing, and cost optimization as models evolve. The core judgment is that while underlying models become increasingly replaceable, the data, processes, and operational memory solidified around specific industries remain irreplaceable.
Startups attempting to build on the Yellow Brick Road face a structural disadvantage. By connecting high-performance models to off-the-shelf connectors like Google Drive, Slack, or Salesforce, these companies replicate the exact "model plus tool invocation" pattern that large labs are executing with superior margins and brand power. Even if a startup outperforms Codex or Claude Code, the labs possess massive distribution capabilities and the architectural authority to define problem scopes. Woofun AI notes that without underlying subagents, custom configurations, or exclusive distribution channels, application companies following this playbook are likely heading toward obsolescence. The labs' ability to decide which problems their products solve gives them an insurmountable advantage in horizontal, low-step-count tasks.
Conversely, the "other parts of Oz" offer significant defensive moats through deep vertical integration. These companies build agent-centric experiences where models are woven into complex automation and integrated networks, often requiring deterministic outcomes that fuzziness cannot tolerate. This work frequently involves legacy systems and key business results, necessitating a level of focus that horizontal platforms cannot achieve. Two compounding flywheels emerge here: a cross-customer flywheel where patterns compound as more variants of issues are observed, and an intra-customer flywheel where specific decision reasons and unsaid exceptions reveal themselves only through real user-system interaction. Even without sharing customer data, an application company can leverage pattern recognition to guide architecture design, creating a knowledge base that newcomers cannot replicate with a first-time AI launch.
The economic logic further favors vertical specialists who can optimize model usage for specific tasks. Throwing every query to the most advanced model is the fastest way to turn gross margins negative. Companies in vertical sectors route queries between different model levels, utilizing cutting-edge models only for the hardest tasks while deploying smaller, fine-tuned models for proven areas. Woofun AI analysis suggests that this approach allows vertical players to achieve the lowest dollar cost at the required intelligence level, a feat structurally impossible for large labs that price at the floor level of intelligence. These companies also absorb the costs of re-evaluating new models, recalibrating prompts for boundary cases, and ensuring production continuity, providing customers with the best market intelligence without disruption.
Regulatory complexity and the need for a control plane further solidify the vertical advantage. A control plane manages permissions, auditing, and agent actions, built upon guardrails that vary entirely across industries and job types. Vertical companies own the tools, workflows, and data end-to-end, allowing them to provide deterministic outcomes and absorb regulatory burdens such as HIPAA in healthcare or SEC rules in finance. Horizontal players cannot credibly assume this responsibility without fragmenting into hundreds of verticals. CIOs require partners who can contractually commit to bearing compliance-handling responsibility, a role that demands a team entrenched in a specific customer archetype to understand workflows, edge cases, and regulatory requirements long-term.
Practical execution of this strategy requires starting with outcomes customers truly care about, such as generating sales leads or underwriting insurance policies. Prabhav Jain of 11x emphasizes that roughly half of any real-world workflow consists of non-AI tasks where large labs have no advantage over focused engineering teams. Domain knowledge often resides outside general training data, requiring bottom-up construction from vertical expertise. For instance, qualifying an inbound lead requires training the AI to understand what constitutes a good sales conversation for a specific industry, a capability that compounds over time. As market dynamics shift, such as the ability to distinguish AI-written emails, the application company's ability to evolve workflows becomes a competitive moat, resulting in response rates that can increase fourfold.
The ultimate metric for success in these vertical sectors is not benchmark scores but customer P&L. Aman Gour of FurtherAI highlights that in insurance, intelligence exists within the workflow itself, scattered across standard operating procedures and risk preferences that models cannot directly read. Utility products may generate revenue, but systemic products that replace human effort command high ACV and defensibility. If a large lab launches a competing product and customers still need the tool, it is a system; if not, it is merely a tool. The next generation of enterprise software will be built beyond the Yellow Brick Road, where working systems, data capture, and governance coalesce into a core experience that customers cannot live without, rendering the underlying models replaceable but the system indispensable.