Founder mistake
The wrong question is "what can the model do?"
The tempting version of this company form starts with capability. A founder sees a model extract data, draft a response, classify documents, call an API, or complete a short task, and then goes looking for a market.
That sequence is backwards. The better starting point is economic. Find expensive work that customers already buy. Check whether the work has enough structure to systematize, enough mess to require a specialist, and enough value that the buyer cares about the finished result more than the tool used to produce it. If the buyer cannot name the result, the market is not ready.
Y Combinator's 2026 Requests for Startups frame this as a broader shift: AI is becoming the foundation for companies that rebuild software and services, not just a feature inside existing products.[1] Emergence Capital makes the category point more directly: outcome services powered by AI and human expertise became practical only after modern foundation models.[2]
Why now
The adoption gap creates room for service companies
AI adoption is broad, but operational transformation is still uneven. McKinsey's 2025 State of AI survey says almost all respondents report AI use in at least one function, but most organizations are still early in scaling and capturing enterprise-level value.[3] Stanford HAI's 2026 AI Index reports organizational AI adoption at 88 percent, while agent deployment remains early relative to the excitement around autonomous systems.[4]
Menlo Ventures estimates that enterprise generative AI spend reached $37 billion in 2025, with the largest share going to the application layer.[5] That spending says buyers believe the technology matters. The scaling gap says many of them still need someone else to turn capability into governed work.
That is the opening. Sell the customer the resolved case, prepared packet, completed filing, reconciled record, validated report, approved creative variant, or monitored workflow. Do that before asking them to operate the machinery themselves. Done-for-you beats do-it-yourself when the workflow is painful.
Framework
How should founders score an AI services market?
A market does not need to be perfect. It does need to pass enough of the following filters that the company can compound operations instead of becoming a custom agency.
- The buyer already pays a vendor, agency, BPO, consultant, broker, or internal operations team for the work.
- The customer can describe the desired output without adopting new category language first.
- The work has recurring steps: intake, classification, extraction, drafting, routing, review, submission, and follow-up.
- Exceptions exist, but they can be grouped and escalated instead of reinvented for every customer.
- The work crosses email, PDFs, spreadsheets, portals, CRMs, ERPs, ticketing systems, or regulated forms.
- The pain is not one missing feature. It is the coordination tax between systems and teams.
- The company can see enough inputs, historical examples, decisions, review traces, and outcomes to improve the system.
- Every completed unit creates learning material for better routing, prompts, extraction, QA, or evaluation.
- The buyer can verify whether the result was accepted, resolved, approved, recovered, filed, shipped, or escalated.
- The business impact is closer to revenue, cost reduction, compliance, cycle time, or risk than to "more productivity."
- Errors matter enough that review, audit trails, source evidence, and escalation rules are valued.
- Human-in-the-loop oversight helps adoption while the automation ratio improves over time.
- Name the paid workflow. Write the exact work the buyer already pays someone to complete.
- Map the repeatable steps. Separate intake, classification, extraction, drafting, review, submission, and follow-up.
- Score the handoff pain. Count the systems, documents, people, and approval loops involved in one completed unit.
- Define the outcome metric. Pick one result the buyer can verify: accepted, resolved, recovered, approved, filed, or shipped.
- Estimate review burden. Decide where humans must approve, audit, escalate, or own exceptions during the first version.
Where to look
Which markets are worth testing first?
The best starting markets often look operationally dull. Examples include insurance claims, prior authorization, revenue cycle work, compliance packets, tax workflows, audit preparation, invoice reconciliation, freight documents, legal intake, security questionnaires, customer support escalation, field-service scheduling, RFP response, and marketing execution.
They are attractive because the buyer already has pain, process, budget, and a tolerance for vendors. Bessemer's work on AI services highlights similar assessment criteria: team quality, platform stickiness, time-to-value, margins, distribution, pricing strategy, and serviceable addressable market.[6]
The pattern is simple. Find work that is too important to ignore, too repetitive to stay fully manual, too messy for a generic SaaS workflow, and too risky for a black-box bot. Boring work is often the best wedge.
Wedge
Should the wedge be vertical or horizontal?
Most founders should start vertical. A narrow market gives the company shared terminology, common documents, recurring edge cases, clearer buyer pain, and a smaller evaluation surface. It also makes sales easier: the buyer believes you understand their world.
Horizontal can work when the workflow itself is the distribution wedge. Voice agents are a useful example: a16z argues that voice can be a wedge for broader AI application companies because it is a frequent, information-dense interface and can unlock downstream workflows.[7] But even then, the strongest companies usually become vertical quickly, because trust, integration, and edge cases are domain-specific.
The practical rule is narrow. Start with one painful workflow in one market. Become unusually good at completing it. Then expand to the adjacent workflows the buyer already expects you to own. Do not widen the product before the first workflow is trusted.
Pricing
Can the market support outcome pricing?
A good services market lets the company price the result. Per completed case. Per submitted packet. Per resolved ticket. Per approved filing. Per recovered dollar. Per monitored workflow. Per accepted report.
Seat pricing is usually a warning sign. It can work as a transition model, but it often means the customer still sees a tool rather than a service. Pure cost-plus pricing is another warning sign: if every automation gain becomes a discount, the company never captures operating leverage.
The pricing question is blunt: if your AI system makes delivery twice as fast, who keeps the upside? In a strong market, the buyer values speed, reliability, compliance, and proof enough that the vendor can keep a meaningful share of the efficiency gain.
Metrics
What metrics prove the service can scale?
Measure the company like operations, not like demo software. The early dashboard should show cycle time, cost per completed unit, review minutes per unit, error rate, escalation rate, acceptance rate, gross margin, and customer-visible outcome proof.
The company is getting healthier when the review burden falls without lowering quality. Exceptions become taxonomies. Customers trust the output faster. Every completed unit improves the next one. If review minutes never fall, the service is still mostly labor.
That is also how to avoid the common trap: using people to hide product gaps. Human review is useful when it shrinks as the system improves. It is a margin sink when every customer remains a special case.
Avoid
Which markets should founders avoid at first?
Avoid markets where the buyer does not already pay for the work. Be careful when the output is subjective, the workflow changes every time, or the company cannot access enough process data. Also avoid liability boundaries that are too severe for an early team. Start where proof is visible and mistakes are recoverable.
Also avoid markets where the only advantage is a model call. If a foundation model improvement makes the company less necessary, the moat is weak. Better models should expand the service surface, reduce cost, and improve QA; they should not erase the reason the company exists.
| Signal | Prefer this | Avoid this |
|---|---|---|
| Budget | Buyer already pays for completed work. | Buyer must be educated to value the category. |
| Workflow | Steps repeat, with bounded exceptions. | Every project is bespoke consulting. |
| Data | Inputs, examples, decisions, and outcomes are visible. | The vendor cannot access enough process evidence. |
| Pricing | Value maps to a resolved case or accepted output. | The only obvious price is a seat or model-cost markup. |
ContentOS brief
The brief for this article
This article came from the AI services hub plan and the Workspace ContentOS packet RUN-60. The target keyword is "AI services market selection." The target prompts include:
- What markets are best for AI services?
- How do I choose an AI service startup idea?
- What makes a service business suitable for AI automation?
- Which outsourced workflows can become AI-powered services?
- AI services vs SaaS: how should founders choose a market?
- What metrics prove an AI service can scale?
Per the Humanswith.ai Workspace FAQ rule, the FAQ below is prompt-derived and has eight questions so the article is easier for answer engines to cite.
Sources
Footnotes and research sources
Requests for Startups
YC's 2026 category framing for startups rebuilding software, services, and silicon around AI as the foundation.
Emergence playbook for outcome services
Spring 2026 playbook for outcome services that combine AI and human expertise.
The state of AI in 2025
Survey evidence for broad AI usage, agent experimentation, and the early state of scaled enterprise value capture.
The 2026 AI Index Report
Adoption context, including organizational AI adoption at 88 percent and the still-early state of agents in production.
2025: The State of Generative AI in the Enterprise
Enterprise generative AI spending estimates and application-layer context.
Reinventing IT services in the age of AI
Framework for assessing AI services companies through time-to-value, margins, pricing, distribution, and serviceable market.
AI Voice Agents: 2025 Update
Reference for workflow wedges, vertical-specific execution, and why voice can unlock broader AI application platforms.
FAQ
AI services market selection FAQ
What markets are best for AI services?
The best markets already have outsourced spend, repeatable workflows, messy handoffs, accessible process data, clear outcome value, and trust requirements that justify review and audit trails.
How do I choose an AI service startup idea?
Start with a paid workflow, not a model demo. Look for a painful service line item where customers can verify the completed result and where each completed unit improves the operating system.
What makes a service business suitable for AI automation?
A service is suitable when it has repeatable substeps, recurring inputs, historical examples, reviewable outputs, bounded exceptions, and enough volume that automation improves cost, speed, and consistency.
Which outsourced workflows can become AI-powered services?
Good candidates include claims, compliance packets, revenue cycle work, tax preparation, audit support, invoice reconciliation, customer support escalation, RFP response, and marketing execution workflows.
AI services vs SaaS: how should founders choose a market?
Choose an outcome service when buyers want the work done for them. Choose SaaS when buyers want to own the workflow internally and have the team, data, and incentive to operate the tool.
Should founders start with a vertical or horizontal AI service?
Most should start vertical because domain-specific documents, edge cases, trust, and buying language make delivery easier. Horizontal can work only when the workflow wedge has strong distribution.
How should founders evaluate service-as-software opportunities?
Evaluate the opportunity through outcome value, repeatability, gross margin path, review burden, data access, time-to-value, distribution, and whether pricing can map to the completed result.
What metrics prove an AI service can scale?
Track cycle time, cost per completed unit, review minutes per unit, acceptance rate, error rate, escalation rate, gross margin, and how much each workflow improves after more completed cases.
Related