Strategy

The First AI Feature You Should Ship (And Why It's Probably Not a Chatbot)

Most teams reach for a chatbot when adding AI. Here's why your first feature should solve a smaller, sharper problem, and five patterns that consistently win.

Most product teams who get told to “add AI” start in the same place: they want to ship a chatbot. Maybe they call it a copilot, an assistant, or an agent. It's what investors ask about, it's what the CEO saw a competitor demo last quarter, and it's the version of AI everyone has seen. Of course you want one.

It's also the worst possible first AI feature in B2B SaaS.

The teams that ship something real in their first quarter almost always start with something a little embarrassing in its smallness. A form field that fills itself. A summary at the top of a long page. A search box that finally understands what the user meant. Narrow, boring, and in production beats ambitious and still in scoping every time.

This post is about why the chatbot instinct burns six months, what to ship instead, and how to do it in roughly six weeks.

The chatbot trap

I don't blame anyone for the chatbot pull. You've seen the demo, the board has asked about it twice, and sales has a deck that mentions it. The problem isn't the idea, it's what happens when you try to scope it.

Chat is the widest possible surface area. The input box invites every question, which means your team has to plan for every question. You'll spend the first month writing a one-pager, the second month arguing about which use cases are in scope, and the third month negotiating with security. By month four you're still pre-engineering, and the team that scoped “smarter ticket routing” instead is on their second iteration.

Then there's eval. Nobody enjoys talking about it, but it matters more for a chatbot than for anything else, because every output is different and “is this answer correct?” is hard to grade when the question itself could be anything. Most chatbots ship without a real eval set, the team finds out it's wrong from support tickets, and someone is patching prompts at midnight by week three.

And the metric is fuzzy. What does success even mean? “Engagement” is meaningless on its own, “questions answered” can be gamed by counting any reply, and “customer satisfaction” takes months to show up in the data. Without a clean number you can point at in Monday's standup, the feature drifts.

I'm not saying never ship a chatbot. Plenty of products eventually do, and some do it well. But almost all of the ones that get there started somewhere smaller.

What makes a good first AI feature

There are four things to look for, and none of them are exciting. That's the point.

Frequency. The feature has to live somewhere your users go every day. If they hit it once a month, you'll get one signal a month, and you'll be guessing for a year. Daily use means you find out in a week whether the thing works.

Narrowness. One job, one screen. The AI does a specific thing, not a general thing. The more specific the job, the easier the eval, the cleaner the metric, and the faster the iteration loop.

A clean success metric. Before you scope anything, write down the number you'll use to decide if it worked, in one sentence. If you can't write that sentence, the scoping isn't done.

Reversibility. Your first AI feature should be easy to turn off. If it breaks, you flip a flag and the product still works the way it did yesterday. In practice that means the feature is an enhancement to an existing workflow, not a replacement for one.

Five patterns that consistently work

When teams stick to those four criteria, the feature usually lands in one of five shapes. None of them are new, and all of them ship.

1. Smart defaults

Pre-fill a field the user was about to fill in. They accept it with a click or override it with a few keystrokes. When you're wrong, nobody notices, because it just looks like a field someone typed into. When you're right, you save a real minute every time.

2. In-context summarization

A short summary at the top of a long page. This works almost anywhere your users skim long content every day: support threads, customer notes, contracts, meeting transcripts, audit logs.

3. Better search

Replace keyword search with semantic search across the customer's own data. Users type the question they actually have, not the keyword they think will match.

4. Inline autocomplete

Suggest text in a field the user is already typing in. It's a familiar interaction, it's one keystroke away from being ignored, and the metric is clean: acceptance rate.

5. Classification and routing

Tag, sort, or assign incoming items automatically. Tickets routed to the right team. Documents filed in the right folder. Leads scored on fit.

A six-week plan

The work splits cleanly into three two-week stretches. You can run it with three people: a PM, a full-stack engineer, and a designer.

Weeks one and two. Pick the pattern, define the job, and write the eval set. The eval set is the part most teams skip and then regret. Twenty to fifty real examples, pulled from real customer data with permission, labeled with what the correct answer would have been.

Weeks three and four. Build it, and ship it to internal users only. Use it every day. Notice what's annoying and fix what's wrong.

Weeks five and six. Roll to five or ten percent of customers and instrument everything: adoption, the success metric, cost per request, latency. Let it run for a week. If the metric holds, expand. If it doesn't, decide whether to iterate or kill.

Three ways this goes wrong

Picking the demo, not the workflow. The feature that wins the sales meeting often loses with real users because they hit it once a quarter, not every day.

Skipping eval to “move fast.” You will not move fast. You'll spend the second month redoing the first month, because you can't tell which of your prompt changes actually helped.

Over-polishing UI before the feature works. Beautiful UI on something broken is more confusing than ugly UI on something useful.

Earn the right to ship the second one

The first AI feature you ship isn't the one that wins the market, it's the one that earns the right to ship the second one. Make it small, make it daily, and make the success metric something you can write in a single sentence. Then ship it.

The bigger surfaces, the chat, the agents, the copilots, can come later, after your team understands how your customers actually use AI and you've built the muscle to scope something open-ended without it eating two quarters.