AI & LLM

AI Chatbot Development

Chatbots that solve a specific problem. Not novelty toys, not ChatGPT wrappers — a system that knows your business and does the work.

An AI chatbot is software that answers questions in writing — on your website, inside your app, or in a help widget — by understanding what someone typed, looking up the real answer in your own content, and either replying or handing the conversation to a person. Not a ChatGPT box bolted to a sidebar — a system that knows your business and does the work.

Most chatbots are novelty toys. They know nothing about your company, so they make things up, get the refund policy wrong, and frustrate the customer into emailing you anyway. I build chatbots that solve one specific problem — deflecting support tickets, qualifying leads, answering your team's repeated internal questions — grounded in your real data and tested on real cases before they go live.

What can an AI chatbot do?

Customer support deflection — The everyday questions ("where is my order", "how do I reset my password", "what is your refund policy") get answered instantly, accurately, with citations to your real documents. When the bot is not sure, it escalates to a person instead of guessing.

Lead qualification — A chat widget on your landing pages asks the right questions, works out whether the visitor is a real prospect, books a call, and writes the lead to your CRM. It replaces a contact form with something that actually converts.

Internal knowledge assistants — An HR-policy bot, a sales-playbook bot, an engineering-runbook bot. Trained on your wiki, your tickets, your documents — so your senior people stop answering the same question two hundred times a year.

Answers from private documents — Contracts, technical manuals, regulatory filings. The bot answers with quotes and page references, and the documents never leave your control. No SaaS data leak.

What makes a chatbot production-grade?

Tested on your real examples — Every chatbot gets a set of real questions with known-correct answers, built from your actual conversations. Accuracy is measured before launch and re-measured on every change. No "it seems to work" — actual numbers you can see.

It knows when to stay quiet — For factual questions the bot answers only from your real content, with citations. If it does not know, it says so and offers a person — never a confident wrong answer. On support questions a wrong answer costs more than no answer.

Staged actions, with a person in the loop — Reading something (answering a question, looking up an order) happens automatically. Changing something (updating the CRM, sending an email) is reviewed or rate-limited. High-stakes actions — a refund, a subscription change — always go to a human.

Every conversation visible — Each chat is traced end to end: what was asked, what the bot retrieved, what it answered, how long it took, what it cost. You debug by reading logs, not guessing — and per-conversation cost is tracked so spend never surprises you.

What technology powers a chatbot?

These are named here for the technically curious — none of it is something you need to choose or manage.

Language models — GPT-4 class, Claude, Llama 3, Qwen. Usually two or three are routed per step: a cheap fast model for sorting and classifying questions, a capable model for the main answer, a frontier model only for the hard edge cases. That keeps both quality and running cost under control.

Retrieval — The part that lets the bot find the right passage in your documents. Built on tools like pgvector, Qdrant, and LlamaIndex, with custom ranking and chunking when finding the answer is the bottleneck — which it usually is.

Integrations — Your CRM (Salesforce, HubSpot, Zoho), your helpdesk (Zendesk, Intercom), calendar, payment, and internal APIs — so the bot can act, not just answer.

Observability — Langfuse, OpenTelemetry, and custom tracing, so conversations, costs, and outputs are visible in real time.

How does a chatbot project work?

First, the scoping. I map the one specific job the chatbot is for and build the first test set from your real conversations. If the task is a poor fit for a chatbot, I tell you honestly before any code is written.

Then a working prototype. A real version, measured against that test set. I iterate on the wording, the retrieval, and the model mix until the accuracy numbers are good.

Then the production build. Integration with your real systems, cost controls, full tracing, and a quiet run on shadow traffic before any customer sees it.

Then launch, and iterate. I stay on as a retained engineer for wording updates, new edge cases, and model upgrades — or hand the project cleanly to your team.

Is a chatbot right for you?

A good fit if:

  • Your team spends too much time answering the same Tier-1 customer questions
  • You want a chat widget that books real calls, not a vanity feature on the homepage
  • Your staff answer the same internal questions every single week
  • You have real conversation data — chats, tickets, docs — to build and test against
  • You care about accuracy and owning the system, not just having "AI" on the site

Not a fit if:

  • You want an off-the-shelf chat widget — buy Intercom Fin or similar; at small scale a custom build does not make sense
  • You expect 100% accuracy — no AI system reaches that, and I engineer around failure rather than pretend it away
  • You have no real data — no conversations, no tickets, no documents — there is nothing to ground the bot in or test it against

Frequently asked questions

How is this different from a ChatGPT widget?

A ChatGPT widget knows nothing about your business. A custom chatbot grounds its answers in your own data with citations, connects to your real systems like your CRM and helpdesk, and refuses to make things up. It is a different category of tool entirely.

What about hallucinations — will it make things up?

For factual questions the bot answers only from your actual content, with citations back to the source. If it does not know, it says so and offers to escalate — rather than handing the customer a confident wrong answer.

Which language model do you use?

It depends on the task. Usually a mix — a cheap fast model to sort and classify questions, a capable model for the main answer, and a frontier model only for the hard edge cases. Cost and quality are both tuned per project, and you never have to choose.

Can the chatbot do more than answer questions?

Yes. It can take actions in your systems — book a call, write a lead to the CRM, look up an order. Reading is automatic, actions that change something are reviewed or rate-limited, and high-stakes actions always go to a person.

How long from kickoff to live?

Scoping and the first test set come first; from there a working prototype and then the production build follow. The exact timeline depends on how many systems the bot has to connect to and how much real conversation data is ready to build on.

Can my team take it over afterwards?

Yes. Everything — the code, the prompts, the integrations — is yours, on your systems. I can stay on as a retained engineer for ongoing updates, or hand the project over cleanly so your team runs it themselves.

Let's talk

Bring a specific use case — a question your team answers fifty times a week, a support queue you want to deflect, an internal wiki nobody reads. A thirty-minute discovery call is free: no deck, no sales, just a real conversation about whether a chatbot fits the job.

Want to talk it through?

Let's scope your project.

Book a discovery call