Building Internal Tools with AI

5 March 2026 by Daniel

AISoftware DevelopmentAutomation

Most of the AI conversation is about customer-facing products. Chatbots, recommendation engines, that sort of thing. But over the last year, the projects that have saved our clients the most money have been boring internal tools. The stuff nobody writes blog posts about. I’m going to write one anyway.

Document Processing

This is the one I’d start with if you’re not sure where to begin. Almost every company we talk to has someone manually pulling data out of PDFs or scanned documents and typing it into another system.

We built a document processor for an accounting firm that was drowning in invoices. Two people spent most of Monday just copying numbers from PDFs into Xero. Now one person spends 20 minutes checking the edge cases.

The setup is simple: OCR converts the document to text, then an LLM extracts the structured fields (invoice number, amount, vendor, line items) and gives you JSON back. We typically use Tesseract for the OCR layer, though commercial alternatives exist if you need better accuracy on rough scans.

One thing we learned the hard way: don’t promise 100% accuracy. Aim for 95% and build a human review step for the rest. Handwritten notes and poorly scanned documents will trip up any system. The value isn’t perfection. It’s turning a full day of work into half an hour of exception handling.

Internal Knowledge Bases

Your documentation is scattered across Confluence, Slack, Google Drive, and probably someone’s laptop. Everyone knows this is a problem. Nobody fixes it because reorganising docs is thankless work.

Semantic search sidesteps the problem entirely. You ingest everything, generate embeddings, store them in a vector database (we like pgvector for most cases), and let people ask questions in plain English. The LLM finds the relevant docs and summarises them.

One organisation we worked with reduced time spent hunting for answers from about 25 minutes per query to under 2 minutes. Across 50 employees, that adds up fast.

Embeddings go stale if your documents change and you don’t re-index. And think carefully about access controls, as not everything should be searchable by everyone.

Automated Reports

This one is less exciting to talk about but saves a surprising amount of time. A scheduled job pulls data from your various systems, passes it to an LLM with formatting instructions, and emails the finished report.

We set this up for a client whose analysts were spending every Friday afternoon pulling numbers from five different sources and writing commentary. Now they spend 15 minutes reviewing what the system produces. As a bonus, the LLM-generated reports caught data inconsistencies their analysts had been missing for months. I was surprised by that.

The obvious caveat: LLMs hallucinate. Always validate the numbers against source data. Let the AI flag that revenue is down 15%, but keep a human in the loop for deciding what it means.

Code Review

This is the one I have the strongest opinions about. AI code review isn’t about replacing reviewers. It’s about making the review process honest. We’ve all seen the “looks good to me” comment from someone who spent 30 seconds skimming a diff. That’s not a review.

We integrated LLM-based analysis into the PR workflow for one of our teams. It summarises what changed, flags high-risk modifications around auth and database queries, and suggests edge cases the author might have missed. The result was 40% fewer post-deployment bugs, not because the AI is brilliant, but because it forces people to engage with the code.

Where it falls down: business logic and domain rules. The AI doesn’t know whether “disable rate limiting in dev mode” is intentional or a security hole. That’s still on you.

Support Triage

This one is quick to explain. Tickets come in, an LLM reads them, classifies severity, and routes them to the right team. One client went from a 4-hour first-response time to 20 minutes. They’re not automating replies. They’re just getting tickets in front of the right person immediately.

Always build escalation paths. LLMs will occasionally misclassify something urgent as routine, and that’s not a mistake you want to make twice.

Pick one internal process that costs real time and money. Build the tool. Measure whether it helped. Then expand from there. The companies getting value from AI right now aren’t chasing the bleeding edge. They’re solving their own problems first.