The Challenge
A growing law firm was spending hundreds of hours monthly on manual document review and data extraction from contracts, with increasing error rates as volume grew.
The Challenge
“We’re drowning in contracts and we can’t hire fast enough.” That was their managing partner on the first call. Their paralegals were spending the bulk of their week pulling data out of contracts by hand. Party names, key dates, liability clauses, payment terms, termination conditions. The same fields, over and over, across hundreds of documents a month.
The error rate told the real story. At 94% accuracy that sounds decent until you realise it means roughly six mistakes per hundred documents. In legal work, one wrong date or a missed clause can cause serious problems. And the errors were getting worse as volume climbed and people got tired.
The Solution
We spent the first week just watching how the paralegals worked. That gave us far more than any requirements document could. We saw where they slowed down, where they second-guessed themselves, and where the process broke.
From there we built a two-stage pipeline: OCR on the front end, then a large language model to read and extract the data. Here is the thing that surprised us, though. We had budgeted a good chunk of time for scanning and OCR work, but it turned out roughly 80% of the firm’s archive was already digital. PDFs straight from Word, not scanned images. That saved us weeks and let us focus effort where it mattered.
We went with Claude’s API for the language model layer because it handled legal terminology well without needing heavy prompt engineering, and the pricing was predictable enough that we could give the firm a clear cost-per-document figure upfront. That mattered to them.
The system split results into two buckets. Anything above a 92% confidence score went straight through. Everything else landed in a review queue where a paralegal could check and correct the extraction in seconds, rather than doing the whole thing from scratch. We were not trying to replace human judgement on tricky clauses. We were trying to stop people wasting hours on the obvious stuff.
We fed the model around 200 sample contracts from across their different practice areas so it could learn their specific terminology and formatting. Then we wired the whole thing into their existing case management system via API. Documents got pulled in automatically and processed data flowed back without anyone touching a file.
One moment that sticks with us: one of the senior paralegals, who had been doing this work for twelve years, was openly sceptical in the first demo. She did not trust a machine to get it right. By month two she was training new staff on the system and pushing for it to cover more document types, though she still insisted on manually checking anything involving indemnity clauses. She told us it was the first time in years she had enough breathing room to advise on cases instead of just typing data into spreadsheets.
The Results
- 85% reduction in manual extraction hours: from 480 monthly hours down to 72, covering human review only
- Accuracy jumped from 94% to 99.2%: fewer transcription errors, and the system caught things tired eyes missed
- Processing time dropped 90%: contracts handled in minutes, not hours
- Cost per document fell 84%: from £3.40 to £0.54
- Two minor edge cases caught in review, no client-facing errors over six months: high-confidence extractions needed minimal corrections
- Six FTE months freed annually: paralegals moved back to billable, client-facing work
The firm paid back the project cost inside three months and kept banking savings after that. More importantly, they stopped seeing document review as a grind and started treating it as something that gave them an edge. They could take on more document-heavy matters without hiring, and their turnaround times dropped noticeably.
Each month, the corrections paralegals made on edge cases fed back into the model, so it got steadily better at handling the firm’s particular quirks. Not magic. Just a well-fitted tool that learned from the people using it.