How 25 Years of Fragmented Client Data Became Instantly Searchable
DG Financial Services had 80k+ documents scattered across AWS S3, OneDrive, and local drives — with inconsistent file names, 1990s-era legacy formats, and poor-quality scans. Finding a single document could consume a full day. We built a unified search platform that changed that entirely.

Decades of client data — with no way to find any of it
DG Financial Services had spent 25 years accumulating client records across multiple storage systems. The data was all there — but practically inaccessible. File naming was inconsistent, formats were incompatible, and finding a single document often meant spending a full day manually trawling through folders, downloading files, and discovering they were duplicates.
One search box. Every document. Instant results.
Rather than forcing a migration or costly SaaS lock-in, Gradient Insight built a purpose-designed search platform that connects directly to the existing storage systems. Elasticsearch sits at the core, with multimodal ingestion pipelines that handle every file type — from 1990s-era scans to modern audio meeting transcripts — making the entire document corpus instantly queryable by keyword, policy number, or topic.
Unified Elasticsearch index aggregating 80k+ documents from AWS S3, OneDrive, and SharePoint into a single queryable interface
OCR pipeline (Tesseract) extracts text from scanned and poor-quality legacy documents — including files from the 1990s
Whisper Speech-to-Text indexes meeting recordings and audio files, making transcripts searchable by keyword or topic without knowing exact meeting names
Fuzzy search handles partial matches, policy numbers, and legacy reference codes regardless of naming inconsistencies or file format
AI-generated document summaries surface key content at results-time — no download or manual file review needed before deciding relevance
Bulk ingestion pipeline processes large batches of scanned paper files without per-document manual work
Scalable architecture designed for future NLP and RAG extension without rebuilding the core search infrastructure
Searches that took days — now done in minutes
The deployed platform didn't just speed up existing workflows — it unlocked access to documents that had been effectively lost. Legacy files from the 1990s are now searchable. Audio meeting transcripts are indexed by topic. And a search that once required a full day of manual trawling takes seconds.
"By putting in the policy number the system determines very quickly what documentation we have — and we didn't even have to open the file. The system gives us a summary."

Similar challenge?
We build bespoke search and AI systems that make years of accumulated data instantly accessible — without costly migrations.
Book a free discovery callHear it from Russell — in his own words
"It's saving hundreds of pounds a month in time looking for documentation. What could have been a full day of manual searching is done within minutes — and the system even gives you a summary so you don't have to open the file."

Your data is already there — you just can't find it
We build production-ready search and AI systems that make decades of accumulated data instantly accessible — without costly migrations or generic SaaS lock-in.