Natural Language Healthcare Business Intelligence for Regulators

Overview

A specialized government contractor needed to demonstrate natural-language analytics over a central healthcare data platform. Teams required bilingual (Native Language/English) question answering with chart generation, fraud-detection lenses, and strict compliance with anonymization and data residency. We delivered an MVP in 8 weeks that layers LLM agents and a semantic model over existing infrastructure, enabling policy and clinical analysts to ask questions in natural language and receive verifiable, chart-ready answers.

Challenge

Key hurdles to self-service healthcare insights included:

Fragmented schemas and varying data quality across national repositories complicated consistent metric definitions and cross-source analyses.

Manual SQL/OLAP queries limited access to insights and slowed decision cycles, especially for ad-hoc policy questions.

Bilingual requirements demanded high-quality Native Language and English understanding, terminology normalization, and consistent metric wording in both languages.

Compliance constraints required on-prem inference, strong anonymization, access controls, and complete auditability without exposing patient-identifiable information.

Solution

We implemented a secure, bilingual NLBI platform on top of the existing data estate:

NLQ & Agent Layer: Native Language/English intent parsing, entity linking, and query planning to translate questions into governed analytical queries; narrative and chart generation for analytical prompts.

Semantic Layer with Cube: Centralized metric and dimension definitions, synonyms, and governance policies; consistent calculations across data sources.

High-Performance OLAP: ClickHouse execution with prepared views, rollups, and caching for low-latency aggregates and drilldowns.

Lenses (Templated Queries): Pre-built, parameterized analytical lenses for fraud/waste/abuse detection across claims, prescriptions, providers, and facilities; one-click exploration paths.

Compliance & Security: On-prem LLMs, anonymization/pseudonymization, RBAC, lineage and citation of source tables/columns, and full audit logs.

Visualization: Auto-selected chart types with exportable artifacts; dual-language labels and captions aligned to the semantic catalog.

Results

The MVP enabled policy and clinical analysts to ask bilingual questions and receive governed, chart-ready answers in seconds. Lenses accelerated fraud investigations with repeatable, auditable queries across multiple sources. The platform operated fully on-prem with anonymized data, preserving privacy while demonstrating scalable performance for national workloads. The contractor showcased a clear path from MVP to production without re-architecting core components.

Evaluation & Why It Worked

What made this successful:

On-Prem by Design: Local LLMs, strict anonymization, and comprehensive logging satisfied stringent privacy and residency requirements.

Semantic-Layer First: A governed metric catalog in Cube ensured consistency, reuse, and multilingual clarity across analyses.

Deterministic Querying: A transparent planner produced explainable SQL/OLAP with lineage and citations, building trust with analysts and auditors.

Bilingual Excellence: Domain-aligned Native Language/English vocabularies and templates delivered consistent answers and chart labels across languages.

Extensible Architecture: Modular lenses and semantic definitions allow rapid expansion to new datasets, KPIs, and investigative workflows.

"Analysts can now ask complex questions in natural language and get accurate, chart-ready answers with full lineage—without compromising privacy with reduced manual intervention."
Program Director
Data & Insights at Government Healthcare Contractor

Technology Stack

NLQ & Agents

On-prem LLMs (Native Language/English)

Agentic Query Orchestration

Prompt Templates (Lenses)

Semantic Layer

Cube (Semantic Modeling)

Metric Catalog

Synonyms & Multilingual Labels

Data Platform

ClickHouse (OLAP)

Prepared Aggregations

Caching & Rollups

Integration

REST APIs

Batch/Event Connectors

Lineage & Audit Exports

Security & Compliance

Anonymization/Pseudonymization

RBAC

On-Prem Deployment

Audit Logging

Visualization

Automatic Chart Selection

Bilingual Narratives

Exportable Charts

Table of Contents