fi-fhir: Format-Agnostic Healthcare Integration
A case study in healthcare integration: how Source Profiles, a three-phase parsing pipeline, and a workflow DSL turn messy legacy formats into semantic events.
Tech Stack
Overview
I’ve spent years debugging healthcare integrations: parsing HL7v2 messages, mapping local codes to LOINC, and explaining to teams why their “standards-compliant” feed is still breaking production. The pattern is always the same: the spec says one thing, production says another.
fi-fhir is my attempt to encode that experience into software. It's a format-agnostic healthcare integration library that transforms legacy formats (HL7v2, CSV, EDI X12, CDA) into semantic events and routes them through configurable workflows. But the real insight isn't the parsing. It's the abstraction layer.
Key Insights:
- Source Profiles are the unit of scalability, not "HL7v2 support." Each interface/feed gets its own config for tolerance, mapping, and event classification.
- Three-phase parsing pipeline: byte normalization → syntactic parse → semantic extraction. Each phase is governed by the profile.
- "Warnings over errors" because healthcare data is messy. Don't fail on recoverable issues.
- Workflow DSL abstracts format-specific parsing from business logic. CEL expressions enable complex routing without code.
- Production reliability by default: retry with exponential backoff, circuit breakers, dead letter queues, and replay/simulation.
What’s Shipped (Current Repo State)
The original idea was “a library.” The current reality is closer to a small platform:
- Parsers: HL7v2 (ADT/ORU/SIU/MDM/DFT), CSV/flatfiles, EDI X12 (837/835/270/271/276/277), CDA/CCDA, and FHIR R4 ingestion.
- Canonical semantic event model: immutable events like
patient_admit,lab_result, andclaim_submittedthat downstream routing can treat consistently. - Workflow engine: YAML routes + CEL conditions + transforms + multiple action types, with
--dry-run, replay, and simulation workflows for safe iteration. - Mapping Studio UI: a SvelteKit app for the “samples -> warnings -> profile/workflow drafts -> run/dry-run” loop.
- Operational deployment: in my K3s environment this runs as
fi-fhir-api+fi-fhir-uiwith Postgres (event store + terminology), a reference FHIR server (HAPI FHIR), Temporal (orchestration), and MinIO (mapping/terminology file management). - Optional AI assist: LLM-backed explain/extract/quality and terminology routing (in my cluster it’s wired through LiteLLM + Qdrant).
The Challenge
Industry Context: Legacy Systems Meet Modern Mandates
Healthcare is mid-transition: FHIR APIs and modern exchange patterns are real, but HL7v2 is still the operational backbone for a lot of production workflows. The problem isn’t “how do I parse HL7v2?” The problem is “how do I ship integrations where every feed is different, and still keep the system operable?”
Common Integration Challenges
- Legacy system compatibility: Older EHRs use outdated data formats and protocols
- Version management: Juggling v2.3, v2.4, v2.5.1 across different feeds
- Customization complexity: Extensive custom coding for each interface
- Semantic inconsistency: The same message type means different things to different systems
- Technical debt: Maintaining traditional interfaces diverts resources from innovation
The Problem: Every Feed Is Different
When teams say "we support HL7v2," they usually mean "we can parse a well-formed 2.5.1 ADT^A01 message." That's necessary but insufficient.
In practice, every interface has quirks:
| Reality | Example |
|---|---|
| Version drift | Feed claims 2.5.1 but sends v2.3 data types |
| Missing segments | PV1 is "required" but a clinic omits it |
| Z-segments | Every Epic feed has ZPD, ZVN, ZIN (none documented the same way) |
| Line endings | Spec says \r, you'll receive \r\n, \n, or mixed |
| Delimiters | MSH-2 is usually ^~\&, until it's !~\$ |
| Event semantics | A01 means "admit"… or "register outpatient"… depending on PV1-2 |
Building parsers that handle all these variations in code is possible, but it doesn't scale. Every new integration means new special-case logic.
The Approach
The Core Insight: Source Profiles
The shift that made everything click: moving the unit of abstraction from format to feed.
A Source Profile is a YAML configuration that owns:
- HL7 version expectations and tolerated drift
- Parsing tolerance (missing segments, extra components, non-standard delimiters)
- Z-segment extraction and mapping rules
- Identifier normalization and validation rules
- Terminology mapping (LOCAL → LOINC, SNOMED, ICD-10)
- Event classification heuristics (A01 → inpatient vs outpatient)
Here's what that looks like:
source_profile:
id: epic_adt_hosp_a
name: 'Epic ADT Feed - Hospital A'
version: '1.0.0'
hl7v2:
default_version: '2.5.1'
timezone: 'America/New_York'
tolerate:
missing_segments: ['PV1', 'PD1']
nte_anywhere: true
extra_components: true
non_standard_delimiters: true
event_classification:
adt_a01:
rules:
- condition: "PV1.2 == 'I'"
event: 'inpatient_admit'
- condition: "PV1.2 == 'O'"
event: 'outpatient_registration'
z_segments:
preserve_raw: true
mappings:
ZPD:
- field: 1
target: patient.extensions.vip_flag
type: boolean
identifiers:
assigning_authority_map:
'HOSP_A': 'urn:oid:1.2.3.4.5.6.7'
'SSA': 'urn:oid:2.16.840.1.113883.4.1'
validation:
npi: { enabled: true, on_invalid: 'warn' }
mbi: { enabled: true, on_invalid: 'warn' }
terminology:
mappings:
- source_system: 'LOCAL_LAB'
target_system: 'http://loinc.org'
file: './mappings/hosp_a_local_to_loinc.csv'
When a new hospital comes online, I create a new profile, not new code. The parsing logic is stable; only the configuration changes.
Three-Phase Parsing Pipeline
The profile governs a three-phase pipeline:
Phase 1: Byte Normalization
- Normalize line endings (
\r\n,\n→\r) - Detect character set (BOM, MSH-18) and decode to UTF-8
- Preserve original bytes for audit/replay
Phase 2: Syntactic Parse
- Detect field separator and encoding characters from MSH-1/MSH-2
- Handle non-standard delimiters if profile allows
- Parse repetitions (
~), components (^), subcomponents (&) - Process escape sequences (
\F\,\S\,\T\,\R\,\E\,\X..\) - Preserve unknown segments (Z-segments, vendor extensions)
Phase 3: Semantic Extraction
- Classify event type using profile rules (A01 → inpatient_admit)
- Extract identifiers and apply normalization/validation
- Map terminology using profile mappings
- Emit canonical semantic events with full provenance
Each phase is governed by the Source Profile. A profile can be strict (fail on missing PV1) or tolerant (emit a warning and continue). The parser doesn't decide; the profile does.
Workflow DSL: Routing Without Code
Once messages become semantic events, routing becomes configuration:
workflow:
name: adt_routing
version: '1.0'
routes:
- name: admits_to_fhir
filter:
event_type: [patient_admit, inpatient_admit]
condition: event.patient.age >= 65
transforms:
- redact: patient.ssn
- map_terminology: patient.race
actions:
- type: fhir
endpoint: https://fhir.hospital.org/r4
resource: Patient
auth:
type: oauth2
tokenUrl: https://auth.hospital.org/token
clientId: ${CLIENT_ID}
clientSecret: ${CLIENT_SECRET}
- name: critical_labs_to_alert
filter:
event_type: lab_result
condition: event.observation.interpretation in ["critical", "HH", "LL"]
actions:
- type: webhook
url: https://alerts.hospital.org/critical
method: POST
- type: log
level: warn
message: 'Critical lab: {{.Observation.Code}} for {{.Patient.MRN}}'
The workflow engine supports:
- Filters: event type, source system, CEL expressions for complex conditions
- Transforms: set_field, map_terminology, redact (PHI masking)
- Actions: FHIR (with OAuth2), webhook, database (PostgreSQL/MySQL/SQLite), message queue (Kafka), logging
CEL (Common Expression Language) makes this work. Instead of writing routing code, you write expressions like event.patient.age >= 65 && event.encounter.class == "inpatient". The engine evaluates them, caches compiled expressions, and routes efficiently.
Implementation Details
Where It Runs (Kubernetes Reality)
The “library” story is useful for explaining the primitives, but the operational shape matters if you’re evaluating whether this is usable.
In the current GitOps deployment, fi-fhir runs in a dedicated fi-fhir namespace with:
fi-fhir-api(HTTP +/metrics), configured with Postgres and a reference FHIR endpointfi-fhir-ui(Mapping Studio)- Postgres PVC-backed storage (event sourcing + terminology DB)
- MinIO (terminology and mapping file workflows)
- Temporal (long-running workflow orchestration)
It’s also explicitly designed for restricted clusters: non-root, seccomp, read-only root filesystem, and resource limits by default.
Production Reliability
Healthcare integrations can't drop messages. fi-fhir builds reliability into the workflow engine:
| Pattern | What It Does |
|---|---|
| Retry | Exponential backoff with configurable max attempts |
| Circuit Breaker | Stop hammering a failing downstream service |
| Dead Letter Queue | Park failed events for investigation and replay |
| Rate Limiting | Token bucket to avoid overwhelming receivers |
| OAuth Token Refresh | Automatic refresh with 401 retry |
Observability is integrated from the start:
- Prometheus metrics (workflow event counters and action duration histograms)
- OpenTelemetry distributed tracing
- Structured logging with trace ID correlation
- Grafana dashboard templates
Key Design Decisions
"Warnings over errors." Healthcare data is messy. A missing PV1 segment shouldn't crash your pipeline if the profile says it's tolerable. The parser emits ParseWarning objects that can be logged, alerted on, or fed into quality metrics, but processing continues.
Identifier-first design. PID-3 (patient identifiers) almost always repeats: MRN, SSN, MBI, insurance ID. I made IdentifierSet a first-class type with validation (NPI/MBI/SSN checksums), normalization (strip dashes, uppercase), and priority selection (which ID is "primary"?).
Profile-driven, not hardcoded. Event classification (is A01 an inpatient admit or outpatient registration?) depends on the source system. Profile rules like condition: "PV1.2 == 'I'" make this configurable per feed.
Go for the core. Performance matters for high-volume feeds. Single binary deployment simplifies operations. Strong typing catches mistakes at compile time. Minimal external dependencies (stdlib + YAML + CEL).
Results
Before: Integration Pain Points
| Issue | Impact |
|---|---|
| Per-feed custom code | 2-3 weeks per new integration |
| Format-specific parsers | Duplicated logic across feeds |
| Hardcoded routing | Code changes for workflow updates |
| Missing observability | Blind spots in production pipelines |
After: Measurable Improvements
| Metric | Result |
|---|---|
| New feed onboarding | Faster iteration via config-first loops (profiles/workflows) |
| Parser test coverage | Strong coverage in core parsers + canonical model |
| Workflow changes | No code deployment needed |
| Production visibility | Full tracing + metrics |
Gap Awareness (What Still Bites You)
If you’ve shipped healthcare integrations, you know the difference between “works on a sample file” and “survives production.” The repo is honest about that reality, and the gaps are mostly in the places you’d expect:
- Profile authoring is still the bottleneck. Source Profiles are the right abstraction, but writing them from scratch is still manual work. Draft inference and vendor templates help, but this is where onboarding time goes.
- Terminology is a lifecycle, not a lookup. The core mapping/terminology engine is strong, but the DB loaders + suggestion/semantic search/indexing pieces are at mixed maturity. You should plan for governance: versioning, review, and rollback.
- GraphQL is production-shaped, but coverage is skewed. Generated code dominates the coverage denominator; the hand-written resolver logic is where you want real tests and regression protection.
- LLM features are additive, not foundational. They help with quality/explanation/extraction, but you still need deterministic parsing, clear provenance, and safe fallbacks. Treat LLM output as suggestions unless you can validate it.
Lessons Learned
What I'd Do Differently
Profile inference + templates. Profile-driven systems scale, but only if profile creation is cheap. Draft inference from samples and “vendor starter packs” (Epic/Cerner/Meditech patterns) would cut onboarding time substantially.
Vendor profile templates. Epic, Cerner, and Meditech all have semi-predictable patterns for Z-segments and event semantics. Shipping default profiles for common EHRs would reduce boilerplate.
Earlier integration-test harnesses. The CLI and the DB-backed components are where regressions hide. Testcontainers-backed Postgres runs (terminology DB, event store) and stubs for FHIR endpoints would have paid off sooner.
Conclusion
-
Think in feeds, not formats. "HL7v2 support" is necessary but not sufficient. The real abstraction is the Source Profile.
-
Configuration over code. Every integration decision that varies per feed belongs in a profile, not in parsing logic.
-
Build tolerance into the system. Healthcare data is messy. Design for warnings, not failures. Quarantine bad data; don't crash.
-
Decouple format from workflow. Semantic events (
patient_admit,lab_result) let you route messages without caring whether they came from HL7v2, EDI, or FHIR. -
Reliability is a feature, not an afterthought. Retry, circuit breaker, DLQ, and observability should be in the architecture from day one.
The full library is at libs/fi-fhir with documentation covering the workflow DSL, FHIR output mappings, and production hardening. You can also browse the docs directly at /docs/fi-fhir and use the interactive tools at /playground/fi-fhir.
Further Reading
Interested in similar solutions?
Let's discuss how I can help with your project.