Philosophy: Profile-Driven Normalization
Core Concepts
This document explains the fundamental concepts behind fi-fhir's design and architecture.
Philosophy: Profile-Driven Normalization
The key insight behind fi-fhir is that the unit of scalability is the Source Profile, not "HL7v2 support".
In traditional healthcare integration:
- You build a monolithic "HL7v2 parser"
- Every feed requires code changes for edge cases
- Tolerance rules are scattered across the codebase
In fi-fhir:
- Each interface/feed gets its own Source Profile
- Parsing behavior is driven by configuration
- Adding a new feed means creating a new profile, not writing code
Parsing Pipeline
fi-fhir processes messages through a three-phase pipeline:
┌─────────────────────┐ ┌─────────────────────┐ ┌─────────────────────┐
│ Phase 1 │ │ Phase 2 │ │ Phase 3 │
│ Byte Normalization │───>│ Syntactic Parsing │───>│ Semantic Extraction │
└─────────────────────┘ └─────────────────────┘ └─────────────────────┘
│ │ │
v v v
Raw bytes Parsed segments Canonical events
(UTF-8, line (fields, components, (patient_admit,
endings, BOM) escape sequences) lab_result, etc.)
Phase 1: Byte Normalization
Input: Raw bytes from source system Output: Normalized UTF-8 string
Operations:
- BOM (Byte Order Mark) detection and handling
- Character encoding conversion (ISO-8859-1 → UTF-8)
- Line ending normalization (CRLF/CR → LF)
- Trailing whitespace handling
Configuration (in Source Profile):
encoding:
charset: UTF-8
lineEnding: auto
bomHandling: strip
Phase 2: Syntactic Parsing
Input: Normalized string Output: Parsed message structure
Operations:
- Field separator extraction from MSH.1
- Encoding characters from MSH.2
- Segment splitting
- Field/component/subcomponent splitting
- Escape sequence handling (
\H\,\N\,\.br\)
Configuration:
syntax:
hl7Version: '2.5'
fieldSeparator: '|'
encodingChars: "^~\\&"
strictMode: false
Phase 3: Semantic Extraction
Input: Parsed message structure Output: Canonical semantic events
Operations:
- Message type classification (ADT^A01 →
patient_admit) - Identifier extraction (MRN, SSN, NPI)
- Field mapping to canonical model
- Terminology normalization
Configuration:
semantics:
messageTypes: [ADT, ORU]
patientIdentifiers:
- source_field: PID.3.1
assigning_authority: EPIC
identifier_type: MRN
Canonical Event Model
All input formats map to a common set of semantic events. This decouples:
- Input parsing from business logic
- Workflow routing from format specifics
- FHIR generation from source systems
Event Types
| Category | Event Types |
|---|---|
| Patient | patient_admit, patient_discharge, patient_transfer, patient_update, patient_merge |
| Scheduling | appointment_scheduled, appointment_cancelled, appointment_rescheduled, appointment_noshow, appointment_checked_in |
| Lab/Clinical | lab_result, lab_ordered, lab_cancelled, vital_sign, condition, procedure, immunization |
| Claims | claim_submitted, claim_adjudicated, prior_auth_request, prior_auth_response |
| Documents | document, document_addendum, document_replacement, document_status_change |
| Financial | financial_transaction |
Event Structure
Every event includes:
{
"meta": {
"id": "evt_unique_id",
"type": "patient_admit",
"source": "epic_adt",
"format": "HL7v2",
"timestamp": "2024-01-15T12:00:00Z",
"source_message_id": "MSG12345",
"parse_warnings": []
}
// Event-specific fields...
}
Source Profiles
A Source Profile defines parsing behavior for a specific data feed.
Why Profiles?
Consider two hospital systems sending ADT messages:
Hospital A (Epic):
- Uses
^as component separator - MRN in PID.3.1 with assigning authority "EPIC"
- Missing NK1 (next of kin) segments are normal
Hospital B (Cerner):
- Uses
^as component separator (same) - MRN in PID.3.1 with assigning authority "CERN"
- NK1 segments are always present
Same message type, different behaviors. Profiles let you configure each independently.
Profile Structure
id: epic_adt
name: Epic ADT Interface
version: '1.0'
# Phase 1 config
encoding:
charset: UTF-8
lineEnding: auto
# Phase 2 config
hl7v2:
default_version: '2.5'
tolerate:
missing_segments: [NK1, NTE]
extra_components: true
# Phase 3 config
identifiers:
mrn:
assigning_authority: EPIC
validation: required
# Event classification
event_classification:
adt_a01:
patient_class_values:
I: inpatient
O: outpatient
E: emergency
Workflow Engine
The workflow engine routes events through configurable pipelines.
Components
Event ─── Filter ─── Transform ─── Actions
(match?) (modify) (destinations)
Filters determine which events match a route:
- Event type matching
- Source system matching
- CEL expressions for complex conditions
Transforms modify events before routing:
- Set/update fields
- Map terminology codes
- Redact sensitive data
Actions send events to destinations:
- FHIR servers
- Webhooks
- Databases
- Message queues
- Logging
Example Workflow
workflow:
name: adt_routing
routes:
- name: critical_admits
filter:
event_type: patient_admit
condition: event.encounter.class == "inpatient"
transform:
- set_field: processed_at = now()
actions:
- type: fhir
endpoint: https://fhir.hospital.com/r4
- type: log
message: 'Inpatient admit: {{.Patient.MRN}}'
Warnings Over Errors
Healthcare data is inherently messy. fi-fhir uses a warnings over errors philosophy:
- Recoverable issues generate warnings, not failures
- Tolerance rules determine what's acceptable
- Warnings are recorded in event metadata for auditing
// Instead of failing on missing data:
if segment == nil {
if profile.IsMissingSegmentTolerated(segmentID) {
addWarning("MISSING_SEGMENT", segmentID)
return defaultValue // Continue processing
}
return error // Only fail if profile says so
}
Identifier-First Design
Patient identifiers are a first-class concept:
IdentifierSethandles multiple identifiers (PID-3 repetitions)- Validators for NPI, MBI, SSN, DEA numbers
- Assigning authority mapping
- Original value preservation for audit
{
"identifiers": {
"mrn": {
"value": "12345",
"assigning_authority": "EPIC",
"type": "MR"
},
"ssn": {
"value": "XXX-XX-6789",
"original": "123-45-6789",
"redacted": true
}
}
}
FHIR Mapping
Canonical events map to FHIR R4 resources following US Core profiles:
| Event Type | FHIR Resource(s) |
|---|---|
patient_admit | Patient, Encounter |
lab_result | Observation, DiagnosticReport |
claim_submitted | Claim (Da Vinci PAS) |
vital_sign | Observation (US Core Vital Signs) |
document | DocumentReference |
See FHIR Output for complete mapping details.
Next Steps
- Source Profiles - Configure parsing for your feeds
- Workflows - Route and transform events
- FHIR Output - Generate compliant FHIR resources