Pipeline
Docs:
Menu
Click phases or animate to see data flow
Phase 1
Byte Normalization
Normalize encoding and line endings
→Raw bytes (UTF-8 with CRLF)
←Normalized UTF-8 string with LF
Phase 2
Syntactic Parsing
Parse HL7v2 structure
→Normalized string
←Parsed segments and fields
Phase 3
Semantic Extraction
Extract meaningful data
→Parsed message structure
←Extracted identifiers and data
Phase 1: Byte Normalization
- •Detect BOM markers (if present)
- •Convert charset to UTF-8
- •Normalize line endings to LF
- •Strip trailing whitespace
Sample Message
MSH|^~\&|ADT|HOSPITAL|RECEIVER|LAB|202401151023||ADT^A01|MSG001|P|2.5.1 PID|1||MRN12345^^^HOSPITAL^MR||Smith^John^A||19850315|M PV1|1|I|ICU^101^A|||||||||||||||VN98765^^^HOSPITAL^VN
This ADT^A01 message flows through all 3 phases, transforming from raw bytes to structured FHIR resources.
Configuration Fields
encoding.charsetencoding.lineEndingencoding.bomHandlingThese fields in your Source Profile control byte normalization behavior.