Skip to main content
fi-fhir docs

Quick Reference

Healthcare Identifier Systems Planning

This document details patient and provider identification systems, normalization strategies, and matching logic for fi-fhir.

Quick Reference

ValidatorStatusChecksumImplementation
NPILuhn (80840 prefix)pkg/validate/identifiers.go:NPIValidator
MBIFormat rulespkg/validate/identifiers.go:MBIValidator
SSNArea/group rulespkg/validate/identifiers.go:SSNValidator
DEAWeighted checksumpkg/validate/identifiers.go:DEAValidator
NormalizerPurpose
SSNStrip dashes, reject invalid patterns (000000000)
PhoneStrip country code, normalize to 10 digits

Patient Identifiers

Identifier Types Reference

ID TypeOIDFormatAssignerPersistence
MRNFacility-specificVariesHospital/ClinicPer facility
Enterprise MRNHealth system OIDVariesHealth systemAcross facilities
SSN2.16.840.1.113883.4.1XXX-XX-XXXXSSALifetime
MBI2.16.840.1.113883.4.927XAXX-XXX-XXXXCMSMedicare enrollment
Medicaid IDState-specific OIDVaries by stateState MedicaidEnrollment period
Insurance Member IDPayer OIDVariesPayerPolicy period
Driver's License2.16.840.1.113883.4.3.[state]VariesState DMVUntil renewal

MBI (Medicare Beneficiary Identifier)

Replaced HICN (Health Insurance Claim Number) in 2020.

Format: XAXX-XXX-XXXX (11 characters)

Position  Allowed Characters
1         1-9 (no 0)
2         A-Z (no S,L,O,I,B,Z)
3         Alphanumeric (no S,L,O,I,B,Z)
4         0-9
5         A-Z (no S,L,O,I,B,Z)
6-7       Alphanumeric (no S,L,O,I,B,Z)
8         0-9
9-11      Alphanumeric (no S,L,O,I,B,Z)

Example: 1EG4-TE5-MK72

State Medicaid ID Formats

StateFormatExampleNotes
CA9 digits123456789Medi-Cal
NY8 alphanumericAB123456
TX9 digits012345678
FL10 digits1234567890
VA12 digits123456789012

HL7v2 PID-3 Structure (Patient Identifier List)

PID-3: Patient Identifier List (CX data type, repeating)

CX Components:
1. ID Number (ST)           - The actual identifier value
2. Check Digit (ST)         - Optional validation digit
3. Check Digit Scheme (ID)  - Algorithm used (M10, M11, etc.)
4. Assigning Authority (HD) - Who issued the ID
5. Identifier Type Code (ID)- Type (MR, SS, AN, etc.)
6. Assigning Facility (HD)  - Facility issuing
7-10. Various optional fields

Example:
123456789^^^HOSPITAL_A^MRN~111-22-3333^^^SSA^SS~BC1234567^^^BCBS^PI
    │         │           │        │      │       │        │    │
    ID     Assigner     Type      ID    Assigner Type     ID   Type

Identifier Type Codes (HL7 Table 0203)

CodeNameDescription
MRMedical Record NumberFacility MRN
PIPatient Internal IDOrganization-specific
SSSocial Security NumberUS SSN
ANAccount NumberBilling account
PTPatient External IDExternal system ID
MBMember NumberInsurance member ID
MAMedicaid NumberState Medicaid
MCMedicare NumberMedicare (MBI/HICN)
DLDriver's LicenseState-issued
PPNPassport NumberInternational ID

Provider Identifiers

NPI (National Provider Identifier)

Format: 10 digits, Luhn check digit

Type 1: Individual providers (physicians, nurses, etc.)
Type 2: Organizations (hospitals, groups, etc.)

Validation:
1. Must be exactly 10 digits
2. Luhn algorithm checksum (prefix "80840")
3. First digit cannot be 0

Example: 1234567893
// NPI validation
func ValidateNPI(npi string) bool {
    if len(npi) != 10 || !isNumeric(npi) {
        return false
    }

    // Luhn check with 80840 prefix
    prefixed := "80840" + npi
    return luhnCheck(prefixed)
}

func luhnCheck(number string) bool {
    sum := 0
    alternate := false
    for i := len(number) - 1; i >= 0; i-- {
        n := int(number[i] - '0')
        if alternate {
            n *= 2
            if n > 9 {
                n -= 9
            }
        }
        sum += n
        alternate = !alternate
    }
    return sum%10 == 0
}

DEA Number

Format: 2 letters + 6 digits + 1 check digit

First Letter: Registrant type
  A, B, F, G = Manufacturers, distributors
  M = Mid-level practitioners
  P, R = Researchers

Second Letter: First letter of last name (usually)

Check Digit Formula:
1. Add 1st, 3rd, 5th digits
2. Add 2nd, 4th, 6th digits, multiply by 2
3. Sum of steps 1 and 2, last digit = check digit

Example: AB1234563
  Step 1: 1 + 3 + 5 = 9
  Step 2: (2 + 4 + 6) × 2 = 24
  Sum: 9 + 24 = 33, check digit = 3 ✓

Provider ID Cross-Reference

SystemExampleUse Case
NPI1234567893Universal provider ID
DEAAB1234563Controlled substance prescribing
State LicenseVA-12345State-specific practice
Medicare PTANI12345Medicare billing
Medicaid ID(varies)Medicaid billing
Hospital IDDR001Internal systems
UPIN (legacy)A12345Pre-NPI Medicare

Patient Matching Strategy

The Challenge

No universal patient ID means probabilistic matching based on demographics:

Input: New patient record
- Name: John W Smith
- DOB: 1965-03-15
- SSN: (declined)
- Phone: 555-123-4567
- Address: 123 Main St, Anytown VA

Question: Is this the same "John Smith" we already have?

Matching Algorithm Options

1. Deterministic Matching

Exact match on one or more fields:

type DeterministicMatch struct {
    Fields   []string // e.g., ["ssn"] or ["mrn", "facility"]
    Required bool
}

// Match if SSN matches exactly
if record1.SSN == record2.SSN && record1.SSN != "" {
    return MatchConfirmed
}

2. Probabilistic Matching

Weighted scoring across multiple fields:

FieldWeightMatch Type
SSN0.95Exact
MRN + Facility0.90Exact
DOB0.80Exact
Last Name0.60Soundex/phonetic
First Name0.50Soundex/phonetic
Address0.40Normalized comparison
Phone0.70Last 7 digits
Gender0.30Exact
type MatchScore struct {
    Score      float64 // 0.0 - 1.0
    Confidence string  // "high", "medium", "low", "no_match"
    Details    map[string]float64 // Per-field scores
}

func ScoreMatch(a, b *Patient) MatchScore {
    score := 0.0
    maxScore := 0.0

    // SSN match
    if a.SSN != "" && b.SSN != "" {
        maxScore += 0.95
        if a.SSN == b.SSN {
            score += 0.95
        }
    }

    // DOB match
    maxScore += 0.80
    if !a.DOB.IsZero() && a.DOB.Equal(b.DOB) {
        score += 0.80
    }

    // Name matching (phonetic)
    maxScore += 0.60
    if soundex(a.LastName) == soundex(b.LastName) {
        score += 0.60
    }

    // ... more fields

    normalized := score / maxScore
    return MatchScore{
        Score: normalized,
        Confidence: classifyConfidence(normalized),
    }
}

3. Machine Learning Matching

Train model on confirmed match/no-match pairs:

  • Input features: field similarity scores
  • Output: match probability
  • Requires labeled training data

Match Thresholds

matching:
  thresholds:
    confirmed: 0.95 # Auto-link records
    review: 0.70 # Human review needed
    no_match: 0.40 # Definitely different

  rules:
    # SSN alone confirms match
    - condition: ssn_exact_match
      action: confirmed

    # MRN + DOB confirms
    - condition: mrn_exact_match AND dob_exact_match
      action: confirmed

    # Name + DOB + phone suggests review
    - condition: name_phonetic_match AND dob_exact_match AND phone_partial
      action: review

Identifier Normalization

SSN Normalization

func NormalizeSSN(input string) string {
    // Remove non-digits
    digits := regexp.MustCompile(`\d`).FindAllString(input, -1)
    ssn := strings.Join(digits, "")

    // Must be 9 digits
    if len(ssn) != 9 {
        return ""
    }

    // Invalid patterns (all same digit, sequential)
    if ssn == "000000000" || ssn == "123456789" {
        return ""
    }

    return ssn
}

Phone Normalization

func NormalizePhone(input string) string {
    digits := regexp.MustCompile(`\d`).FindAllString(input, -1)
    phone := strings.Join(digits, "")

    // Remove country code if present
    if len(phone) == 11 && phone[0] == '1' {
        phone = phone[1:]
    }

    if len(phone) != 10 {
        return ""
    }

    return phone
}

Name Normalization

func NormalizeName(input string) string {
    // Uppercase
    name := strings.ToUpper(input)

    // Remove prefixes/suffixes
    prefixes := []string{"MR", "MRS", "MS", "DR", "MISS"}
    suffixes := []string{"JR", "SR", "II", "III", "IV", "MD", "PHD"}

    for _, p := range prefixes {
        name = strings.TrimPrefix(name, p+" ")
        name = strings.TrimPrefix(name, p+".")
    }

    // Remove punctuation
    name = regexp.MustCompile(`[^A-Z\s]`).ReplaceAllString(name, "")

    // Collapse whitespace
    name = strings.Join(strings.Fields(name), " ")

    return name
}

fi-fhir Canonical Identifier Model

// Identifier represents any healthcare identifier
type Identifier struct {
    // The identifier value (normalized)
    Value string `json:"value"`

    // Code system URI or OID
    System string `json:"system"`

    // Type code (from HL7 Table 0203)
    Type string `json:"type"`

    // Who assigned this identifier
    Assigner string `json:"assigner,omitempty"`

    // Original value before normalization
    OriginalValue string `json:"original_value,omitempty"`

    // Validity period
    Period *Period `json:"period,omitempty"`

    // Use (usual, official, temp, secondary, old)
    Use string `json:"use,omitempty"`
}

// IdentifierSet manages multiple identifiers for an entity
type IdentifierSet struct {
    Identifiers []Identifier `json:"identifiers"`

    // Primary identifier (usually MRN from source)
    Primary *Identifier `json:"primary,omitempty"`
}

func (s *IdentifierSet) GetByType(idType string) *Identifier {
    for _, id := range s.Identifiers {
        if id.Type == idType {
            return &id
        }
    }
    return nil
}

func (s *IdentifierSet) GetBySystem(system string) []Identifier {
    var result []Identifier
    for _, id := range s.Identifiers {
        if id.System == system {
            result = append(result, id)
        }
    }
    return result
}

Implementation Plan

Phase 1: Core Identifier Parsing ✅

  • Parse HL7v2 PID-3 (repeating CX) - see internal/parser/hl7v2/parser.go
  • Parse HL7v2 XCN (provider names with IDs)
  • Basic normalization (SSN, phone) - see pkg/validate/identifiers.go

Phase 2: Validation ✅

  • NPI Luhn validation - see pkg/validate/identifiers.go:NPIValidator
  • DEA checksum validation - see pkg/validate/identifiers.go:DEAValidator
  • MBI format validation - see pkg/validate/identifiers.go:MBIValidator
  • SSN reasonableness checks - see pkg/validate/identifiers.go:SSNValidator
  • Phone/SSN normalizers - see pkg/validate/identifiers.go

Phase 3: Matching Engine ✅

  • Deterministic matching rules - see pkg/matching/deterministic.go
  • Probabilistic scoring (Jaro-Winkler, Soundex) - see pkg/matching/similarity.go, pkg/matching/scorer.go
  • Configurable thresholds - see pkg/matching/matcher.go:MatcherConfig

Phase 4: MPI Interface ✅

  • Abstract MPI interface - see pkg/matching/mpi.go
  • In-memory implementation for testing - see pkg/matching/mpi_memory.go
  • External MPI integration (future)

Testing Data

var testPatients = []Patient{
    {
        MRN: "123456",
        Identifiers: []Identifier{
            {Value: "123456", Type: "MR", System: "urn:oid:1.2.3.4"},
            {Value: "111223333", Type: "SS", System: "2.16.840.1.113883.4.1"},
        },
        FamilyName: "SMITH",
        GivenName: "JOHN",
        DOB: time.Date(1965, 3, 15, 0, 0, 0, 0, time.UTC),
    },
    // Similar patient - should match?
    {
        MRN: "789012",
        Identifiers: []Identifier{
            {Value: "789012", Type: "MR", System: "urn:oid:5.6.7.8"},
            {Value: "111223333", Type: "SS", System: "2.16.840.1.113883.4.1"},
        },
        FamilyName: "SMITH",
        GivenName: "JOHN W",
        DOB: time.Date(1965, 3, 15, 0, 0, 0, 0, time.UTC),
    },
}

See Also

References