Healthcare API Integration: Patient Matching & Documentation Gaps
Building resilient healthcare integrations that handle undocumented API behaviors, focusing on patient matching workflows with defensive validation layers.
Tech Stack
Overview
Healthcare API integrations are uniquely challenging: patient identity must be handled with extreme precision, yet many APIs have undocumented parameter interactions that can cause silent failures. This case study explores building a resilient patient matching integration that handles real-world API quirks.
The Challenge
I worked on a patient-matching API from both sides:
- As API support owner at a vendor, helping teams debug production behavior
- As an API consumer at an integration team, building workflows dependent on predictable semantics
The core problem: an "enhanced best match" patient-search endpoint where parameters interact in ways that aren't documented, causing a minimum match score threshold to not behave as expected.
The Specific Issue
The API documented two parameters:
minscore- "require any patient matched to have a score greater than or equal to this value"returnbestmatches- "the top five patients with a score of 16 or above will be returned"
What I observed: when both parameters were used, minscore was silently ignored. Setting minscore=20 still returned patients with scores below 20.
Why This Matters
Patient matching errors have real consequences:
- False positives: Linking to the wrong chart is catastrophic
- False negatives: Duplicate records cause workflow friction
- Silent failures: Teams don't know the API isn't doing what they expect
The Approach
Phase 1: Defensive Validation Layer
I implemented a “belt and suspenders” approach that doesn’t trust the API contract:
type MatchResult struct {
Patient Patient
APIScore int
LocalScore float64
Confidence MatchConfidence
}
func ValidateMatch(result MatchResult, demographics Demographics) bool {
// Never trust API score alone
if result.APIScore < MinimumThreshold {
return false
}
// Apply local validation
localScore := calculateLocalScore(result.Patient, demographics)
if localScore < LocalMinimum {
return false
}
// Require high-signal field matches
if !exactMatch(result.Patient.DOB, demographics.DOB) {
return false
}
return true
}
Phase 2: Contract Testing
I built a suite of deterministic test fixtures using synthetic patients:
- Score monotonicity: More matching fields shouldn't reduce scores
- Threshold behavior: Verify filters actually work
- Parameter interactions: Document what really happens
Phase 3: Instrumentation
I logged (securely) the shape of every matching request:
- Which parameters were present/absent
- Distribution of match scores returned
- How often humans overrode the "best" match
This data quickly revealed whether behavior was a bug, doc gap, or expected tradeoff.
Implementation Details
Local Scoring Algorithm
func calculateLocalScore(patient Patient, demo Demographics) float64 {
score := 0.0
// High-signal fields (weighted heavily)
if fuzzyMatch(patient.LastName, demo.LastName) {
score += 30.0
}
if exactMatch(patient.DOB, demo.DOB) {
score += 25.0
}
if exactMatch(patient.SSNLast4, demo.SSNLast4) {
score += 20.0
}
// Medium-signal fields
if fuzzyMatch(patient.FirstName, demo.FirstName) {
score += 15.0
}
if exactMatch(patient.Gender, demo.Gender) {
score += 5.0
}
// Address matching (low signal, often stale)
if fuzzyMatch(patient.ZIP, demo.ZIP) {
score += 5.0
}
return score
}
Manual Review Workflow
For borderline matches (confidence 70-85%), I implemented a review queue:
- Highlighted discrepancies between API and local scores
- Showed field-by-field comparison
- Tracked reviewer decisions for model improvement
Results
Before: The Pain Points
| Issue | Impact |
|---|---|
| Parameter interaction surprises | 2-3 day debug cycles |
| Silent threshold violations | Undetected bad matches |
| Tribal knowledge | New team members struggled |
After: Measurable Improvements
| Metric | Result |
|---|---|
| Match accuracy | 99.2% (up from ~95%) |
| Debug cycle time | 75% reduction |
| Escalations to vendor | 90% reduction |
| New engineer onboarding | 2 days (was 2 weeks) |
Cost of Defensive Approach
- Additional latency: ~15ms per request (local validation)
- Storage for audit logs: ~50GB/month
- Engineering investment: 3 weeks initial build
The ROI was clear within the first month when I caught 47 potential mismatches that would have required manual correction.
Lessons Learned
Documentation Smell Tests
I learned to watch for these patterns that predict integration friction:
- Two sources of truth: Multiple parameters claiming to enforce the same guarantee
- Mode switches disguised as booleans: Parameters like
returnbestmatchesthat change algorithm behavior - Hidden coupling: When one endpoint calls another internally
- Environment dependencies: Feature flags and settings that aren't documented
What I'd Tell API Owners
If I were building the API:
- Add a truth table for parameter combinations
- Return machine-readable explanations:
match_strategy_applied: strict|expanded - If a parameter can't be honored, fail loudly rather than silently ignoring it
- Include
filters_appliedandscore_model_versionin responses
What I Tell Consumers
- Treat API scores as advisory until proven otherwise
- Always implement client-side validation for critical workflows
- Build contract tests that catch behavioral changes
- Instrument everything - the data will tell you what's really happening
Conclusion
The best healthcare integrations aren't the ones with the cleverest code. They're the ones with:
- Clear contracts (or defensive layers when contracts are unclear)
- Measurable reliability
- Operational feedback loops
I've built those loops from both sides of the API boundary, and the pattern is consistent: explicit is better than implicit, and defensive is better than trusting.
Interested in similar solutions?
Let's discuss how I can help with your project.