PromethenoBlogContact ↗

Three failures, one missing layer

Three failures are symptoms of one missing layer.

Ziyuan (Chester) Guan
Ziyuan (Chester) Guan
11 min read

Now let's think together. In Hello from here I said I'd revisit what I got wrong in the Medium pieces last summer. This is that revisit.

Last summer I wrote that healthcare AI keeps stalling for three reasons: fragmented data, missing audit, misaligned incentives. Ten months later, I still think those three failures are real. I no longer think they're three failures. They're symptoms of one.

What I said last summer, restated#

The Medium pieces were diagnostic. They named the failures, gave each a name, and proposed each could be addressed separately. Fragmented data — fix it with better integration. Missing audit — fix it with better logging. Misaligned incentives — fix it with better economics. Three problems, three fixes, three workstreams.

I still believe each of those failures is real. I have spent ten months trying to address them, primarily through the protocol I called HAVEN and the reference implementation that runs against MIMIC-IV today. What I have learned in those ten months is that I named them wrong. Not because the symptoms are wrong, but because the cause is one.

Each of those three failures, when you look at what would actually fix it, requires the same thing: a layer of infrastructure that currently does not exist anywhere in healthcare. Not in a specific app. Not in any single regulation. Not in any platform. A layer that lives beneath the application layer, between the data and the things that use it, and is jointly governed rather than custodially owned.

Healthcare hasn't built that layer. The reason the three failures are so persistent is that all of the actors who could build it are working at the wrong layer.

What the three failures share#

Consider what each failure actually requires.

Fragmented data is a coordination problem. Each EHR holds part of a patient's record. Each direct-to-consumer health app holds another part. Each research dataset is a fixed snapshot of one institution. Fixing this requires not better storage but a way for the parts to refer to each other — to be the same record, verifiably, across systems that don't trust each other.

Missing audit is a coordination problem. An audit log that lives inside the system being audited is auditable by the system's custodian only. To be trustworthy, the audit has to be visible from outside the custodian's reach. That means coordinating audit across actors who otherwise have no reason to cooperate.

Misaligned incentives is a coordination problem. Patients contribute data; researchers use it; outcomes flow to neither directly. To realign requires value-tracking across that chain. No actor in the chain has the standing to track it on behalf of everyone. Value attribution at scale is shared accounting across systems that do not share a custodian. That is coordination, just at a different layer than data shape or audit trails.

Three failures, one shape: each is a problem of coordination across actors who don't share a custodian. And there are four layers in current healthcare infrastructure where such coordination has been attempted: the application layer, the regulatory layer, the platform layer, and the standards layer. Each has tried to host the fix. Each has produced a layer-specific limitation worth examining in detail.

Why each existing layer can't host the fix#

Application layer fails at coordination#

Most patient-data infrastructure today is application-layer. MyChart manages access to one health system's records. Pillpack manages medications. Apple Health stores a phone's sensor data. Each has consent UI. Each has logs. Each has some value model — even if "free" is the model. None of them coordinate with the others. Consent given in one is not visible to another. Audit logs in one are not auditable from another. Value accrued in one cannot be paid across them.

You can build the best possible consent flow inside one application and still have failed at the actual problem, because the patient does not have one application. The patient has dozens. The data exists in dozens of systems. The application layer cannot, by its structural definition, coordinate across applications it does not contain.

This is not a problem that better applications will solve. It is a problem that requires a layer applications can rest on, in the way that an HTTP server doesn't have to reimplement TCP.

Regulatory layer fails at latency#

HIPAA1 defined privacy boundaries in 1996, before patient data was an AI training resource. It maps poorly onto questions like "who is allowed to train a model on this record," because the act of training does not look like the disclosure events HIPAA was designed to govern.

GDPR2 added the right to erasure in 2018. The right to erasure is a coherent demand for records held in databases. It is much less coherent for records held in the gradient weights of a deployed model. The right exists in statute; the mechanism for enforcing it for training data simply doesn't.

The 21st Century Cures Act3 and the subsequent ONC interoperability rules (2020–2022) mandated that patients receive access to their records via standardized APIs. Access is a precondition for sovereignty, not a substitute for it. Receiving the data is not the same as having rights about what is done with the data once it's received.

What these regulations have in common is that they responded to whatever problem was visible at the time of drafting. By the time the regulation is in force, the technology has produced new problems. Regulation has structurally lower bandwidth than technology, which means whatever is built before regulation catches up will continue to operate, will continue to extract value, and will not be unwound by the eventual regulatory response. The fix has to exist before regulation, or it cannot exist at all.

Platform layer fails at consolidation#

The platform attempt is the most recent. Apple Health Records launched in 20184 with twelve partner health systems and now integrates with hundreds of US health systems. Google has had four separate goes at healthcare data (Google Health 2008–2011, Google Fit, DeepMind Streams, and Cloud Healthcare API)5, each closed or refocused. EHR vendors operate patient-facing portals that are platform-like at health-system scope.

These platforms work, in the narrow sense that data does flow through them. They do not solve the sovereignty problem. They consolidate it. When Apple is the custodian of a unified patient-data layer, the patient is no longer the sovereign — Apple is, with the patient as user. When the EHR vendor is the custodian, the health system is. Sovereignty becomes mediated, which is the opposite of sovereignty.

Platforms aren't bad. They're just not where the fix lives.

Standards layer fails at scope#

Healthcare has serious protocol-layer attempts. HL7 v2 standardized clinical message exchange in 19896. HL7 FHIR has standardized RESTful access to clinical data since 20147. The OMOP Common Data Model8 codified the shape of observational research data across hundreds of institutions. SMART on FHIR9 standardized authorization for clinical apps.

These are real protocol-layer wins. They are not the wins the missing fix needs.

Each of these standards governs the wire. FHIR specifies how to retrieve a record; it does not specify whether the retrieving party may train a model on it. OMOP specifies how a diagnosis is encoded; it does not specify who may access the cohort or what they owe the patients in it. SMART on FHIR specifies how an app authenticates; it does not specify what the patient should receive when the app's output is used in care.

The standards layer scopes to data shape. The missing fix has to scope to data use. The two are complementary: a governance protocol operates over FHIR-shaped data and OMOP-modeled cohorts. What those standards don't provide.

What "protocol layer" means in this context#

A protocol is a set of rules that participants follow voluntarily, without any of them owning the rules or storing the data the rules govern. SMTP made email possible across institutions in 198110 — not because Bell Labs hosted email, but because everyone agreed on how to address it. HTTP made the web possible across servers in 199111 — not because Tim Berners-Lee hosted the web. DNS made naming possible without a single registrar12.

In each case, the protocol layer succeeded by enabling cross-system behavior that no single custodian could have provided. Each protocol was published, ratified by use, and operated without any party having permission to revoke it. Email has survived four decades of vendor consolidation because the protocol is older than the vendors.

Healthcare data does not have such a layer. It has applications that consolidate. It has regulations that constrain disclosure. It has platforms that mediate. It has no shared rules for what a record is, what consent means, what audit consists of, or how value gets attributed. Each of those questions is currently answered application by application, regulation by regulation, platform by platform.

The bet is that a protocol layer for patient-sovereign healthcare data could behave the way SMTP and HTTP did. Not because it solves any specific application problem better than that application would, but because it enables a class of cooperation that cannot happen without it. There is a second part to the bet: this layer is buildable now, before regulation forces a worse version of it, and before any single platform consolidates the territory.

What this means for the next four posts#

If the missing layer is protocol, then specifying what such a protocol must provide is the next step. Not "consent and audit" as generic abstractions. Specific primitives, each with a job.

The next post argues that four primitives carry the load: content-addressable Health Assets, programmable Consent, hash-chained Provenance, quality-weighted Contribution. Each maps to one of the failures named here. The claim is not that these four are provably the smallest possible set. Design spaces resist that kind of proof. The claim is that each one earns its place against a specific failure mode, and that the four cluster naturally rather than arbitrarily.

That's a softer commitment than "minimum sufficient." It's the one I can defend. A reader who sees a natural fifth primitive should write back. The series is better for the pressure.

What I underestimated#

When I wrote the Medium pieces last summer, I thought the field needed better tools. I now think it needs a layer the field hasn't built. That's a harder problem than the one I named.

Building better tools in a missing layer is a treadmill.

The next post specifies. The two after that examine the gaps that surfaced during the specification — gaps that became separate work because they live in different verification regimes. The fifth post commits to what would prove the whole argument wrong.


Footnotes#

  1. HIPAA, Public Law 104-191 (1996); Privacy Rule effective 2003. The statute governs disclosure of protected health information by covered entities. It is structurally about who may share what with whom, not about what may be inferred from what has been shared.

  2. GDPR, Regulation (EU) 2016/679, effective May 2018. Art. 17 (right to erasure) and Art. 20 (right to data portability). Art. 17 is binding on data controllers; the mechanism for applying it to data already encoded in trained model weights remains an open legal question.

  3. 21st Century Cures Act, Public Law 114-255 (2016). Subsequent ONC interoperability rules: 85 FR 25642 (May 2020) and 89 FR 1437 (January 2024). FHIR R4 patient-access APIs mandated for certified health IT.

  4. Apple Health Records launched March 28, 2018. Initial 12 partner health systems; FHIR R4–based; now integrated with hundreds of US health systems.

  5. Google Health (consumer): 2008–2011. Google Fit: launched 2014. Google DeepMind Streams: piloted at Royal Free London 2016, criticized by UK ICO 2017, folded into Google Health 2018. Google Cloud Healthcare API: launched 2018, operational. None operate at protocol layer; all are platform plays.

  6. HL7 v2 (originally HL7 v2.1, 1989). Maintained by HL7 International; versions 2.3–2.7 in widespread clinical deployment.

  7. HL7 FHIR (Fast Healthcare Interoperability Resources). DSTU 1 published 2014; FHIR R4 became normative in 2019.

  8. OMOP Common Data Model, maintained by the OHDSI consortium. v5.x widely deployed across hundreds of research sites; v6.0 current.

  9. Mandel, J.C., Kreda, D.A., Mandl, K.D., Kohane, I.S., and Ramoni, R.B. "SMART on FHIR: A standards-based, interoperable apps platform for electronic health records." Journal of the American Medical Informatics Association 23(5) (2016): 899-908. Initial profile published 2014; SMART App Launch Framework v2.0 in current use.

  10. Postel, J. (1981). RFC 821: Simple Mail Transfer Protocol.

  11. Berners-Lee, T. (1991). HTTP/0.9 first proposal; HTTP/1.0 standardized 1996, RFC 1945.

  12. Mockapetris, P. (1983). RFC 882, RFC 883: DNS specifications.

Get new notes by email

No tracking, no upsells. One email per post. Unsubscribe instantly.