What changes when your analytics can't see PHI

A healthcare marketing team comes to us with a familiar problem. They've been running Meta and Google ads with the same client-side pixels every other consumer brand uses. Then someone — usually legal, sometimes a compliance vendor — points out that those pixels are sending Protected Health Information to ad platforms, and that's not allowed.

The fix is conceptually simple: route the events through a server-side pipeline that strips PHI before forwarding. There are vendors who do this. We've worked on one. Implementing it is straightforward.

What's not straightforward — and what surprises every team the first time — is how much of the analytics workflow assumes the data was complete the whole time. Removing PHI doesn't just sanitize the events; it changes what you can know about your users. And that changes what you can do with the analytics on the other side.

What gets stripped

PHI is more than names and email addresses. Under HIPAA, the eighteen identifiers include things you wouldn't think of as obviously identifying: IP addresses, device identifiers, dates more precise than the year, and any combination of fields that could re-identify someone in a small enough population.

In a typical ad-tracking pipeline, that means stripping:

Email and phone (hashed or not)
IP address (or anonymizing it)
User agent (truncating)
Precise timestamps (rounding to the hour or day)
Page URLs that contain identifiers (/appointment/12345/details)
Form field values

What's left, in many cases, is not enough to support the optimization features the ad platforms offer. Lookalike audiences need user identifiers. Conversion optimization needs an unbroken chain back to a specific click. Frequency capping needs a stable user ID.

Where the gap shows up

There are a few specific places teams hit this:

Conversion attribution. Without precise timing and user identity, you can't tell with confidence which ad click led to which conversion. You can model it — fractional attribution across a time window — but the model is no longer mechanical. It's a judgment.

Audience-building. Lookalike audiences depend on telling the platform "here are 10,000 of my best customers, find more like them." If you can't send identifiers, you can't seed the model. You can sometimes do this server-to-server with hashed identifiers, but it requires more careful plumbing than the client-side flow.

Frequency and reach. If two visits from the same user look like two different users (because the IP is anonymized and the cookies are short-lived), your frequency caps and reach numbers are going to be wrong in a direction you can't easily measure.

Funnel analysis. Your dashboards that show "users who did A then did B then dropped off at C" rely on stitching events to a user. If the stitching is fuzzier, the funnel is fuzzier.

What you can do anyway

The good news is that healthcare marketing teams don't need to fly blind. The constraints rule out certain workflows, but they don't rule out useful analytics. The pattern we've seen work:

Aggregate the precision you've lost. If you can't track an individual user through a funnel, track cohorts. Cohort-level analytics — "what percentage of users from campaign X converted within 30 days" — survives PHI stripping because it's already aggregated.

Lean on first-party data you already collect compliantly. Your EHR or CRM has consented data. Server-to-server uploads using hashed identifiers (where allowed) can rebuild some of the targeting capability the pixel used to provide, with consent in place.

Treat the platforms' "modeled" features as load-bearing. Meta and Google have invested in modeled conversion features for exactly this reason — most regulated industries are running into the same problem. The modeled numbers aren't as crisp as direct observation, but they're not noise either.

Be explicit about confidence intervals. If your conversion data is a model, present it as a model. Ranges, not point estimates. Reviewers and stakeholders need to know what's measured directly and what's inferred.

The architectural shift

What this means for engineering is that the analytics layer in a healthcare-marketing system isn't just a passthrough. It's a transformation layer with rules. It strips, anonymizes, aggregates, sometimes routes to different downstream destinations depending on consent state. It logs what it stripped, for audit. It has to be testable on synthetic PHI without ever touching real PHI in non-production environments.

If you're building one of these from scratch, treat the analytics pipeline as a regulated product surface. Specify it explicitly. Test it adversarially. Audit what it sends.

If you're retrofitting one into an existing healthcare product — which is the more common case — accept that the project is bigger than "switch to a server-side pixel vendor." You're not changing one component; you're changing what the rest of the system is allowed to assume about its data.

The teams that handle this best treat the constraint as a forcing function. The ones that struggle treat it as an integration project.

What changes when your analytics can't see PHI

What gets stripped

Where the gap shows up

What you can do anyway

The architectural shift

More from Insights

Healthcare and electric utilities have more in common than you think

Have a system to build, modernize, or rescue?