The Nightmare of Extracting and Interpreting Blood Pressure Data

Blood Pressure

Medical record data is full of hidden gotchas that a data professional can stumble into. My favorite example is Blood Pressure.

What Is Blood Pressure?

Blood Pressure is a measurement of force exerted by the heart on blood in order to move it through the body. It is a key indicator of heart health, made up of two measurements: Systolic Pressure (the pressure inside your blood vessels when your heart beats) and Diastolic Pressure (the pressure inside your blood vessels when your heart rests between beats). In its informative form, Blood Pressure is displayed to clinicians like this:

Systolic Blood Pressure / Diastolic Blood Pressure

To the untrained eye, this looks like a division math problem from elementary school. This string is an analyst’s nightmare!

  • Do you take the top number and divide it by the bottom number, and utilize the quotient as the numeric representation of blood pressure?
  • Can you consider Systolic or Diastolic Pressure numbers independently?
  • If I clean this data by removing punctuation, I am left with one 5-6 digit integer. Can I use that?

Needless to say, it is confusing if you do not know much about medical data.

In What Way Is This Data Made Useful?

To help make this data more actionable, these two numbers are compared against reference ranges, sometimes stratified by age and gender. An elevated number for either reading can flag the whole blood pressure reading for a clinician, but a single normal reading does not indicate healthy blood pressure overall. The AMA guidelines for example:

Image result for AMA blood pressure reference ranges

This diagram sheds light into the utility of the two values independently. In SQL terms, our logic for categorizing and color-flagging blood pressure would look like this:

Screen Shot 2019-03-25 at 3.42.27 PM

These are some analytic complexities when making blood pressure data actionable and useful, but let’s take it a step further and consider how we might see this data inside the underlying EMR data source.

How Do We See This Data Stored?

While a Blood Pressure reading’s useful form is as a string containing both systolic and diastolic readings, LOINC standards (LOINC Code 55284-4) note that it is inadvisable to report two observations in one record. Systems instead capture the data as two different integer readings, Systolic Blood Pressure (LOINC Code 8480-6) and Diastolic Blood Pressure (LOINC Code 8462-4). Now we know that the data tends to be stored as numbers, which should make an analyst jump for joy. This isn’t the whole story though.

Here are two common table structures modeled from EMR databases, both capturing vital readings:

System 1

Screen Shot 2019-03-25 at 3.45.54 PM

System 2

Screen Shot 2019-03-25 at 3.46.01 PM

Does anything pop out at you? These examples highlight a common EMR design trend for vitals data; it is not attributed its own primary key. Instead, it is logged in association with its encounter. Vital data is considered a subset of encounter information, rather than its own health data entity in and of itself. This introduces the concept of a vital set, which is the vital readings your doctor tends to do every time you visit (Height, Weight, Blood Pressure, O2 Saturation, Temperature, etc.). The vital set captures a specific set of vital readings, and allows for capture of 1 and only 1 reading per type.

Consider the Tilt Table Test, who’s data is recorded in the table examples above. During this test, a person lies on a table that rotates from vertical to horizontal. Their blood pressure is recorded at the supine position, then immediately monitored for the next minute once the table is rotated to the vertical position.

How would a system capture this information? System 1 creates a second vital set for the additional blood pressure reading. This approach allows the systolic and diastolic pair to remain closely associated to each other, but its association with other vital data captured during that encounter is now at the encounter level rather than vital set level. System 2 is in a long format and simply tacks on a new row. This model accounts for additional readings more fluidly, but the systolic and diastolic pair are not linked as closely. EncounterID, VitalTime, VitalPosition and any other columns available would need to be employed in order to make a systolic/diastolic link with confidence.

How Do We Match Vital Readings Together?

Now we see that while blood pressure data is most informative as a vital data pair, source systems do not store this data in its systolic/diastolic form. We as engineers need to come up with ways to confidently pair systolic and diastolic pressure readings when extracting this information from source systems.

Beyond pairing Blood Pressure data together, further complexity is introduced when considering that reference ranges for vital data are dictated by other patient information such as age, gender, or sometimes even from other vital data such as height or weight. Consider BMI, which is a calculated value from both height and weight. We may not expect height to change during the course of a visit, but we can reasonably expect weight to. This means that we have to account for the possibility that we will have as many calculated BMI values from a single encounter as there are weight measurements.

Medical data is a reflection of human health, thus mirroring its complexity in form and utility. We at Hart have come up with methods for extracting vital data as its own health data entity rather than as a child of an encounter, and pair it with other pieces of information to make it actionable for a clinician when viewed on the Compass Platform. Our knowledge of potential health data complexities, which can only be gained from deep experience with medical information, aids us in navigating every new source system we come across.

npm Packaging Part 1: Thinking About Publishing A Package?

This is the first installment in a series exploring publishing npm packages. The goal is to walk through publishing everything from a simple, dozen-or-so lined node module to a relatively complicated react component library with typescript & flow definition files and pretty import paths.

npm has been around since 2010, allowing JS engineers to publish javascript modules for distribution. As the default package installer for Node.js it received adoption as node grew in popularity, and with more recent javascript tooling has allowed for the explosion of JS as one of the most far reaching programming languages.

Why do packages exist?

The simple answer is distribution.

A package has some distinct advantages to copy-paste solutions: ease of import (using a unique name, version), an explicit api interface, and most importantly code isolation. This allows a package to be used broadly as small components in larger applications, minimizing the need to re-solve problems (like date-time management) or providing abstracted, simplified interfaces to more complicated low-level apis (like DOM manipulation, e.g., react).

Their ease of distribution is enabled by npm’s hosted registry ( and the accompanying command-line program npm, which can take in a unique name and install all the required pieces to get that code running in your app. I’ll also note, npm isn’t the only choice: yarn, which was released by facebook a few years ago, and other package installers like npmdnpm-install and ied, offer some optimizations and alternatives.

As of this writing there are nearly 900,000 packages on the npm registry that were downloaded more than 9,000,000,000+ times in the last week. With those kinds of numbers there is likely a solution to whatever problem you’re having; if not, several solutions. And that’s just fine, having more choice is good. And this system is what has made the internet the world-connecting and problem-solving platform it is.

Do we really need more npm packages?

A fair question: like I said above, there’s hundreds of thousands of packages solving any number of problems. What makes your solution unique? Honestly, it doesn’t need to be: having all that choice is good.

Don’t worry about whether you are solving anyone’s problem but your own, because if you’ve solved a unique problem you encountered in the construction of whatever you’re building, it’s probably worthwhile to share your solution. Some incredibly simple packages can be so small but useful they wind up being used by virtually everyone, and there’s a fun story about how a small (11 line function) package named leftpad became so ubiquitous that it broke the internet.

What makes a good npm package?

Simply put, usefulness.

npm exists to allow packages of code to be distributed and consumed easily. If a problem you have solved is solvable in isolation, it can be turned into a package and re-used by others encountering the same or similar problems.