Exploring a New Computing Model to Enhance Health Care Analysis

Naresh Sundar Rajan; Jeya Balaji Balasubramanian; Jaime Bland and Jonas De Almeida | December 10, 2020

This post was written by guest contributors.

To battle the COVID-19 pandemic, health professionals need immediate access to relevant health data in order to inform clinical decision making, conduct timely research, understand trends, and inform response efforts.

The pandemic, though, highlighted a known shortcoming of our nation’s health data infrastructure. Decision makers, like health care providers, must rely on information reporting services outside the health care system to obtain real-time epidemiological data that, even then, only showed a limited picture.

Before the pandemic started, researchers at the National Institutes of Health (NIH) began to study how our nation’s health information exchanges (HIEs) could solve this problem. HIEs already perform the challenging task of aggregating, harmonizing, and reporting comprehensive data. The challenge for NIH was to find an approach where this data could be shared while adhering to existing privacy laws and protections.

If unlocked, the potential insight derived from these data could provide researchers and data analysts with new discoveries that could improve the care, treatment, and prevention of everything from cancer and heart disease to COVID-19.

Our efforts also aligned with the larger goals of the Office of the National Coordinator for Health IT, to discover improved ways to facilitate the electronic exchange of health information.

Pilot Project Serves as Proof of Concept

In cooperation with the Nebraska Health Information Initiative (NEHII) and the National Cancer Institute (NCI), NIH launched a pilot program earlier this year that uses a machine learning concept known as federated learning that could leverage existing HIE data while ensuring privacy. Federated learning is a machine learning technique that shares an analytical model between different groups, allowing them to analyze their own data – and share those results – without needing to share the actual raw data.

The pilot aimed to expand clinical insights regarding COVID-19 using the existing ways health data are stored and protected in the US. In our pilot, each organization connected to a central cloud ecosystem known as the Multi-state Federated Architecture for Shared Analytics, better known as MuFASA.

MuFASA serves as the central hub for the federated learning model. It can send analytic models to HIEs who can then analyze their own data. The HIEs then send the results back to MuFASA. When MuFASA has collected these results, it can create a deeper analysis using this information.

By using this back-and-forth process, HIEs do not give unfettered control of their data to an outside party. This ensures that patient privacy remains intact, while researchers can still leverage data to study public health. NCI and NEHII tested the federated learning model to create three predictive models centered on COVID-19:

  • Vulnerability Index for COVID that helps manage the outbreak by assigning a score to any individual in the general population indicating their susceptibility to developing a COVID infection.
  • Diagnostic Index for COVID that helps with rapid testing and to find new risk factors for COVID.
  • Prognostic Index for COVID which helps with the clinical management of the disease by predicting disease severity for an individual.

Without federated learning, these models would not provide health care leaders with deep enough data to make strong conclusions that goes beyond known risk factors and digs into social determinants of health like income, race, and education.

Looking to the Future

These discoveries serve as an exciting starting point. Further testing, research, and analysis will determine if the federated learning structure and MuFASA can provide researchers and decision-makers with more detailed insights into health care challenges.

In theory, researchers will be able to use this kind of federated learning approach to learn more about people at risk for cancer, the efficacy of certain drugs, and a whole host of other concerns. The power will truly come in learning things we did not even think to originally ask.

Federated learning can power analytical models while preserving the data governance and protecting patient privacy within HIEs, a critical element of interoperability. We have already seen a tremendous impact and believe we can truly leverage advanced technologies like federated learning for the betterment of health care.

NIH continues to do fascinating research on this topic. For those interested in learning more, we recommend this paper on facilitating multi-institutional collaborations without sharing patient data, and this report on how federated learning of electronic health records improves mortality prediction in patients hospitalized with COVID-19.