Healthcare’s Data Deluge Requires Optimized IT System Performance

1

By James D’Arezzo

The healthcare system generates about a zettabyte (a trillion gigabytes) of data each year, with sources including electronic health records (EHRs), diagnostics, genetics, wearable devices and much more.

While this data can help improve our health, reduce healthcare costs and predict diseases and epidemics, the technology used to process and analyze it is a major factor in its value.

In many industries, including healthcare, IT systems aren’t always given the attention they deserve. Oftentimes, no one notices how valuable technology is until it’s not working as expected.

According to a recent report from International Data Corporation, the volume of data processed in the overall healthcare sector is projected to increase at a compound annual growth rate of 36 percent through 2025, significantly faster than in other data-intensive industries such as manufacturing (30 percent projected CAGR), financial services (26 percent) and media and entertainment (25 percent).

Healthcare faces many challenges, but one that cannot be ignored is information technology. Without adequate technology to handle this growing tsunami of often-complex data, medical professionals and scientists can’t do their jobs. And without that, we all pay the price.

Electronic Health Records

Over the last 30 years, healthcare organizations have moved toward digital patient records, with 96 percent of U.S. hospitals and 78 percent of physician’s offices now using EHRs, according to the National Academy of Medicine. A recent report from market research firm Kalorama Information states that the EHR market topped $31.5 billion in 2018, up 6 percent from 2017.

Ten years ago, Congress passed the Health Information Technology for Economic and Clinical Health (HITECH) Act and invested $40 billion in health IT implementation.

The adoption of EHRs is supposed to be a solution, but instead it is straining an overburdened healthcare IT infrastructure. This is largely because of the lack of interoperability among the more than 700 EHR providers. Healthcare organizations, primarily hospitals and physicians’ offices, end up with duplicate EHR data that requires extensive (not to mention non-productive) search and retrieval, which degrades IT system performance.

Precision Medicine

Recent developments in data collection and analysis are paving the way to a new era in healthcare. Today, patient data is generated at a level orders of magnitude higher than that of even a decade ago. Through advanced predictive analytics, this information has the potential to save lives through the diagnosis, treatment and prevention of disease at a highly personalized level.

According to BIS Research, the subcategory of precision medicine, a $43.6 billion global market in 2016, is expected to reach $141.7 billion by 2026.

But all of this personalization means massive amounts of data.

The biomedical journal BMC Medicine, commenting on the January Precision Medicine World Conference, stated that “the results from this type of research will likely offer countless opportunities for future clinical decision-making. In order to implement appropriately, however, the challenges associated with large datasets need to be resolved.” A January report from the California Precision Medicine Advisory Committee similarly found that “precision medicine will require significant investments in data storage, infrastructure and security systems in the coming years to achieve its full potential.”

Prediction of Outbreaks

Healthcare data networks have been slow to predict, understand and contain the recent measles outbreak. Experts see healthcare predictive analytics lagging far behind fields like climatology and marketing. Predictive analytics and modeling require massive amounts of data from many sources, with thousands of variables. Trying to process this important data on an overburdened IT system is risky, potentially causing major system slowdowns and crashes.

In a Journal of Infectious Diseases special supplement, “Big Data for Infectious Disease Surveillance and Modeling,” researchers indicated that there are significant challenges to overcome in acquiring and analyzing important real-time data that could offer significant insight into current and future disease outbreaks.

More Data, More Problems

Processing and analyzing big data are dependent on the overall system’s input/output (I/O) performance, also known as throughput. Data analytics requires a computer system to access multiple and widespread databases, pulling information together through millions of I/O operations. The system’s analytic capability is dependent on the efficiency of those operations, which in turn is dependent on the efficiency of the computer’s operating environment.

In the Windows environment especially, I/O performance degradation progresses over time. This degradation, which can lower the system’s overall throughput by 50 percent or more, happens in any storage environment. Windows penalizes optimum performance due to server inefficiencies in the handoff of data to storage. As the storage layer has been logically separated from the compute layer and more systems are being virtualized, Windows handles I/O logically rather than physically, meaning it breaks down reads and writes to their lowest common denominator.

This creates tiny, fractured, random I/O that results in a “noisy” environment that slows down application performance.  Left untreated, it only worsens with time.

Solutions

Even experienced IT professionals mistakenly think that new hardware will solve these problems. Since data is so essential to healthcare organizations, they are tempted to throw money at the problem by buying expensive new hardware.

While additional hardware can temporarily mask this degradation, targeted software can improve system throughput by 30 to 50 percent or more and should be part of the IT toolkit for any large-scale healthcare organization.

James D’Arezzo

About the Author

James D’Arezzo is CEO of Condusiv Technologies, a global provider of software-only storage performance solutions for virtual and physical server environments.