We are thrilled to announce the release of version 2.0 of our R package, healthcare.ai. The goal of the software is to make it as easy and fast as possible to put machine learning models to work for health systems. We overhauled the code for this release to make the package even easier to use, to automatically avert problems that commonly arise in machine learning deployments, and to boost models’ predictive power. This post describes how the package does that, but if you’re more of a hands-on type,…
Data Science Blog
Why We Invest in healthcare.ai
We started healthcare.ai in late 2016 to bring machine learning (ML) to the healthcare masses. As we release version 2.0 of the software (on April 20th), it’s worth stepping back to fully understand why we invest in this open-source project, which is freely available to all. Why would a for-profit firm spend time investing in this public good?
Since the 2009 HITECH act incentivized EHR adoption, data has become much more ubiquitous in healthcare. Despite all that’s gone wrong in US healthcare, the fact that healthcare data is now largely digitized is something to celebrate. To help integrate, cleanse, and standardize the wealth of disparate data sources for analytic use, the need for data warehouses rose commensurately over this same period. This analytic world is the realm of reports, and dashboards; the realm of tracking, comparing, slicing, and dicing. This is the realm of business intelligence.
While this realm of boardroom analytics is helpful when analyzing clinics, departments, cohorts, and aggregate measures, it’s difficult to translate lessons from such tools to optimal in-the-moment decision-making for individual patients. And business intelligence isn’t to blame! Front-line staff are simply overloaded with information. In medicine, it’s estimated that seven thousand journal articles are published each month. Not only is it clearly too much for even sub-specialists to keep up with, but hospital idiosyncrasies and unique patient demographics make it difficult to know what might apply to individual patient care decisions.
Enter machine learning. It leverages local data—on past patient attributes and outcomes—to optimize clinical or operational decision-making about the patient currently seeking treatment. When presented simply and actionably, it relieves the clinician’s cognitive burden by suggesting optimal treatments based on what has worked best for similar patients in the past.
While the technology and software to produce this kind of ML-driven decision support has existed for 15+ years, only recently has healthcare gained the tech and cultural prerequisites necessary for widespread acceptance. And while ML has received considerable buzz, it’s still not widely used in healthcare. Why is that?
- Data scientists and PhDs, the long-time ML gatekeepers, are still relatively rare in most health systems.
- While researchers have long been able to create models that explain how variables relate, using ML-based models for in-the-moment decision support is a substantially higher level of difficulty. This is evidenced by all the papers that describe models that are interesting, but don’t subsequently make it into daily hospital use.
- Analytic directors and health system professionals often don’t know what’s possible because there hasn’t been a great source of practical healthcare ML education.
For these reasons, we stepped into the public realm on this. To truly realize the potential of data for optimizing healthcare outcomes, ML has to become ubiquitous in the average health system, across not only clinical but also operational and financial realms. Despite the buzz, we didn’t feel that ML was on that practical trajectory in healthcare, which is why we started and open-sourced healthcare.ai. It advances and supports healthcare ML in several ways:
- Focus your efforts via open-source software. This software brings ML to the masses via a simple interface, wise decisions made under the hood, and clear use case examples. Our goal was not only to put ML in reach of BI developers, analysts, and software engineers, but also to make the lives of data scientists easier. With each new model, you don’t have to worry about what preprocessing should be done, how to tune the model, what metrics are appropriate, and how to deploy the model. This gives you more time to focus on important details like prioritizing the right business question, finding optimal features, defining metrics for project success, and nailing the workflow integration (so your insight realizes wide adoption).
- Easy integration for streamlined deployment. Once you do have solid answers to those questions, healthcare.ai makes it easier to put models into batch or real-time production environments, such that your modeling work is driving daily care decisions and isn’t just an academic project. We do this via clear ways to document, save, and leverage pre-saved models, with examples that don’t require a PhD.
- Simplified maintenance through stable code. Rather than writing your own scripts, leveraging such software lets you put much less ad-hoc code (which is dangerous) in production overall. The unit testing, version control, and continuous integration that stands behind this software makes the support and maintainability of many models actually feasible.
This 2.0 release—further detailed here in a technical post later this week—doubles-down on the above principles. As we’ve put models in production around CLABSIs, readmissions, no-shows, propensity to pay, etc, we’ve learned plenty from our work at 15+ health systems and have put the lessons back into the software. Broadly, 2.0 provides a clearer interface (that allows you to train a model with two lines of code), provides for even more accurate models, adds more software and statistical rigor under the hood, and automates many ML tasks that you may not want to spend time on. As always, this is done with practical examples and use cases in mind. Rather than folks working in silos, the goal is to work together via open code and community support. The tools and this endeavor are useless if we don’t show you how it can help your efforts in healthcare. To that end, here are the software docs, the past broadcasts on practical use cases, and the community where you can find continuing support.