Data Science at Medic

Post authored by Matt Cane, Medic’s Data Scientist.

It’s an exciting time at Medic, as we’re in the early stages of our journey into data science. We feel data science has tremendous potential to help us better understand the challenges facing health workers, and to allow us to design tools to make them more efficient and more effective in their jobs. But we’re also aware that there’s no guarantee that any data tools we build will improve on the expertise health workers already possess, and that we need to remain cognizant of the potential for algorithms to reinforce existing stereotypes.

In this post we’ll outline our vision for data science, including how we’ll look to leverage health workers’ existing expertise and how we plan to mitigate some of the risks associated with algorithmic decision making tools.

Our Vision For Data Science

At its core, our vision for data science is to put data to work for those who are most marginalized, helping health workers and health systems deliver high-quality, equitable, targeted, and timely, healthcare for everyone. Medic’s users are collecting an enormous amount of ground level data each day using our app. We hope to use this data to build new tools to assist health workers and their supervisors deliver high-quality care in the most efficient manner possible, and to identify those who might be at risk of experiencing negative health outcomes at any given time. We feel that data can help us make certain that everyone has access to the services of health systems, and that it can enable health workers to provide more proactive care to identify patients in need of assistance in a more timely manner.

While our list of goals is large, our plan is to start small and iterate, ensuring that we take stock at each stage of what’s working and what isn’t. We feel that data science can make a big difference in how health workers do their jobs, but we also know that it’s not as simple as throwing data into an algorithm and blindly trusting the outputs. We need to take the time to learn how health workers make care-related decisions now, and to consider whether any tools we build might disrupt practices that are working well already.

Centering Health Workers’ Expertise

Too often it seems like machine learning models are viewed as an automatic method of improving decision making, simply because they appear to remove all human judgement from the equation. However, the health workers who use our app have a wealth of expertise about their work and their communities, much of which is difficult (if not impossible) to encode in an algorithm or model. Each one of them has their own “internal” models for prioritizing patients and delivering care, and it would be extremely naive of us to think that an algorithm would automatically be able to improve on their learned expertise.

Instead, we’d like to look for clues in data that shed light on how health workers are already prioritizing patients, and then dig deep into the data to surface additional information that they may not know or otherwise have access to. In this approach, the insights that the application offers them are complementary to that which they’ve gained through their training and experience.

This is critical to our approach to data science, because if we don’t give CHWs information that complements their existing expertise, we could destroy their trust in the app, or override their local expertise. And if their trust in the application or their own abilities begins to deteriorate, so might the quality of the care provided to the community, resulting in more harm than good.

Thinking Ahead About What Could Go Wrong

There are countless examples across many fields of machine learning algorithms going wrong, and in many cases, serving less as an unbiased arbiter and more as a reinforcer of existing stereotypes. Because discrimination is often heavily baked into historical data, too often algorithms built on that data run the risk of encoding that discrimination into a supposedly “unbiased” method. The criminal justice system is full of numerous cases of sentencing or parole algorithms drawing upon historical data to generate predictions that in many cases only increase existing inequalities.

That’s why one of the first questions we asked ourselves was “What could go wrong?”. We drew up a list of the many different ways that any data tools we developed could make health workers’ jobs more difficult, or how they might lead to worse outcomes for the families they care for.

We came up with a range of potential negative outcomes, including the reinforcement of historical and inaccurate biases, the risk of overwhelming health workers with information that they’re unable to use to make effective decisions, or even the possibility of demotivating health workers if they felt an algorithm was taking away their autonomy to treat patients based on their experiences. We plan to refer back to this list with every data tool we choose to implement, in order to ensure that we mitigate each risk as much as possible at the design stage before anything we build goes live.

Looking Ahead

While there are many challenges in ensuring that we build data tools in a responsible way, we feel it is our duty to continue to deliver the best technology and scientific tools to support health workers and the communities they serve. We firmly believe that with careful design, including a focus on iterative rounds of human feedback, we can use data science to help make the health workers using our application more efficient and effective, and to ensure that the families that they work with on a daily basis receive the best care that can be delivered.

Although we’re only at the start of our data science journey, we aim to embark upon it with a commitment to both transparency and collaboration. With that in mind, I’d like to invite anyone who is working on similar tools or techniques to reach out on the CHT Forum.

Scroll to Top