Fairness and Usefulness
The impact of using artificial intelligence (AI) to guide patient care or operational processes is an interplay of the AI model’s output, the decision-making protocol based on that output, and the capacity of the stakeholders involved to take the necessary subsequent action. Estimating the effects of this interplay before deployment, and studying it in real time afterwards, are essential to bridge the chasm between AI model development and achievable benefit.1
To bridge the gaps we found, we developed a framework to estimating usefulness (JAMIA) and made it broadly available as a python library for usefulness simulations of machine learning models in healthcare (JBI). We developed a way to assess fairness in terms of the consequences of using a model to guide care (BMJ Informatics). Finally, to demonstrate application in practice, we conducted a fairness audit (Frontiers in Digital Health) which required 115 person-hours across 8-10 months.
We have combined all these efforts to develop a mechanism to identify fair, useful and reliable AI models (FURM) by conducting an ethical review to identify potential value mismatches, simulations to estimate usefulness, financial projections to assess sustainability, as well as analyses to determine IT feasibility, design a deployment strategy, and recommend a prospective monitoring and evaluation plan. Our novel contributions – usefulness estimates by simulation, a process to do ethical assessments and financial projections to quantify sustainability – as well as their underlying methods and open source tools, are available for other healthcare systems to conduct actionable evaluations of candidate AI solutions.2
1 https://hai.stanford.edu/news/how-do-we-ensure-healthcare-ai-useful
2 https://furm.stanford.edu/