Mike Mastanduno Data Scientist
Health Catalyst

# Model evaluation using ROC Curves Before we can get to the curve itself, we need a few definitions. Let’s say we’ve generated a machine learning model to predict the likelihood of 30-day readmission in a set of patients. The model gives a probability (between 0 and 1) for each person of how likely they are to be readmitted. 30 days later, the True Positive Rate (TPR) is the proportion of actual readmissions that the test correctly predicted would be readmitted. The False Positive Rate (FPR) is the proportion of patients whom the model predicted would be readmitted, but were not. In order to make these black and white predictions, we must pick a decision boundary somewhere between 0 and 1. Remember, the model gives a probability, not a definitive answer. If we were to choose 0.9, we would say that everyone with readmittance probability above 0.9 is a readmission, everyone below is not. We could then calculate the TPR and FPR, and have a measure of how well our model performed at 0.9 decision boundary. As you might have guessed, the decision boundary is a sticky spot. If we were choose 0.8 to increase the TPR, it will come at the expense of a larger FPR. The three parameters are tied to one another in a way that makes models hard to interpret and discuss.