In fact, there is no fixed value for this problem that is universally applicable. The core depends on the cost of missed diagnosis in screening scenarios. In clinical practice, the recall rate in most serious disease initial screening scenarios will be stuck at more than 85%. In extremely special scenarios, it may even reach 99%.
To be honest, the essence of the recall rate is the proportion of true positive cases that can be picked out by the model. In a screening scenario, the value is completely a cost account: if the cost of missed diagnosis is high enough, set it high, and if the cost of false positives is high enough, set it low. For diseases such as AIDS and rabies that have extremely high fatality rates and are highly contagious after being missed in the initial screening, when using logistic regression to predict risks, the classification threshold will even be lowered to 0.1. As long as there is a slight positive risk, all will be recalled for antibody and nucleic acid retests. At this time, the recall rate can reach over 98%, even if the specificity drops to 4 No one cares about 0% - after all, if a true positive is missed, not only will the patient miss the best window for intervention, but it may also cause more social spread. The cost of re-examination is nothing compared with life and health. This logic is actually similar to the security inspection of high-risk flights. It would rather open a few more boxes than let go of any risk points.
But the conclusion is completely different in another scenario. For example, for routine physical examinations of ordinary healthy people, logistic regression is used to initially screen for the malignant risk of thyroid nodules. If the recall rate is set to 95%, nearly 30% of patients with benign nodules will be required to undergo puncture biopsy. Puncture is an invasive examination and will also be performed. Ordinary people bring great anxiety. At this time, the recall rate will be reduced to the 80%-85% range. The remaining low-risk missed cases are originally progressing very slowly, but annual routine physical examinations can also catch them in time. The cost of missed diagnosis is not high, so there is no need to torment most normal people in order to catch a few more very early cases.
There is now a lot of controversy over this value in the industry. Last year, our team collaborated with a local maternal and child health hospital to develop a logistic regression preliminary screening model for rare metabolic diseases in newborns. At the beginning, we set the recall rate to 90% according to the common preliminary screening standards, and the clinical team directly called back and said Rare diseases are inherently difficult to detect. If a child is missed in the initial screening, it may take three or four years for the child to be diagnosed. If the golden period of intervention is missed, the disease will be ruined for life. Even if 90 more false positives are recalled out of 1,000 newborns, just one more tube of blood can be eliminated for mass spectrometry. It is cost-effective no matter how you calculate it.; However, statisticians at the CDC have different opinions, saying that many novice parents are already anxious. When receiving a re-examination call, they think that their children must be seriously ill, and many have emotional breakdowns. There have even been extreme cases of pregnant women rashly inducing labor after being notified of false positives. The secondary harm of false positives cannot be ignored at all. The two sides went back and forth for almost a month, and finally made differentiated settings according to regions. The recall rate was set to 95% in areas with high levels of parent awareness and complete follow-up systems, while the recall rate was temporarily stuck at 85% in less developed areas to try to balance the risks on both sides.
Moreover, the advantages of logistic regression are that it is highly interpretable and easy to adjust the threshold. As long as the quality of the data set itself is not too poor, any high recall rate you want can be achieved by lowering the classification threshold. There is no problem of "the model's capabilities are not reached", it is just a matter of how much the specificity decreases. So all in all, this value is never decided by algorithm engineers based on indicators. It is determined by clinicians, disease control personnel, and even patient representatives when they sit together and discuss the actual scenario. The most appropriate value is the one that can balance the risk of missed diagnosis and screening.

Hera 