A causal perspective on dataset bias in machine learning for medical imaging – Nature.com

Char, D. S., Shah, N. H. & Magnus, D. Implementing machine learning in health care addressing ethical challenges. N. Engl. J. Med. 378, 981983 (2018).

Article PubMed PubMed Central Google Scholar

Obermeyer, Z., Powers, B., Vogeli, C. & Mullainathan, S. Dissecting racial bias in an algorithm used to manage the health of populations. Science 366, 447453 (2019).

Article ADS CAS PubMed Google Scholar

Wiens, J. et al. Do no harm: a roadmap for responsible machine learning for health care. Nat. Med. 25, 13371340 (2019).

Article CAS PubMed Google Scholar

Buolamwini, J. & Gebru, T. Gender shades: intersectional accuracy disparities in commercial gender classification. In Proc. 1st Conference on Fairness, Accountability and Transparency (eds Friedler, S. A. & Wilson, C.) 7791 (PMLR, 2018).

Beede, E. et al. A human-centered evaluation of a deep learning system deployed in clinics for the detection of diabetic retinopathy. In Proc. 2020 CHI Conference on Human Factors in Computing Systems 112 (Association for Computing Machinery, 2020).

Seyyed-Kalantari, L., Liu, G., McDermott, M., Chen, I. Y. & Ghassemi, M. CheXclusion: fairness gaps in deep chest X-ray classifiers. Pacific Symp. Biocomput. 26, 232243 (World Scientific, 2021).

Seyyed-Kalantari, L., Zhang, H., McDermott, M. B., Chen, I. Y. & Ghassemi, M. Underdiagnosis bias of artificial intelligence algorithms applied to chest radiographs in under-served patient populations. Nat. Med. 27, 21762182 (2021).

Article CAS PubMed PubMed Central Google Scholar

Mamary, A. J. et al. Race and gender disparities are evident in COPD underdiagnoses across all severities of measured airflow obstruction. Chronic Obstruct. Pulmon. Dis. 5, 177184 (2018).

Google Scholar

Oakden-Rayner, L., Dunnmon, J., Carneiro, G. & R, C. Hidden stratification causes clinically meaningful failures in machine learning for medical imaging. Proc. ACM Conf. Health Infer. Learn. 2020, 151159 (2020).

Article Google Scholar

Gianfrancesco, M. A., Tamang, S., Yazdany, J. & Schmajuk, G. Potential biases in machine learning algorithms using electronic health record data. JAMA Intern. Med. 178, 15441547 (2018).

Article PubMed PubMed Central Google Scholar

Larrazabal, A. J., Nieto, N., Peterson, V., Milone, D. H. & Ferrante, E. Gender imbalance in medical imaging datasets produces biased classifiers for computer-aided diagnosis. Proc. Natl Acad. Sci. USA 117, 1259212594 (2020).

Article ADS CAS PubMed PubMed Central Google Scholar

Wang, Z. et al. Towards fairness in visual recognition: effective strategies for bias mitigation. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 89168925 (IEEE, 2020).

Zietlow, D. et al. Leveling down in computer vision: pareto inefficiencies in fair deep classifiers. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 1041010421 (IEEE, 2022).

Alvi, M., Zisserman, A. & Nellaaker, C. Turning a blind eye: explicit removal of biases and variation from deep neural network embeddings. In Proc. European Conference on Computer Vision Workshops 556572 (Springer, 2018).

Kim, B., Kim, H., Kim, K., Kim, S. & Kim, J. Learning not to learn: training deep neural networks with biased data. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 90129020 (IEEE, 2019).

Madras, D., Creager, E., Pitassi, T. & Zemel, R. Learning adversarially fair and transferable representations. In International Conference on Machine Learning 33843393 (PMLR, 2018).

Edwards, H. & Storkey, A. Censoring representations with an adversary. In International Conference in Learning Representations (eds Bengio, Y. & LeCun, Y.) (2016). Editors: Yoshua Bengio and Yann LeCun.

Ramaswamy, V. V., Kim, S. S. Y. & Russakovsky, O. Fair attribute classification through latent space de-biasing. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 93019310 (IEEE, 2021).

Wang, M., Deng, W., Hu, J., Tao, X. & Huang, Y. Racial faces in the wild: reducing racial bias by information maximization adaptation network. In Proc. IEEE/CVF International Conference on Computer Vision 692702 (IEEE, 2019).

Hendricks, L. A., Burns, K., Saenko, K., Darrell, T. & Rohrbach, A. Women also snowboard: overcoming bias in captioning models. In Computer Vision ECCV 2018 Vol. 11207 (eds Ferrari, V. et al.) 793811 (Springer, 2018).

Li, Y. & Vasconcelos, N. REPAIR: removing representation bias by dataset resampling. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition 95649573 (IEEE, 2019).

Quadrianto, N., Sharmanska, V. & Thomas, O. Discovering fair representations in the data domain. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition 82198228 (IEEE, 2019).

Wang, T., Zhao, J., Yatskar, M., Chang, K.-W. & Ordonez, V. Balanced datasets are not enough: estimating and mitigating gender bias in deep image representations. In 2019 IEEE/CVF International Conference on Computer Vision 53095318 (IEEE, 2019).

Corbett-Davies, S. & Goel, S. The measure and mismeasure of fairness: a critical review of fair machine learning. Preprint at https://arxiv.org/abs/1808.00023 (2018).

Friedler, S. A. et al. A comparative study of fairness-enhancing interventions in machine learning. In Proc. Conference on Fairness, Accountability, and Transparency 329338 (Association for Computing Machinery, 2019).

Zong, Y., Yang, Y. & Hospedales, T. MEDFAIR: benchmarking fairness for medical imaging. In International Conference on Learning Representations (eds Kim, B., Nickel, M., Wang, M., Chen, N. F. & Marivate, V.) (2023).

Castro, D. C., Walker, I. & Glocker, B. Causality matters in medical imaging. Nat. Commun. 11, 3673 (2020).

Article ADS CAS PubMed PubMed Central Google Scholar

Subbaswamy, A. & Saria, S. From development to deployment: dataset shift, causality, and shift-stable models in health AI. Biostatistics 21, 345352 (2020).

MathSciNet PubMed Google Scholar

Subbaswamy, A. & Saria, S. Counterfactual normalization: proactively addressing dataset shift using causal mechanisms. In 34th Conference on Uncertainty in Artificial Intelligence 2018 947957 (Association For Uncertainty in Artificial Intelligence, 2018).

Subbaswamy, A., Schulam, P. & Saria, S. Preventing failures due to dataset shift: learning predictive models that transport. In Proc. Twenty-Second International Conference on Artificial Intelligence and Statistics 31183127 (PMLR, 2019).

Huang, B. et al. Behind distribution shift: mining driving forces of changes and causal arrows. Proc. IEEE Int. Conf. Data Mining 2017, 913918 (2017).

Google Scholar

Yue, Z., Sun, Q., Hua, X.-S. & Zhang, H. Transporting causal mechanisms for unsupervised domain adaptation. In Proc. IEEE/CVF International Conference on Computer Vision 2021 85998608 (IEEE, 2021).

Zhang, K., Gong, M. & Schoelkopf, B. Multi-source domain adaptation: a causal view. In Proceedings of the AAAI Conference on Artificial Intelligence 29, 31503157 (AAAI Press, Palo Alto, CA, 2015).

Magliacane, S. et al. Domain adaptation by using causal inference to predict invariant conditional distributions. In Proc. 32nd International Conference on Neural Information Processing Systems 1086910879 (Curran Associates Inc., 2018).

Chen, R. J. et al. Algorithmic fairness in artificial intelligence for medicine and healthcare. Nat. Biomed. Eng. 7, 719742 (2023).

Article PubMed PubMed Central Google Scholar

Vapnik, V. An overview of statistical learning theory. IEEE Trans. Neur. Netw. 10, 988999 (1999).

Article CAS Google Scholar

Peters, J., Janzing, D. & Schlkopf, B. Elements of Causal Inference: Foundations and Learning Algorithms (MIT Press, 2017).

Pearl, J. Causality: Models, Reasoning, and Inference 2nd edn (Cambridge Univ. Press, 2011).

Schlkopf, B. et al. On causal and anticausal learning. In Proc. 29th International Coference on Machine Learning 459466 (Omnipress, 2012).

Verma, T. & Pearl, J. Causal networks: semantics and expressiveness. In Proc. Fourth Annual Conference on Uncertainty in Artificial Intelligence 6978 (North-Holland Publishing Co., 1990).

Pearl, J. & Dechter, R. Identifying independencies in causal graphs with feedback. In Proc. Twelfth International Conference on Uncertainty in Artificial Intelligence 420426 (Morgan Kaufmann Publishers Inc., 1996).

Glocker, B., Jones, C., Bernhardt, M. & Winzeck, S. Algorithmic encoding of protected characteristics in chest X-ray disease detection models. eBioMedicine 89, 104467 (2023).

Article PubMed PubMed Central Google Scholar

Gichoya, J. W. et al. AI recognition of patient race in medical imaging: a modelling study. Lancet Digit. Health 4, e406e414 (2022).

Article CAS PubMed PubMed Central Google Scholar

Jones, C., Roschewitz, M. & Glocker, B. The role of subgroup separability in group-fair medical image classification. In Medical Image Computing and Computer Assisted Intervention 2023 179188 (Springer Nature, 2023).

Mccradden, M. et al. Whats fair is fair? Presenting JustEFAB, an ethical framework for operationalizing medical ethics and social justice in the integration of clinical machine learning: JustEFAB. In Proc. 2023 ACM Conference on Fairness, Accountability, and Transparency 15051519 (Association for Computing Machinery, 2023).

Chiappa, S. Path-specific counterfactual fairness. In Proceedings of the AAAI Conference on Artificial Intelligence 33, 78017808 (AAAI Press, Palo Alto, CA, 2019).

Friedler, S. A., Scheidegger, C. & Venkatasubramanian, S. On the (im)possibility of fairness. Preprint at https://arxiv.org/abs/1609.07236 (2016).

Wachter, S., Mittelstadt, B. & Russell, C. Bias preservation in machine learning: the legality of fairness metrics under EU non-discrimination law. West Virginia Law Review 123, 735790 (2021).

Hardt, M., Price, E. & Srebro, N. Equality of opportunity in supervised learning. In Advances in Neural Information Processing Systems (eds Lee, D. et al.) 29, 33233331 (Curran Associates, 2016).

Zemel, R., Wu, Y., Swersky, K., Pitassi, T. & Dwork, C. Learning fair representations. In Proc. 30th International Conference on Machine Learning 325333 (PMLR, 2013).

Dutta, S. et al. Is there a trade-off between fairness and accuracy? A perspective using mismatched hypothesis testing. In Proc. 37th International Conference on Machine Learning 28032813 (PMLR, 2020).

Wick, M., panda, s. & Tristan, J.-B. Unlocking Fairness: A Trade-off Revisited. In Advances in Neural Information Processing Systems Vol. 32 (Curran Associates, Inc., 2019).

Plecko, D. & Bareinboim, E. Causal fairness analysis. Preprint at https://arxiv.org/abs/2207.11385 (2022).

Mao, C. et al. Causal transportability for visual recognition. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 75217531 (IEEE, 2022).

Pearl, J. & Bareinboim, E. Transportability of causal and statistical relations: a formal approach. In Proceedings of the AAAI Conference on Artificial Intelligence 25, 247254 (AAAI Press, Palo Alto, CA, 2011).

Jiang, Y. & Veitch, V. Invariant and transportable representations for anti-causal domain shifts. Adv. Neur. Inf. Process. Syst. 35, 2078220794 (2022).

Google Scholar

Wolpert, D. & Macready, W. No free lunch theorems for optimization. IEEE Trans. Evol. Comput. 1, 6782 (1997).

Article Google Scholar

Holland, P. W. Statistics and causal inference. J. Am. Stat. Assoc. 81, 945960 (1986).

Article MathSciNet Google Scholar

Schrouff, J. et al. Diagnosing failures of fairness transfer across distribution shift in real-world medical settings. In Advances in Neural Information Processing Systems 3, 1930419318 (Curran Associates, 2022).

Bernhardt, M., Jones, C. & Glocker, B. Potential sources of dataset bias complicate investigation of underdiagnosis by machine learning algorithms. Nat. Med. 28, 11571158 (2022).

Article CAS PubMed Google Scholar

Szczepura, A. Access to health care for ethnic minority populations. Postgrad. Med. J. 81, 141147 (2005).

Article CAS PubMed PubMed Central Google Scholar

Richardson, L. D. & Norris, M. Access to health and health care: how race and ethnicity matter. Mt Sinai J. Med. 77, 166177 (2010).

Article PubMed Google Scholar

Niccoli, T. & Partridge, L. Ageing as a risk factor for disease. Curr. Biol. 22, R741752 (2012).

Article CAS PubMed Google Scholar

Riedel, B. C., Thompson, P. M. & Brinton, R. D. Age, APOE and sex: triad of risk of Alzheimers disease. J. Steroid Biochem. Molec. Biol. 160, 134147 (2016).

Article CAS PubMed Google Scholar

Dwork, C., Immorlica, N., Kalai, A. T. & Leiserson, M. Decoupled classifiers for group-fair and efficient machine learning. In Proc. 1st Conference on Fairness, Accountability and Transparency Vol. 81 (eds Friedler, S. A. & Wilson, C.) 119133 (PMLR, 2018).

Boyko, E. J. & Alderman, B. W. The use of risk factors in medical diagnosis: opportunities and cautions. J. Clin. Epidemiol. 43, 851858 (1990).

Article CAS PubMed Google Scholar

Iglehart, J. K. Health insurers and medical-imaging policya work in progress. N. Engl. J. Med. 360, 10301037 (2009).

Article CAS PubMed Google Scholar

Iglehart, J. K. The new era of medical imagingprogress and pitfalls. N. Engl. J. Med. 354, 28222828 (2006).

Article CAS PubMed Google Scholar

Irvin, J. et al. CheXpert: a large chest radiograph dataset with uncertainty labels and expert comparison. Proc. AAAI Conf. Artif. Intell. 33, 590597 (2019).

Google Scholar

Johnson, A. E. W. et al. MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Sci. Data 6, 317 (2019).

Article PubMed PubMed Central Google Scholar

Jiang, H. & Nachum, O. Identifying and correcting label bias in machine learning. In Proc. Twenty Third International Conference on Artificial Intelligence and Statistics 702712 (PMLR, 2020).

Gebru, T. et al. Datasheets for datasets. Commun. ACM 64, 8692 (2021).

Read more:
A causal perspective on dataset bias in machine learning for medical imaging - Nature.com

Related Posts

Comments are closed.