Investigating Oversampling Techniques for Fair Machine Learning Models

Please use this identifier to cite or link to this item: https://rfos.fon.bg.ac.rs/handle/123456789/2263

Title:	Investigating Oversampling Techniques for Fair Machine Learning Models
Authors:	Rančić, S. Radovanović, Sandro Delibašić, Boris
Keywords:	SMOTE;Oversampling;Machine learning;Data preprocessing;Algorithmic fairness
Issue Date:	2021
Publisher:	Springer Science and Business Media Deutschland GmbH
Abstract:	Applying machine learning in real-world applications may have various implications on companies, but individuals as well. Besides obtaining lower costs, faster time to decision and higher accuracy of the decision, automation of decisions can lead to unethical and illegal consequences. More specifically, predictions can systematically discriminate against a certain group of people. This comes mainly due to dataset bias. In this paper, we investigate instances oversampling to improve fairness. We tried several strategies and two techniques, namely SMOTE and random oversampling. Besides traditional oversampling techniques, we tried oversampling of instances based on sensitive attributes as well (i.e. gender or race). We demonstrate on real-world datasets (Adult and COMPAS) that oversampling techniques increase fairness, without greater decrease in predictive accuracy. Oversampling improved fairness up to 15% and AUPRC up to 3% with a loss in AUC of 2%.
URI:	https://rfos.fon.bg.ac.rs/handle/123456789/2263
ISSN:	1865-1348
Appears in Collections:	Radovi istraživača / Researchers’ publications

18

checked on Nov 17, 2025

Check