MIT Researchers Develop Method to Enhance Machine Learning Fairness
Brief news summary
MIT researchers have created a novel method to improve the fairness and accuracy of machine-learning models by addressing dataset biases that often leave certain groups underrepresented. These biases can lead to significant errors, such as misdiagnoses when models primarily trained on male patient data are applied to female patients. Traditional solutions often involve removing large segments of data, which can negatively impact model performance. Led by Kimia Hamidieh, the MIT team developed a technique that selectively removes biased data points affecting minority groups while preserving overall model accuracy. This method identifies hidden biases in unlabeled datasets, enhancing fairness, particularly in critical sectors like healthcare. It complements existing fairness strategies, leading to more comprehensive solutions. Their approach focuses on reducing "worst-group error," where models falter with minority subgroups. Utilizing a technique called TRAK, the team identifies and removes problematic data points causing inaccurate predictions, allowing for retraining without needing to change the model's structure. This flexibility is essential for various model types, especially when subgroup labels are not well-defined. The new method outperforms existing techniques on three datasets, achieving higher accuracy with fewer data removals compared to traditional methods. Supported by the National Science Foundation and DARPA, this research constitutes a significant advancement in developing fair and reliable machine-learning models. The team is dedicated to refining this technique for practical applications.Machine-learning models often underperform for minority groups due to imbalanced training datasets, which can lead to incorrect predictions. For example, a model trained primarily on data from male patients may not accurately predict treatment for female patients. To address this, engineers sometimes balance datasets by removing data points, but this can harm overall model performance. Researchers from MIT have developed a method that selectively removes data points that most contribute to a model's poor performance on minority groups, maintaining model accuracy and improving fairness. This technique can also reveal hidden biases in datasets that lack labels, which is useful as unlabeled data is more common.
The method has shown better performance than existing approaches by reducing the number of removed samples and increasing worst-group accuracy. It offers an accessible way to enhance model fairness without altering the model's architecture, making it a potentially useful tool for practitioners. The researchers aim to further validate and improve this approach, supporting the development of fairer and more reliable models. This research is supported by the National Science Foundation and the U. S. Defense Advanced Research Projects Agency.
Watch video about
MIT Researchers Develop Method to Enhance Machine Learning Fairness
Try our premium solution and start getting clients — at no cost to you