What are Biases in Machine Learning

What are Biases?

Lesson Details:
June 29, 2020

I: Introduction

A: Bias is a common problem in machine learning. In this article, I will go through the topic of bias and discuss how it affects our society.

II: Body

A: What are biases?

Bias is a problem that occurs when a machine learning model treats different groups of a population differently. It means a model can be trained to make predictions about a specific group but not another, which could lead to discrimination.

An example of a bias is a model that has been trained to determine whether a person is going to contract a disease or not. The model can predict that based on data from patients that have been diagnosed with the disease, and those that haven’t, and it does so successfully. However, if we try to train the same model on people with brown skin color, the model will tend to predict that those people are more likely to contract the disease than white people with similar symptoms. This is because the model has learned from data about white people and has not learned enough about brown skin color.

Another type of bias is called under-representation bias. An example of an under-representation bias would be training a model on data from fashion websites such as http://www.polyvore.com/. It would learn that women like pink clothes and fishtail braid hairstyles. However, if we train the same model on data from LinkedIn profiles, it would learn that men like pink clothes and fishtail braids, which would lead to discrimination against women in fashion and hairstyle choices.

III: Conclusion

A: These biases can be controlled by training models on larger datasets and cross-checking them with real-world data.

Course content