What is Machine Learning?

I get this question a lot so I figured I would take some time to explain it, especially for people who are not programmers. Machine learning is how we teach the machine, or in this case, computer, to think for itself by learning how to use available data. We generally put these tools for learning in if/then statements. So all day long, you’re providing data to the computer about what you like, what you do, how you do it, etc. And someone told the computer that if you do X, you’ll also do Y, or if you like X, you’ll also like Y. So you can imagine how many companies are clamoring to know what the Xs and Ys are.

We correct or confirm that these algorithms are correct when we do the behavior that it predicted. In data science, put simply, the percentage of correct predictions is called entropy. So a programmer should set entropy at a high enough rate to make the thing true, or valid.

There are so many dilemmas with this. So many…I mean, a tremendous amount. But I am only going to focus on two. I am most interested in how racism shows up in these and the types of information bias that occurs and why.

The first is the data. I think Simone Browne does an amazing job in explaining this, but I’ll do my best here. We don’t collect data on everything and the data we collect is biased. Programmers like to say that data is unbiased and they especially don’t like when people imply that they are racist. But again, we have to move beyond an understanding of racism that only includes white hoods and segregated busses. Racism is a systemic meaning we all participate in the systems and some of us have the power to create the systems whether we meant to or not.

In one of my courses last year, we talked about a study that aimed to look at the impact of violence on students in the Southside of Chicago. But, except for instances where the person was already dead, the only crimes that were recorded were those that the police exacerbated. They also did not record white collar crime or “violent” crime in the northern parts of Chicago. On average, Black students in Chicago travel thirty minutes more to get to school compared to their White counterparts, and Black students pass four more schools than their White counterparts on the way to their schools. We also know that Black people on the Southside are surveilled more than anyone else and we only surveil them for certain crimes. You’ve probably heard the stories of police literally turning off their body cameras during certain interactions. So the questions become: Are you looking at data for police crime? And which neighborhood? Which school? Why those crimes?

This means that when you train the computer to look for patterns, it will only find the patterns in the data that is available to it.

In the work that I do, this all means that we have to collect our own data. For example, there are some universities where I have never seen a rejection letter from them. When I make the algorithms, I’m setting things up to look for the patterns that get me to that same 100%, not 5% as most universities do. It’s not an accident that we continue to be successful and maintain that 100%. We set it up that way. But I had to include data that no one else includes, such as access to a qualified Calculus teacher. The algorithm knows to look elsewhere when it doesn’t find what it wants on that data point and I’ve collected the data so it knows what to do.

That brings me to the other problem, and that’s with entropy. In one of my programming classes, a student asked what entropy is best. The professor confirmed something I read in one of the textbooks: anything above 50%. Then the student asked, how else would you know if what you did was correct? The professor said to ask an expert. The student asked, how do you know who is an expert? The professor said, whoever agrees with 50%. All of you went to school and you know that 50% isn’t good. If I ordered food and they came back with half a burger and half my fries, you wouldn’t say, “that seems right.” But somehow, that is the bar in data science.

Now consider all the things that they are predicting. It isn’t just what you’ll put in your Amazon cart. They’re predicting for medical care, education, employment, EVERYTHING. And they know they’ll only be right 50% of the time. Now think about those complaints you hear about the high death rates for Black women during child birth, or you wondering why you didn’t get a call back for that job when you had fantastic qualifications. This is a combination of how the data is collected and entropy. In the case of high death rates during child birth, we don’t collect data when death rates are low. So when the algorithm is wrong and the machine wants to look somewhere else, it has no where to look.


Consider the language used when George Floyd was murdered, prior to the release of the cell phone tape. It was accidental, he already had a health condition, it was in the commission of a crime, and emergency response was swift. Consider what you now know about what happened. Even for that characterization, if all police crimes against Black people are listed as “accidental,” then we’re also interpret that data as an accident. This is especially significant because there really isn’t anything we can do about things that are actually accidents, so we just keep repeating the same cycle.

This actually brings me to a third problem. I know I promised two, but this is like 2B. “Correct” means that it can be validated. So when we say 5% of Black students are admitted to X university, 5% is correct. We’re using historical data to make that prediction. The algorithm has no where else to look. When we say 70% of Black women survive child birth, 70% is correct because it can be validated. The computer adapts and now the if/then statement becomes if you are Black, you have a 70% chance of surviving child birth. We attach the outcome to the available data, in this case, race, as it was set up. It becomes eugenic as though Black women are biologically only capable of having successful births 70% of the time. But we know that this was created, meaning the reason why Black women die at higher rates during child birth is because they are more likely to be without health insurance, or they are more likely to give birth in hospitals that will force a C-section on them, or doctors don’t believe them when they say they’re in pain, and so on. That’s not genetic, that’s systemic. But the hospital will create a care plan to meet that 70%, so the pattern continues.

The same goes for college—the admissions rates don’t have to be what they are, but we make it so by not providing access to resources, or believing what people say when they say college isn’t for us, or believing that a six-year graduation rate is the new norm. None of these things HAVE to be true. We make them true. A really great example of this is the college rankings. College ranks high, more students apply to the college, college rejects more students, and the measurement on admissions rates holds firm. We made that happen. Meanwhile, most of the things that go into college rankings actually have very little to do with the standard student experience.

Consider all the things we make true because we believe it’s already true. And click on something else.

Previous
Previous

What is Critical Race Theory?

Next
Next

Who is Underserved?