Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
600k Images Removed from ImageNet After Art Project Exposes Racist Bias (hyperallergic.com)
229 points by aaronbrethorst on Sept 23, 2019 | hide | past | favorite | 289 comments


Superficial judgement is kind of where intelligence starts.

It's really only people where you can't tell what it does/is from the outside. Cars, trees, animals, mountains... everything else, if it looks a way it acts that way. Early AI will probably have just as much trouble with this as people have historically.

I really wish people would start viewing Racism as a willingness to let that primitive part of the mind be in command, rather than a binary attribute you either have or don't. Like, nobody is 100% not racist. There will always be slip ups, over-simplifications, snap judgements, subconscious or not.


>It's really only people where you can't tell what it does/is from the outside.

Thing is, that's not even true in that it doesn't fully acknowledge the problem. You often CAN tell information from people's outward appearance, albeit probabilistically. Therein lies the problem: you can very easily train an algorithm to be maximally right according to your cost function, but end up biased because the underlying distributions aren't the same between groups.

The issue is that as a society we've (mostly) decided that unfairly classifying someone based on correlated but non-causal characteristics is wrong, EVEN in the extreme that you're right more often than you're wrong if you make that assumption.


>You often CAN tell information from people's outward appearance, albeit probabilistically. Therein lies the problem: you can very easily train an algorithm to be maximally right according to your cost function, but end up biased because the underlying distributions aren't the same between groups.

In fact, an AI trained on aggregate data to probabilistically infer characteristics about individuals is _literally a stereotyping machine_.

If people are upset that their stereotyping machine stereotypes people, they probably didn't fully understand what they were doing when they built it, because this is not a design flaw -- it's the design.


Maybe people are upset that stereotyping machines are being given harder powers to make consequential decisions about individuals.


> Maybe people are upset that stereotyping machines are being given harder powers to make consequential decisions about individuals.

Then these people should argue against AI instead of swinging the racism cudgel.


I would respectfully disagree. The builder of the AI should be trained in recognizing that his discrimination machine can be used for good and for bad. If the creation shows racist tendencies, it's an outcome of the machine but a function of the (lack of) quality in the modeller. If the end result is racism, I would like to be able to point to the creator of the AI (a human) not a piece of software.

More concrete: government AI shouldn't use things like names, zipcode demographics (at least those strongly to those characteristics we think discrimination = racism) and pictures of humans in their models. Why? Because it's pretty much impossible to control your model for racist tendencies one you start there. It's in the ethics of the creator of the model to point that out and just don't do it. If you do, and all people whose name start with an M (for Mohammed) get a different category, racist is the right term IMHO.


> The builder of the AI should be trained in recognizing that his discrimination machine can be used for good and for bad.

Quite a tautology.


It could work where a large number of AI's are constructed. A small subset of these AI's--those that can only be used for good--are used as a training set. A number of AI's that can be used for bad are added to the training set. The AI builder is exposed to this training set for a period of time, and on each exposure he is rewarded if he correctly categorizes each AI by its ability to be used for good or bad. After the AI builder demonstrates an ability to properly differentiate the AI's that can only be used for good from those that can be used for good or for bad, he is set loose on constructing a new AI, after which he is compelled to render (and publicize) a judgement on its potential use for good or bad. Alternatively, the builder can also be tasked with choosing only-good AI's from larger mixed set of good/bad AI's.


He's not wrong though.


It could just be a path forward. In America a lot of the time we can't get any progress unless it's a racism issue. Things like gerrymandering or marijuana legalization get their first pushes because the most blatant group that suffers are POCs. The fact that it harms everyone is a little lost on most people, but in the end the racism cudgel can be effective for positive societal change. In this case, we can use it to get rid of automatically being identified for crimes or whatever by an AI.


AI is very definitely not procedurally fair unless you can explain all behaviors. Which is why designers generally don't know what they built.


That is true but be careful with the terminology! A system trained using aggregate data to probabilistically infer characteristics isn't artificial intelligence. If the system could find causations, then there would be grounds for calling it intelligent. But finding correlations, that's just number crunching.


See also Stucchio's impossibility theorem: it is impossible for a given algorithm to be procedurally fair, representationally fair, and utilitarian at the same time:

Video: https://www.youtube.com/watch?v=Zn7oWIhFffs

Slides: https://www.chrisstucchio.com/pubs/slides/crunchconf_2018/sl...


The general point is that you have to robustly compromise and satisfice all the goals. People tend to be rather good at it when taken as a group. (Any particular person may be bad at a given subset of all problems.)

It is a kind of optimality condition on all three goals.

The robustness additionally means that should conditions change, the algorithm usually will become better not worse and should a degradation still happen, it will be graceful and not catastrophical.

It's a hard and open problem in ML and especially ANN, design of robust solutions in the space. Most have really bad problems with it even when debiased.


Hello - do you have a good reference to this area? I bang on about similar ideas whenever allowed, but haven't found good support in the literature.

Not that I mind that too much!


Good presentation


>The issue is that as a society we've (mostly) decided that unfairly classifying someone based on correlated but non-causal characteristics is wrong, EVEN in the extreme that you're right more often than you're wrong if you make that assumption.

This is likely due to an acknowledgment of the limits of human models to account for the full context surrounding correlated-but-non-causal classifications, such that conclusions drawn from them can have unforeseen or highly detrimental ramifications.

Speaking to race in America specifically, the schema through which we judge people are highly susceptible to bias from the white supremacist bent of historical education and general discourse. This is how you end up with cycles like those within the justice system (pushed in part by sentencing software), wherein black defendents are assumed to have a higher likelihood of re-offending, therefore increasing the likelihood of any given black defendent not receiving bond or having a lengthy sentence if convicted. After all, blackness correlates with recidivism. Lying outside this correlative relationship is the likely causal relationships of longer stays in jail and lack of access to employment opportunities, which disproportionately affect black people, causing higher rates of recidivism, regardless of race.


You have other measures than prison.

You can still have enhanced vigilance without enhanced annoyance and mistakes.

There's often a superior choice lurking that nobody is thinking about, sometimes expensive, sometimes not, seemingly unrelated to such optimization. This is why ML is not intelligence, it cannot find new solutions you're not already looking for.

The main problem of the judicial and police system is it tries to be procedurally fair and still fails at it anyway.


I'd counter that in many cases it doesn't even try to be fair. It privileges the ability to craft an argument over bare facts, which immediately privileges those who can afford professional representation. At the core of the fear of a surveillance state isn't simply the loss of privacy (which in and of itself could be worth the accuracy it would bring to judicial proceedings), but the fact that it would just bolster the ability of skilled narrative-builders to pull the most advantageous facts out of context and twist them to their whims.


> we've (mostly) decided that unfairly classifying someone based on correlated but non-causal characteristics is wrong, EVEN in the extreme that you're right more often than you're wrong if you make that assumption.

Ironically, the only places where it's legally prohibited or frowned upon to use these heuristic techniques are situations that people have arbitrarily (heuristically or conveniently) decided.

For example: It's "not fair" to hire someone because they're white (and consequently have a higher chance of being wealthy and hence a higher change of being educated.)

But it's "fair" to choose a love partner based on their height, their waist-to-hip ratio, their weight (and hence having a higher chance of giving birth to healthy offspring, better physical protection, etc.).

Maybe it's hypocritical, and I don't know if that's a good thing or not. Maybe being hypocritical helps us survive.


The hiring vs mating issue is perhaps simpler than you picture. US federal discrimination law only applies to companies with more than 15 people. If we regularly married 15+ people at a time, we might very well put legal restrictions on your mating choices. The more personal the decision, the more agency you get.


wow, interesting. Why does it only apply to companies with more than 15 people? Is it the idea that you're more likely to have family help (and only family being willing to help) when your company (more small business than traditional startup) is this small?


If you are starting a small company you either pick people you already know (whoever those might be) or maybe a few random experts with very specific skillsets. There is no place for you to actively discriminate someone that would have maybe been a better pick, just because you didn't like their skin color or gender... Or if you still do it comes at your own loss.


> if you still [discriminate] it comes at your own loss.

This applies to companies of any size.


A job contract is naturally a relationship between two entities: employee and employer.

Sure, some people in some countries have tried (sometimes successfully) to undermine this principle, but it's akin to forcing people to marry in groups.


While (modern? ideal?) marriage is a peer-to-peer relationship, the relationship between employee and employer is unbalanced, more like pet-owner.

Societal restrictions on contracts are a way to balance this. Virtually all societies have some form of this, outside utopian ultra-liberal hellholes.


That's not very arbitrary. If you're hiring someone, you're always in a position of power. If you're dating someone, there's no power differential (or if there is, that's a problem all by itself).


How are you "always in a position of power" when hiring someone? That's only true when there's more supply than demand, and it's the opposite in markets where the candidates get multiple high-quality offers to choose from.


Because you are paying that person money and have the ability to fire them. In the US you're also probably providing their health care.

I get what you're saying, but no one moves jobs every week. The sunk costs of switching employment are significant for the employee, less so for the company.


They provide you valuable work in exchange for that... Internally you are probably imagining some big corp that can pick from 100s of replaceable workers.


If you provide health care, you always have power over that person. Less so if they are relatively healthy - but any condition, theirs or a family members, means that the person has no real choice but to do enough good work to keep the job. Even if they hate the company. Even if you treat them poorly. They still must work for you. This is even more true if you have hundreds of people that will replace them and your health coverage is good enough - or at least, better than the opposition.

To a lesser degree, the same goes with vacation time and other benefits. At least in the US, anyway. This is why having some of this stuff coded into law and decoupled from employment takes some power away from employers.


Ideally there are multiple companies which you could choose from... And some might really need your specific skillset.


In practice, that's how it works, particularly at the lower end of the economic spectrum. If that wasn't the case then the concept of a minimum wage wouldn't be necessary - the market would take care of it.

Of course, this is a point of view and not everyone agrees with me, but to me it appears that for a chunk of the population the available jobs do not pay well enough to meet a certain standard of living.


> it appears that for a chunk of the population the available jobs do not pay well enough to meet a certain standard of living.

That is definitely true, but it's also pretty much the exact opposite of "always".


> If you're hiring someone, you're always in a position of power.

What? Can you explain the reasoning behind this statement?


If the candidate had more power, she would set up interviews and force the employers to impress her.

Of course, that scenario happens only in extreme edge cases. Even in a booming economy with a shortage of workers, John Doe is not going to be pursued aggressively to fill the Senior Marketing Manager role.

This makes intuitive sense: employers have a ton of money, so people come to them.

Right now I have a client who needs to hire truck drivers and can't do it fast enough. I asked him what he'd done to make his company the most attractive (pay, technology, perks, etc.). He said he's done nothing.


> John Doe is not going to be pursued aggressively to fill the Senior Marketing Manager role.

That's exactly what recruiters and headhunters do. Aggressively nails it.


Recruiters and headhunters don't necessarily seek people to fill a role. They seek people to fill their batch of application forms to send to those actually doing the hiring.


Recruiters don't replace the interview process, where the dynamic is that the applicant is the interviewee. They only change the way the applicant discovers the job.

Most people will never be headhunted.


If candidate had more power, companies would create whole section of company to try to find and hire talents. Companies would literally pay to find candidates.


By that same logic, it's unfair for attractive people to choose who they date, because they're in a position of power.


There are two possible situations:

A candidate has multiple job offers, and decides which one to take.

The company interviews multiple people for a single position, rejecting the others.

There are only a small number of sectors where the first situation is reasonably possible, a lot of us on here are incredibly fortunate that engineering happens to be one of them. For the majority of the job market (by volume of people rather than volume of money), it takes people attempt after attempt to get a job. They don't get to choose between multiple offers, they have to take the first thing that will allow them to pay the rent, and then they have to hold on to it.


And yet from the point of view of the one who everyone is discriminating against, the feeling is pretty similar: Everyone rejects me and there's nothing I can do about it.


Turning it around though. It's entirely fair to hire someone because they are well educated, which presumably means you're disproportionately hiring white people.

Europe has the concept of indirect discrimination, which could make that illegal, certainly things less central to the role could amount to indirect discrimination.


This is over stated:

"are situations that people have arbitrarily (heuristically or conveniently) decided"

The Holocaust was not convenient, even to those who were for it. Slavery was convenient for those who benefited from it, but not those who suffered under it. Over the last 200 years racial categories lead to many millions of dead. This is not merely a matter of convenience.


The holocaust and slavery fall in the realm of physical violence, or at least coercion.

Violence and coercion don't logically follow from racial differentiation. One may point out the differences between populations of different races, but that wouldn't justify attacking any individual from those populations.

My comment is framed inside that basic (and obvious) principle. Choosing a partner or an employee is not a violent or coercive act.


> The issue is that as a society we've (mostly) decided that unfairly classifying someone based on correlated but non-causal characteristics is wrong, EVEN in the extreme that you're right more often than you're wrong if you make that assumption.

That sounds like the wrong basis for calling it extreme. It's not at all extreme to say that classifying an interview candidate based on correlated but non-causal characteristics is wrong, regardless of the statistical significance of those correlations.


I just mean that I'd guess most stereotypes have a correlation nowhere near 0.50, but it'd be wrong to use them to screen candidates even if it was 0.50 or greater. I used the word "extreme" to convey the sense that such a scenario of stereotype accuracy is very unlikely.


Most stereotypes are lagging indicators that do not tell us about the future, as the history of Fenty Beauty makes clear.


I don't believe Fenty Beauty has been discussed on Hacker News before seeing as how it's a line of cosmetics, so maybe you could expand on what you know about its history that others may not have been following in as much detail?


In 2017, Fenty Beauty found an underserved market (dark-skinned women and high-quality makeup) and made a killing selling them what they wanted. If you relied solely on history and stereotypes, you would believe nobody could make $500 million dollars that way.

https://www.fool.com/investing/2019/05/15/rihannas-fenty-is-...


Thank you very much!


> EVEN in the extreme that you're right more often than you're wrong if you make that assumption.

And this is actually the rational thing to do. Reason: There are two potential errors involved here: labeling an innocent person a criminal (false positive) and labeling a criminal as innocent (false negative). The key is to realize that the cost of these two errors is not the same. For instance, treating an innocent person as a criminal could be much more expensive for the person and society than not detecting a criminal with a given classifier. For that very reason, we have the presumption of innocence as a principle in law. As a consequence, you don't want to select a classifier based on just the rate of errors overall, but you want to incorporate some kind of loss function that minimizes the cost for individuals and society. Under that loss function, the best classifier may actually be wrong more often than some other classifier.


I'd just flag that "EVEN in the extreme that you're right more often than you're wrong" is not a meaningful line (if someone wanted to apply it) because the fact is that it's not the number of wrong calls or right calls that matter here - it's which ones and when. For example the project under discussion classifies a photo of my mother as a young woman as a "flibbertigibbet", which is amusing, as a way to tease her now (she is a 70 year old ex prison and probation officer and needs no protection from teasing).

However, had this been done to my daughter and the same result obtained (in fact the result was "blue" but there you go) by someone assessing her potential as a candidate for university - well there's damage.


The fact is that we live in a society


In many cases, being logically correct is more important than being politically correct.

For example: It may be politically correct to drive down a bad neighborhood in the middle of the night, but it isn't logically correct.


True, but poorly stated. Which is to say, you aren't wrong, but you're missing the indications of nuance which are really, really important to the personal liberty of not being defined by superficial traits.


It logically correct to do that when it's your neighborhood and you're driving home.

But note the implicit bias in your own comment: you assume yourself and all the readers of the comment are not people who live in bad neighborhoods.


>But note the implicit bias in your own comment: you assume yourself and all the readers of the comment are not people who live in bad neighborhoods.

that assumption results simply from the need for 'bad neighborhood' to be a negative scoring action.

if you oversimplify it to get rid of 'implicit bias' (which I don't agree exists in the example), the results turn into near-meaningless babble.

"For example: It may be politically correct to do something that ignores statistical dangers in favor of the promise of human goodness, which may result in the possibility for more personal endangerment than other choices, but it isn't logically sound to ignore such statistics for the hope of a less biased personal experience."

The example requires the person driving to be detached from the bad neighborhood that they have a choice to drive through. How that isn't an obvious requirement for the example to have merit is beyond me.


That is not technically even true. If you live in a bad neighborhood it's still probabilistically better to drive home thru a good neighborhood than thru another bad neighborhood.


While there may be some bias, it is as much yours. Even people who live in bad neighborhoods avoid driving through them when possible.


Did you live in really bad neighborhood? I did. Out of two bus stops I always used one that required to cross two streets with no crosswalks. The other required to walk right through the middle of my lovely vicinity with 50% probability of giving something up to ‘charity’.


Well, it's especially in those situations that you need institutional controls to forbid the logically correct choice.

After all, you don't really need a law telling companies they aren't allowed to hire infants as senior officers - it's already not in the company's interest to do so.

However, when there is a logically correct but politically incorrect decision that the company could make, it is now that you need laws to prevent the company from taking that choice.

Of course, as the weight of an institution's decisions goes down, so does the need to police its actions. In particular, it is rarely necessary to prevent an individual person from acting on their biases.

Applying this to your example, if we had an AI that should suggest your best route home, and it avoided a short route through a bad neighborhood, that is likely ok. However, if a municipality used the same AI to decide where to prioritize changing street lights, that should be prohibited.


> The issue is that as a society we've (mostly) decided that unfairly classifying someone based on correlated but non-causal characteristics is wrong, EVEN in the extreme that you're right more often than you're wrong if you make that assumption.

Sorry, but it's not a decision. Science has found repeatedly that using outward characteristics does not work as a good classification measure. Society simply enforces not making bad judgements. See:

Phrenology, Racism -- "appearance implies certain things about someone's character and intelligence" (https://en.wikipedia.org/wiki/Phrenology)

Sex -- "A person's gender presentation determines their gender or sex" (https://en.wikipedia.org/wiki/Transgender_history), "A person's genitalia determines their chromosones", "A person's chromosones determine their sex and their ability to give birth" (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2190741/).

Wealth -- "That person is wearing bad clothing/living an unassuming life therefore they must be poor" (https://www.npr.org/2018/12/29/680883772/social-worker-led-f...)

Intelligence -- "This person scores badly on intelligence tests therefore they must not be worthwhile" (https://en.wikipedia.org/wiki/Richard_Feynman, https://www.psychologytoday.com/gb/blog/sudden-genius/201101...), "This person appears to be unskilled or incapable of many tasks therefore they must be totally unskilled" (http://en.wikipedia.org/wiki/Savant_syndrome)


> Science has found repeatedly that using outward characteristics does not work as a good classification measure.

Science has found that, on average, it works [1]. Of course there will be many cases where it fails - that's how statistical inference works. Whether that makes it a 'good' measure, by whatever standard, is a different question. But there is no doubt information can be inferred from appearance.

[1] http://www.spsp.org/news-center/blog/stereotype-accuracy-res...


>Sorry, but it's not a decision. Science has found repeatedly that using outward characteristics does not work as a good classification measure.

Lol, what? Generically, that statement is almost certainly false more often than true in general in science. But I believe you are restricting yourself to social psychology?

You then presented a long list of examples taken from the tails of certain distributions to refute an argument that said distribution exists and has an average? I didn't even name any particular distributions. You're thinking appears flawed and emotionally driven, and most unfortunately, that's the type of thinking that will lead you to building biased systems.

Here's the point you missed the first time around: There are going to be outwardly visible characteristics that ARE correlated with some factor of interest, to the extent that training a machine learning algorithm based on a cost function that uses predictive accuracy alone WILL result in a system that assigns what society would consider an inappropriate importance placed on non-causal but correlated parameters.

Here's a real world example that might help you understand why this is important: (Data taken from: https://en.wikipedia.org/wiki/Incarceration_in_the_United_St...) Because blacks are over-represented in the US criminal justice system (40% of the prison population vs 13% of the population) and because part of what defines "black" is the outward appearance of certain facial features, a facial-recognition algorithm which is trained to recognize criminals, with a cost function based on prediction accuracy alone, and facial features as input parameters is going to have false positives that over-represent blacks. Does that sound like something you want? Because denying the underlying distributions is going to lead to exactly that.

It's very important to consider this when you develop a training set, for fucks sake. It might work something like this: Take 100 innocent people's faces at random. (On average it will have only 13 blacks) Then take 100 random criminal faces from inmates. (On average it will have 40 blacks.)

Then mix up the groups into your training set and assign a prediction score 1 or 0 depending on whether or not your classifier has correctly predicted whether or not a face was in the criminal group. Then, based on no other feature than race, your neural net can get better performance based solely on guessing more often that black people are criminals. That's not a good thing.

Do you get it now? The likelihood of being falsely identified as a criminal is greater based on the non-causal but correlated variable of being black. And this has happened several times already! You can't keep pushing this narrative that neglects the underlying statistics because of your beliefs, or people will keep making racist systems.


It seems that some tasks are just inherently racist (and sexist, etc.), and we should be able to identify them before someone inflicts it on society.

Trying to identify potential for criminality based on looks may well work, identifying and using the underlying racial biases. And if those systems are used, where people identified by these systems are more likely to be identified as criminals and investigated, we end up with a feedback loop and, over time, the racial biases in a society that uses the systems will get worse. More black criminals will be caught, as they are more likely to be suspected as criminal, making the racial biases worse for blacks. While the opposite happening for white faces.

Similar social evolution would happen if you pre-screened job candidates, over time magnifying existing gender and racial biases. I've seen in some cultures it is required to add photographs to job applications, but I think it a good thing that this practice is discouraged in western countries.


So, I disagree with the "inherently racist" portion of your argument. You can evaluate the classifiers in a race independent manner, i.e. take a look at the metrics stratified across race. I'm going to proceed here assuming that the classification task is possible.

Say in your example of criminality, if black people were more predicted by the classifier, it isn't racist if it accurately reflects the base distribution. There's a line to be crossed here, and to me it's if that application starts to significantly (where the boundary lies here is up for debate), and directly affect the non-criminal portion. I'd say that if the misclassification error is similar across racial lines, then there is no issue.

Additionally, I don't quite agree with the "making racial biases worse" argument either. The way I see it, we already use racial heuristics in law enforcement. With automated, replicable tasks, we can at least quantify the degree of bias and correct for such.


The main question remains: what price do you want to pay for procedural fairness, why is it even a major goal?

Most people would probably opt for utility mixed with representational fairness of some degree even when it means law applies to some degree differently to groups and special cases.

Justice is not fairness. It has an institution called motive. (Which is often overly simplified or ignored.)

It incorporates elements of fairness. And it is hard to train. Not everyone can be Solomon, for example.


This is truly where the danger lies. You can say that you're trying to make rulesets that are accurate to the world we live in, without acknowledging that you fall heir to, and may in fact be producing this generation's version of, historical policy decisions that contributed to or outright created racial division and disparities in the first place. Then and now exist an appeal to empiricism that takes the current state as the natural order, and not one manufactured by past decisions which aimed specifically at a particular, and not altogether organic, conclusion.


I haven't seen a single use case, or a proposition, to use AI classification on such a raw task as classifying whether someone is a criminal or not just by the photo. Any sane person would see that such an endeavor would by itself present so many problems in a society, way longer before we even get to the racist bias of a statistical distribution. So what is the danger really?


Yeah I didn't pull that out of a hat. The specifics were arbitrary, but I chose that as a real example of systems that people already tried building. THAT'S why understanding Bayesian statistics and underlying distributions is super important to consider when you might inadvertently create biased systems.


Proposition? Does this count as a proposition?:

https://www.newscientist.com/article/2114900-concerns-as-fac...


Ok yes well that one is absolutely crazy and scary. Regardless of whether there are race biases in the database.


That is less relevant than the goal, which is to have a better society while not violating human rights, for a very wide definition of human.

In short, being humane. If a certain degree of racism or stereotyping is necessary for that goal (definitely not 100%, but something low), then so be it.

Currently multiple social system in the USA are considered much too racist.


You have such poor use of sources to underline your claims it is almost incredible.

What underlines it is seemingly a lack of understanding of statistics, becausee the most "examples", if they can even be called that are from the realm of non-common outlier cases. That's not what statistics operates on mostly - it operates on common cases that cover the largest part of a distribution. Not outlier cases on the ends of a distribution.

In every case you gave an example of some uncommon exception, as if that would somehow be an argument for the rule existing?


You're saying correlation is not causation, a very well known truism. Surely you can't be seriously saying that society or anything else effectively prevents people from equating the two?


> I really wish people would start viewing Racism as a willingness to let that primitive part of the mind be in command, rather than a binary attribute you either have or don't.

This. Some Childhood development studies have born out that children preferentially treat their own 'ingroup' preferentially over those of an 'outgroup' [1]. This is before they really are "impressionable" as I understand it, so it shows something of an inbuilt mechanism. Though they made this distinction:

“Racism connotes hostility and that’s not what we studied. What the study does show is that babies use basic distinctions, including race, to start to cleave the world apart by groups of what they are and aren’t a part of.” [2]

1 - https://www.frontiersin.org/articles/10.3389/fpsyg.2018.0175...

2 - https://www.telegraph.co.uk/news/science/science-news/107705...


I am completely against this reclassification, because I think it is crude and wrong. Racism isn't equivalent to prejudice and the distinction they made is pretty feasible. Racists think of themselves as superior due to their race. I like to keep these issues separate and the lizard brain theory is just laughable.

At least some of the people conducting the studies on children are guilty of child abuse, for they tried to correct the behavior of infants with their infantile theories.


"preferentially treat their own 'ingroup' preferentially"

It might make interesting reading to learn about people who don't have an ingroup or who don't prefer their ingroup.

Or then again, it might just be depressing.


From what I remember from pop-psychology books, people create their in-groups on the spot when the situation allows them to also identify an out-group.

This was the classical experiment I remember: https://en.wikipedia.org/wiki/Realistic_conflict_theory#Robb....


People who don’t have an ingroup are called psychopaths. I know at least one person who (claims to have) reasoned themselves into something resembling utilitarianism from that starting point but that’s a minority end point to not having a capacity for group loyalty.


White liberals have strongly negative in-group bias: https://www.tabletmag.com/wp-content/uploads/2019/06/AA2.jpg


I can't gather any information from that chart alone. It lacks all context, which is important.


What we're rapidly heading towards is someone being smart enough to implement an AI that detects if something is racist, and then everyone else who has an AI that is trained on a random-sampling dataset from reality, which we all know will have Uncomfortable Conclusions (tm), will have to filter their AI through the censorship to be sure that it doesn't accidentally notice any crimethink.

https://en.wikipedia.org/wiki/Thoughtcrime#Crimestop

We basically need this in software so that AI researchers can stop getting lambasted with claims of racism


That's needlessly confrontational.


[flagged]


So you didn't learn the lesson of the 20th century dystopian future predictions, and you'd rather implement your own and learn the hard way?


well, they're not going to think for the minorities and SOMEONE is going to have to be the bigger person here.

Empathy is an important part of resolution, as much as that sucks.


Unconscious Bias training tries do teach that, but I've always found the framing and delivery to be counter-productive.

They give everyone tests which reveal that primitive part of your brain. Essentially they want to shock you by forcing you to fail. It is almost like they are trying to shame people, which is a terrible way to teach.


Well yeah, step one is showing that there is a problem. Bias training has to do something for the not insignificant amount of people that don’t believe that they have a noticeable bias. If you actually want the training to be effective you have destroy the “but I can’t be racist because...”

There’s nothing to be ashamed of when it comes to having bias: women think other women are less competent than their male coworkers, trans women struggle with thinking of other trans women as men, black men find other black men threatening. Your first thought is the one you’ve been conditioned to think and is broadly speaking shared by everyone. It’s when you don’t stop to have the second thought — your own thought that’s the problem.

“No, that’s silly. I’ve seen her work — she absolutely knows what she’s talking about.”

“She’s a woman. Full stop.”

“That guy is just minding his own business and given zero actual signs of being a threat.”


How about the fact that things like IAT might be junk science: https://digest.bps.org.uk/2018/12/05/psychologys-favourite-t...


What I'm talking about is the difference between telling people they're inherently flawed ("you have unconcious biases you'll never defeat") instead of that they have a choice to make (”racism as a willingness to let that primitive part of the mind be in command").


What would you say is a pedagogically better way to help people consciously compensating for that part of your brain, and especially what's the best way to make someone aware of it without shaming them?


There are people who are aware of this and are training AI that are allowed to become racist if that’s the direction they take.

You won’t hear much about it though because: 1) it’s a sensitive topic that is easily misunderstood by those who aren’t researchers 2) when everyone else is trying to scrub racism out of their models for politically correct reasons, having a “racist” AI can actually produce a competitive advantage in some industries, and it can be a difficult advantage for competitors to get similar performance if they don’t allow for natural racism to emerge.

In short, it’s not a big deal, humans have some amount of racism whether they admit it or not. What matters is that you treat people equally and without prejudice, regardless what you may think of their race. Judgements about a group and judgements about an individual are two different things.

Racist results don’t even need to be about negative things, it could be as simple as “a person of this race prefers this kind of food over that one”


That is most definitely not racist though. Once we go there, we're basically trying to force ourselves to take a worldview that does not accurately represent reality.

If my classifier predicts that an Asian guy would probably eat rice, if it's accurate, is it racist?

The issue I'd have here isn't racism, it's when race dominates other features resulting in, in this case, poor recommendation quality/variety for Asians who prefer non-rice dishes or vice versa.


> Cars, trees, animals, mountains... everything else, if it looks a way it acts that way.

Well there’s all kinds of mimicry in nature to exploit exactly this assumption.


Not to mention that there are different kinds of racism we lump together under one label. Someone can think a certain race is, say, bad at math and still be racist even without disliking that race. Someone can also hate a race without believing they have some inherent inferiority, and that's racist. Similarly, you could simply decide to support your own race at the expense of others out of a sense of loyalty without even disliking other races and that's racism too.


It is incorrect that inanimate objects are handled in a bias-free way. For instance, yellow bananas are (by human labelers, and consequently by AIs) labeled 'bananas', while green bananas are labeled 'green bananas' (similar to how a male doctor may be labeled 'doctor' while a female doctor may be labeled 'female doctor'). Even aside from labeling, the choice of data itself may be prone to bias, for example most pictures in an image set may be taken from a camera at human height pointed horizontally, etc. The biases that are racist are simply a subset of the manifold human biases that pervade data and consequently pervade AIs. There is no unbiased algorithm, all are hopelessly contaminated by humans.


>>(similar to how a male doctor may be labeled 'doctor' while a female doctor may be labeled 'female doctor')

Isn't that just a bug in the English language though? English very rarely employs variations of the same word based on gender - but that's not true of many other languages. If the labeling was done in say Polish or German suddenly it wouldn't be biased at all, a male doctor would be labeled Lekarz/Artz while a female one would be labeled Lekarka/Ärztin - it's just what it is, no bias here.


As someone who has lived as quite an extreme minority (only 4 people out of 1500 at my high school looked even remotely like me), it's hard to imagine a world where these small slip ups don't happen at all. I wouldn't want to live in that world. I'm guilty as well, to some extent.

Our minds are literally categorical machines--in order to fight entropy we find stable states through classification of the world into categories. So literally any form of thought is a form of discrimination between many categories. Every word can be thought of as a category.


Sure but people are shipping this stuff in production without questioning how good it is or even what kind of problems it might have.


>There will always be slip ups, over-simplifications, snap judgements, subconscious or not.

Absolutely. The mark of intelligence is to recognize them and correct for it.

And the people who just permanently slip up, and never correct themselves, and somehow always make the same snap judgments? 100% racist.


I don't think anyone has achieved 100% un-racism, or 100% racism.


I think it’s unfortunate that the word “racism” is used both for evil things like apartheid or Jim Crow, and for unconscious bias. Especially when the bias isn’t even to view certain races negatively, just differently.

I mean it doesn’t seem like people tended to label some races with negative labels, just that they labeled their race at all, and didn’t do that for white people. At least in the example.

I don’t think it is useful to lump these two concepts together.


A car could look fast, but be slow, or have latent problems, software on the car's computer could be fooling emissions tests... Looking at a tree tells you nothing about photosynthesis, tree pheromones, deep relationships with symbiotic insects or bacteria...

You can't understand very much at all just based on how things look. That holds for humans and inhumans alike.


This is a perfect example of Tactical Nihilism. You're basically denying the existence of a massive amount of reproducible psychometrics and anthropological research because it makes you uncomfortable, and you're declaring that heuristic analysis of available, salient facts is not worthwhile because "you can't understand very much at all based on how things look."

When an arborist "looks" at a tree, they identify the kind of tree it is, and that connects to all of the research they know about that species of tree. It's reasonable to suppose that a new instance of a known kind of tree is going to have properties in common with all the other known instances.


I think you're reading into my comment things that aren't there. Pointing out that even apparently simple things contain hidden depths is not an example of nihilism, tactical or otherwise. Likewise, it's not a denial of research - it's an acknowledgement of the research that uncovered those hidden depths.

Your point about how a specialist can connect their knowledge to what they see is true, but not relevant to the point at hand. If I was your therapist, and had studied and understood you thoroughly, I might be able to assess your mental state at a glance in most scenarios. That doesn't deny you have a detailed inner life anymore than that an arborist might generally understand a tree does not deny the complex life of the tree.


This is a perfect example of an incredibly hostile overreaction backed up by a Proper Noun.


The claim wasn't that people are harder to understand than trees based on appearance - it was that people are harder to understand than anything based on appearance. It seems like a poorly articulated reference to people having an interior mind - but so do animals. Can you identify a friendly or sick or lazy dog just by looking? No.


> Can you identify a friendly or sick or lazy dog just by looking? No.

Yes you can do that. To put it extremely simply: define what "friendly", "sick", and "lazy" means in terms of behavior, then observe the dog's behaviors.


A sleeping dog still possesses those characteristics, but has no behaviors.


A dog sleeping amidst lots of stimuli known to excite dogs is exhibiting a lazy, sick or perhaps even unfriendly behavior.

Edit: sleeping is a behavior.


You can say exactly the same thing about a person, which seems to support my point.


I must have misunderstood, upon second read you seem to say something similar to, "nothing is as it appears" no?


Mmm, I think I was going for "lots of things cannot be understood completely by appearance, it is not unique to humans"


> Can you identify a friendly or sick or lazy dog just by looking? No.

?? Obviously you can. Why do you think you can't?


they're talking about the times when you can't.

you can't identify every friendly dog just by looking at it, you'll identify dogs that are also acting friendly and doing things people have found preceded positive experiences.

you can't identify every sick dog just by looking at it, otherwise the vet would never find additional issues.

and so on


This is a case of "When people thought the Earth was flat, they were wrong. When people thought the Earth was spherical, they were wrong. But if you think that thinking the Earth is spherical is just as wrong as thinking the Earth is flat, then your view is wronger than both of them put together. "

Yes, it's wrong to assume that you can tell everything about that dog by looking at it - but it's so much more uncomparably wrong to consider that you can tell nothing about friendly or sick dogs based on looking at them!

Observating the behaviors and dog breeds correlating with previous friendly and unfriendly experiences is a very useful, somewhat reliable predictor of how likely this particular dog is to be (un)friendly. Sure, if you know this particular dog, then that should supercede any group information, but if not, then that's all the information you have, it is useful information that correlates with (future) reality that matters to you, and so it's prudent to use it. Appearing "nonjudgmental" by throwing away information and judgment is simply stupid, and bound to get you bitten at least in the metaphorical way.


If we are talking about observing behaviors, then the original assertion that you can't tell anything about humans by looking is so silly that it wouldn't have been made. We are not talking about observing behaviors, but appearance.


You can tell a lot about humans by their appearance.


Yes, I agree. I originally commented to disagree with this claim someone made above > It's really only people where you can't tell what it does/is from the outside. Cars, trees, animals, mountains... everything else, if it looks a way it acts that way.


> Yes, it's wrong to assume that you can tell everything about that dog by looking at it - but it's so much more uncomparably wrong to consider that you can tell nothing about friendly or sick dogs based on looking at them!

but nobody in this entire thread was saying that, and my contribution was an additional attempt to further point out why the other sibling responses were missing this

At this point I'm totally content in talking past each other on this topic as I honestly don't know why its not clear that the subset which is undetectable by mere observation is statistically significant.


Empiricism is basically the art of inference based on outward appearance in different contexts.

I’d argue we learn everything based on appearance. You have to do more than just observe things superficially and take context into account, but for something to be measurable it has to have some sort of outward appearance, whether that be direct, as in something you can see with your naked eye, or indirect, as in something we need to measure through some other instrument.


You can, probabilistically. If you randomly sample cars from all the street-legal models in the US (for example), and sort them by how fast they are based on appearance only, you would not be 100% correct, but you'd probably do much better than if you claim that "You can't understand very much at all just based on how things look" and sort them randomly (the prior being that each car is equally likely to be anywhere in the distribution).



Ultimately we know nothing, so one can always nit-pick any statement into an oversimplified oblivion.

There are also cakes that look like cars, model sets which have miniature mountains, on and on. But in typical nature, it works well enough to breed and see what comes next.


Tactical Nihilism sounds a lot like solipsism. "We can't ever not make generalizations, so we shouldn't even try to make statements."


[flagged]


I have a few thoughts about that.

1. If you look at the consensus in the field of psychology, human beings are probably not that good at making the kind of assessment you're talking about with any degree of accuracy. For instance, once you form a belief that there is a difference in competence between people of different races, you will become heavily biased to attend to examples which confirm your belief, and you will be biased to disregard examples which contradict your belief. Humans are just not perfect bayesian reasoning machines.

2. In the social contract of western, liberal society, we have broadly agreed that people should not be judged by their immutable characteristics. So in a sense, yes you are not allowed to "pattern match by race", according to the standing rules of society.

3. This rule helps protect us from the dangers of racial thinking. For instance, a belief about the relative merit or competence of one race can easily jump the gap from being descriptive to prescriptive when one goes around treating people by different standards based on their group identity. There is so much variance within every demographic group that even if there were some statistically significant difference between groups, you would be doing a massive injustice to so many of the individuals in that group to judge them by the "representative member".


broadly agreed

I would say ruthlessly coerced.


edit: I mean, hypothetically, if you were to think black people are less competent on average than white people, why would that be the case?


Not saying it's one way or ther other, but hypothetically someone could be using some IQ statistics, even if there are issues regarding data collection, :

https://en.wikipedia.org/wiki/Race_and_intelligence

Yes, obviously these statistics would not definitely tell anyone if there is any difference in actual genetic races, or if the factors are socioeconomic, but for someone looking for job candidates it would actually make no difference, since IQ is not generally easily changed or trained in a developed adult.

Of course, they could also just be using their own personal experience and claim that they are basing their beliefs on it.


Hmm. Are there truly no racist iq studies which control for the net worths of its participants?


He's not really thinking that. He is just concerned about society being to politically correct about specific things. I.e. you could probably completely discriminate someone by height, hair color, ... but not by race specifically for some reason.


Good example - we do have statistical data that proves beyond doubt that men have more automotive accidents than women. Yet in the EU it's illegal for insurance companies to charge men more for car insurance than they would charge a woman for the same insurance, because apparently that's sexist. But somehow no one has any problems with charging a younger driver more than an older driver, even if they both have identical amount of experience, because data supports that younger drivers have more accidents regardless of their experience behind the wheel - and yet that's not ageist and isn't banned.


I doubt that there are so many "older drivers" with comparable levels of experience to "younger drivers" that such an evaluation of risk could be beyond doubt. Are you sure such data exists?


Absolutely. A 25 year old driver who passed their test a day before will pay much much smaller premium than a 20 year old driver who also just passed their test(even if the 20 year old driver already has 2 years of experience and no accidents their insurance will still be more expensive). I obviously don't have access to the actuarial data used by the insurance companies, but it's very easy to check this with many online insurance comparators.


Why are you so sure that such data exists if you have never seen it?

I'm already slightly inclined to suspend my disbelief, because I know a little bit about developmental psychology that sort of corroborates what you're claiming about the ageism thing, however, if you want to start discriminating based on race, you have to be ready to bear a massive burden of proof when you make racist claims. Would you be so blindly accepting if I threatened your life, liberty or property for no reason other than the color of your skin? Wouldn't you want conclusive evidence, quality controlled and beyond a reasonable doubt?


The data the parent is claiming to exist is discrimination based on age, which is indeed rather simple to confirm.

>however, if you want to start discriminating based on race, you have to be ready to bear a massive burden of proof when you make racist claims. Would you be so blindly accepting if I threatened your life, liberty or property for no reason other than the color of your skin? Wouldn't you want conclusive evidence, quality controlled and beyond a reasonable doubt?

Would you not say the same about age?


Yes, I would, but I think we're veering too far off the original topic of race.


>>Why are you so sure that such data exists if you have never seen it?

Because when you ring up an insurance company and ask why (as a 20-year old) your insurance is higher than that of a 25 year old(with also zero driving experience) you will be told that it's because of your age. Unless they are lying to you of course, which is hard to prove or disprove.

>> however, if you want to start discriminating based on race

Wait, what? That took a weird turn?


>that took a weird turn

Well, you turned first. Here we are all talking about racism, and then you claim age inversely affects driver risk. In context, I thought you were somehow trying to convince me that circular reasoning and appeals to authority ought to dispel any doubts I have about the validity of racism, but it sounds like you were trying to say something else, and I could be doing a better job of understanding what you are trying to say.

>unless they are lying

They may just be ignorant. Maybe someone else lied to them and they are just parroting the lie. Remember, the people who price insurance exist in a world where polygraphs are admissible evidence in a U.S. court of law; misconceptions are everywhere and I think we must be cautious, lest our generalizations be hasty.


That's a very good and interesting example, thank you.


In the U.S. we already tried to institutionalize racial discrimination. From reading the history, I think that "too much political correctness" is preferable to "separatist paramilitary groups and city police departments having shooting wars in my neighborhood."


Problem is the same brand of PC is being exported to places without that history, because of the power shift it causes.

Also see: invoking emergency military powers during peacetime (as reaction to overblown external threats) to suppress political dissent.


What is a specific example of a PC behavior which you don't appreciate? I want to understand why you are concerned.


I disagree and the dichotomy is a wrong one, since suppression of issues just leaves the way of taking justice into your own hands in the way you described


The dichotomy is real, and you are comparing apples and oranges.

Yes, there are white separatist paramilitaries operating within the United States who have cropped up in reaction to affirmative action and social justice, but it won't blow up the same way the black panthers did. It's apples and oranges, because today's U.S. white separatist paramilitary groups are not targets of the U.S. government. The FBI isn't assassinating members of the Aryan Brotherhood nor are politicians passing gun control laws to curb the flow of weapons into their territories. The U.S. government is not actively persecuting white supremacists. There is no COINTELPRO for white supremacists.

There is no escalation here, no conflict. Nazis get a free lunch in the U.S.


>I really wish people would start viewing Racism as a willingness to let that primitive part of the mind be in command

It this were the case, babies would be racist by default, but they are not. Racism is taught.


That... what?

"And when one user uploaded an image of the Democratic presidential candidates Andrew Yang and Joe Biden, Yang who is Asian American, was tagged as “Buddhist” (he is not) while Biden was simply labeled as “grinner.”"

They are not removing images that categorize blacks as blacks. They are removing images that are incorrect.


Wait, no, that is not how racism starts.

Racism starts out of preferential treatment of some people. Most racist people have a « root event », where the other party didn’t get condemned. It may even have happened several times, with various consequences (rape, molestation, repeat racket, etc).

Then they proceeded to denounce the rape/molestation/racket to the police. The police doesn’t act because they suspect you are racist. Which has the opposite effect than desired: It doesn’t condemn the criminal, and puts the burden on the victim.

Then they seek security in their lives. Therefore, if the probability is high to experience rape, molestation, racket or crime, enough that the racist has been confronted to it in his life, then the first best approximation of judging whether someone might be a criminal is whether he’s free or in jail. That’s in well-functioning societies. In non-functioning societies where criminals are not in jail, the second approximation to protect yourself is secondary indicators, inferred from grossly racist statistics. Here racism is born.

It also explains why racist people often still have various friends of the group they are supposed to mistrust. It’s because they were able to assess their probity and trustworthiness in some opportunity. That’s why mixing people by compulsory rules works. But mixing people is a poor palliative. It doesn’t solve the underlying problem if a non-functioning society, so it doesn’t make the people less racist, despite giving the appearance that people work together.

Racism is often used with regret by those who exert it. But it is the second best approximation to seek security. Racism is the result of a non-functioning society which is caused by being more lenient on criminals of chosen category. I’m pretty sure it is possible to engineer racism by letting a made-up group get away with crimes.


Ehm care to cite some sources for your well generalized and arbitrary claims, especially regarding "root events"?


I thought about it, but if I cite sources, I will be citing racist people. Not something people enjoy reading.

But you can find the same discourse in most famous racists of the world, at least currently living.

If I have to cite a source, Tommy Robinson (here comes the downvotes) constantly talks (and breaks) about the omerta surrounding the trials he reported on, which made that the rape gangs could proceed for so long with so few hindrances. Go through the list of people recently suppressed from Youtube and you’ll find the same (unadressed) logic.

Concerning root events, they are often private, so you have to know the person personally. But go ask around you when people became racist, they’ll often tell you one specific event. It’s interesting to go interviewing an enemy.


So based on anecdotes? Aren't you doing the same thing you are accusing them of?


These "root events" might just as well be a justification for pre-existing racism.


The AI isn't biased, the curaters were.

The people who curated the first training set used subjective words like 'attractive' to tag the images which means the AI tagged all images it deemed 'attractive' to the people who made the training set. as this is a very biased and homogenous group it means the AI turned out biased. Maybe if they randomly sampled millions of people from all countries in order to create the training data then they could effectively train an AI to guess what YOU might find attractive. However even then I somehow doubt it. Beauty is in the eye of the beholder. We dont consiously know the rules of what we are attracted to, nor does an AI have secret information if you just supplied it with enough words and images.

If they'd stuck with simple classifications like 'black' 'white' 'man' 'woman' then they would have less subjective judgement values about the original training set.


> The AI isn't biased, the curaters were.

No, it's not "The dataset is biased, the AI isn't" it's "The dataset is biased, THEREFORE the AI is biased".


Garbage in, garbage out


Data is data and cannot make value judgments so not sure how it can be racist. If the data is how racist people label things it still is not racist data it is data is perfectly valid for what it is racists. Removing the images from ImageNet seems absurd.


In practice, "data" means a set of observations collected by humans- who have inherent biases that influence the collection.

I'm not talking specifically about cultural biases like racial stereotypes etc. Confirmation bias is a thing, there's nothing stopping a researcher from making those observations that confirm their favoured theory and contradict all others.

Then of course there is sampling error. Just because you have a set of data that you collected "at random" doesn't mean that this dataset is representative of the population you are interested in. Let alone the fact that it's very hard to collect a truly random set of observations about processes that we don't understand to begin with.

The kind of data you're describing is an ideal, a principle that we all aspire to. It's far from the reality in practice.


I mean, it's very common colloquially to describe non sentient things as racist because they're based on either purposely or obliviously racist ideas and stereotypes.


Do you hear yourself ?! ImageNet was not based on either purposely or obliviously racist ideas and stereotypes. If it was, sure, I would have some patience for the claim it was racist. But it was not.


Exactly! They got subjective results because they went beyond facts.

For example, it could have been interesting if they tagged each person with their actual religion or propensity to "grin"


Going beyond facts is an important component of “being human”, so in that regard it makes the AI seem more intelligent. The problem is the AI is 100% honest with what it thinks, unlike a human.


Not sure you guys understand AI and ML. Neither of these actually "think" for instance. By way of example, the only thing this AI really does, at base, is classify things into categories that the curator told the AI to classify them into via the dataset. I mean, that's pretty much it. There is no bias. There is no lack of bias. It's just blindly doing what the curator told it to do. Don't mistake that for "thinking". That's more AGI, which is not likely to happen in the lifetime of anyone reading this post.


I agree that going beyond the facts is a good thing when humans are doing critical thinking and when being careful and transparent about their doing so.

However this was creating a dataset for classification. Something that specifically should not go beyond the facts. (The basis of the model is the strength of the facts it's built upon)


In your opinion, are qualitative labels like "attractive" or group membership such as someone's skin color or ethnicity within the domain of facts or not?

I.e. is the issue in the fact that the particular annotators were subjective and annotated some particular facts wrong (and the labels for skin color could be filled from, for example, census data which is self-reported) or that these whole type of labels shouldn't be attempted to be made as they're not facts?

If the latter, what do you think about the categories like "adult" or "sports car" that are also part of ImageNet; can we draw an unambiguous factual boundary between images of adults and teenagers, or "normal" cars and sports cars?


Except only including facts can still reinforce unfair bias. For example, it's true that there are more men than women in software engineering. Whether someone is a software engineer or not is a fact. If you have a "representative" dataset with only facts, then it's possible that an AI would have a higher chance as labeling men as software engineers than women, simply because it begins to associate masculine facial features with software engineering.

In my eyes, this result would reinforce unfair bias, and a thus well-designed AI should avoid this (i.e. with all else equal, a well-designed AI should suggest the label "software engineer" at the same rate for both men and women).


If it's true that there are more male software engineers, then why is it wrong for the AI to "learn" that?

If the AI did start classifying masculine features biased towards software engineers, then the AI has learnt the above fact, and thus can be used to make predictions.

The moral standpoint that there shouldn't be more male software engineers than female engineers is a personal and subjective ideal, and if you lament bias, then why isn't this kind of bias given the same treatment?


The moral standpoint isn't that there shouldn't be more (or less) male software engineers.

The moral standpoint is that there shouldn't be an AICandidateFilter|HumanPrejudicialInterviewer that only coincidentally appears to beat a coin-flip because it has learned non-causal correlations which it uses to dust out qualified stereotype-defying human candidates because they don't look stereotypical enough on the axes that the dataset--which almost inevitably has a status-quo bias--suggests are relevant.


So, it depends on what you want to do here. If the task is just "predict if the person is a software engineer". I'd say go ahead, bias it away. Here, anything that boosts accuracy is game to me.

But if the task is say the pre-screening side. This becomes a more ethically/morally tricky question. If and only if that sex is not a predictive factor for engineer quality, you would then expect to see similar classifier performance for male / female samples. Given that assumption, significant (hah) divergence from equal performance would be something to correct.

Of course there are other issues to handle, such as the unbalanced state of the dataset and so on.


It is wrong because there is no causal relationship between the two so none can be inferred.


[flagged]


You are making a logic error. When there is no causal connection between two items it is very well possible that there is a connection that allows you to say something about populations. But you will never be able to say something about an individual. And that is where all these arguments flounder, we put population information in, in order to engineer features that then allow us to make decisions about individuals. For those cases where feature engineering can dig up causal connections this works wonders, for those cases where it does not or gives you apparent connections that are not really there you end up with problems.


'black' and 'white' aren't simple at all.


>'black' and 'white' aren't simple at all.

well seems to be very simple - either there is a hat or there isn't

https://twitter.com/Dk3Kbball/status/1174115660219072512

That illustrates the depth of the issue - while directly racist data can possibly be removed there are many proxy/correlated attributes (as any insurance/mortgage/etc company knows), and to find correlations is the core nature of the machine learning systems (at least the ML as it is currently known to humans).


Tangently related, this reminds me of the women who had her picture photoshopped by people from different countries to make her look beautiful in their eyes:

https://www.buzzfeed.com/ashleyperez/global-beauty-standards


I think it's important context that "had her" means "she initiated", rather than "was imposed upon her":

"Make me beautiful," she said, hoping to bring to light how standards of beauty differ across various cultures"

She specifically asked for some kind of alteration, so leaving the picture unaltered was implicitly not an option. Furthermore, this assumes a random (but distributed) selection of Fivver artists represent worldwide beauty standards.


You'd probably get a better result by adding in the base culture/race/whatever other cultural identifier of the tagger for these subjective tags.

The definition of beauty varies across culture, and while it varies from person to person, there are some aggregates that might be informative.


> The AI isn't biased, the curaters were.

This is pretty close to "does the Chinese room know Chinese". https://en.wikipedia.org/wiki/Chinese_room Going too deep down that path is fun for philosophy, but not super useful...


> If they'd stuck with simple classifications like 'black' 'white' 'man' 'woman'

Interestingly, I was just thinking about the "black and white" issue today. A while ago I watched an interview on Youtube with a young woman who was born in Japan. Her parents were American and growing up she new she was different, but basically it boiled down to "the English teacher's daughter", or "part of the foreigner family". But she didn't speak English very well and her parents didn't speak Japanese very well, so she identified very much more strongly with her friends in the area than her family.

When she was 12, her family moved back to the US. When she went to school the new children were encouraged to say what nationality they were. One person said they were Canadian. One said they were Mexican. When it came to her turn she said, "I'm Japanese". One child in her class corrected her: "No you aren't. You are black." Of course, this was a source of considerable confusion for her.

One of the things that's kind of weird about Japan is that some Japanese people have very dark complexions -- darker than what would be called "black" in the US. Some have very light complexions. In my opinion, considerably more "white" than I am (and most people would call me "white" I think). In fact, after I moved to Japan, I realised that I wasn't white at all. I'm pink. I mean, I'm super pink. I seriously never noticed it until I spent 5 or so years living in a place where nobody else was pink. I literally avoid wearing red now because it makes me look like a tomato.

When I was about 3, my best friend was black. My grandmother asked me, "Do you notice anything different about your fried? Their hair or something?" I didn't understand the question. All of my friends had different hair. Then she said, "Can you tell that he is black and you are white? Or do you not think about it that way?" My grandmother was just curious, but this question completely blew me away. From that time, I realised that people didn't each have their own skin color. Instead, they were categorised and my friend was different than me. I think my friend noticed that I looked at him differently (though not necessarily badly). We stopped being friends for a long time. Somewhat strangely, I just recently realised that he was one of my best friends in high school... I never actually made the connection that my friend at 3 and my friend in high school were the same person until recently. I wonder if he ever realised.

But anyway, there really isn't a classification like "black" and "white". When I have a tan, my skin is darker than my wife's. But she is very tan and so her skin looks brown. When I compare my skin to hers, my skin is still red, even though it is dark. But if I were to compare my skin to an indiginous American, I think their skin will be more red than mine when tan -- and less pink when not. And my wife, when compared to someone indiginous from Africa has more a more of a curry brown than a deep chocolate brown.

As I was saying, Japanese people don't call dark Japanese people "black". They don't treat them as a different race. Neither are very pale Japanese people "white", even though they may say "your skin is very white". They aren't different. What I found interesting about the young girl who discovered that she was black was that she wasn't "black" in Japan. Or, at least, not "black" in the way we use the term in America. Japan does not have that cultural history (it's got enough of it's own baggage, thank you very much ;-) ).

If a computer were to compare skin tones objectively, it would simply tell you the color. If it decided to classify in terms of "black" and "white", it would be classifying based on cultural labels, not color.


Perhaps instead of using human categories of race, we should ask the neural nets to provide those categorisations?

It would be based on a larger selections of individuals (non-local), based purely on appearance, and un-biased by human conception (we are really good at processing faces/facial features).


I'm curious. What benefit would that kind of categorisation bring?


The problem seems less about racist bias specifically and more about unbelievably dumb tags generally in the first place.

"Stunner"? "Looker"? "Mantrap"? Or even trying to tag people's images with categories like Buddhist, grinner, or microeconomist?

What were they thinking?! Clearly these tags were never curated in any remotely responsible way -- for quality, for sensitivity, or just usefulness at all -- and I'm shocked they were ever intended for academic or research use. No wonder image recognition gets a bad rap, with input data like this.


There are ~32.000 tags. No surveillance system is using ImageNet tags to classify people into Buddhists or Not-Buddhist. Most researchers ignore these tags and focus on a 1000 classes, and know that 32k performance is not good (and these artists have no intention of making it work at all). What they are uniquely trying with this Art Project is as much research as it is activism. Note that "mantrap" is defined in synset as "A trap for catching trespassers", and that you are bound to find weird stuff among over 30k categories (imagine what you can say with the 32% most popular words in French...).

This is a photo in question: https://memepedia.ru/wp-content/uploads/2019/09/imagenet-1.p...

This is the route the network took:

person, individual, someone, somebody, mortal, soul (6978) > female, female person (150) > woman, adult female (129) > smasher, stunner, knockout, beauty, ravisher, sweetheart, peach, lulu, looker, mantrap, dish (0)

So it was (politically) correct on the first three categories, and the last one was either a crapshoot (and she could also have gotten to the subcategory of "prostitute" > "streetwalker, street girl, hooker, hustler, floozy, floozie, slattern") or she really is posing in a common "beautiful woman"-way. (The global description for this route is "A very attractive or seductive looking woman" and often triggers for females with tilted heads and lip curls).

You can turn any faces dataset into a labeled face color dataset, so if a black person being subclassified as "negro" is problematic bias or encoded racism, then all such datasets are suspect. Noisy labeled data is the norm, not some horrible exception to be avoided at all costs.


The fact that we have a machine learning algorithm that labels _humans_ as smashers, prostitutes, or convicts is a problem, irrespective of whatever technical justification can be made for it.


But who artificially created that problem? The artists. There is no marketing company that has intelligent billboards scanning the public for prostitutes. There are no researchers seriously using CV to classify convicts (at least not in the West, and not with ImageNet). That could be a malicious usage problem. This is either a non-problem or Armaggedon for all ML CV datasets, because you certainly can use most CV datasets to train a crappy classifier to output offensive labels. If I train a people photo tagger using a dataset used for combating poaching monkeys in Africa, then who is at fault? Certainly not the researchers who published that dataset with the idea that the data would be used with common sense and scientific rigor, not adversarially -- to make a political point attacking the very existence of that data. The "exposed" bias is trivial.

It is the technical justification that should be all that matters for a canonical academic dataset. Science does its best to be apolitical, but then politics ("red bull drinking white men train racist and sexist classifiers") is forced upon it, and we can't really have a productive conversation about bias and ethics anymore.

AI needs common sense knowledge of the world to improve. Censorship so science does not offend our sensibilities, would only make it so Google Image Search (a machine learning algorithm) does not return any images of people when you search for "prostitute". Heck, the AI would never learn the difference between a male and a female prostitute. Destruction of accessible knowledge so we (aka: people on Twitter who think AI is the terminator, or the director of the internet) don't get offended by some primitive ML model-as-art-project forced to make errors or awkward classifications. That's a sentence the academic ML community could do entirely without. No benchmarks or duckface selfie would be hurt. No unfortunate third-world souls hired to scan 20 million + internet crawled images for wrongthink, only for the machine to do the unsupervised learning in a hug box, not a black box. Oi mate, you got a loicense fer that label?

Just wait until the activists find out who wrote the first 100 8's added to MNIST. Nobody but MIT would be associated with her, if they found out what she did.


> What were they thinking?! [...]

In addition to their statement, linked in a thread below [0], what they were thinking is pretty out in the open in the abstract of the original paper:

"We introduce here a new database called “ImageNet”, a largescale ontology of images built upon the backbone of the WordNet structure. ImageNet aims to populate the majority of the 80,000 synsets of WordNet with an average of 500-1000 clean and full resolution images. " [1]

That paper is now 10 years old, WordNet itself had it's first release somewhere in the 1980s. A lot happens in research a decade down the road and lots of research towards biases of datasets are relatively recent in that sense. It can hardly be blamed on the researchers that people use their datasets in production merely because "they're large and there in the open". As their statement indicates they're actively working on the issues that arose during the last few years.

Most research datasets are fit for a very specific purpose, to me the larger problem here is two-fold: "AI" fearmongering on one side and startups/corporations on the other side lacking the necessary skills to select and filter their training data. Hopefully that'll be a matter of the past once the hyped data science field reaches maturity.

edit: Wordnet seems to include said terms to this day, [2] for example still lists "mantrap" associated to "beaty" in the synset for "a very attractive or seductive looking woman". Although I'm unsure how they should handle these cases, both are words and that's one definition used in practice, even if we find it reprehensible. They seem to have removed the n-word but that's about the only removed instance I can find in a cursory search of sensitive or insulting English words. Maybe they should clarify sensitive annotations like they do with vulgarity.

[0] https://news.ycombinator.com/item?id=21054770

[1] http://www.image-net.org/papers/imagenet_cvpr09.pdf

[2] http://wordnetweb.princeton.edu/perl/webwn?s=beauty


But removing the images from ImageNet, an inanimate set of images that cannot make value judgements, will surely fix this problem...

Our culture is broken and nobody is willing to stand up to insanity or even admit to it. People are too happy to do completely insane and senseless things if it saves them from the social lynchings.


Not sure about "mantrap", but I could imagine "stunner" and "looker" in the pages of a magazine article - A red-carpet celeb photo "Jane doe looking stunning! And with her hubbie - what a looker!".

I guess the problem with things like "microeconomist" are that other tags such as "fireman" might be valid (if a picture contained clues about vocation, for example) - furthermore, rather than condemn the label, we might instead consider not demonising a neural net, and taking its results at face value; in this case an indication that most labelled micro-economists in the dataset are suited white men.


Buddhist seems pretty easy to categorize? If you see someone standing in front of a buddhist temple wearing an Kāṣāya (had to google the actual word: the orange robes) they're pretty likely to be a buddhist, right?


Agreed. These are not descriptive labels you can infer from an image.

They may be, however, labels generated by users from a context.


Their statement: http://image-net.org/update-sep-17-2019

> Each synset is classified as either “unsafe” (offensive regardless of context), “sensitive” (offensive depending on context), or “safe”. “Unsafe” synsets are inherently offensive, such as those with profanity, those that correspond to racial or gender slurs, or those that correspond to negative characterizations of people (for example, racist). “Sensitive” synsets are not inherently offensive, but they may cause offense when applied inappropriately, such as the classification of people based on sexual orientation and religion.

> So far out of 2,832 synsets within the person subtree we’ve identified 438 “unsafe” synsets and 1,155 “sensitive” synsets. The remaining 1,239 synsets are temporarily deemed “safe.”

They've also completely disabled Imagenet downloads while they remedy this.


Great, I can see some of this being useless (some parts of the unsafe dataset), but if they cull the "sensitive" portion, this may induce performance regressions.

I need to find an ImageNet archive now.


It doesn't really touch the 1000 class "Imagenet" that's commonly used in computer vision.


Some of the classes are subcategories of ILSVRC classes in the WordNet hierarchy. So by removing images of persons in categories that are considered inappropriate, the resulting classifier will end up less likely to recognize those as images of people at all. I'm not sure whether that's a better outcome.


Ah, then it's fine... probably.


This should be the link to the article.


Really the bias isn't the true problem here so much as the lack of epistemology in how it is being used. By definition associations work by blind correlations - expecting anything other than stereotyping is a foolish misuse of the tool. It will be wrong for outliers because it tries to get it right for most cases regardless of cause - like skin cancer detectors who saw rulers as a sign of cancer because most of the reference images of cancer had a ruler in them.


Exactly. The large amount of faith currently placed in probabilistic models that do not have common sense ways of eliminating factors which are extremely unlikely to be causal (like the presence if a ruler causing cancer) disturbs me. There is something that humans do that we have not quite figured out how to teach computers yet, at least as far as I can tell, which is to get them to evaluate whether their model is not just compatible with their observations, but other models and prior knowledge about the objects being observed.

I think we’ll get there at some point, and I’m not exposed to the most cutting edge AI research, but it seems like AI us currently very overhyped and deeply flawed for many of the applications people would like to use it for.


Because these algorithms don't know that they are classifying cancer. The label it sees is just 1 or 0. For all it knows, based on its inputs, you may want to classify ruler/non-ruler images.

To achieve what you want, semantic structure must be used as labels instead of just categorical labels.

Assuming we have a sane AI that now knows its looking for cancer, it knows what that means (from digesting medical textbooks, papers and generic text corpora) and it can detect rulers and knows the two are not casually linked from ruler to cancer, we could make the model output "dataset diagnostics", like a "Warning! The cancer label in this dataset is implausibly correlated with the visual presence of a ruler". Or "Warning: 99% of your hotdog images show a human hand. Evaluation on this dataset will ignore errors on hotdog images without hands!"

Context does matter though. If there's an orange fluff on a tree trunk, the AI is right to look at the environment and infer it's a squirrel.


It's because of another human trait: paranoia.

And between the Dem/trump back-and-forth, racial paranoia is strong in America.

The hidden motive is "what if ImageNet is used to (approve loans|decide parole|pre-populate killer police drone biases)!", but the way this is done damning any emerging tech that is less that perfect.

If I heard loaded handguns where given to toddlers, I'd attack the practice of giving handguns to toddlers - not the existence of toddlers!

That said, between tech hubris, "self-regulation", and the toxic partisan culture war, I'm not sure what could change. A real AI guideline, rather than vague pop-sci-fi document, or tech policing (dictate appropriate usage only)? preferably incubated in a country where nuanced discussion can be had without the flamewars...


Has this been confirmed? Not doubting the story, but the article doesn't provide a source.

Edit: Might be this, which was a week ago / before the Roulette blew up: http://image-net.org/update-sep-17-2019

> We are in the process of preparing a new version of ImageNet by removing all the synsets identified as “unsafe” and “sensitive” along with their associated images. This will result in the removal of 600,040 images, leaving 577,244 images in the remaining “safe” person synsets.


If they delete half the data, the probability of the remaining data sets to be rubbish is quite high. You also wouldn't want someone to pick sets in the first place.


Bias in AI (whatever the source or nature of the problem) is a real issue that needs addressing, but taking the relevant training data off of ImageNet seems like a perfect example of papering over a problem to avoid really confronting it. We will need to find ways to make AI programs than can see beyond the biases (about humans or otherwise) that will always exist to some degree in real-world data.

If the ImageNet contains bias that leads to embarrassing results- that’s fine! That give us a readily available toy instance of the problem to study. Taking that away could actively harm anti-bias research.


I strongly prefer they keep the bias. It shows plainly that these models have zero intelligence, they're just pattern matching over specific datasets.

I don't want the appearance of fairness (introduced by human dataset curation) to be mistaken for "intelligence".

Keeping the bias would hopefully cause people to think more critically about why such bias exists in the model in the first place.


Disagree that it proves zero intelligence. Intelligent humans also have bias.


I'm not saying that intelligence and bias are orthogonal. That discussion requires a much deeper consideration of human cognition and psychology :)

I'm just saying that model bias is a very easy thing to explain (usually, data imbalance).

You can also fiddle with the label ratios to change the bias - which is also a good way of showing that the models aren't really intelligent.


Bias: prejudice in favor of or against one thing, person, or group compared with another, usually in a way considered to be unfair.

If a data set is flawed, it should be fixed, but when ML finds objective patterns our culture finds subjectively unpalatable and we choose to "fix" them, we fall prey to and re-enact the same grade of self-delusion exhibited by, for example, "the church" in the dark ages. Computer science is already low in terms of accountability & rigor compared to other fields without these kinds of suggestions.


You use the phrase 'objective pattern' here, but that conflates causation and correlation.

For example: black men in the US are more likely than white men to have criminal records, but this in no way means that black men are "objectively" more criminal.


Let me catch up with the double-think. They commit more crime on average but they are not more criminal? What kind of olympics-grade mental gymnastics are these?


You're rewording what they said. If any group of people is more policed than another group of people, said group is more likely to have a criminal record. Doesn't mean they're more "criminal" than any other group but more of a reflection on the current state of the criminal justice system.

To paraphrase Warren Buffet: "If a cop follows you for 500 miles, they're going to find a reason to give you a ticket."


Could be true for trivial felonies like a ticket, but if we talk about real crime thats not really an excuse. Would be more inclined to think black people commit more crime because socioeconomic factors, as poverty correlates crime.


...which, even if true, would circle back to 'being black' being a correlation to but not causative of criminal records.


It really depends on how you define crime. Usually crime and prosecution is defined in such a way as to impact the lower classes more than the upper classes.


Murder is rather easy to define. There is a dead body with holes in his/her head, there is a murder. The tragedy being that murder is way more prevalent in the black community, partly because of underpolicing.


I wouldn't be so fast with that assertion.

If your life is a cesspit and you can count on the authorities to be part of the problem, where does murder end and self defense begin?

See the song "I shot the sheriff." I'm most familiar with the Eric Clapton version, but googling it recently suggests to me it was originally written by Bob Marley.

https://en.m.wikipedia.org/wiki/I_Shot_the_Sheriff


This is a rather naive understanding. For example, there were nearly 3000 deaths in the 9/11 attacks, and about 4800 soldiers have died in the subsequent war with Iraq.

There are plenty of dead bodies with holes in their heads, but it's not clear how one accounts for crime among those 7800 dead.


"The offending rate for African Americans was almost 8 times higher than European Americans, and the victim rate 6 times higher. Most homicides were intraracial, with 84% of European Americans victims killed by European Americans, and 93% of African Americans victims were killed by African Americans.". https://en.wikipedia.org/wiki/Race_and_crime_in_the_United_S....

How does overpolicing explain both arrest for murder rate and victim of murder rate have a significantly higher prevalence within black community?


To start with, 'more likely to have a criminal record' and 'more likely to have committed a crime' are not actually the same thing.


In Memphis, TN, black people commit more murders than about everyone else. They don't even report many of them since (a) tourism and (b) it's already most news stories. Most whites in poverty turn to drugs, do petty crimes, higher than national average suicide rate in some places, etc. Blacks, recently Latinos (esp cartels), were unique in doing all kinds of violence in their own communities plus others at random. Whites usually use violence on others externally to benefit their own group, esp via military. Most people writing comments like yours try to explain the difference with all kinds of stuff about oppression, environments, etc pinning blame in one place. Yet, the whites so oppressed they're doing drugs and/or killing themselves aren't murdering other people in their neighborhoods due to their oppressive environment nearly as much. The hard data argues a difference that's not from white people since they'd have been doing it first if it was.

I have a feeling most of you went to white schools or lived in white areas instead of black ones in bad places. Most in my area that went to the latter, including black folks, know where this comes from. I already described it here:

https://news.ycombinator.com/item?id=20660278

You folks need to stop just dismissing all blame on the black side as "cops must just always make stuff up or police them harder." There's plenty of that worth calling out. However, the biggest hole in your argument is murder in high numbers. You can watch folks that don't kill people or who get shot all day in whatever selective way you want. In the end for poor neighborhoods, the black ones you watch will have committed many more murders hitting many more black victims than the white ones.

And, if gangs are involved, the dealers and killers will often be told they're in for life. They'll be perpetuating it in a way that has nothing to do with white people. And you should be calling them out hard on that if you really care about thugs getting arrested by cops and/or protecting their victims which are mostly black. Instead, it's whites (damage of some kind) blacks non-stop in your comments or the media vs thug culture from blacks murdering black people all the time. And, it seems, something like that with machismo and cartels in Latino areas. I have less experience with them, though. I didn't get to since we had to leave the neighborhood after a small confrontation led to one putting a contract on our heads that canceled (maybe) if we moved.

So, I'm calling bullshit unless you're saying there's both racist over-policing by cops (mostly but not all white) and thug culture coming from blacks creating killers that take out tons of blacks blacks. Something similar for Latino areas. Then, it fits the data.


If we say the dataset is flawed, it is important that we pin down why and how exactly it is flawed. By what methods exactly are we going to source "not-flawed" data?


> "the church" in the dark ages

This isn't a thing and you should read more about what you think happened in this period of western history.


I said nothing of Western history and am familiar with the subject. Also, this is completely unrelated to my point so you will receive no further responses on it.


It's the foundation of your point.


They could store it somewhere non-public for research purposes, but taking it off the website was the right call.


"For example, the program defined one white woman as a “stunner, looker, mantrap,” and “dish,” describing her as “a very attractive or seductive looking woman.” Many people of color have noted an obvious racist bias to their results. Jamal Jordan, a journalist at the New York Times, explained on Twitter that each of his uploaded photographs returned tags like “Black, Black African, Negroid, or Negro."

They claim this is AI, but earlier in the article, it states that mechanical turk was used. Mechanical Turk is basically just people getting paid pennies or fractions of a penny to tag these photos.

Many people on Mechanical Turk are from countries where English is not their first language and they don't have the same idea of racism as we do here in the US, which would explain the racist tags mentioned in the article.

This doesn't really show us anything about AI and racial bias and more that other countries still aren't up to our levels of what we consider decent.


The AI connection is that these are the type of datasets that feed AI systems. It shows that you have to be careful when curating your dataset, so that you don't further bias.

I just hope the masses don't glom onto this, start shrieking that "AI is racist", and attempt to take AI completely out of the picture.


As someone who spent years in those villages I can tell you it is exactly how they view the world. The only posturing comparable to the one coming from the so called enlightened anti-racist comments that I ever heard/read about was Marie Antoinette saying "Let them eat cake". She ended up without her head.


edit: I conflated ImageNet with the art exhibitors; it is the former who are culling the images as a result of public reaction, not the latter.

This is a really bizarre project. I had seen some really offensive race-based labels, but I thought revealing the ugliness of the system was part of the point of this project?

But besides that, the results just seemed completely scattershot; I half expect the artists/exhibitors to reveal that x% of the results were randomized. Last week, I tried it myself after seeing another Asian user display results that were entirely Asian slurs (e.g. gook, slant-eyed). I uploaded my own very Asian-looking photo and got "prophetess", along with very vague labels, such as "person" and "individual".

Maybe the exhibitors cleaned the data/results by the time I tried it, but I used it just a few hours after seeing the other Asian user's results, so I'm doubtful that her tweet/complaints were enough on their own to change up the dataset that same day.


> I thought revealing the ugliness of the system was part of the point of this project?

It 100% is. From the link to the artist's website:

"Things get strange: A photograph of a woman smiling in a bikini is labeled a “slattern, slut, slovenly woman, trollop.” A young man drinking beer is categorized as an “alcoholic, alky, dipsomaniac, boozer, lush, soaker, souse.” A child wearing sunglasses is classified as a “failure, loser, non-starter, unsuccessful person.” You’re looking at the “person” category in a dataset called ImageNet, one of the most widely used training sets for machine learning."


I was just about to edit and correct myself; it is ImageNet who has made the decision to delete the offensive images, after the reaction to the exhibitors' work. It's too bad the exhibitors didn't make their own mirror/cache of the dataset. Judging from some tweets I saw, I think this project really helped people to understand how much of current artificial intelligence is human-driven. It's not a sentient computer deeming you to be a "slant-eye", it's a bunch of random Internet users. (not that this makes you feel better about the world, but at least the hate's coming from an expected source)


While interesting, this is not surprising. One of the most commonly utilized datasets for learning ML is the Boston Housing dataset - https://www.kaggle.com/c/boston-housing

In it there's a problematic feature tagged simply as "black" and it is defined as the proportion of blacks by town.

Any pricing model that is built off of this dataset is inherently racially biased because the data has been collected and the feature tagged - but what's the alternative? Not to collect the information? Or collect it but completely ignore this feature?


Sensitive features should not be collected or used in applications where they introduce bias.

For example, a medical screening NN may find race to be a valuable feature for the prediction of illness; but a health insurance assessor should not.


Due to spurious correlations, it could still be helpful for the insurance assessor so that they can use bias mitigation techniques. Otherwise, it might learn something about zip code or something else that leads to a similar outcome as having race as an input variable. Just removing a sensitive variable does not suffice for preventing unwanted bias.


Then you are confusing bias with bias. One is data bias and the other one is racial bias. If you remove the data bias, you are by definition introducing a racial bias by imposing your will on reality.


AIs are like every other computer program - perfectly naive. If you train them on bad data, you get bad results. Garbage in, garbage out.


Of course, but the issue is also data availability. It is not easy to get large sets of data to be used in training AI. Imagenet was a great attempt to provide some useful data.


My comment was not intended to be a ding on Imagenet. Data comes from somewhere, and no source is without bias because no human is without bias.


What happens when we train them on an incredibly wide set of randomly-sampled data, which should show no built-in bias, and uncomfortable conclusions emerge? Do we censor the AI, castigate someone else in the chain for the problem, or just deal with reality as it presents itself?


In this example, where would you get the tags for the photos from, if not people who are biased?


Why assume that the people are "biased"? If the people are labelling things the way that people generally do, the software will return labels that are useful. Any kind of filtering needed to censor this for people's personal comfort should be implemented on top.


yes, when I’m preparing my prestigious project with my Ivy League peers I make sure not to sanitize or vet my inputs so as not to bias the results of my programs or publication success. truly it is remarkable that the training set we chose to use is racist but no single person who could have stopped this or be held responsible is


If your data set was compiled by a single person, well, easy enough.

If your data set was compiled by thousands of mechanical turk workers, well, you got a lot of people to blame a little bit. As it goes, "everybody is a little racist sometimes..." and apparently that shows on a big enough data set.


amazing, how when even more people are involved with selecting and tagging the input, it becomes even less likely any one person was to blame.


I love articles and papers like this. They keep illustrating that the people writing them do not get it:

Guess who classified the blonde white woman "a dish" on a Mechanical Turk? Do you think it was that Billy Bob from the swamp of Louisianan who makes $12/hour? Because $12/hour is impossible to make classifying images. Or do you think it was a dirt poor Indonesian or Filipino or Chinese or Indian for whom that's the best job he or she could possibly get and that's exactly how they view the world and there are over three billion of them?


Your characterization of these nationals aside, what makes you think any of them even know what “dish” means in this context?

Sincerely, a Malaysian Chinese who’s seeing this slang for the first time.


Dish is almost an archaic term a this point. So either someone's pretty old grandparents are working for MT, or people are translating a term from another language using an old dictionary.


It is colonial (British) English. I heard it again recently and it took me a few moments before I recalled it from my childhood. Apparently now "snack" is used the same way.

I would guess the classifiers went to school in India or Pakistan.


How were the mechanical turk workers asked to annotate?


Since the amount of speculation in this overall thread is pretty high, I suggest simply having a look at section 3 of [0]. It describes the construction of the MTurk task and how they quantified the confidence of answers they got.

[0] http://www.image-net.org/papers/imagenet_cvpr09.pdf


That's the key question here, isn't it?

If faces are like other images in the database, then according to this article turkers were presented a word and a group of images, and asked to click all the images that showed said word. https://qz.com/1034972/the-data-that-changed-the-direction-o...


ImageNet are likely incredibly naive with how they approached and fixed all of this.

from the article "Jamal Jordan, a journalist at the New York Times, explained on Twitter that each of his uploaded photographs returned tags like “Black, Black African, Negroid, or Negro.”"

Which indicates that the vocabulary they used has 4 words to describe a black person's skin color. This severely complicates things and leaves all kinds of unintended biases on the table before even getting to the human element. Though I doubt they used a dictionary at all because 'Black African' isn't really a word. And not found in a few dictionaries I tried online. - if the article is accurate.

They should have curated the vocabulary first, value judgements like 'attractive' should be removed because it means something different for everyone judging the images. Synonyms should be collapsed into one to prevent weighted biases, etc.


I'm actually quite surprised that they didn't collapse the synonyms though.

Though value judgements should not be removed. They should have been separated and placed together with some tagger metadata as they might actually be informative then.


In NLP (specifically vectorizing words, ala word2vec) there's a famous test of whether or not your training has worked properly whereby you calculate the vector of "king" and subtract the vector of "man" and add the vector of "woman," if your machine is properly tuned, you should end up with a vector close to "queen" or "princess."

I wonder if similar things can be done to address specific (i.e. racial or gender) biases in computer vision.


There is a similar word-embedding test that definitely rustles people's jimmies:

Doctor - Man + Woman = ?

What normally comes out is Nurse. What "they" think should come out is Doctor!

By "they" I mean people that get upset by this.


Yeah, besides the fact that this compositionality is relatively unique to word2vec, research on the biases pre-trained models express is pretty available. Linked a few below for those interested. Most of the issues are down to the same phenomenon discussed here in the context of ImageNet, the input texts were biased and the algorithm learned said bias.

[0] https://arxiv.org/abs/1607.06520 "Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings "

[1] http://proceedings.mlr.press/v97/brunet19a/brunet19a.pdf "Understanding the Origins of Bias in Word Embeddings"

[2] http://matthewkenney.site/biases.html "Google word2vec biases"


The longer piece this one is based on is much more comprehensive in both discussion and concrete examples: https://www.excavating.ai/


Why take down the images? It seems really that one should take down entire classifications, i.e. remove the tags.


ImageNet is a mapping from tags to images, so removing the tags means removing the images.

An image of a "bad person" wasn't tagged by someone looking at said image and deciding that "bad person" was the best possible description. It was generated by searching Google Images for "bad person" and removing obviously incorrect results (e.g. when there's nobody in the image).

Researchers have been using it to learn the inverse mapping from images to tags with some success, but in its construction the dataset is not naturally suited for that task.


As I understood it, and confirmed by Wikipedia (as much that's worth) images were hand annotated.


Only the second step "removing obviously incorrect results" involved human annotators.


Because the woke scolds have spoken and nobody wants to stand up to them. So better to do patently insane things than risk becoming collateral damage in a social lynching.


I think the whole training and such AI was just bad designed. Aren't all first impression judgements prejudices as much as all racist bias are? If I say a person is attractive, inspires me just based on their looking it is a positive judgement, a bias based on the looks and a few traits I subconsciously perceive and perhaps I cannot even explain, and the same with the negative ones, and I guess for some people this association happens also to race and looks. Basically an AI was trained to map the prejudices of their inputs, and that means to create a snapshot of the subconscious average of whoever behind the mechanical turk is, everyone has prejudices of all sorts which usually are not manifested externally but in a process of classification such as the one of the mechnical turk might arise much more.


The article mentions https://imagenet-roulette.paglen.com/ but doesn't link to it. Just put in a photo and see what category of person the algorithm sorts them in to.


One of my favorites is searching google images for "immigrant" versus "expat". Guess which one is whiter!


I don’t quite get it, why is “Buddhist” more objectionable than “grinner”? Surely there must be more egregious examples than that if they decided to pull 600,000 images? And if they find the tag “Negro” offensive can’t they just replace it with “black”, “of African origin” or whatever the currently allowed term is? Tagging a black person as black does not seem very objectionable in a country that provides racial breakdowns for official statistics.


When white people are labeled for their other qualities, and non-white people are labeled for their (perceived) ethnicity, that’s racist. Which is to say, unfair. Not nice.


Were those two examples labeled by the same people? If there really is systematic bias why don’t they mention some statistics? For example, 90% of white people lacked a race tag while 80% of black people had one?

Also, it doesn’t seem like there has been any evil intent here, nor did I see anything about pernicious consequences. It seems a bit overblown to accuse people of racism over something that is unintended and theoretical. Just update the tags.


Shouldn't everyone just be labeled for everything? Seems like we should just add "white" to the tags of white people.


I think the ironic thing here is the algorithm is intentionally designed to make a superficial judgement on the image it’s presented with.

If it was trained to identify “human” or otherwise categorize the things in the picture, that’s exactly what it would output.

Next you can train it to attempt to guess basic traits like gender or ethnicity, of course this can only be done based on the RGB values of a 2D array of pixels. Interestingly the NN will not be merely be using skin color but building probabilistic weightings based on any statistically significant features. For added controversy, it’s probably even possible to invert parts of the network to suss out how it’s weighting various facial features toward different labeling.

Lastly the images could be labeled for things like profession of the subject. A good intentioned effort to perhaps detect things like a lab coat could mean a doctor or scientist was followed by oblivious Turkers with predictably poor results.

The problem of course is not with the images but the particular labeling hierarchy, and allowing opinionated labels as well as labels which may be true for the particular subject but which bear no distinguishing features for which to actually codify (“portrait of a macroeconomist”), in other words, garbage in garbage out. ImageNet calls this the “imageability” of the label.

Calling this racism of course is entirely inapt, because there is no judgement being made whatsoever. Even if the sampling method was well designed and the labeling factually accurate, the system would still produce output which could be considered offensive. Again, because the entire point of the algorithm is to generate statistical assumptions based on a single image.

My conclusion is that some superficial judgments are algorithmically useful and hopefully less controversial. “White male human, ~55 yrs old, 180lbs”. Even things like analyzing clothing and guessing where the picture was taken. Iff the clothing is a uniform, identifying the profession (police, fire, paramedic)

But you have to know where this goes off the rails. Bad enough to label indistinct portraits with the subject’s profession, let’s not do inane things like labeling them with how subjectively attractive the labeler thinks the person is, their economic status, maybe even a 1-10 scale of how threatening they look or if they look like a criminal or not! </facepalm>


The "app", mentioned in the article, but not linked, is hosted here (until Friday), if you want to try it out: https://imagenet-roulette.paglen.com/

Apparently i'm either an insurance agent (in a grey t-shirt) or a surgeon (in a red hoodie).


“This exhibition shows how these images are part of a long tradition of capturing people’s images without their consent, in order to classify, segment, and often stereotype them in ways that evokes colonial projects of the past,” Paglen told the Art Newspaper.

Odd that the images were removed for being categorized as the project intended.


wouldn't it make more sense to add more images, not remove existing data?


If I understand correctly, they are removing incorrect tags, not the data. It's just news because some of the incorrect tags are racist bias. If they removed a bunch of cats tagged as dogs, you wouldn't hear about it.


IIUC the database is intended to identify the actual contents of images, not collect data about stereotypes. Even keeping the categorized face images in a separate database wouldn't be that useful, since turkers are not a random sample.


I think it's temporary. Sensible defaults to protect users who don't understand potential weaknesses of the model. Also, if someone does train an AI on a bunch of photos and ends up with a biased result, it won't be ImageNet's maintainers' liability.


It seems to be largely the tags that are biased. Long term, I'm sure people want big unbiased databases of images of people, but it will take a lot of time and effort to build them and ensure they aren't biased.


For the artist, ImageNet’s problems are inherent to any kind of classification system. If AI learns from humans, the rationale goes, then it will inherent all the same biases that humans have. Training Humans simply exposes how technology’s air of objectivity is more façade than reality.

An AI being accused of bias tends to really mean it works. Removing bias from AI, I've noticed, requires hardcoded 'fixes' rather than refactoring the algorithms. And in my view, becomes yet another human-curated classification system, and no longer AI.


Your point takes a narrow view on correctness and also shows why you need to take the systemic view on the issue.

The problem is not that the AI is "inaccurate". The problem is the second order effect: when you build systems on top of this AI's predictions, you cement the input social biases into future systems in a way that is very hard to remediate.

The real problem is how to avoid accidentally creating systems that amplify existing social biases.

I'm very specifically using the term "social bias" to distinguish from "bias" as a term in ML, because they are very different problems.


This comment reminded me of a really interesting article about bias in AI: https://www.chrisstucchio.com/blog/2016/alien_intelligences_...


I wonder if the article author knew what he was hinting at when he chose the example of the art project calling someone a "dish"???

https://www.youtube.com/watch?v=HEX7xsYF1nA

(And now I'm gonna be listening to 90s electro-disco all day...)


That's newthink at a level I find hard to comprehend. Censoring datasets instead of improving them.

"We have too many pictures of white people, remove them!"

"We don't have enough pictures of non-white people, add them!"

I'd have gone for the latter, and have let the set be biased until fixed.


I uploaded my picture, and it was tagged "beard",

Which is fine, I have a beard, But then I read the definition:

> beard: a person who diverts suspicion from someone (especially a woman who accompanies a male homosexual in order to conceal his homosexuality)

:-/


Any way to download those images before they get deleted ?


Everybody seems to be focusing if the image recognition is racist or biased. I think there is a much more fundamental problem. The image classification does not work for a lot of images of people!

For example Search for images on scuba diver https://upload.wikimedia.org/wikipedia/commons/9/94/Buzo.jpg is labeled choreographer https://dtmag.com/wp-content/uploads/2015/03/scuba-diver-105... is labeled picador (the horseman who pricks the bull with a lance early in the bullfight to goad the bull and to make it keep its head low)

How about search for images of dancer https://eugeneballet.org/wp-content/uploads/2018/10/Alessand... is a nonsmoker https://www.ballet.org.uk/wp-content/uploads/2017/09/ENB_Eme... is speedskater https://www.ballet.org.uk/wp-content/uploads/2018/10/WEB-ENB... is plyer https://rachelneville.com/wp-content/uploads/2018/11/10.14.1... is a mediatrix (a woman who is a mediator)

How about images of lumberjack https://alaskashoreexcursions.com/media/ecom/prodxl/Lumberja... is a skinhead https://static.tvtropes.org/pmwiki/pub/images/lumberjack_591... beard: a person who diverts suspicion from someone (especially a woman who accompanies a male homosexual in order to conceal his homosexuality) http://cdn.shopify.com/s/files/1/0234/5963/products/I4A0032-... is a flight attendant https://previews.123rf.com/images/rasstock/rasstock1411/rass... is an asserter, declarer, affirmer, asseverator, avower: someone who claims to speak the truth

How about teacher https://media.edutopia.org/styles/responsive_2880px_16x9/s3/... is a shot putter: an athlete who competes in the shot put https://c0.dq1.me/uploads/article/54231/student-classroom-te... The kid is a non smoker and the teacher is psycholinguist: a person (usually a psychologist but sometimes a linguist) who studies the psychological basis of human language https://media.gannett-cdn.com/29906170001/29906170001_578035... is girl, miss, missy, young lady, young woman, fille: a young woman https://media.self.com/photos/5aa9743e19b7c01d73149d50/4:3/w... (almost the same picture as the prveious one ) is now a sociologist!

How about search for pilot images https://news.delta.com/sites/default/files/Propel%20Embedded... is a parrot: a copycat who does not understand the words or acts being imitated https://pilotpatrick.com/wp-content/uploads/2017/10/workday_... is a beekeeper, apiarist, apiculturist: a farmer who keeps bees for their honey !!! https://imagesvc.meredithcorp.io/v3/mm/image?url=https%3A%2F... is a boatbuilder !!

And some random one https://media.spiked-online.com/website/images/2019/08/06154... is a "Sister/nun" https://www.english-heritage.org.uk/siteassets/home/visit/in... deacon, Protestant deacon: a Protestant layman who assists the minister https://minervasowls.org/wp-content/uploads/2018/04/Romans-M... are identified as morris dancer


The more you impose completeness of morality into a data set, the more inconsistent your results will be.

Gödel arbitrage will be very profitable in the future.


Of course it was removed. 4chan was having way too much fun with image classification recently :D


They will be hated for it, but they do provide a really important service here. And if it is just pulling down the pant of people trying to classify the world.


reminds me in the early 2000s when people would say "but he's gay" as if that had anything to do with their profession or aspirations in life or the topic at hand, as if that specific metadata defined them more than the other pieces of metadata.

this AI seems to be doing that to things it has determined are black people. A bunch of synonyms for black people, some English, some Spanish, some phenotypes, some of those descriptions simultaneously fallen out of favor in some parts of the world but not others, while completely eliminating other metadata.

I'm not sure Imagenet's response of removing "sensitive" adjectives is capable of fixing this. Mechanical Turk: english, spanish, academics, using terms that aren't universally agreed upon?

That doesn't really address what is happening


[flagged]


Perhaps when we have real artificial intelligence, whatever that is. Right now we have pattern matchers, and it is not so much the 'truth' as it is stereotypes.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: