600k Images Removed from ImageNet After Art Project Exposes Racist Bias

seph-reed · on Sept 23, 2019

Superficial judgement is kind of where intelligence starts.

It's really only people where you can't tell what it does/is from the outside. Cars, trees, animals, mountains... everything else, if it looks a way it acts that way. Early AI will probably have just as much trouble with this as people have historically.

I really wish people would start viewing Racism as a willingness to let that primitive part of the mind be in command, rather than a binary attribute you either have or don't. Like, nobody is 100% not racist. There will always be slip ups, over-simplifications, snap judgements, subconscious or not.

Enginerrrd · on Sept 23, 2019

>It's really only people where you can't tell what it does/is from the outside.

Thing is, that's not even true in that it doesn't fully acknowledge the problem. You often CAN tell information from people's outward appearance, albeit probabilistically. Therein lies the problem: you can very easily train an algorithm to be maximally right according to your cost function, but end up biased because the underlying distributions aren't the same between groups.

The issue is that as a society we've (mostly) decided that unfairly classifying someone based on correlated but non-causal characteristics is wrong, EVEN in the extreme that you're right more often than you're wrong if you make that assumption.

foldingmoney · on Sept 24, 2019

>You often CAN tell information from people's outward appearance, albeit probabilistically. Therein lies the problem: you can very easily train an algorithm to be maximally right according to your cost function, but end up biased because the underlying distributions aren't the same between groups.

In fact, an AI trained on aggregate data to probabilistically infer characteristics about individuals is _literally a stereotyping machine_.

If people are upset that their stereotyping machine stereotypes people, they probably didn't fully understand what they were doing when they built it, because this is not a design flaw -- it's the design.

learnstats2 · on Sept 24, 2019

Maybe people are upset that stereotyping machines are being given harder powers to make consequential decisions about individuals.

wolfgke · on Sept 24, 2019

> Maybe people are upset that stereotyping machines are being given harder powers to make consequential decisions about individuals.

Then these people should argue against AI instead of swinging the racism cudgel.

wjnc · on Sept 24, 2019

I would respectfully disagree. The builder of the AI should be trained in recognizing that his discrimination machine can be used for good and for bad. If the creation shows racist tendencies, it's an outcome of the machine but a function of the (lack of) quality in the modeller. If the end result is racism, I would like to be able to point to the creator of the AI (a human) not a piece of software.

More concrete: government AI shouldn't use things like names, zipcode demographics (at least those strongly to those characteristics we think discrimination = racism) and pictures of humans in their models. Why? Because it's pretty much impossible to control your model for racist tendencies one you start there. It's in the ethics of the creator of the model to point that out and just don't do it. If you do, and all people whose name start with an M (for Mohammed) get a different category, racist is the right term IMHO.

wolfgke · on Sept 24, 2019

> The builder of the AI should be trained in recognizing that his discrimination machine can be used for good and for bad.

Quite a tautology.

godson_drafty · on Sept 24, 2019

It could work where a large number of AI's are constructed. A small subset of these AI's--those that can only be used for good--are used as a training set. A number of AI's that can be used for bad are added to the training set. The AI builder is exposed to this training set for a period of time, and on each exposure he is rewarded if he correctly categorizes each AI by its ability to be used for good or bad. After the AI builder demonstrates an ability to properly differentiate the AI's that can only be used for good from those that can be used for good or for bad, he is set loose on constructing a new AI, after which he is compelled to render (and publicize) a judgement on its potential use for good or bad. Alternatively, the builder can also be tasked with choosing only-good AI's from larger mixed set of good/bad AI's.

bsanr2 · on Sept 24, 2019

He's not wrong though.

alwayslearning0 · on Sept 24, 2019

It could just be a path forward. In America a lot of the time we can't get any progress unless it's a racism issue. Things like gerrymandering or marijuana legalization get their first pushes because the most blatant group that suffers are POCs. The fact that it harms everyone is a little lost on most people, but in the end the racism cudgel can be effective for positive societal change. In this case, we can use it to get rid of automatically being identified for crimes or whatever by an AI.

AstralStorm · on Sept 24, 2019

AI is very definitely not procedurally fair unless you can explain all behaviors. Which is why designers generally don't know what they built.

bjourne · on Sept 24, 2019

That is true but be careful with the terminology! A system trained using aggregate data to probabilistically infer characteristics isn't artificial intelligence. If the system could find causations, then there would be grounds for calling it intelligent. But finding correlations, that's just number crunching.

nostrademons · on Sept 23, 2019

See also Stucchio's impossibility theorem: it is impossible for a given algorithm to be procedurally fair, representationally fair, and utilitarian at the same time:

Video: https://www.youtube.com/watch?v=Zn7oWIhFffs

Slides: https://www.chrisstucchio.com/pubs/slides/crunchconf_2018/sl...

AstralStorm · on Sept 24, 2019

The general point is that you have to robustly compromise and satisfice all the goals. People tend to be rather good at it when taken as a group. (Any particular person may be bad at a given subset of all problems.)

It is a kind of optimality condition on all three goals.

The robustness additionally means that should conditions change, the algorithm usually will become better not worse and should a degradation still happen, it will be graceful and not catastrophical.

It's a hard and open problem in ML and especially ANN, design of robust solutions in the space. Most have really bad problems with it even when debiased.

sgt101 · on Sept 24, 2019

Hello - do you have a good reference to this area? I bang on about similar ideas whenever allowed, but haven't found good support in the literature.

Not that I mind that too much!

missosoup · on Sept 24, 2019

Good presentation

bsanr2 · on Sept 24, 2019

>The issue is that as a society we've (mostly) decided that unfairly classifying someone based on correlated but non-causal characteristics is wrong, EVEN in the extreme that you're right more often than you're wrong if you make that assumption.

This is likely due to an acknowledgment of the limits of human models to account for the full context surrounding correlated-but-non-causal classifications, such that conclusions drawn from them can have unforeseen or highly detrimental ramifications.

Speaking to race in America specifically, the schema through which we judge people are highly susceptible to bias from the white supremacist bent of historical education and general discourse. This is how you end up with cycles like those within the justice system (pushed in part by sentencing software), wherein black defendents are assumed to have a higher likelihood of re-offending, therefore increasing the likelihood of any given black defendent not receiving bond or having a lengthy sentence if convicted. After all, blackness correlates with recidivism. Lying outside this correlative relationship is the likely causal relationships of longer stays in jail and lack of access to employment opportunities, which disproportionately affect black people, causing higher rates of recidivism, regardless of race.

AstralStorm · on Sept 24, 2019

You have other measures than prison.

You can still have enhanced vigilance without enhanced annoyance and mistakes.

There's often a superior choice lurking that nobody is thinking about, sometimes expensive, sometimes not, seemingly unrelated to such optimization. This is why ML is not intelligence, it cannot find new solutions you're not already looking for.

The main problem of the judicial and police system is it tries to be procedurally fair and still fails at it anyway.

bsanr2 · on Sept 24, 2019

I'd counter that in many cases it doesn't even try to be fair. It privileges the ability to craft an argument over bare facts, which immediately privileges those who can afford professional representation. At the core of the fear of a surveillance state isn't simply the loss of privacy (which in and of itself could be worth the accuracy it would bring to judicial proceedings), but the fact that it would just bolster the ability of skilled narrative-builders to pull the most advantageous facts out of context and twist them to their whims.

lone_haxx0r · on Sept 24, 2019

> we've (mostly) decided that unfairly classifying someone based on correlated but non-causal characteristics is wrong, EVEN in the extreme that you're right more often than you're wrong if you make that assumption.

Ironically, the only places where it's legally prohibited or frowned upon to use these heuristic techniques are situations that people have arbitrarily (heuristically or conveniently) decided.

For example: It's "not fair" to hire someone because they're white (and consequently have a higher chance of being wealthy and hence a higher change of being educated.)

But it's "fair" to choose a love partner based on their height, their waist-to-hip ratio, their weight (and hence having a higher chance of giving birth to healthy offspring, better physical protection, etc.).

Maybe it's hypocritical, and I don't know if that's a good thing or not. Maybe being hypocritical helps us survive.

stickfigure · on Sept 24, 2019

The hiring vs mating issue is perhaps simpler than you picture. US federal discrimination law only applies to companies with more than 15 people. If we regularly married 15+ people at a time, we might very well put legal restrictions on your mating choices. The more personal the decision, the more agency you get.

hellocs1 · on Sept 24, 2019

wow, interesting. Why does it only apply to companies with more than 15 people? Is it the idea that you're more likely to have family help (and only family being willing to help) when your company (more small business than traditional startup) is this small?

Double_a_92 · on Sept 24, 2019

If you are starting a small company you either pick people you already know (whoever those might be) or maybe a few random experts with very specific skillsets. There is no place for you to actively discriminate someone that would have maybe been a better pick, just because you didn't like their skin color or gender... Or if you still do it comes at your own loss.

lone_haxx0r · on Sept 24, 2019

> if you still [discriminate] it comes at your own loss.

This applies to companies of any size.

lone_haxx0r · on Sept 24, 2019

A job contract is naturally a relationship between two entities: employee and employer.

Sure, some people in some countries have tried (sometimes successfully) to undermine this principle, but it's akin to forcing people to marry in groups.

JetSpiegel · on Sept 24, 2019

While (modern? ideal?) marriage is a peer-to-peer relationship, the relationship between employee and employer is unbalanced, more like pet-owner.

Societal restrictions on contracts are a way to balance this. Virtually all societies have some form of this, outside utopian ultra-liberal hellholes.

mrkurt · on Sept 24, 2019

That's not very arbitrary. If you're hiring someone, you're always in a position of power. If you're dating someone, there's no power differential (or if there is, that's a problem all by itself).

stavros · on Sept 24, 2019

How are you "always in a position of power" when hiring someone? That's only true when there's more supply than demand, and it's the opposite in markets where the candidates get multiple high-quality offers to choose from.

untog · on Sept 24, 2019

Because you are paying that person money and have the ability to fire them. In the US you're also probably providing their health care.

I get what you're saying, but no one moves jobs every week. The sunk costs of switching employment are significant for the employee, less so for the company.

Double_a_92 · on Sept 24, 2019

They provide you valuable work in exchange for that... Internally you are probably imagining some big corp that can pick from 100s of replaceable workers.

Broken_Hippo · on Sept 24, 2019

If you provide health care, you always have power over that person. Less so if they are relatively healthy - but any condition, theirs or a family members, means that the person has no real choice but to do enough good work to keep the job. Even if they hate the company. Even if you treat them poorly. They still must work for you. This is even more true if you have hundreds of people that will replace them and your health coverage is good enough - or at least, better than the opposition.

To a lesser degree, the same goes with vacation time and other benefits. At least in the US, anyway. This is why having some of this stuff coded into law and decoupled from employment takes some power away from employers.

Double_a_92 · on Sept 25, 2019

Ideally there are multiple companies which you could choose from... And some might really need your specific skillset.

rkangel · on Sept 24, 2019

In practice, that's how it works, particularly at the lower end of the economic spectrum. If that wasn't the case then the concept of a minimum wage wouldn't be necessary - the market would take care of it.

Of course, this is a point of view and not everyone agrees with me, but to me it appears that for a chunk of the population the available jobs do not pay well enough to meet a certain standard of living.

stavros · on Sept 24, 2019

> it appears that for a chunk of the population the available jobs do not pay well enough to meet a certain standard of living.

That is definitely true, but it's also pretty much the exact opposite of "always".

lone_haxx0r · on Sept 24, 2019

> If you're hiring someone, you're always in a position of power.

What? Can you explain the reasoning behind this statement?

smt88 · on Sept 24, 2019

If the candidate had more power, she would set up interviews and force the employers to impress her.

Of course, that scenario happens only in extreme edge cases. Even in a booming economy with a shortage of workers, John Doe is not going to be pursued aggressively to fill the Senior Marketing Manager role.

This makes intuitive sense: employers have a ton of money, so people come to them.

Right now I have a client who needs to hire truck drivers and can't do it fast enough. I asked him what he'd done to make his company the most attractive (pay, technology, perks, etc.). He said he's done nothing.

jacquesm · on Sept 24, 2019

> John Doe is not going to be pursued aggressively to fill the Senior Marketing Manager role.

That's exactly what recruiters and headhunters do. Aggressively nails it.

rumanator · on Sept 24, 2019

Recruiters and headhunters don't necessarily seek people to fill a role. They seek people to fill their batch of application forms to send to those actually doing the hiring.

smt88 · on Sept 24, 2019

Recruiters don't replace the interview process, where the dynamic is that the applicant is the interviewee. They only change the way the applicant discovers the job.

Most people will never be headhunted.

al_chemist · on Sept 25, 2019

If candidate had more power, companies would create whole section of company to try to find and hire talents. Companies would literally pay to find candidates.

lone_haxx0r · on Sept 24, 2019

By that same logic, it's unfair for attractive people to choose who they date, because they're in a position of power.

rkangel · on Sept 24, 2019

There are two possible situations:

A candidate has multiple job offers, and decides which one to take.

The company interviews multiple people for a single position, rejecting the others.

There are only a small number of sectors where the first situation is reasonably possible, a lot of us on here are incredibly fortunate that engineering happens to be one of them. For the majority of the job market (by volume of people rather than volume of money), it takes people attempt after attempt to get a job. They don't get to choose between multiple offers, they have to take the first thing that will allow them to pay the rent, and then they have to hold on to it.

jlawson · on Sept 24, 2019

And yet from the point of view of the one who everyone is discriminating against, the feeling is pretty similar: Everyone rejects me and there's nothing I can do about it.

benj111 · on Sept 24, 2019

Turning it around though. It's entirely fair to hire someone because they are well educated, which presumably means you're disproportionately hiring white people.

Europe has the concept of indirect discrimination, which could make that illegal, certainly things less central to the role could amount to indirect discrimination.

lkrubner · on Sept 24, 2019

This is over stated:

"are situations that people have arbitrarily (heuristically or conveniently) decided"

The Holocaust was not convenient, even to those who were for it. Slavery was convenient for those who benefited from it, but not those who suffered under it. Over the last 200 years racial categories lead to many millions of dead. This is not merely a matter of convenience.

lone_haxx0r · on Sept 24, 2019

The holocaust and slavery fall in the realm of physical violence, or at least coercion.

Violence and coercion don't logically follow from racial differentiation. One may point out the differences between populations of different races, but that wouldn't justify attacking any individual from those populations.

My comment is framed inside that basic (and obvious) principle. Choosing a partner or an employee is not a violent or coercive act.

brlewis · on Sept 23, 2019

> The issue is that as a society we've (mostly) decided that unfairly classifying someone based on correlated but non-causal characteristics is wrong, EVEN in the extreme that you're right more often than you're wrong if you make that assumption.

That sounds like the wrong basis for calling it extreme. It's not at all extreme to say that classifying an interview candidate based on correlated but non-causal characteristics is wrong, regardless of the statistical significance of those correlations.

Enginerrrd · on Sept 23, 2019

I just mean that I'd guess most stereotypes have a correlation nowhere near 0.50, but it'd be wrong to use them to screen candidates even if it was 0.50 or greater. I used the word "extreme" to convey the sense that such a scenario of stereotype accuracy is very unlikely.

bgilroy26 · on Sept 24, 2019

Most stereotypes are lagging indicators that do not tell us about the future, as the history of Fenty Beauty makes clear.

Noumenon72 · on Sept 24, 2019

I don't believe Fenty Beauty has been discussed on Hacker News before seeing as how it's a line of cosmetics, so maybe you could expand on what you know about its history that others may not have been following in as much detail?

bgilroy26 · on Sept 24, 2019

In 2017, Fenty Beauty found an underserved market (dark-skinned women and high-quality makeup) and made a killing selling them what they wanted. If you relied solely on history and stereotypes, you would believe nobody could make $500 million dollars that way.

https://www.fool.com/investing/2019/05/15/rihannas-fenty-is-...

Noumenon72 · on Sept 26, 2019

Thank you very much!

tmalsburg2 · on Sept 24, 2019

> EVEN in the extreme that you're right more often than you're wrong if you make that assumption.

And this is actually the rational thing to do. Reason: There are two potential errors involved here: labeling an innocent person a criminal (false positive) and labeling a criminal as innocent (false negative). The key is to realize that the cost of these two errors is not the same. For instance, treating an innocent person as a criminal could be much more expensive for the person and society than not detecting a criminal with a given classifier. For that very reason, we have the presumption of innocence as a principle in law. As a consequence, you don't want to select a classifier based on just the rate of errors overall, but you want to incorporate some kind of loss function that minimizes the cost for individuals and society. Under that loss function, the best classifier may actually be wrong more often than some other classifier.

sgt101 · on Sept 24, 2019

I'd just flag that "EVEN in the extreme that you're right more often than you're wrong" is not a meaningful line (if someone wanted to apply it) because the fact is that it's not the number of wrong calls or right calls that matter here - it's which ones and when. For example the project under discussion classifies a photo of my mother as a young woman as a "flibbertigibbet", which is amusing, as a way to tease her now (she is a 70 year old ex prison and probation officer and needs no protection from teasing).

However, had this been done to my daughter and the same result obtained (in fact the result was "blue" but there you go) by someone assessing her potential as a candidate for university - well there's damage.

thiccly · on Sept 24, 2019

The fact is that we live in a society

um_ya · on Sept 23, 2019

In many cases, being logically correct is more important than being politically correct.

For example: It may be politically correct to drive down a bad neighborhood in the middle of the night, but it isn't logically correct.

seph-reed · on Sept 23, 2019

True, but poorly stated. Which is to say, you aren't wrong, but you're missing the indications of nuance which are really, really important to the personal liberty of not being defined by superficial traits.

munificent · on Sept 23, 2019

It logically correct to do that when it's your neighborhood and you're driving home.

But note the implicit bias in your own comment: you assume yourself and all the readers of the comment are not people who live in bad neighborhoods.

serf · on Sept 24, 2019

>But note the implicit bias in your own comment: you assume yourself and all the readers of the comment are not people who live in bad neighborhoods.

that assumption results simply from the need for 'bad neighborhood' to be a negative scoring action.

if you oversimplify it to get rid of 'implicit bias' (which I don't agree exists in the example), the results turn into near-meaningless babble.

"For example: It may be politically correct to do something that ignores statistical dangers in favor of the promise of human goodness, which may result in the possibility for more personal endangerment than other choices, but it isn't logically sound to ignore such statistics for the hope of a less biased personal experience."

The example requires the person driving to be detached from the bad neighborhood that they have a choice to drive through. How that isn't an obvious requirement for the example to have merit is beyond me.

sershe · on Sept 24, 2019

That is not technically even true. If you live in a bad neighborhood it's still probabilistically better to drive home thru a good neighborhood than thru another bad neighborhood.

whenchamenia · on Sept 24, 2019

While there may be some bias, it is as much yours. Even people who live in bad neighborhoods avoid driving through them when possible.

wruza · on Sept 24, 2019

Did you live in really bad neighborhood? I did. Out of two bus stops I always used one that required to cross two streets with no crosswalks. The other required to walk right through the middle of my lovely vicinity with 50% probability of giving something up to ‘charity’.

tsimionescu · on Sept 24, 2019

Well, it's especially in those situations that you need institutional controls to forbid the logically correct choice.

After all, you don't really need a law telling companies they aren't allowed to hire infants as senior officers - it's already not in the company's interest to do so.

However, when there is a logically correct but politically incorrect decision that the company could make, it is now that you need laws to prevent the company from taking that choice.

Of course, as the weight of an institution's decisions goes down, so does the need to police its actions. In particular, it is rarely necessary to prevent an individual person from acting on their biases.

Applying this to your example, if we had an AI that should suggest your best route home, and it avoided a short route through a bad neighborhood, that is likely ok. However, if a municipality used the same AI to decide where to prioritize changing street lights, that should be prohibited.

fao_ · on Sept 23, 2019

> The issue is that as a society we've (mostly) decided that unfairly classifying someone based on correlated but non-causal characteristics is wrong, EVEN in the extreme that you're right more often than you're wrong if you make that assumption.

Sorry, but it's not a decision. Science has found repeatedly that using outward characteristics does not work as a good classification measure. Society simply enforces not making bad judgements. See:

Phrenology, Racism -- "appearance implies certain things about someone's character and intelligence" (https://en.wikipedia.org/wiki/Phrenology)

Sex -- "A person's gender presentation determines their gender or sex" (https://en.wikipedia.org/wiki/Transgender_history), "A person's genitalia determines their chromosones", "A person's chromosones determine their sex and their ability to give birth" (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2190741/).

Wealth -- "That person is wearing bad clothing/living an unassuming life therefore they must be poor" (https://www.npr.org/2018/12/29/680883772/social-worker-led-f...)

Intelligence -- "This person scores badly on intelligence tests therefore they must not be worthwhile" (https://en.wikipedia.org/wiki/Richard_Feynman, https://www.psychologytoday.com/gb/blog/sudden-genius/201101...), "This person appears to be unskilled or incapable of many tasks therefore they must be totally unskilled" (http://en.wikipedia.org/wiki/Savant_syndrome)

deogeo · on Sept 23, 2019

> Science has found repeatedly that using outward characteristics does not work as a good classification measure.

Science has found that, on average, it works [1]. Of course there will be many cases where it fails - that's how statistical inference works. Whether that makes it a 'good' measure, by whatever standard, is a different question. But there is no doubt information can be inferred from appearance.

[1] http://www.spsp.org/news-center/blog/stereotype-accuracy-res...

Enginerrrd · on Sept 23, 2019

>Sorry, but it's not a decision. Science has found repeatedly that using outward characteristics does not work as a good classification measure.

Lol, what? Generically, that statement is almost certainly false more often than true in general in science. But I believe you are restricting yourself to social psychology?

You then presented a long list of examples taken from the tails of certain distributions to refute an argument that said distribution exists and has an average? I didn't even name any particular distributions. You're thinking appears flawed and emotionally driven, and most unfortunately, that's the type of thinking that will lead you to building biased systems.

Here's the point you missed the first time around: There are going to be outwardly visible characteristics that ARE correlated with some factor of interest, to the extent that training a machine learning algorithm based on a cost function that uses predictive accuracy alone WILL result in a system that assigns what society would consider an inappropriate importance placed on non-causal but correlated parameters.

Here's a real world example that might help you understand why this is important: (Data taken from: https://en.wikipedia.org/wiki/Incarceration_in_the_United_St...) Because blacks are over-represented in the US criminal justice system (40% of the prison population vs 13% of the population) and because part of what defines "black" is the outward appearance of certain facial features, a facial-recognition algorithm which is trained to recognize criminals, with a cost function based on prediction accuracy alone, and facial features as input parameters is going to have false positives that over-represent blacks. Does that sound like something you want? Because denying the underlying distributions is going to lead to exactly that.

It's very important to consider this when you develop a training set, for fucks sake. It might work something like this: Take 100 innocent people's faces at random. (On average it will have only 13 blacks) Then take 100 random criminal faces from inmates. (On average it will have 40 blacks.)

Then mix up the groups into your training set and assign a prediction score 1 or 0 depending on whether or not your classifier has correctly predicted whether or not a face was in the criminal group. Then, based on no other feature than race, your neural net can get better performance based solely on guessing more often that black people are criminals. That's not a good thing.

Do you get it now? The likelihood of being falsely identified as a criminal is greater based on the non-causal but correlated variable of being black. And this has happened several times already! You can't keep pushing this narrative that neglects the underlying statistics because of your beliefs, or people will keep making racist systems.

stubish · on Sept 24, 2019

It seems that some tasks are just inherently racist (and sexist, etc.), and we should be able to identify them before someone inflicts it on society.

Trying to identify potential for criminality based on looks may well work, identifying and using the underlying racial biases. And if those systems are used, where people identified by these systems are more likely to be identified as criminals and investigated, we end up with a feedback loop and, over time, the racial biases in a society that uses the systems will get worse. More black criminals will be caught, as they are more likely to be suspected as criminal, making the racial biases worse for blacks. While the opposite happening for white faces.

Similar social evolution would happen if you pre-screened job candidates, over time magnifying existing gender and racial biases. I've seen in some cultures it is required to add photographs to job applications, but I think it a good thing that this practice is discouraged in western countries.

esyir · on Sept 24, 2019

So, I disagree with the "inherently racist" portion of your argument. You can evaluate the classifiers in a race independent manner, i.e. take a look at the metrics stratified across race. I'm going to proceed here assuming that the classification task is possible.

Say in your example of criminality, if black people were more predicted by the classifier, it isn't racist if it accurately reflects the base distribution. There's a line to be crossed here, and to me it's if that application starts to significantly (where the boundary lies here is up for debate), and directly affect the non-criminal portion. I'd say that if the misclassification error is similar across racial lines, then there is no issue.

Additionally, I don't quite agree with the "making racial biases worse" argument either. The way I see it, we already use racial heuristics in law enforcement. With automated, replicable tasks, we can at least quantify the degree of bias and correct for such.

AstralStorm · on Sept 24, 2019

The main question remains: what price do you want to pay for procedural fairness, why is it even a major goal?

Most people would probably opt for utility mixed with representational fairness of some degree even when it means law applies to some degree differently to groups and special cases.

Justice is not fairness. It has an institution called motive. (Which is often overly simplified or ignored.)

It incorporates elements of fairness. And it is hard to train. Not everyone can be Solomon, for example.

bsanr2 · on Sept 24, 2019

This is truly where the danger lies. You can say that you're trying to make rulesets that are accurate to the world we live in, without acknowledging that you fall heir to, and may in fact be producing this generation's version of, historical policy decisions that contributed to or outright created racial division and disparities in the first place. Then and now exist an appeal to empiricism that takes the current state as the natural order, and not one manufactured by past decisions which aimed specifically at a particular, and not altogether organic, conclusion.

Erlich_Bachman · on Sept 24, 2019

I haven't seen a single use case, or a proposition, to use AI classification on such a raw task as classifying whether someone is a criminal or not just by the photo. Any sane person would see that such an endeavor would by itself present so many problems in a society, way longer before we even get to the racist bias of a statistical distribution. So what is the danger really?

Enginerrrd · on Sept 25, 2019

Yeah I didn't pull that out of a hat. The specifics were arbitrary, but I chose that as a real example of systems that people already tried building. THAT'S why understanding Bayesian statistics and underlying distributions is super important to consider when you might inadvertently create biased systems.

skinner_ · on Sept 24, 2019

Proposition? Does this count as a proposition?:

https://www.newscientist.com/article/2114900-concerns-as-fac...

Erlich_Bachman · on Sept 24, 2019

Ok yes well that one is absolutely crazy and scary. Regardless of whether there are race biases in the database.

AstralStorm · on Sept 24, 2019

That is less relevant than the goal, which is to have a better society while not violating human rights, for a very wide definition of human.

In short, being humane. If a certain degree of racism or stereotyping is necessary for that goal (definitely not 100%, but something low), then so be it.

Currently multiple social system in the USA are considered much too racist.

Erlich_Bachman · on Sept 24, 2019

You have such poor use of sources to underline your claims it is almost incredible.

What underlines it is seemingly a lack of understanding of statistics, becausee the most "examples", if they can even be called that are from the realm of non-common outlier cases. That's not what statistics operates on mostly - it operates on common cases that cover the largest part of a distribution. Not outlier cases on the ends of a distribution.

In every case you gave an example of some uncommon exception, as if that would somehow be an argument for the rule existing?

perl4ever · on Sept 23, 2019

You're saying correlation is not causation, a very well known truism. Surely you can't be seriously saying that society or anything else effectively prevents people from equating the two?

dfsegoat · on Sept 23, 2019

> I really wish people would start viewing Racism as a willingness to let that primitive part of the mind be in command, rather than a binary attribute you either have or don't.

This. Some Childhood development studies have born out that children preferentially treat their own 'ingroup' preferentially over those of an 'outgroup' [1]. This is before they really are "impressionable" as I understand it, so it shows something of an inbuilt mechanism. Though they made this distinction:

“Racism connotes hostility and that’s not what we studied. What the study does show is that babies use basic distinctions, including race, to start to cleave the world apart by groups of what they are and aren’t a part of.” [2]

1 - https://www.frontiersin.org/articles/10.3389/fpsyg.2018.0175...

2 - https://www.telegraph.co.uk/news/science/science-news/107705...

raxxorrax · on Sept 24, 2019

I am completely against this reclassification, because I think it is crude and wrong. Racism isn't equivalent to prejudice and the distinction they made is pretty feasible. Racists think of themselves as superior due to their race. I like to keep these issues separate and the lizard brain theory is just laughable.

At least some of the people conducting the studies on children are guilty of child abuse, for they tried to correct the behavior of infants with their infantile theories.

perl4ever · on Sept 23, 2019

"preferentially treat their own 'ingroup' preferentially"

It might make interesting reading to learn about people who don't have an ingroup or who don't prefer their ingroup.

Or then again, it might just be depressing.

TeMPOraL · on Sept 23, 2019

From what I remember from pop-psychology books, people create their in-groups on the spot when the situation allows them to also identify an out-group.

This was the classical experiment I remember: https://en.wikipedia.org/wiki/Realistic_conflict_theory#Robb....

barry-cotter · on Sept 24, 2019

People who don’t have an ingroup are called psychopaths. I know at least one person who (claims to have) reasoned themselves into something resembling utilitarianism from that starting point but that’s a minority end point to not having a capacity for group loyalty.

spangry · on Sept 24, 2019

White liberals have strongly negative in-group bias: https://www.tabletmag.com/wp-content/uploads/2019/06/AA2.jpg

omegabravo · on Sept 24, 2019

I can't gather any information from that chart alone. It lacks all context, which is important.

core-questions · on Sept 23, 2019

What we're rapidly heading towards is someone being smart enough to implement an AI that detects if something is racist, and then everyone else who has an AI that is trained on a random-sampling dataset from reality, which we all know will have Uncomfortable Conclusions (tm), will have to filter their AI through the censorship to be sure that it doesn't accidentally notice any crimethink.

https://en.wikipedia.org/wiki/Thoughtcrime#Crimestop

We basically need this in software so that AI researchers can stop getting lambasted with claims of racism

concordDance · on Sept 24, 2019

That's needlessly confrontational.

falsedan · on Sept 23, 2019

[flagged]

core-questions · on Sept 24, 2019

So you didn't learn the lesson of the 20th century dystopian future predictions, and you'd rather implement your own and learn the hard way?

seph-reed · on Sept 23, 2019

well, they're not going to think for the minorities and SOMEONE is going to have to be the bigger person here.

Empathy is an important part of resolution, as much as that sucks.

underwater · on Sept 23, 2019

Unconscious Bias training tries do teach that, but I've always found the framing and delivery to be counter-productive.

They give everyone tests which reveal that primitive part of your brain. Essentially they want to shock you by forcing you to fail. It is almost like they are trying to shame people, which is a terrible way to teach.

Spivak · on Sept 23, 2019

Well yeah, step one is showing that there is a problem. Bias training has to do something for the not insignificant amount of people that don’t believe that they have a noticeable bias. If you actually want the training to be effective you have destroy the “but I can’t be racist because...”

There’s nothing to be ashamed of when it comes to having bias: women think other women are less competent than their male coworkers, trans women struggle with thinking of other trans women as men, black men find other black men threatening. Your first thought is the one you’ve been conditioned to think and is broadly speaking shared by everyone. It’s when you don’t stop to have the second thought — your own thought that’s the problem.

“No, that’s silly. I’ve seen her work — she absolutely knows what she’s talking about.”

“She’s a woman. Full stop.”

“That guy is just minding his own business and given zero actual signs of being a threat.”

Chris2048 · on Sept 24, 2019

How about the fact that things like IAT might be junk science: https://digest.bps.org.uk/2018/12/05/psychologys-favourite-t...

underwater · on Sept 24, 2019

What I'm talking about is the difference between telling people they're inherently flawed ("you have unconcious biases you'll never defeat") instead of that they have a choice to make (”racism as a willingness to let that primitive part of the mind be in command").

Vinnl · on Sept 24, 2019

What would you say is a pedagogically better way to help people consciously compensating for that part of your brain, and especially what's the best way to make someone aware of it without shaming them?

xwdv · on Sept 23, 2019

There are people who are aware of this and are training AI that are allowed to become racist if that’s the direction they take.

You won’t hear much about it though because: 1) it’s a sensitive topic that is easily misunderstood by those who aren’t researchers 2) when everyone else is trying to scrub racism out of their models for politically correct reasons, having a “racist” AI can actually produce a competitive advantage in some industries, and it can be a difficult advantage for competitors to get similar performance if they don’t allow for natural racism to emerge.

In short, it’s not a big deal, humans have some amount of racism whether they admit it or not. What matters is that you treat people equally and without prejudice, regardless what you may think of their race. Judgements about a group and judgements about an individual are two different things.

Racist results don’t even need to be about negative things, it could be as simple as “a person of this race prefers this kind of food over that one”

esyir · on Sept 24, 2019

That is most definitely not racist though. Once we go there, we're basically trying to force ourselves to take a worldview that does not accurately represent reality.

If my classifier predicts that an Asian guy would probably eat rice, if it's accurate, is it racist?

The issue I'd have here isn't racism, it's when race dominates other features resulting in, in this case, poor recommendation quality/variety for Asians who prefer non-rice dishes or vice versa.

empath75 · on Sept 23, 2019

> Cars, trees, animals, mountains... everything else, if it looks a way it acts that way.

Well there’s all kinds of mimicry in nature to exploit exactly this assumption.

Causality1 · on Sept 23, 2019

Not to mention that there are different kinds of racism we lump together under one label. Someone can think a certain race is, say, bad at math and still be racist even without disliking that race. Someone can also hate a race without believing they have some inherent inferiority, and that's racist. Similarly, you could simply decide to support your own race at the expense of others out of a sense of loyalty without even disliking other races and that's racism too.

xenocyon · on Sept 24, 2019

It is incorrect that inanimate objects are handled in a bias-free way. For instance, yellow bananas are (by human labelers, and consequently by AIs) labeled 'bananas', while green bananas are labeled 'green bananas' (similar to how a male doctor may be labeled 'doctor' while a female doctor may be labeled 'female doctor'). Even aside from labeling, the choice of data itself may be prone to bias, for example most pictures in an image set may be taken from a camera at human height pointed horizontally, etc. The biases that are racist are simply a subset of the manifold human biases that pervade data and consequently pervade AIs. There is no unbiased algorithm, all are hopelessly contaminated by humans.

gambiting · on Sept 24, 2019

>>(similar to how a male doctor may be labeled 'doctor' while a female doctor may be labeled 'female doctor')

Isn't that just a bug in the English language though? English very rarely employs variations of the same word based on gender - but that's not true of many other languages. If the labeling was done in say Polish or German suddenly it wouldn't be biased at all, a male doctor would be labeled Lekarz/Artz while a female one would be labeled Lekarka/Ärztin - it's just what it is, no bias here.

solinent · on Sept 24, 2019

As someone who has lived as quite an extreme minority (only 4 people out of 1500 at my high school looked even remotely like me), it's hard to imagine a world where these small slip ups don't happen at all. I wouldn't want to live in that world. I'm guilty as well, to some extent.

Our minds are literally categorical machines--in order to fight entropy we find stable states through classification of the world into categories. So literally any form of thought is a form of discrimination between many categories. Every word can be thought of as a category.

sp332 · on Sept 23, 2019

Sure but people are shipping this stuff in production without questioning how good it is or even what kind of problems it might have.

groby_b · on Sept 23, 2019

>There will always be slip ups, over-simplifications, snap judgements, subconscious or not.

Absolutely. The mark of intelligence is to recognize them and correct for it.

And the people who just permanently slip up, and never correct themselves, and somehow always make the same snap judgments? 100% racist.

seph-reed · on Sept 24, 2019

I don't think anyone has achieved 100% un-racism, or 100% racism.

Grustaf · on Sept 25, 2019

I think it’s unfortunate that the word “racism” is used both for evil things like apartheid or Jim Crow, and for unconscious bias. Especially when the bias isn’t even to view certain races negatively, just differently.

I mean it doesn’t seem like people tended to label some races with negative labels, just that they labeled their race at all, and didn’t do that for white people. At least in the example.

I don’t think it is useful to lump these two concepts together.

ALittleLight · on Sept 23, 2019

A car could look fast, but be slow, or have latent problems, software on the car's computer could be fooling emissions tests... Looking at a tree tells you nothing about photosynthesis, tree pheromones, deep relationships with symbiotic insects or bacteria...

You can't understand very much at all just based on how things look. That holds for humans and inhumans alike.

core-questions · on Sept 23, 2019

This is a perfect example of Tactical Nihilism. You're basically denying the existence of a massive amount of reproducible psychometrics and anthropological research because it makes you uncomfortable, and you're declaring that heuristic analysis of available, salient facts is not worthwhile because "you can't understand very much at all based on how things look."

When an arborist "looks" at a tree, they identify the kind of tree it is, and that connects to all of the research they know about that species of tree. It's reasonable to suppose that a new instance of a known kind of tree is going to have properties in common with all the other known instances.

ALittleLight · on Sept 24, 2019

I think you're reading into my comment things that aren't there. Pointing out that even apparently simple things contain hidden depths is not an example of nihilism, tactical or otherwise. Likewise, it's not a denial of research - it's an acknowledgement of the research that uncovered those hidden depths.

Your point about how a specialist can connect their knowledge to what they see is true, but not relevant to the point at hand. If I was your therapist, and had studied and understood you thoroughly, I might be able to assess your mental state at a glance in most scenarios. That doesn't deny you have a detailed inner life anymore than that an arborist might generally understand a tree does not deny the complex life of the tree.

lazyasciiart · on Sept 23, 2019

This is a perfect example of an incredibly hostile overreaction backed up by a Proper Noun.

lazyasciiart · on Sept 23, 2019

The claim wasn't that people are harder to understand than trees based on appearance - it was that people are harder to understand than anything based on appearance. It seems like a poorly articulated reference to people having an interior mind - but so do animals. Can you identify a friendly or sick or lazy dog just by looking? No.

johnisgood · on Sept 23, 2019

> Can you identify a friendly or sick or lazy dog just by looking? No.

Yes you can do that. To put it extremely simply: define what "friendly", "sick", and "lazy" means in terms of behavior, then observe the dog's behaviors.

lazyasciiart · on Sept 24, 2019

A sleeping dog still possesses those characteristics, but has no behaviors.

mikelyons · on Sept 25, 2019

A dog sleeping amidst lots of stimuli known to excite dogs is exhibiting a lazy, sick or perhaps even unfriendly behavior.

Edit: sleeping is a behavior.

lazyasciiart · on Sept 25, 2019

You can say exactly the same thing about a person, which seems to support my point.

mikelyons · on Sept 26, 2019

I must have misunderstood, upon second read you seem to say something similar to, "nothing is as it appears" no?

lazyasciiart · on Sept 27, 2019

Mmm, I think I was going for "lots of things cannot be understood completely by appearance, it is not unique to humans"

ars · on Sept 23, 2019

> Can you identify a friendly or sick or lazy dog just by looking? No.

?? Obviously you can. Why do you think you can't?

rolltiide · on Sept 23, 2019

they're talking about the times when you can't.

you can't identify every friendly dog just by looking at it, you'll identify dogs that are also acting friendly and doing things people have found preceded positive experiences.

you can't identify every sick dog just by looking at it, otherwise the vet would never find additional issues.

and so on

PeterisP · on Sept 24, 2019

This is a case of "When people thought the Earth was flat, they were wrong. When people thought the Earth was spherical, they were wrong. But if you think that thinking the Earth is spherical is just as wrong as thinking the Earth is flat, then your view is wronger than both of them put together. "

Yes, it's wrong to assume that you can tell everything about that dog by looking at it - but it's so much more uncomparably wrong to consider that you can tell nothing about friendly or sick dogs based on looking at them!

Observating the behaviors and dog breeds correlating with previous friendly and unfriendly experiences is a very useful, somewhat reliable predictor of how likely this particular dog is to be (un)friendly. Sure, if you know this particular dog, then that should supercede any group information, but if not, then that's all the information you have, it is useful information that correlates with (future) reality that matters to you, and so it's prudent to use it. Appearing "nonjudgmental" by throwing away information and judgment is simply stupid, and bound to get you bitten at least in the metaphorical way.

lazyasciiart · on Sept 24, 2019

If we are talking about observing behaviors, then the original assertion that you can't tell anything about humans by looking is so silly that it wouldn't have been made. We are not talking about observing behaviors, but appearance.

baobabKoodaa · on Sept 25, 2019

You can tell a lot about humans by their appearance.

lazyasciiart · on Sept 26, 2019

Yes, I agree. I originally commented to disagree with this claim someone made above > It's really only people where you can't tell what it does/is from the outside. Cars, trees, animals, mountains... everything else, if it looks a way it acts that way.

rolltiide · on Sept 24, 2019

> Yes, it's wrong to assume that you can tell everything about that dog by looking at it - but it's so much more uncomparably wrong to consider that you can tell nothing about friendly or sick dogs based on looking at them!

but nobody in this entire thread was saying that, and my contribution was an additional attempt to further point out why the other sibling responses were missing this

At this point I'm totally content in talking past each other on this topic as I honestly don't know why its not clear that the subset which is undetectable by mere observation is statistically significant.

didericis · on Sept 23, 2019

Empiricism is basically the art of inference based on outward appearance in different contexts.

I’d argue we learn everything based on appearance. You have to do more than just observe things superficially and take context into account, but for something to be measurable it has to have some sort of outward appearance, whether that be direct, as in something you can see with your naked eye, or indirect, as in something we need to measure through some other instrument.

sershe · on Sept 24, 2019

You can, probabilistically. If you randomly sample cars from all the street-legal models in the US (for example), and sort them by how fast they are based on appearance only, you would not be 100% correct, but you'd probably do much better than if you claim that "You can't understand very much at all just based on how things look" and sort them randomly (the prior being that each car is equally likely to be anywhere in the distribution).

deogeo · on Sept 23, 2019

But you can make a pretty good guess: http://www.spsp.org/news-center/blog/stereotype-accuracy-res...

seph-reed · on Sept 23, 2019

Ultimately we know nothing, so one can always nit-pick any statement into an oversimplified oblivion.

There are also cakes that look like cars, model sets which have miniature mountains, on and on. But in typical nature, it works well enough to breed and see what comes next.

desert_boi · on Sept 24, 2019

Tactical Nihilism sounds a lot like solipsism. "We can't ever not make generalizations, so we shouldn't even try to make statements."

hubert1234 · on Sept 24, 2019

[flagged]

skohan · on Sept 24, 2019

I have a few thoughts about that.

1. If you look at the consensus in the field of psychology, human beings are probably not that good at making the kind of assessment you're talking about with any degree of accuracy. For instance, once you form a belief that there is a difference in competence between people of different races, you will become heavily biased to attend to examples which confirm your belief, and you will be biased to disregard examples which contradict your belief. Humans are just not perfect bayesian reasoning machines.

2. In the social contract of western, liberal society, we have broadly agreed that people should not be judged by their immutable characteristics. So in a sense, yes you are not allowed to "pattern match by race", according to the standing rules of society.

3. This rule helps protect us from the dangers of racial thinking. For instance, a belief about the relative merit or competence of one race can easily jump the gap from being descriptive to prescriptive when one goes around treating people by different standards based on their group identity. There is so much variance within every demographic group that even if there were some statistically significant difference between groups, you would be doing a massive injustice to so many of the individuals in that group to judge them by the "representative member".

alfromspace · on Sept 24, 2019

broadly agreed

I would say ruthlessly coerced.

jacobwilliamroy · on Sept 24, 2019

edit: I mean, hypothetically, if you were to think black people are less competent on average than white people, why would that be the case?

Erlich_Bachman · on Sept 24, 2019

Not saying it's one way or ther other, but hypothetically someone could be using some IQ statistics, even if there are issues regarding data collection, :

https://en.wikipedia.org/wiki/Race_and_intelligence

Yes, obviously these statistics would not definitely tell anyone if there is any difference in actual genetic races, or if the factors are socioeconomic, but for someone looking for job candidates it would actually make no difference, since IQ is not generally easily changed or trained in a developed adult.

Of course, they could also just be using their own personal experience and claim that they are basing their beliefs on it.

jacobwilliamroy · on Sept 24, 2019

Hmm. Are there truly no racist iq studies which control for the net worths of its participants?

Double_a_92 · on Sept 24, 2019

He's not really thinking that. He is just concerned about society being to politically correct about specific things. I.e. you could probably completely discriminate someone by height, hair color, ... but not by race specifically for some reason.

gambiting · on Sept 24, 2019

Good example - we do have statistical data that proves beyond doubt that men have more automotive accidents than women. Yet in the EU it's illegal for insurance companies to charge men more for car insurance than they would charge a woman for the same insurance, because apparently that's sexist. But somehow no one has any problems with charging a younger driver more than an older driver, even if they both have identical amount of experience, because data supports that younger drivers have more accidents regardless of their experience behind the wheel - and yet that's not ageist and isn't banned.

jacobwilliamroy · on Sept 24, 2019

I doubt that there are so many "older drivers" with comparable levels of experience to "younger drivers" that such an evaluation of risk could be beyond doubt. Are you sure such data exists?

gambiting · on Sept 24, 2019

Absolutely. A 25 year old driver who passed their test a day before will pay much much smaller premium than a 20 year old driver who also just passed their test(even if the 20 year old driver already has 2 years of experience and no accidents their insurance will still be more expensive). I obviously don't have access to the actuarial data used by the insurance companies, but it's very easy to check this with many online insurance comparators.

jacobwilliamroy · on Sept 24, 2019

Why are you so sure that such data exists if you have never seen it?

I'm already slightly inclined to suspend my disbelief, because I know a little bit about developmental psychology that sort of corroborates what you're claiming about the ageism thing, however, if you want to start discriminating based on race, you have to be ready to bear a massive burden of proof when you make racist claims. Would you be so blindly accepting if I threatened your life, liberty or property for no reason other than the color of your skin? Wouldn't you want conclusive evidence, quality controlled and beyond a reasonable doubt?

tpxl · on Sept 24, 2019

The data the parent is claiming to exist is discrimination based on age, which is indeed rather simple to confirm.

>however, if you want to start discriminating based on race, you have to be ready to bear a massive burden of proof when you make racist claims. Would you be so blindly accepting if I threatened your life, liberty or property for no reason other than the color of your skin? Wouldn't you want conclusive evidence, quality controlled and beyond a reasonable doubt?

Would you not say the same about age?

jacobwilliamroy · on Sept 24, 2019

Yes, I would, but I think we're veering too far off the original topic of race.

gambiting · on Sept 24, 2019

>>Why are you so sure that such data exists if you have never seen it?

Because when you ring up an insurance company and ask why (as a 20-year old) your insurance is higher than that of a 25 year old(with also zero driving experience) you will be told that it's because of your age. Unless they are lying to you of course, which is hard to prove or disprove.

>> however, if you want to start discriminating based on race

Wait, what? That took a weird turn?

jacobwilliamroy · on Sept 24, 2019

>that took a weird turn

Well, you turned first. Here we are all talking about racism, and then you claim age inversely affects driver risk. In context, I thought you were somehow trying to convince me that circular reasoning and appeals to authority ought to dispel any doubts I have about the validity of racism, but it sounds like you were trying to say something else, and I could be doing a better job of understanding what you are trying to say.

>unless they are lying

They may just be ignorant. Maybe someone else lied to them and they are just parroting the lie. Remember, the people who price insurance exist in a world where polygraphs are admissible evidence in a U.S. court of law; misconceptions are everywhere and I think we must be cautious, lest our generalizations be hasty.

rimliu · on Sept 24, 2019

That's a very good and interesting example, thank you.

jacobwilliamroy · on Sept 24, 2019

In the U.S. we already tried to institutionalize racial discrimination. From reading the history, I think that "too much political correctness" is preferable to "separatist paramilitary groups and city police departments having shooting wars in my neighborhood."

Chris2048 · on Sept 24, 2019

Problem is the same brand of PC is being exported to places without that history, because of the power shift it causes.

Also see: invoking emergency military powers during peacetime (as reaction to overblown external threats) to suppress political dissent.

jacobwilliamroy · on Sept 24, 2019

What is a specific example of a PC behavior which you don't appreciate? I want to understand why you are concerned.

raxxorrax · on Sept 24, 2019

I disagree and the dichotomy is a wrong one, since suppression of issues just leaves the way of taking justice into your own hands in the way you described

jacobwilliamroy · on Sept 24, 2019

The dichotomy is real, and you are comparing apples and oranges.

Yes, there are white separatist paramilitaries operating within the United States who have cropped up in reaction to affirmative action and social justice, but it won't blow up the same way the black panthers did. It's apples and oranges, because today's U.S. white separatist paramilitary groups are not targets of the U.S. government. The FBI isn't assassinating members of the Aryan Brotherhood nor are politicians passing gun control laws to curb the flow of weapons into their territories. The U.S. government is not actively persecuting white supremacists. There is no COINTELPRO for white supremacists.

There is no escalation here, no conflict. Nazis get a free lunch in the U.S.

ragazzina · on Sept 24, 2019

>I really wish people would start viewing Racism as a willingness to let that primitive part of the mind be in command

It this were the case, babies would be racist by default, but they are not. Racism is taught.

Iv · on Sept 24, 2019

That... what?

"And when one user uploaded an image of the Democratic presidential candidates Andrew Yang and Joe Biden, Yang who is Asian American, was tagged as “Buddhist” (he is not) while Biden was simply labeled as “grinner.”"

They are not removing images that categorize blacks as blacks. They are removing images that are incorrect.

alexis_fr · on Sept 24, 2019

Wait, no, that is not how racism starts.

Racism starts out of preferential treatment of some people. Most racist people have a « root event », where the other party didn’t get condemned. It may even have happened several times, with various consequences (rape, molestation, repeat racket, etc).

Then they proceeded to denounce the rape/molestation/racket to the police. The police doesn’t act because they suspect you are racist. Which has the opposite effect than desired: It doesn’t condemn the criminal, and puts the burden on the victim.

Then they seek security in their lives. Therefore, if the probability is high to experience rape, molestation, racket or crime, enough that the racist has been confronted to it in his life, then the first best approximation of judging whether someone might be a criminal is whether he’s free or in jail. That’s in well-functioning societies. In non-functioning societies where criminals are not in jail, the second approximation to protect yourself is secondary indicators, inferred from grossly racist statistics. Here racism is born.

It also explains why racist people often still have various friends of the group they are supposed to mistrust. It’s because they were able to assess their probity and trustworthiness in some opportunity. That’s why mixing people by compulsory rules works. But mixing people is a poor palliative. It doesn’t solve the underlying problem if a non-functioning society, so it doesn’t make the people less racist, despite giving the appearance that people work together.

Racism is often used with regret by those who exert it. But it is the second best approximation to seek security. Racism is the result of a non-functioning society which is caused by being more lenient on criminals of chosen category. I’m pretty sure it is possible to engineer racism by letting a made-up group get away with crimes.

Erlich_Bachman · on Sept 24, 2019

Ehm care to cite some sources for your well generalized and arbitrary claims, especially regarding "root events"?

alexis_fr · on Sept 24, 2019

I thought about it, but if I cite sources, I will be citing racist people. Not something people enjoy reading.

But you can find the same discourse in most famous racists of the world, at least currently living.

If I have to cite a source, Tommy Robinson (here comes the downvotes) constantly talks (and breaks) about the omerta surrounding the trials he reported on, which made that the rape gangs could proceed for so long with so few hindrances. Go through the list of people recently suppressed from Youtube and you’ll find the same (unadressed) logic.

Concerning root events, they are often private, so you have to know the person personally. But go ask around you when people became racist, they’ll often tell you one specific event. It’s interesting to go interviewing an enemy.

Erlich_Bachman · on Sept 24, 2019

So based on anecdotes? Aren't you doing the same thing you are accusing them of?

_wmhc · on Sept 24, 2019

These "root events" might just as well be a justification for pre-existing racism.

Out_of_Characte · on Sept 23, 2019

The AI isn't biased, the curaters were.

The people who curated the first training set used subjective words like 'attractive' to tag the images which means the AI tagged all images it deemed 'attractive' to the people who made the training set. as this is a very biased and homogenous group it means the AI turned out biased. Maybe if they randomly sampled millions of people from all countries in order to create the training data then they could effectively train an AI to guess what YOU might find attractive. However even then I somehow doubt it. Beauty is in the eye of the beholder. We dont consiously know the rules of what we are attracted to, nor does an AI have secret information if you just supplied it with enough words and images.

If they'd stuck with simple classifications like 'black' 'white' 'man' 'woman' then they would have less subjective judgement values about the original training set.

fao_ · on Sept 23, 2019

> The AI isn't biased, the curaters were.

No, it's not "The dataset is biased, the AI isn't" it's "The dataset is biased, THEREFORE the AI is biased".

pharke · on Sept 24, 2019

Garbage in, garbage out

ailideex · on Sept 24, 2019

Data is data and cannot make value judgments so not sure how it can be racist. If the data is how racist people label things it still is not racist data it is data is perfectly valid for what it is racists. Removing the images from ImageNet seems absurd.

YeGoblynQueenne · on Sept 24, 2019

In practice, "data" means a set of observations collected by humans- who have inherent biases that influence the collection.

I'm not talking specifically about cultural biases like racial stereotypes etc. Confirmation bias is a thing, there's nothing stopping a researcher from making those observations that confirm their favoured theory and contradict all others.

Then of course there is sampling error. Just because you have a set of data that you collected "at random" doesn't mean that this dataset is representative of the population you are interested in. Let alone the fact that it's very hard to collect a truly random set of observations about processes that we don't understand to begin with.

The kind of data you're describing is an ideal, a principle that we all aspire to. It's far from the reality in practice.

faceplanted · on Sept 24, 2019

I mean, it's very common colloquially to describe non sentient things as racist because they're based on either purposely or obliviously racist ideas and stereotypes.

ailideex · on Sept 26, 2019

Do you hear yourself ?! ImageNet was not based on either purposely or obliviously racist ideas and stereotypes. If it was, sure, I would have some patience for the claim it was racist. But it was not.

maerF0x0 · on Sept 23, 2019

Exactly! They got subjective results because they went beyond facts.

For example, it could have been interesting if they tagged each person with their actual religion or propensity to "grin"

dev_dull · on Sept 23, 2019

Going beyond facts is an important component of “being human”, so in that regard it makes the AI seem more intelligent. The problem is the AI is 100% honest with what it thinks, unlike a human.

bilbo0s · on Sept 23, 2019

Not sure you guys understand AI and ML. Neither of these actually "think" for instance. By way of example, the only thing this AI really does, at base, is classify things into categories that the curator told the AI to classify them into via the dataset. I mean, that's pretty much it. There is no bias. There is no lack of bias. It's just blindly doing what the curator told it to do. Don't mistake that for "thinking". That's more AGI, which is not likely to happen in the lifetime of anyone reading this post.

maerF0x0 · on Sept 23, 2019

I agree that going beyond the facts is a good thing when humans are doing critical thinking and when being careful and transparent about their doing so.

However this was creating a dataset for classification. Something that specifically should not go beyond the facts. (The basis of the model is the strength of the facts it's built upon)

PeterisP · on Sept 24, 2019

In your opinion, are qualitative labels like "attractive" or group membership such as someone's skin color or ethnicity within the domain of facts or not?

I.e. is the issue in the fact that the particular annotators were subjective and annotated some particular facts wrong (and the labels for skin color could be filled from, for example, census data which is self-reported) or that these whole type of labels shouldn't be attempted to be made as they're not facts?

If the latter, what do you think about the categories like "adult" or "sports car" that are also part of ImageNet; can we draw an unambiguous factual boundary between images of adults and teenagers, or "normal" cars and sports cars?

tylerhou · on Sept 23, 2019

Except only including facts can still reinforce unfair bias. For example, it's true that there are more men than women in software engineering. Whether someone is a software engineer or not is a fact. If you have a "representative" dataset with only facts, then it's possible that an AI would have a higher chance as labeling men as software engineers than women, simply because it begins to associate masculine facial features with software engineering.

In my eyes, this result would reinforce unfair bias, and a thus well-designed AI should avoid this (i.e. with all else equal, a well-designed AI should suggest the label "software engineer" at the same rate for both men and women).

chii · on Sept 23, 2019

If it's true that there are more male software engineers, then why is it wrong for the AI to "learn" that?

If the AI did start classifying masculine features biased towards software engineers, then the AI has learnt the above fact, and thus can be used to make predictions.

The moral standpoint that there shouldn't be more male software engineers than female engineers is a personal and subjective ideal, and if you lament bias, then why isn't this kind of bias given the same treatment?

abathur · on Sept 24, 2019

The moral standpoint isn't that there shouldn't be more (or less) male software engineers.

The moral standpoint is that there shouldn't be an AICandidateFilter|HumanPrejudicialInterviewer that only coincidentally appears to beat a coin-flip because it has learned non-causal correlations which it uses to dust out qualified stereotype-defying human candidates because they don't look stereotypical enough on the axes that the dataset--which almost inevitably has a status-quo bias--suggests are relevant.

esyir · on Sept 24, 2019

So, it depends on what you want to do here. If the task is just "predict if the person is a software engineer". I'd say go ahead, bias it away. Here, anything that boosts accuracy is game to me.

But if the task is say the pre-screening side. This becomes a more ethically/morally tricky question. If and only if that sex is not a predictive factor for engineer quality, you would then expect to see similar classifier performance for male / female samples. Given that assumption, significant (hah) divergence from equal performance would be something to correct.

Of course there are other issues to handle, such as the unbalanced state of the dataset and so on.

jacquesm · on Sept 24, 2019

It is wrong because there is no causal relationship between the two so none can be inferred.

lmm · on Sept 24, 2019

[flagged]

jacquesm · on Sept 24, 2019

You are making a logic error. When there is no causal connection between two items it is very well possible that there is a connection that allows you to say something about populations. But you will never be able to say something about an individual. And that is where all these arguments flounder, we put population information in, in order to engineer features that then allow us to make decisions about individuals. For those cases where feature engineering can dig up causal connections this works wonders, for those cases where it does not or gives you apparent connections that are not really there you end up with problems.

Scarblac · on Sept 23, 2019

'black' and 'white' aren't simple at all.

trhway · on Sept 23, 2019

>'black' and 'white' aren't simple at all.

well seems to be very simple - either there is a hat or there isn't

https://twitter.com/Dk3Kbball/status/1174115660219072512

That illustrates the depth of the issue - while directly racist data can possibly be removed there are many proxy/correlated attributes (as any insurance/mortgage/etc company knows), and to find correlations is the core nature of the machine learning systems (at least the ML as it is currently known to humans).

fyfy18 · on Sept 24, 2019

Tangently related, this reminds me of the women who had her picture photoshopped by people from different countries to make her look beautiful in their eyes:

https://www.buzzfeed.com/ashleyperez/global-beauty-standards

Chris2048 · on Sept 24, 2019

I think it's important context that "had her" means "she initiated", rather than "was imposed upon her":

"Make me beautiful," she said, hoping to bring to light how standards of beauty differ across various cultures"

She specifically asked for some kind of alteration, so leaving the picture unaltered was implicitly not an option. Furthermore, this assumes a random (but distributed) selection of Fivver artists represent worldwide beauty standards.

esyir · on Sept 24, 2019

You'd probably get a better result by adding in the base culture/race/whatever other cultural identifier of the tagger for these subjective tags.

The definition of beauty varies across culture, and while it varies from person to person, there are some aggregates that might be informative.

viraptor · on Sept 24, 2019

> The AI isn't biased, the curaters were.

This is pretty close to "does the Chinese room know Chinese". https://en.wikipedia.org/wiki/Chinese_room Going too deep down that path is fun for philosophy, but not super useful...

mikekchar · on Sept 24, 2019

> If they'd stuck with simple classifications like 'black' 'white' 'man' 'woman'

Interestingly, I was just thinking about the "black and white" issue today. A while ago I watched an interview on Youtube with a young woman who was born in Japan. Her parents were American and growing up she new she was different, but basically it boiled down to "the English teacher's daughter", or "part of the foreigner family". But she didn't speak English very well and her parents didn't speak Japanese very well, so she identified very much more strongly with her friends in the area than her family.

When she was 12, her family moved back to the US. When she went to school the new children were encouraged to say what nationality they were. One person said they were Canadian. One said they were Mexican. When it came to her turn she said, "I'm Japanese". One child in her class corrected her: "No you aren't. You are black." Of course, this was a source of considerable confusion for her.

One of the things that's kind of weird about Japan is that some Japanese people have very dark complexions -- darker than what would be called "black" in the US. Some have very light complexions. In my opinion, considerably more "white" than I am (and most people would call me "white" I think). In fact, after I moved to Japan, I realised that I wasn't white at all. I'm pink. I mean, I'm super pink. I seriously never noticed it until I spent 5 or so years living in a place where nobody else was pink. I literally avoid wearing red now because it makes me look like a tomato.

When I was about 3, my best friend was black. My grandmother asked me, "Do you notice anything different about your fried? Their hair or something?" I didn't understand the question. All of my friends had different hair. Then she said, "Can you tell that he is black and you are white? Or do you not think about it that way?" My grandmother was just curious, but this question completely blew me away. From that time, I realised that people didn't each have their own skin color. Instead, they were categorised and my friend was different than me. I think my friend noticed that I looked at him differently (though not necessarily badly). We stopped being friends for a long time. Somewhat strangely, I just recently realised that he was one of my best friends in high school... I never actually made the connection that my friend at 3 and my friend in high school were the same person until recently. I wonder if he ever realised.

But anyway, there really isn't a classification like "black" and "white". When I have a tan, my skin is darker than my wife's. But she is very tan and so her skin looks brown. When I compare my skin to hers, my skin is still red, even though it is dark. But if I were to compare my skin to an indiginous American, I think their skin will be more red than mine when tan -- and less pink when not. And my wife, when compared to someone indiginous from Africa has more a more of a curry brown than a deep chocolate brown.

As I was saying, Japanese people don't call dark Japanese people "black". They don't treat them as a different race. Neither are very pale Japanese people "white", even though they may say "your skin is very white". They aren't different. What I found interesting about the young girl who discovered that she was black was that she wasn't "black" in Japan. Or, at least, not "black" in the way we use the term in America. Japan does not have that cultural history (it's got enough of it's own baggage, thank you very much ;-) ).

If a computer were to compare skin tones objectively, it would simply tell you the color. If it decided to classify in terms of "black" and "white", it would be classifying based on cultural labels, not color.

Chris2048 · on Sept 24, 2019

Perhaps instead of using human categories of race, we should ask the neural nets to provide those categorisations?

It would be based on a larger selections of individuals (non-local), based purely on appearance, and un-biased by human conception (we are really good at processing faces/facial features).

mikekchar · on Sept 24, 2019

I'm curious. What benefit would that kind of categorisation bring?

crazygringo · on Sept 24, 2019

The problem seems less about racist bias specifically and more about unbelievably dumb tags generally in the first place.

"Stunner"? "Looker"? "Mantrap"? Or even trying to tag people's images with categories like Buddhist, grinner, or microeconomist?

What were they thinking?! Clearly these tags were never curated in any remotely responsible way -- for quality, for sensitivity, or just usefulness at all -- and I'm shocked they were ever intended for academic or research use. No wonder image recognition gets a bad rap, with input data like this.

ipsa · on Sept 24, 2019

There are ~32.000 tags. No surveillance system is using ImageNet tags to classify people into Buddhists or Not-Buddhist. Most researchers ignore these tags and focus on a 1000 classes, and know that 32k performance is not good (and these artists have no intention of making it work at all). What they are uniquely trying with this Art Project is as much research as it is activism. Note that "mantrap" is defined in synset as "A trap for catching trespassers", and that you are bound to find weird stuff among over 30k categories (imagine what you can say with the 32% most popular words in French...).

This is a photo in question: https://memepedia.ru/wp-content/uploads/2019/09/imagenet-1.p...

This is the route the network took:

person, individual, someone, somebody, mortal, soul (6978) > female, female person (150) > woman, adult female (129) > smasher, stunner, knockout, beauty, ravisher, sweetheart, peach, lulu, looker, mantrap, dish (0)

So it was (politically) correct on the first three categories, and the last one was either a crapshoot (and she could also have gotten to the subcategory of "prostitute" > "streetwalker, street girl, hooker, hustler, floozy, floozie, slattern") or she really is posing in a common "beautiful woman"-way. (The global description for this route is "A very attractive or seductive looking woman" and often triggers for females with tilted heads and lip curls).

You can turn any faces dataset into a labeled face color dataset, so if a black person being subclassified as "negro" is problematic bias or encoded racism, then all such datasets are suspect. Noisy labeled data is the norm, not some horrible exception to be avoided at all costs.

dannyw · on Sept 24, 2019

The fact that we have a machine learning algorithm that labels _humans_ as smashers, prostitutes, or convicts is a problem, irrespective of whatever technical justification can be made for it.

ipsa · on Sept 24, 2019

But who artificially created that problem? The artists. There is no marketing company that has intelligent billboards scanning the public for prostitutes. There are no researchers seriously using CV to classify convicts (at least not in the West, and not with ImageNet). That could be a malicious usage problem. This is either a non-problem or Armaggedon for all ML CV datasets, because you certainly can use most CV datasets to train a crappy classifier to output offensive labels. If I train a people photo tagger using a dataset used for combating poaching monkeys in Africa, then who is at fault? Certainly not the researchers who published that dataset with the idea that the data would be used with common sense and scientific rigor, not adversarially -- to make a political point attacking the very existence of that data. The "exposed" bias is trivial.

It is the technical justification that should be all that matters for a canonical academic dataset. Science does its best to be apolitical, but then politics ("red bull drinking white men train racist and sexist classifiers") is forced upon it, and we can't really have a productive conversation about bias and ethics anymore.

AI needs common sense knowledge of the world to improve. Censorship so science does not offend our sensibilities, would only make it so Google Image Search (a machine learning algorithm) does not return any images of people when you search for "prostitute". Heck, the AI would never learn the difference between a male and a female prostitute. Destruction of accessible knowledge so we (aka: people on Twitter who think AI is the terminator, or the director of the internet) don't get offended by some primitive ML model-as-art-project forced to make errors or awkward classifications. That's a sentence the academic ML community could do entirely without. No benchmarks or duckface selfie would be hurt. No unfortunate third-world souls hired to scan 20 million + internet crawled images for wrongthink, only for the machine to do the unsupervised learning in a hug box, not a black box. Oi mate, you got a loicense fer that label?

Just wait until the activists find out who wrote the first 100 8's added to MNIST. Nobody but MIT would be associated with her, if they found out what she did.

tastroder · on Sept 24, 2019

> What were they thinking?! [...]

In addition to their statement, linked in a thread below [0], what they were thinking is pretty out in the open in the abstract of the original paper:

"We introduce here a new database called “ImageNet”, a largescale ontology of images built upon the backbone of the WordNet structure. ImageNet aims to populate the majority of the 80,000 synsets of WordNet with an average of 500-1000 clean and full resolution images. " [1]

That paper is now 10 years old, WordNet itself had it's first release somewhere in the 1980s. A lot happens in research a decade down the road and lots of research towards biases of datasets are relatively recent in that sense. It can hardly be blamed on the researchers that people use their datasets in production merely because "they're large and there in the open". As their statement indicates they're actively working on the issues that arose during the last few years.

Most research datasets are fit for a very specific purpose, to me the larger problem here is two-fold: "AI" fearmongering on one side and startups/corporations on the other side lacking the necessary skills to select and filter their training data. Hopefully that'll be a matter of the past once the hyped data science field reaches maturity.

edit: Wordnet seems to include said terms to this day, [2] for example still lists "mantrap" associated to "beaty" in the synset for "a very attractive or seductive looking woman". Although I'm unsure how they should handle these cases, both are words and that's one definition used in practice, even if we find it reprehensible. They seem to have removed the n-word but that's about the only removed instance I can find in a cursory search of sensitive or insulting English words. Maybe they should clarify sensitive annotations like they do with vulgarity.

[0] https://news.ycombinator.com/item?id=21054770

[1] http://www.image-net.org/papers/imagenet_cvpr09.pdf

[2] http://wordnetweb.princeton.edu/perl/webwn?s=beauty

ailideex · on Sept 24, 2019

But removing the images from ImageNet, an inanimate set of images that cannot make value judgements, will surely fix this problem...

Our culture is broken and nobody is willing to stand up to insanity or even admit to it. People are too happy to do completely insane and senseless things if it saves them from the social lynchings.

Chris2048 · on Sept 24, 2019

Not sure about "mantrap", but I could imagine "stunner" and "looker" in the pages of a magazine article - A red-carpet celeb photo "Jane doe looking stunning! And with her hubbie - what a looker!".

I guess the problem with things like "microeconomist" are that other tags such as "fireman" might be valid (if a picture contained clues about vocation, for example) - furthermore, rather than condemn the label, we might instead consider not demonising a neural net, and taking its results at face value; in this case an indication that most labelled micro-economists in the dataset are suited white men.

runj__ · on Sept 24, 2019

Buddhist seems pretty easy to categorize? If you see someone standing in front of a buddhist temple wearing an Kāṣāya (had to google the actual word: the orange robes) they're pretty likely to be a buddhist, right?

Iv · on Sept 24, 2019

Agreed. These are not descriptive labels you can infer from an image.

They may be, however, labels generated by users from a context.

orf · on Sept 23, 2019

Their statement: http://image-net.org/update-sep-17-2019

> Each synset is classified as either “unsafe” (offensive regardless of context), “sensitive” (offensive depending on context), or “safe”. “Unsafe” synsets are inherently offensive, such as those with profanity, those that correspond to racial or gender slurs, or those that correspond to negative characterizations of people (for example, racist). “Sensitive” synsets are not inherently offensive, but they may cause offense when applied inappropriately, such as the classification of people based on sexual orientation and religion.

> So far out of 2,832 synsets within the person subtree we’ve identified 438 “unsafe” synsets and 1,155 “sensitive” synsets. The remaining 1,239 synsets are temporarily deemed “safe.”

They've also completely disabled Imagenet downloads while they remedy this.

esyir · on Sept 24, 2019

Great, I can see some of this being useless (some parts of the unsafe dataset), but if they cull the "sensitive" portion, this may induce performance regressions.

I need to find an ImageNet archive now.

buildbot · on Sept 24, 2019

It doesn't really touch the 1000 class "Imagenet" that's commonly used in computer vision.

yorwba · on Sept 24, 2019

Some of the classes are subcategories of ILSVRC classes in the WordNet hierarchy. So by removing images of persons in categories that are considered inappropriate, the resulting classifier will end up less likely to recognize those as images of people at all. I'm not sure whether that's a better outcome.

esyir · on Sept 24, 2019

Ah, then it's fine... probably.

kristianp · on Sept 24, 2019

This should be the link to the article.

Nasrudith · on Sept 23, 2019

Really the bias isn't the true problem here so much as the lack of epistemology in how it is being used. By definition associations work by blind correlations - expecting anything other than stereotyping is a foolish misuse of the tool. It will be wrong for outliers because it tries to get it right for most cases regardless of cause - like skin cancer detectors who saw rulers as a sign of cancer because most of the reference images of cancer had a ruler in them.