Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Problems with data:

- as a representation of something else, it can be incorrect (meaning error)

- as a domain of decision input, it can be misleading (sampling error)

- for questions of any significant complexity, it's the only way to scale decision-making capacity

- in an economy where actors differ in scale and information asymmetry can be leveraged to financial advantage, data gathering is incentivized even or especially when it contributes to coercive transactions, violating transaction invariants and reducing the competitive parity that disciplines the market

- it gives agents the illusion that they understand, leading to overconfident actions

How does knowledge differ from data in these respects?

- Knowledge is validated by sharing. Facts known only to one or few are not considered known.

- Knowledge can only be shared after it's embedded into overall meaning of a culture

- Knowledge can only scale to the well-understood and well-remembered past events simple enough to be comparable to other such events

Most people make personal decisions based on knowledge. A few people can make assessments/decisions based on data (though many use knowledge and justify using data). Organizations have to reduce knowledge to data to distribute authority and avoid bureaucratic capture.

People are more valuable when knowledge is more valuable, but knowledge only has an operational advantage really when value lies more in conserving states or staying small than producing new ones or going big.



The "problems of data" are not really problems with data, I feel that's what Rich Hickey was alluding to in that discussion (and no, I didn't feel that him & Alan Kay were talking past each other)

> as a representation of something else, it can be incorrect (meaning error)

- So here, you're saying that "Knowledge" may be incorrect. "Sun observed at this position in the sky during various times of day" is data, whereas "Sun moves around the Earth" is (wrong) knowledge. Yes data can contain errors (e.g. incorrect measurements). But Rich Hickey was saying that the fact that data doesn't contain the "interpretation" too is a feature, not a bug!

> as a domain of decision input, it can be misleading (sampling error)

- Right. But at least, it gives you the tools to validate the decision process and identify errors, or potential weaknesses. If you include the interpreter with the data and give direct access to the decision - any error with the interpreter will automatically invalidate all the data (and really it will make it hard to tell whether it's a sampling error, interpretation error, or simply error in the original measurements)

> it gives agents the illusion that they understand, leading to overconfident actions

- On the contrary, KNOWLEDGE does that.


I like this rebuttal. It disentangles data from interpretation and knowledge. This distinction helps us to solve problems associated with data and is a core tenet of science and problem solving.

Increasing the amount of generated data and not jumping to conclusions at the same time is how we avoid getting stuck in misconceptions or plain ignorance.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: