I’ve been wondering how does Creative Commons apply in ‘big data’-ish use cases....

edent · on July 15, 2019

Those are reasonable questions. At work, we release lots of data under OGL (Open Government Licence) which is CC compatible.

For my personal stuff, if you'd like a different license, I'm happy for you to pay me for a more restrictive one. But if you build an ML using my open data, I expect that model to be released under a similarly licence.

goblin89 · on July 15, 2019

Didn’t know about OGL, it does look suitable for this purpose.

To (partially) answer myself, contrary to what I implied CC-BY does cover this base if (for example) the creator of the dataset accepts a note in product’s “About” documentation as sufficient attribution.