More

mdlincoln · on Dec 19, 2019

context-dependent, or "reified" assertions are a pain point for sure. I come from the perspective of cultural heritage data, where context is king. Which expert made this attribution for this painting? Who owned it _when_? According to which archival document? etc.

Almost all the engineering problems cited in the original post are still basically there, but graphical models are still the least painful way of doing this, particularly when trying to share data between institutions. Example: https://linked.art/model/assertion/

zozbot234 · on Dec 19, 2019

The OP mentions property graphs as a way around this problem. They can be seen as natural extensions of "RDF quads" which in turn are based on common RDF triples (Subject / Property / Object)

mdlincoln · on Jan 16, 2019

You may be interested in what Birkbeck has been developing: https://github.com/BirkbeckCTP/janeway

n1000 · on Jan 16, 2019

Very interesting - thanks! For completeness' sake: GitHub-based publishing is possible too: http://www.theoj.org

mdlincoln · on May 1, 2018

Just commenting to add this wonderfully succinct summary of the post by John Overholt:

>It takes a tremendous amount of work to make the work that goes into photographing this goblet invisible.

https://twitter.com/john_overholt/status/991110369082068992

mdlincoln · on Feb 5, 2018

Several hundred publications from the Getty Museum (and the other research arms of the Getty) are available for free download as well: https://www.getty.edu/publications/virtuallibrary/

(I work at the Getty Research Institute)

BrentOzar · on Feb 6, 2018

Just since I've got your eye, and you'll get a chuckle out of the bug - it appears that the title listing pages don't handle unicode correctly:

"Fernand Khnopff: Portrait of Jeanne K\xE9fer"

http://www.getty.edu/Search/VirtualLibrary?title=&author=Mic...

Click on the title, and the book details page renders correctly.

mdlincoln · on Aug 28, 2017

I find that a fascinating reaction given how rapidly %>% have been taken up across a large segment of the R universe, to great excitement! Personally, I find it far MORE legible than endlessly-nested function calls.

It results in code that more closely resembles executed order of operations (e.g. filter -> mutate -> group -> summarize). Context is also key: it's most often used for data processing pipelines in specific analytical scripts or literate-code documents - less so used when defining generalizable/testable functions in packages (again, just a personal perspective - YMMV of course)

extr · on Aug 29, 2017

you nailed it. dplyr is better the further you are from doing heavy duty data analysis or creating production code. if you're writing some simple transforms to put data into a report, fine. someone is probably going to want to look at that at some point and it's much, much easier to understand. but for anything else i stick with data.table.

mdlincoln · on April 18, 2017

A related aside: while forgeries - deliberate imitations to mislead and deceive - are exciting, they only represent a very tiny portion of art attribution questions. In reality, these tend to deal more with discerning between artists working in the same period, rather than those attempting to fool the eye at several centuries' remove.

For example, the Rembrandt Research Project infamously set out to identify genuine vs. fake Rembrandt paintings in his corpus of known works under the false assumption that there would be a lot of 18th/19th/20th century forgeries. In fact, most of the "non-Rembrandt" cases they found were not later imitations, but instead works done by his own students or contemporaries - or works co-produced by Rembrandt and another. The result - deconstructing the project's original false assumption - proved revolutionary for our understanding of artistic studio practice from the period, but failed to locate many "forgeries" as such.

A review (paywalled, sorry!): http://www.sciencedirect.com/science/article/pii/02604779899...

And a Met exhibition: http://www.metmuseum.org/art/metpublications/Rembrandt_Not_R...

mdlincoln · on Feb 9, 2017

I'm working on pulling the images now, like I did for the Rijksmuseum CC0 dump. FWIW a good place to host that torrent is the Internet Archive - it's great for discoverability.

lostmsu · on Feb 11, 2017

How did it go?

mdlincoln · on Sept 30, 2016

And if you want a TL;DR: http://neverworkintheory.org/2016/09/30/rethinking-git.html

mdlincoln · on Jan 6, 2016

The NYPL has posted metatdata about these collections on GitHub as well

https://github.com/NYPL-publicdomain/data-and-utilities

shmerl · on Jan 6, 2016

More background on it: http://www.nypl.org/research/collections/digital-collections...

A visualization tool: http://publicdomain.nypl.org/pd-visualization/

mdlincoln · on Dec 31, 2015

<3 this! Are there any plans for some type of export utility, e.g. some type of JSON serialization of a finished model?

ozgooen · on Dec 31, 2015

That's definitely on the agenda. Curious, what are you interested in a JSON serialization for?

mdlincoln · on Dec 31, 2015

Oh, it wouldn't have to be that format in particular, I was just guessing how you might represent the worksheet in some useful, repurpose-able manner.