Digital Humanities

Data Visualization

Posted by Emily Crockett

This week’s class on data visualization was very exciting for me, as I think data is the bee’s knees. That being said, I understand many of classmate’s hesitation towards it, particularly with a title like “When A Machine Learning Algorithm Studied Fine Art Paintings, It Saw Things Art Historians Had Never Noticed.” Ironically, in my required class, Information Retrieval for my Information Science masters, I had actually read the article this blog had summarized. I think the biggest difference between this blog and the actual article is the importance of search, which I think was minimized in the blog post. In fact, the research area that the Salah paper was integrated in was Content Based Image Retrieval (CBIR). CBIR refers to the idea of using what is actually in the image rather than any textual description related to it. In the field of image retrieval, there are two important factors recall and precision. Recall refers to of the known matches to one’s search query, how many did the search algorithm return. Precision is looking at if 10 articles are returned, but only 4 of them are relevant to the query, that is not a particularly good precision rate, where as if only two articles are returned, but those two are both relevant, it is a 100% precision rate. Generally as recall goes up, precision goes down. With this in mind, the study was rather successful, because their algorithm was able to return correct images. This idea that the algorithm “saw things art historians had never noticed” is reductive. While they do mention that no art historian had mentioned this “connection” this isn’t a particularly big part of their research. Ultimately they’re interested in increasing both the recall and precision for when art historians are using search engines, or in their future work they’re interested in searching by the principles of art, which may be more useful for artists. Of course the question that comes up with all of these projects is who cares, what does this actually do for the discipline. While I can’t answer that definitively, I do think it would be interesting as a tool for pieces of art that may have an unknown artist or provenance. For example, if someone is at a flea market and finds a piece of art maybe they’d be able to take a picture (in the same way that people can book barcodes to determine their worth at a thrift shop) and be able to move forward with who a possible artist is, it may be able to uncover some hidden histories. By this I mean, perhaps one would be able to use the tool to find similar paintings to ultimately determine that they’re by an unknown woman artist. Or, it would be interesting thing to know perhaps in cases where we may know that a husband/wife team the wife didn’t get appropriate credit, we’d be able to determine specifically what pieces the wife worked on, and maybe how much of the piece. Another irony for this week was the idea of bag of words and topic modeling, both of which are very important concepts in information retrieval. As we mentioned in class, it’s unclear what exactly this may work in for art history but it could be interesting to query journals or letters of an artist.

In terms of data visualization, it’s interesting the way that seeing something visually can elucidate new information. In the tableau dashboard I made below, it compares the acquisition years of the Tate gallery as split up by gender. Seeing it visually it’s very clear that 1975 was a big year for acquisitions overall, and while males and females seem to follow similar curves overall, there’s incredible difference in the amount of each. However, none of this really is shocking information, so once again we are brought to the idea of who cares, what does this do for the discipline. Again, I can’t answer this, but I hope as people are able to improve tools and data to be able to do something really interesting that can move the discipline forward.

Related Post

Leave A Comment