The Data Science of Art

My larger theory is that data=art and art=data and that Artificial Intelligence will be nothing more or less than a continued exercise in artistic-historical integration of new mediums and forms.   However this post isn't another rehash of those ideas.  This one is about the data of art.

Here I offer some insights by computationally analyzing art (my own and pointing to others analysis.)   There are quite a few excellent and very detailed data science analysis of art that have come out recently due to the fact that more and more collections are being digitized and data faceted.   Here's fantastic write up of MoMA's collection.  Michael Trott offers a very detailed analysis of ratios and others hard facets of a couple hundred years of visual art.   Last year Google made waves with Deep Dream which was really an analysis of how neural networks do work.  Turning that analysis on its head and you get art creation (my first point above... the lines between art and data are more than blurry!)

On to some pictures depicting data about pictures!

A cluster analysis of 534 of my visual works

While an artist has intimate knowledge of their own work it is likely highly biased knowledge.  We often become blind to our habits and tend to forget our average work.   Doing a large scale unemotional analysis of all the art is enlightening.   

The cluster analysis above was created using machine learning to cluster images by similar features (color, composition, size, subject matter, etc) (I did everything in Mathematica 11/Wolfram Language).  A cursory view shows density in deep reds and faces/portraits.  My work tends to be more varied as I move into dry media (pencil, pastels) and remove the color (or the algorithms simply don't find as much similarity.   And, of course, the obvious note here is that this is just a quick view without attempting to optimize the machine learning or feed it cleaned up images (color correction etc).   What else do you notice about the general shape of my work?  (I am currently running a larger analysis on 4500 images instead of just 530, curious if things will look different.)

Drilling in a bit we find some curiosities.  Below we see a couple of deeper dives into particular images where we break down the images geometry, detection of objects/subjects and a little bit of color analysis.  What I find most interesting is just own often similar compositions show up in varied subject matters.  Not terribly surprising as it's all just marks on a page but it is interesting just how biased I am to certain geometries?

I tend to love strong crossing lines or directly vertical or horizontal.  In my own art history studies we find this isn't necessarily optimal for keeping a viewers attention.   Often these geometries are taking eyes right of the page.

Below is a bit of a detailed view of a few images and their 4 closest neighboring images/works.  Pretty weird in a lot of ways!   I am heartened to see that over a 18 month period of work I haven't particularly sunk into a maxima or minima of ideas.  I can see many older works in a cluster with newer works as old themes/ideas resurface and are renewed, often without active thought.  Another study I should do is to sort these out by source of the picture as I can tell that photos taken from phones and posted to instagram etc often distort what was really there etc. (not a bad thing, just something to analyze to measure effects/artifacts).


I'm going to continue drilling into all this and do a full color spectrum analysis, histograms of subject matter, categorization of mediums and materials, etc.    The point is not to find a formula but instead to learn and become aware and enjoy.  I do not find data forensics to be limiting or mechanical.   You can see in the above that even the math is fairly interpretative and loose - the object identification is off often, faces often are missed, images are misgrouped... OR are they?   

One of my recurring themes in my art exploration is a simple question "what art entertains a computer/algorithm?" or more provocatively what if to art is to be human which is really just data science....