Overview of the project
The task was to create a classifier for - obviously - a Santa! We know there are children who stay awake all night to spot who brought them presents. But our solution is for the smart ones - they can safely go to sleep and let our classifier do the rest.
Our Data Scientist used Tensorflow 2 for that. Here’s the architecture, just for the record:
Orest used ResNet50 body and added several layers on top of it.
Creation of the dataset
The creation of our own dataset was another thing to do. For everyday, typical images it was easy peasy - we took it from here: (https://www-old.emt.tugraz.at/~pinz/data/GRAZ_01/) (persons.zip)
A good point to watch is to remember about possible licensing issues of images. It is crucial to check the licenses of images beforehand.
And here’s one funny thing that demonstrated manually checking the training data is fruitful. Let’s see what the “Santa scraper” found: https://theconversation.com/...
And that's just two examples.
We don’t say it’s wrong. We just say: “Align your data to your goal”. If your goal is to find ANY Saint Nicholas (literally) or Santa with ANY skin color, then it’s ok to keep it that way. Or actually: it would be if you had thousands of images of an actual Saint Nicholas, not just this one. Otherwise, you will probably confuse the network with just this one picture.
All in all, it took 704 train & 78 val images and 2 min 45 s on CPU to train the Tensorflow model to achieve 96,15% accuracy on the test set after 6 epochs. Quite impressive for a mini 4-day project for our young trainee data scientist. And I think the most interesting part is yet to come.
Explainability of the model
Another thing to do was to develop a tool that would show us a heat map: which parts of an image exactly made the model decide so?
Here's a quick and easy solution for that and it’s called SHAP.
Some of you might already know it - its mechanics are based on Shapley Values which in turn are connected with Game Theory. It’s a great tool for model decision explainability. It works with Keras and Tensorflow; we use it for XGBoost models as well. That’s why I advised my colleague to give it a try.
3 hours later and bam! The visualizations are ready. Wanna see?
The image on the left is our input to the network. The last two images are quite tricky. The blue dots are the ones that say “it’s not that class”, while the red ones say “it is this class”. And here’s the tricky part: the image in the middle shows us, which pixels were standing for and which against classifying an image as class 0 (here: “it’s just a human, not a Santa”). The image on the right - which pixels were standing for and which against classifying an image as class 1 (here: “it’s a Santa”).
So, as you can see, the red and blue color means something completely different in those two images.
It is actually funny to see, that the head in the middle attracted attention from the model in the case of both classes. It’s just that in the case of class 1 (“It’s a Santa”) there were much more of them. We could guess now: is it the beard? Maybe the slightly greyish color? We would have to analyze it a bit longer to get precise answers. And I promise we’ll do the homework until next Christmas.
Btw, I am super happy to have been a mentor for this project. And I hope Orest is happy to join us too - and I hope that he didn’t have to pull an all-nighter to get those results:)