As humans, we have a visual system that allows us to see (extract and understand) shapes, colours and contours. So why do we see every image as a different image? How do we know, for example, that a box in an image is, in reality, a box? And how do we know what a plane is or what a bird is?