Unsupervised deep learning identifies semantic disentanglement in single inferotemporal face patch neurons
Our brain has an amazing ability to process visual information. We can take one glance at a complex scene, and within milliseconds be able to parse it into objects and their attributes, like colour or size, and use this information to describe the scene in simple language. Underlying this seemingly effortless ability is a complex computation performed by our visual cortex, which involves taking millions of neural impulses transmitted from the retina and transforming them into a more meaningful form that can be mapped to the simple language description. In order to fully understand how this process works in the brain, we need to figure out both how the semantically meaningful information is represented in the firing of neurons at the end of the visual processing hierarchy, and how such a representation may be learnt from largely untaught experience.