Video Abstract

The relationship between Artificial Intelligence, images and language can be seen in the light of recent theories, which have a high potential influence on art.

According to research published by MIT Boston – The Platonic Representation Hypothesis, by Minyoung Huh, Brian Cheung, Tongzhou Wang, Phillip Isola – linguistic models of Artificial Intelligence are converging, creating increasingly uniform data representation models.

Beneath the many languages of mankind, now translated by Artificial Intelligence, and the images used by AI models, we are detecting the existence of a kind of universal “world of ideas,” reminiscent of the concept of the Platonic Hyperuranium.

As models encounter increasing datasets and broadening applications, they require a representation that identifies common fundamental properties found in all types of data, whether text or images.

The Platonic representation hypothesis thus holds that both image (img) and text (text) are projections of an underlying common reality, what Plato called the “idea.”

Thus, it seems that in AI the vision and the text are converging.

Computer vision, through sensors, captures many images of a real-world object. The images detected by the sensors are classified with a word, for example, the geometric figure of a cone, or an animal species; the cat, which in turn can be expressed in multiple languages (cat, chat, cat…), but retains its convergent reference to the images.

This convergent hypothetical representation is called a “Platonic representation” in reference to Plato’s Allegory of the Cave and his idea of an ideal reality underlying our sensations. The training data for our algorithms would then be shadows cast on the wall of the Platonic cave. Starting from this received, collected and processed data, the models are developing increasingly improved and universal representations of the real world outside the cave.

Furthermore, in AI neural networks, “Rosetta neurons” were discovered by Dravid that act as bridges between coders trained in different languages, and are activated by the same pattern in a range of visual patterns. Such neurons form a common dictionary, discovered independently of all patterns.

As artificial intelligence models grow, due to their depth and complexity, they acquire a greater capacity for abstraction. This allows them to capture concepts and patterns underlying the data, eliminating noise or outliers, and thus arriving at a representation that is more universal and potentially closer to the real world.

The Platonic Representation Hypothesis notes the existence of a universal “world of ideas” shared in all human languages, and common between words and images. A “hyperuranium” is emerging from Artificial Intelligence that can inspire new art forms of visual writing. The image is released under a Creative Commons Attribution 4.0 International (CC BY 4.0) license. Artwork by Gualtiero and Roberto Carraro – Homo Extensus. Please quote the authors and link to the original page

Insights