The Cartesian Theater of AI

The Cartesian Theater was an intuition pump proposed by philosopher Daniel Dennett to describe a false theory of consciousness. Inside the mind is a stage watched by a humunculus that observes what passes on the stage, hears the sounds, and controls the rest of the mind by responding to the action of what the smaller version of the self sees. Of course, what makes the smaller self conscious? It is an infinite regression. Our understanding of mind is that there is no such theater. There is no one place in the mind that is the locality of consciousness.

I want to offer another thought experiment based on the concept of a Cartesian Theater.

Imagine a theater containing a screen with a camera pointing at it, and a speaker with a microphone directed toward the speaker. The camera is pointing at the screen. The mic is pointing at the speaker.

However, instead of there being an infinite loop due to a direct connection between the camera and screen, insert an AI in the loop.

The images captured from the camera are sent to a neural network that generates speech, which is sent to the speaker. The mic sends the audio to a neural network that generates images on the screen.

You, as a kind of homunculus, can sit in the theater of this large “brain” and observe the realism of the images and sounds, like observing another’s consciousness.

If the homunculus is removed, are the images and sounds still real? Are they still interacting with each other inside the “brain”?

If you abstract away the theater and only information is passed between the neural networks, is the realism of the images and sounds still present?

How much realism of physical qualities like color or sound is carried within channels of information flow?


