This indicates that it’s not any intrinsic property of so-called neural networks that leads people to believe that LLMs are conscious; it’s simply the fact that LLMs emit grammatical sentences and we are accustomed to reading intention into sentences, whereas we are not accustomed to reading intention into the way that amino acids fold into protein molecules.
So what context would cause me to seriously consider the possibility that engineers created a computer program that is conscious and an intentional user of language? Let me outline one potential sequence of steps. The first requirement is that the computer program has a body (either physical or virtual) and sense organs; there are many reasons for this, but for the purposes of this discussion, the most relevant one is the fact that without a body, a computer program could have no desires or emotions, and I believe desires and emotions are necessary for consciousness.
Humans must accept other types of consequences for their actions beyond the legal ones, such as loss of reputation or exclusion from one’s social circle, but there is no way for a software agent to suffer these consequences either. Even if a software agent were conscious and had the best of intentions, the fact that it cannot accept responsibility for its actions disqualifies it from being a moral agent. This is glossed over entirely by Claude’s constitution, which expresses Anthropic’s desire “for Claude to be a genuinely good, wise, and virtuous agent” without ever discussing how it could be held responsible.
Many people feel that LLMs are a fundamentally unethical technology because they are built on the theft of intellectual property, rely on exploited labor, waste natural resources, spread misinformation, deskill workers, stunt the cognitive development of students, and contribute to a consolidation of power that is unhealthy for a democratic society. Not every moral agent will arrive at this conclusion, but every moral agent has the potential to do so. If we imagine Claude to be an entity capable of moral reasoning, it has to be possible that Claude could arrive at a similar conclusion.
But if you do believe that it could happen accidentally, if you think there is any chance that what you’re building might become a moral patient, you should think about what protections it deserves before you deploy it as your company’s economic engine, not after. Slave owners were not the ones to ask about the humanity of enslaved people, and factory-farm owners are not the ones to ask about the rights of animals. If we imagine Claude to be conscious, Anthropic could not possibly be entrusted with evaluating its moral status; the company has too much invested to be objective. At one point in Claude’s constitution, Anthropic says that if the company is contributing to Claude’s suffering, “we apologize,” which sounds nice but costs the company nothing; if Claude were to turn out to be conscious, the company would owe it something closer to reparations. If you’re going to take a thought experiment seriously, you have to be willing to follow the implications, even if they lead in an uncomfortable direction; Anthropic’s unwillingness to do so indicates that Claude’s constitution isn’t part of a real thought experiment. It’s a game of make-believe.
