Bookmarks: Superintelligence — AI alignment, Bayesian reasoning, Harry Potter fanfic, and a murder cult

Editor’s note: I’m in the habit of bookmarking on LinkedIn and X (and in actual books, magazines, movies, newspapers, and records) things I think are insightful and interesting. What I’m not in the habit of doing is ever revisiting those insightful, interesting bits of commentary and doing anything with them that would benefit anyone other than myself. This weekly column is an effort to correct that.

Depending on who you follow on LinkedIn and which blogs you keep an eye on, you may have noticed rising discourse around the evolution of AI as we know it today into some kind of superintelligence, a system more capable than humans in cognitive abilities across all domains. That conversation tends to amplify fears around AI-led job displacement, but the concept of superintelligence has been around for a while, and the real, long-term concern is more about AI alignment than about workforce ebbs and flows.

Philosopher Nick Bostrom popularized the term in his 2014 book titled, Superintelligence: Paths, Dangers, Strategies. More recently OpenAI CEO Sam Altman has become the flag bearer but again, it’s not a new talking point. In a 2023 blog, Altman characterized superintelligence as something beyond artificial general intelligence (AGI), and rightly discussed the topic in the context of the also well-established issues with AI alignment.

Altman wrote: “The first AGI will be just a point along the continuum of intelligence. We think it’s likely that progress will continue from there, possibly sustaining the rate of progress we’ve seen over the past decade for a long period of time. If this is true, the world could become extremely different from how it is today, and the risks could be extraordinary. A misaligned superintelligent AGI could cause grievous harm to the world; an autocratic regime with a decisive superintelligence lead could do that too.”

AI alignment is harder than it sounds

Here Altman raises points around AI as a strategic geopolitical lever that we’ll set aside for now; instead we’ll focus on AI alignment. The big goal of AI alignment is ensuring that increasingly powerful AI systems do what humans intend for them to do. This sounds simple but it is not. For AI to do what humans want it to do, it has to understand intent and it has to somehow understand complex systems of uniquely human values.

More so than Altman or anyone else really, Eliezer Yudkowsky has long rung the alignment alarm, gathering a number of disciples along the way, and writing extensively about it — and many other things — in a vast, sometimes inscrutable piece of Harry Potter fan fiction. More on that later.

Yudkowsky is a self-taught AI researcher who co-founded the alignment-focused Machine Intelligence Research Institute (MIRI), and gained a following through his collection of blog posts called The Sequences published on LessWrong. He draws materially on Bayesian reasoning — this is statistician Thomas Bayes’s method of probabilistic thinking wherein you continuously update your belief system based on newly available information.

As it relates to AI, this is all about the ability to reason when faced with uncertainty. It’s technically applied in existing AI systems for inference tasks, and it’s theoretically applied in aligning intent, weighing the applicability of new information, and selecting pre-existing assumptions that feed into initializing an output.

All of the above (Bayes, Yudkowsky, futurism, etc…) converges around 2009 into the rationalist movement which flourished on LessWrong. The site describes the “most canonical” definition of rationality as, “The more rational you are, the more likely your reasoning leads you to have accurate beliefs, and by extension, allows you to make decisions that most effectively advance your goals.” As it relates to AI, the folks at LessWrong “are predominantly motivated by trying to cause powerful AI outcomes to be good.”

From rationality to religion — meet the Zizians

The epistemology around the outlook for AI, in my estimation at least, falls into two very broad buckets: the grounded, practical approach to incremental improvements made to solve specific problems and the quasi-religious notion that AI will propel humanity into either collapse or utopia. We explored some aspects of this divergence in comparing Apple’s and OpenAI’s commentary on how capable current systems are. Let’s go further.

In addition to The Sequences, another foundational text of the rationalist movement is Yudkowsky’s epic fanfic “Harry Potter and the Methods of Rationality”. I struggle to even attempt a summarization but it feels necessary. Harry is both a magical and scientific prodigy who relies on experimentation, logic, manipulation, and reasoning to outthink (and defeat), rather than outmagic, Voldemort. Yudkowsky and his work is divisive and fringe, but his thoughts around AI alignment have found their way into more mainstream discussions.

As the rationalist community grew, offshoots emerged. One of those is referred to as the Zizians, named after the potential leader, Ziz LaSota. In an excellent piece of journalism for The Guardian, J. Oliver Conroy tells the ongoing story of Ziz and the Zizians. He described Ziz’s writings about rationalism as having “polarized members” of the community, in the process earning a base of admirers and followers. “A few things drew those people together: all were militant vegans with a worldview that could be described as far-left. All were highly educated — or impressive autodidacts…But what they had in common, above all, was a kinship with a philosophy, which Ziz largely promulgated, that takes abstract questions from AI research to extreme and selective conclusions.”

The detailed narrative is well worth reading but, suffice to say, Ziz and other apparent affiliates are allegedly involved in or associated with persons of interest involved in four alleged murders, including of a California landlord, the parents of a group member in Pennsylvania, and a U.S. Border Patrol Agent in Vermont. It’s all very weird.

Conroy summarizes: “It goes without saying that the AI-risk and rationalist communities are not morally responsible for the Zizians any more than any movement is accountable for a deranged fringe. Yet there is a sense that Ziz acted, well, not unlike a runaway AI — taking ideas and applying them with zealous literality, pushing her mission to its most bizarre, final extremes.”

Which brings us back to AI alignment and back to Bostrom who famously described the paperclip problem in his aforementioned book on superintelligence. In that thought experiment, Bostrom describes an AI tasked with making as many paperclips as possible with no other guardrails or limitations. The AI starts by using available resources to make as many paperclips as possible, then starts to use all available matter, including humans and the world they occupy, to make paperclips. In this case, the AI didn’t understand our implied intent of making as many paperclips as possible without destroying civilization; it didn’t understand our value system.

In a more recent blog on the “gentle singularity,” Altman (who I’m not picking on for sport; he puts himself out there on this stuff) tracks the near-term progress of AI, calling out the arrival of agents this year, the arrival “of systems that can figure out novel insights” next year, and the rise of AI-enabled robots in 2027. “We are past the event horizon; the takeoff has started. Humanity is close to building digital superintelligence, and at least so far it’s much less weird than it seems like it should be.”

Generally speaking, he’s right. It has been less weird. But as the case of the Zizians shows, as well as the dense, engaged online communities debating things like the fundamental nature of logic and reasoning as it applies to this drive to superintelligence, it’s still pretty god damn weird. That to say, AI alignment is a technical problem and a human problem. And the real danger from humans who deeply believe they’ve figured out the path forward is perhaps as (or more) present than the perceived danger of rogue AIs.

Bookmarks: Superintelligence — AI alignment, Bayesian reasoning, Harry Potter fanfic, and a murder cult

AI alignment is harder than it sounds

From rationality to religion — meet the Zizians

ABOUT US

FOLLOW US

Bookmarks: Superintelligence — AI alignment, Bayesian reasoning, Harry Potter fanfic, and a murder cult

AI alignment is harder than it sounds

From rationality to religion — meet the Zizians

RELATED POSTS

ABOUT US

FOLLOW US