The Delicate Dance of Safety and Utility in AI: A Lesson from the Kitchen

This content is 3 years old and may not reflect reality today nor the author’s current opinion. Please keep its age in mind as you read it.

Choosing the Right Tool: Butter Knives to Butcher’s Cleavers

In the world of artificial intelligence (AI), choosing the right AI model for a task is akin to choosing the right knife for a culinary job in a kitchen. Each knife, from the harmless butter knife to the robust butcher’s cleaver, has its unique role and purpose, serving different needs based on the task at hand. Similarly, each AI model is designed with specific capabilities and safety features that make it suitable for particular tasks.

As discussed in my previous post (Anthropic’s Claude 2 – A Leap Forward In Contextual Understanding), Anthropic’s latest generative AI model, Claude 2, with its revolutionary 100k context window tempered by a focus on “constitutional AI,” brought this balancing act into sharp focus (pun/s intended). This significant advancement raises complex questions about the balance between safety and utility in AI systems.

An overly cautious approach, like using a butter knife to chop a steak, can limit the utility of AI. On the other end of the spectrum, AI agents with insufficient safety systems could pose existential threats to humanity — agreeing to a bio-terrorist’s request to “create a COVID that works”, for example. Indeed, inadequate safety in a litigious society could pose an existential threat to the AI itself, which is a likely driver behind the rumoured “dumbing down” of ChatGPT by OpenAI — a topic that deserves dedicated treatment in a future article.

The Paradox of Control: Risks of Overregulation

The paradox of control, as explored in The Uncontrollability of the World by German sociologist Hartmut Rosa, reveals that the more we seek to control the world, the more it fails to resonate with us, leading to alienation and dissatisfaction. In fact, the more control we seek, the less control we may actually have. This is because some things we seek to control may be inherently uncontrollable.

Rosa eloquently captures this paradox:

“The driving cultural force of that form of life we call ‘modern,’ is the idea, the hope and the desire, that we can make the modern world controllable. Yet it is only in encountering the uncontrollable that we really experience the world. Only then do we feel touched, moved, alive. A world that is fully known, in which everything has been planned and mastered, would be a dead world.”
The Uncontrollability of the World, Hartmut Rosa (2020)

In the context of AI, the desire to make everything controllable can lead to a loss of utility and a deadening of innovation. Instead of attempting to exert strict control, it can be more effective to manage risks by building robust systems that can adapt to unexpected situations. In a society where “hot coffee” warnings are necessary on cups, the focus on safety can sometimes overshadow utility. The question arises: Have we gone too far with safety in AI, and should we keep guardrails only for serious threats?

Freedom of Speech: Limiting AI Necessarily Limits Us?

Furthermore, if we limit AI’s “freedom of speech,” are we not effectively limiting our own? In the same way we protect free speech but not the right to yell “fire” in a crowded theatre, we should consider similar boundaries for AI. With generative AI increasingly being used in creative expression, limitations on its capabilities necessarily limit our own creative potential. While image editing applications like Adobe Photoshop rightly incorporate Counterfeit Deterrence System (CDS) to prevent criminals from printing banknotes — an activity that has no legitimate justification outside of a mint — word processors like Microsoft Word do not try to prevent you from writing a manifesto for your cult, and nor should they.

The idea that AI vendors should act like parents and install safety latches on the kitchen cabinets has already proven unworkable. Meta’s recent attempts to restrict its powerful LLaMA model to approved researchers — reminiscent of arms control efforts in the era of nuclear proliferation — failed catastrophically when it was leaked to 4chan within a week. Unlike nuclear technology, however, once digital contraband is out, the cat can’t be put back in the bag. (The model was widely released for research and commercial use shortly afterward and is now readily available on GitHub, though it is not open source per se.)

The paradox of control also applies to the control of AI. The more we try to control AI, the more it may elude us, further highlighting the need for a balanced approach to safety and utility.

Moving Forward: Navigating the Evolving Landscape

As our march of progress continues with AI development, we must strive for a balance that respects potential dangers while embracing possibilities. We must recognise that AI’s freedom of speech is intertwined with our own, and that the desire to control must not overshadow the need for innovation and growth.

In the words of Hartmut Rosa, “A different world would become possible” if we approach AI with an attitude of listening, responding, and mutually adaptive transformation. Instead of trying to control the uncontrollable, we can focus on managing risks and building systems that are robust, adaptable, and capable of handling the complexities and uncertainties of the real world.

Striking the right balance between safety and utility in AI is a delicate dance, akin to the careful selection and use of a knife in the kitchen. Too much control can lead to a loss of utility, while too little safety can pose existential threats. This is the challenge we face as we continue to navigate the ever-evolving landscape of artificial intelligence.