Like a Challenge? Try Aligning AI With Human Values

Matt Thompsett Press Release

Better still, try agreeing a common set of human values that are robust enough to be recognised and safeguarded by every nation state, political faction, interest group, religion and by every human being on the planet. Good luck.

Let’s take a step back. AI is here and AI is not going away. The advent of super-intelligent AI systems predicates step-change in human existence beyond our imagination. It is not difficult to conceptualise a world where, in a relatively short period of accelerated development, AI will run the show for us. Fully automated and self-governing production/distribution/consumption systems for food, energy, CPG and just about everything else is a near-term reality.

AI will have mastery of our systems and consumption of services such as communications, medical care, transportation, housing and education. Likewise, AI will be at the core of our systems of record, our decision making and our governance (hesitating to say government!).

So we will inexorably cede control of our vast and complex ’empires’ to super intelligent AI systems, nothing is more inevitable. It is this inevitability which has motivated a ‘who’s who list’ of scientists, humanitarians, legislators, entrepreneurs, researchers, filmakers, etc to tackle the dangers associated with bringing AI into the world. This ‘think tank’ is the platform for the concerns of eminent thought-leaders such as Steven Hawking, Elon Musk and Demis Hassabis.

The threat of weaponised AI systems destroying humanity has been served up to us through numerous films, books and news articles. However, the classic ‘Terminator’ plot is only one of a broad spectrum of scenarios in which AI will pose a threat to our planet, our ecosystems and indeed human life.

The process of ensuring that AI is responsibly deployed for the benefit of mankind is one that we must undertake as early as possible, the adoption of AI has already begun. An accelerated deployment of AI without established and vigorous principles for design and purpose will be a reckless course to take.

Fortunately, world-wide, thought leaders are taking the initiative and forcing the discussion to the table. One such ‘discussion’ took place at the 2017 Asilomar Conference. In conjunction with this conference, the Asilomar AI Principals were developed, principals whose purpose is to establish a framework for developing beneficial AI.

There are 23 Asilomar AI Principals. Guided by these principles, AI ‘should’ offer incredible opportunities to help and empower people, all people, whilst safeguarding the planet and its precious ecosystems and resources.

To me, the most important and fundamental of the 23 principals is the tenth;

‘Value Alignment: Highly autonomous AI systems should be designed so that their goals and behaviors can be assured to align with human values throughout their operation’Asilomar Principal 10

This principal presents a significant challenge, mastery of which will determine the future of our race and the planet. The problem at the core of this principal is that human values are not universally shared, are not simple to express and are widely open to interpretation. Human values vary massively according to many factors such as economic stability, culture, religion, morals, etc.

For example, let’s imagine two super intelligent AI systems running food production facilities. One of our facilities is in affluent Europe and the other in a third world economic region. The focus of both facilities is based entirely on the values attached to food production and here is the ‘conflict’. The facility in Europe would produce healthy, well-balanced, low salt and high quality products at a relatively high price point. The mission is to acquire market share and make the shareholders happy. Equally the facility would run with a low (or zero) carbon footprint, minimal environmental impact with sustainability at its core.

Our third world facility would have an entirely different focus. The challenge is about food security not stock valuation. The facility would produce basic products at a low price point, there would be little or no consideration of environmental impact. The mission is to feed people, thus quality would be less important than supply. This is not a criticism of the third world, it’s the reality that affluence enables a different set of values to be embraced. It is a lot easier to be altruistic when all your basic needs are fulfilled.

In this example, it is immediately apparent that divergent values are inevitable. We see this all over the world. For example, over-fishing is a major problem being addressed in the ‘first world’, it’s a very different problem if it is the only way to feed your family. The destruction of the rain forest is globally opposed, would you oppose it if cultivating palm for palm oil was the only way you could guarantee survival?

Let’s take a different value system, let’s call them the ‘shared commandments’. These commandments exist, in one form or another, in every religion/culture and have done since we chose upright as a means of getting about. An easy one to accept at face value is ‘don’t kill people’ but that’s not strictly true and needs articulation. Generally, it is accepted to be ‘don’t kill people unless you have to…’

In some cultures it’s acceptable (expected) that you kill someone who has broken with tradition and brought shame on your family, an intolerable crime in other cultures. Military peace-keepers have a thorny dilemma with this one; is it acceptable to use lethal force knowing that there will be ‘collateral’ casualties? These are serious and complex decisions. So the real question is what are the circumstances that make killing acceptable? That’s the sort of question (albeit a very unpleasant one) that AI has to be provided with the answers to.

So how do human beings make complex decisions, especially where values conflict? We use a system that involves three ‘layers’ of values working in harmony. The first layer is often described as our ‘ID’ and encapsulates the basic needs of existence; feeding, fornication and survival. The second is more complex and has its origin in how we developed to maximise our chances of survival; social and cultural influences. This second layer is often referred to as the ‘Super Ego’. To explain, the Super Ego represents the learning that our society and culture imposes on the ‘raw material’, the child. For example, a baby will grab food from another’s plate but scream blue murder when the tables are turned. Society teaches us that it’s not right to eat from someone else’s plate and that we should share. Children are taught how to behave and interact according to a complex set of rules according to which culture/social they belong. These rules are based on human values which differ widely from society to society and from individual to individual.

However, our basic instincts and learned behaviour are often in conflict and have to be mediated. For example, our basic instinct is that our survival is paramount yet our learned behaviour is that we should protect the vulnerable often at risk to ourselves. So how does a fire officer decide whether or not to rush into a burning building to rescue a stranger yet putting their life at risk? For these complex mediations we need a third layer that enables real-time balancing of the basic demands of the ID and the socio-cultural influence of the Super Ego – it’s called the Ego. The ego manages there ‘here and now’ conflict between the other two states.

As an example; we have a basic instinct to satisfy our hunger, when hungry and confronted by a table laden with delicious food we are instantly prompted by our ID state to ‘dive in’. Our socio-cultural influence (in conflict) tells us that we should wait to be invited, let other people go first, eat in moderation, not be a pig, etc. The ego will resolve the conflict by assessing the context of the dilemma, weighing up the risk of embarrassment with the desire to self-satisfy and making a decision that determines our behaviour.

This balancing of the ID and Super Ego is continuous, complex and extremely difficult to ‘map’ using simple models and yet it is the very process that enables society to function. Unfortunately, there are countless examples of where circumstances strip us of our socio-cultural influence and we revert to satisfying basic instincts, usually to the detriment of others. Likewise, when human values differ, the process leads to conflict that consumes not just individuals, but interest groups and nations. For example, if there is such a utopia as common human values, why are there irresolvable conflicts over such issues as euthanasia, abortion rights, diversity and immigration?

Developing an AI system that can take on the complexity of this human decision-making system is a formidable task, an impossible task unless human values can be baselined. Otherwise, the values embedded in the AI system that form the conditional ‘reference’ will be those of the creators whose values might well (probably will) conflict with other ‘interest’ groups.

Finally, principal 11 of the Asilomer Manfesto states;

‘Human Values: AI systems should be designed and operated so as to be compatible with ideals of human dignity, rights, freedoms, and cultural diversity’Asilomar Principal 11

The expression ‘compatibility’ dictates that, again some sort of harmonious pre-state exists. This is absolutely not the case despite the fact that the majority of the UN members are signatories to the International Bill of Human Rights (at its core the Universal Declaration of Human Rights) which passed into international law in 1976.

Whilst a great leap for humanity, the International Bill of Human Rights (IBHR) and its offspring UN convenants have not been without problems and remain conflicted and challenged. For example, religions find its covenants conflict with religious law likewise state interests may not be best served by adherence to the tenets of the IBHR. Hence, many countries abstained from ratifying the IBHR, many countries regularly breach its covenants; and, abuses of the fundamental rights it seeks to protect are frequent.

Whilst principal 11 extols the virtue of aligning AI systems with human rights, the lack of a common set of ‘values’ creates an insurmountable challenge.

My hypothesis is that the adoption and deployment of AI systems is here, now and certain but determining the human values, for AI to align with, will remain stubbornly out of reach until we can agree a baseline of human values, globally.