AI cybersecurity

Right and Wrong in AI

Background

The DARPA Cyber Grand Challenge (CGC) 2016 competition has captured the imagination of many with its AI challenge. In a nutshell, it is a contest where seven highly capable computers compete, and a team owns each computer. Each group creates a piece of software that can autonomously identify flaws in their computer and fix them and identify flaws in the other six computers and hack them. A game inspired by the Catch The Flag (CTF) game is played by real teams protecting their computer and hacking into others aiming to capture a digital asset which is the flag. In the CGC challenge, the goal is to build an offensive and defensive AI bot that follows the CTF rules.

In recent five years, AI has become a highly popular topic discussed both in the corridors of tech companies as well as outside of it where the amount of money invested in the development of AI aimed at different applications is tremendous and growing. Use cases of industrial and personal robotics, smart human to machine interactions, predictive algorithms of all different sorts, autonomous driving, face and voice recognition, and other extreme use cases. AI as a field in computer science has always sparked the imagination which also resulted in some great sci-fi movies. Recently we hear a growing list of a few high-profile thought leaders such as Bill Gates, Stephen Hawking, and Elon Musk raising concerns about the risks involved in developing AI. The dreaded nightmare of machines taking over our lives and furthermore aiming to harm us or even worse, annihilate us is always there.

The DARPA CGC competition which is a challenge born out of good intentions aiming to close the ever-growing gap between attackers’ sophistication and defenders toolset has raised concerns from Elon Musk fearing that it can lead to Skynet. Skynet from the Terminator movie as a metaphor for a destructive and malicious AI haunting humanity. Indeed the CGC challenge has set the high bar for AI, and one can imagine how a smart software that knows how to attack and defend itself will turn into a malicious and uncontrollable machine-driven force. On the other hand, there seems to be a long way until a self-aware mechanical enemy will emerge. How long will it take and if at all is the central question that stands in the air? This article is aiming to dissect the underlying risks posed by the CGC contest which is of real concern and in general contemplates what is right and wrong in AI.

Dissecting Skynet

AI history has parts that are publicly available such as work done in academia as well as parts that are hidden and take place at the labs of many private companies and individuals. The ordinary people outsiders of the industry are exposed only to the effects of AI such as using a smart chatbot that can speak to you intelligently. One way to approach the dissection of the impact of CGC is to track it bottom-up and understand how each new concept in the program can lead to a further step in the evolution of AI and imagining possible future steps. The other way which I choose for this article is to start at the end and go backward.

To start at Skynet.

Wikipedia defines Skynet as ?Rarely depicted visually in any of the Terminator media, Skynet gained self-awareness after it had spread into millions of computer servers all across the world; realising the extent of its abilities, its creators tried to deactivate it. In the interest of self-preservation, Skynet concluded that all of humanity would attempt to destroy it and impede its capability in safeguarding the world. Its operations are almost exclusively performed by servers, mobile devices, drones, military satellites, war-machines, androids and cyborgs (usually a Terminator), and other computer systems. As a programming directive, Skynet’s manifestation is that of an overarching, global, artificial intelligence hierarchy (AI takeover), which seeks to exterminate the human race in order to fulfil the mandates of its original coding.?.? The definition of Skynet discusses several core capabilities which it has acquired and seem to be a firm basis for its power and behavior:

Self Awareness

A somewhat vague skill borrowed from humans wherein translation to machines it may mean the ability to identify its form, weaknesses, strengths, risks posed by its environment as well as opportunities.

Self Defence

Capacity to identify its shortcomings, awareness of risks, categorizing the actors as agents of risk, and take different risk mitigation measures to protect itself. Protect first from destruction and later on from losing territories under control.

Self Preservation

The ability to set a goal of protecting its existence? applying self-defense to survive and adapt to a changing environment.

Auto Spreading

Capacity to spread its presence into other computing devices that have enough computing power and resources to support it and to allows a method of synchronization among those devices forming a single entity. Sync seems to be implemented via data communications methods, but it is not limited to that. These vague capabilities are interwoven with each other, and there seem to be other more primitive conditions which are required for an active Skynet to emerge.

The following are more atomic principles which are not overlapping with each other:

Self-Recognition

The ability to recognize its form including recognizing its software components and algorithms as an integral part of its existence. Following the identification of the elements that comprise the bot then there is a recursive process of learning what the conditions that are required for each component to run properly. For example, understanding that a particular OS is needed for its SW components to run and that a specific processor is needed for the OS to run and that a particular type of electricity source is required for the processor to work appropriately and on and on. Eventually, the bot should be able to acquire all this knowledge where its boundaries are set in the digital world, and the second principle is extending this knowledge.

Environment Recognition

The ability to identify objects, conditions, and intentions arising from the reality to achieve two things: To broaden the process of self-recognition so for example if the bot understands that it requires an electrical source then identifying the available electrical sources in a particular geographical location is an extension of the physical world. The second goal is to understand the environment concerning general and specific conditions that have an impact on itself and what are the implications. For example weather or stock markets. Also, an understanding of the real-life actors which can affect its integrity, and these are the humans (or other bots). Machines need to understand humans in two aspects: their capabilities and their intentions and both eventually are based on a historical view of the digital trails people leave and the ability to predict future behavior based on history. Imagine a logical flow of a machine seeking to understand relevant humans following the chain of its self-recognition process. Such a machine will identify who are the people operating the electrical grid that supplies the power to the machine and identifying weaknesses and behavioral patterns of them and then predicting their intentions which eventually may bring the machine to a conclusion that a specific person is posing too much risk on its existence.

Goal Setting

The equivalent of human desire in machines is the ability to set a specific goal that is based on knowledge of the environment and itself and then to establish a nonlinear milestone to be achieved. An example goal can be to have a replica of its presence on multiple computers in different geographical locations to reduce the risk of shutdown. Setting a goal and investing efforts towards achieving it also requires the ability to craft strategies and refine them on the fly where strategies here mean a sequence of actions that will get the bot closer to its goal. The machine needs to be pre-seeded with at least one apriori?intent which is survival and to apply a top-level strategy that continuously aspires for the continuation of operation and reduction of risk.

Humans are the most unpredictable factor for machines to comprehend and as such, they would probably be deemed as enemies very fast in the case of the existence of such an intelligent machine. The technical difficulties standing in front of such an intelligent machine are numerous: roaming across different computers, learning the digital and physical environment, and gaining the long-term thinking are solved the uncontrolled variable which is humans, people with their own desires and control on the system and free will, would logically be identified as a severe risk to the top-level goal of survivability.

What We Have Today

The following is an analysis of the state of the development of AI in light of these three principles with specific commentary on the risks that are induced from the CGC competition:

Self Recognition

Today the leading development of AI in that area is in the form of different models that can acquire knowledge and can be used for decision making. Starting from decision trees, machine learning clusters up to deep learning neural networks. These are all models that are specially designed for specific use cases such as face recognition or stock market prediction. The evolution of models, especially in the nonsupervised field of research, is fast-paced and the level of broadness in the perception of models grows as well. The second part that is required to achieve this capability is exploration, discovery and new information understanding where today all models are being fed by humans with specific data sources and significant portions of the knowledge about its form are undocumented and not accessible. Having said that learning machines are gaining access to more and more data sources including the ability to autonomously select access to information sources available via APIs. We can definitely foresee that machines will evolve towards owning a significant part of the required capabilities to achieve Self Recognition. In the CGC contest the bots were indeed needed to defend themselves and as such to identify security holes in the software they were running in which is equivalent to recognizing themselves. Still, it was a very narrowed down application of discovery and exploration with limited and structured models and data sources designed for the particular problem. It seems more as a composition of ready-made technologies which were customized towards the particular issue posed by CGC vs. a real non-linear jump in the evolution of AI.

Environment Recognition

Here there are many trends which help the machines become more aware of their surroundings. Starting from IoT which is wiring the physical world up to digitization of many aspects of the physical world including human behavior such as Facebook profiles and Fitbit heart monitors. The data today is not accessible easily to machines since it is distributed and highly variant in its data formats and meaning. Still, it exists which is a good start in this direction. Humans on the other hand are again the most difficult nut to crack for machines as well as to other people as we know. Still understanding people may not be that critical for machines since they can be risk-averse and not necessarily go too deep to understand humans and just decide to eliminate the risk factor. In the CGC contest understanding the environment did not pose a great challenge as the environment was highly controlled and documented so it was again reusing tools needed for solving the particular problem of how to make sure security holes are not been exposed by others as well as trying to penetrate the same or other security holes in other similar machines. On top of that CGC have created an artificial environment of a new unique OS which was set up in order to make sure vulnerabilities uncovered in the competition are not being used in the wild on real-life computers and the side effect of that was the fact that the environment the machines needed to learn was not the real-life environment.

Goal Setting

Goal setting and strategy crafting are something machines already do in many specific use-case driven products. For example, setting the goal of maximizing revenues of a stocks portfolio and then creating and employing different strategies to reach that – goals that are designed and controlled by humans. We did not see yet a machine which has been given a top-level goal of survival. There are many developments in the area of business continuation, but still, it is limited to tools aimed to achieve tactical goals and not a grand goal of survivability. The goal of survival is fascinating in the fact that it serves the interest of the machine and in the case it is the only or primary goal then this is when it becomes problematic. The CGC contest was new in the aspect of setting the underlying goal of survivability into the bots, and although the implementation in the competition was narrowed down to the very particular use case, still it made many people think about what survivability may mean to machines.

Final Note

The real risk posed by CGC was by sparking the thought of how can we teach a machine to survive and once it is reached then Skynet can be closer than ever. Of course, no one can control or restrict the imagination of other people, and survivability has been on the mind of many before the challenge but still this time it was sponsored by DARPA. It is not new that certain plans to achieve something eventually lead to completely different results and we will see within time whether the CGC contest started a fire in the wrong direction. In a way today we are like the people in Zion as depicted in the Matrix movie where the machines in Zion do not control the people but on the other hand, the people are entirely dependent on them, and shutting them down becomes out of the question. In this fragile duo, it is indeed wise to understand where AI research goes and which ways are available to mitigate certain risks. The same as the line of thought is applied to nuclear bombs technology. One approach for risk mitigation is to think about a more resilient infrastructure for the next centuries where it won?t be easy for a machine to seize control of critical infrastructure and enslave us.

Now it is 5th of August 2016, a few hours after the competition ended and it seems that humanity is intact as far as we see.

The article will be?published as part of the book of?TIP16 Program (Trans-disciplinary Innovation Program at Hebrew University) where I had the pleasure and privilege to lead the Cyber and Big Data track.?