Chat bots are everywhere. It feels like the early days of mobile apps where you either knew someone who is building an app or many others planning to do so. Chat bots have their magic. It’s a frictionless interface allowing you to chat with someone naturally. The main difference is that on the other side there is a machine and not a person. Still, one as old as I got to think whether it is the end game concerning human-machine interaction or is they just another evolutionary step in the long path of human-machine interactions.
How Did We Get Here?I’ve noticed chat bots for quite a while, and it piqued my curiosity concerning the possible use cases as well as the underlying architecture. What interests me more is Facebook and other AI superpowers ambitions towards them. And chat bots are indeed the next step regarding human-machine communications. We all know where history began when we initially had to communicate via a command line interface limited by a very strict vocabulary of commands. An interface that was reserved for the computer geeks alone. The next evolutionary step was the big wave of graphical user interfaces. Initially the ugly ones but later on in significant leaps of improvements making the user experience smooth as possible but still bounded by the available options and actions in a specific context in a particular application. Alongside graphical user interfaces, we were introduced to search like interfaces where there is a mix of a graphical user interface elements with a command line input which allows extensive textual interaction - here the GUI serves as a navigation tool primarily. And then some other new human-machine interfaces were introduced, each one evolving on its track: the voice interface, the gesture interface (usually hands) and the VR interface. Each one of these interaction paradigms uses different human senses and body parts to express communications onto the machine where the machine can understand you to a certain extent and communicate back. And now we have the chat bots and there’s something about them which is different. In a way it’s the first time you can express yourself freely via texting and the machine will understand your intentions and desires. That's the premise. It does not mean each chat bot can respond to every request as chat bots are confined to the logic that was programmed to them but from a language barrier point of view, a new peak has been reached. So do we experience now the end of the road for human-machine interactions? Last week I’ve met an extraordinary woman, named Zohar Urian (the lucky Hebrew readers can enjoy her super smart blog about creative, innovation, marketing and lots of other cool stuff) and she said that voice would be next which makes a lot of sense. Voice has less friction than typing, its popularity in messaging is only growing, and technology progress is almost there regarding allowing free vocal express where a machine can understand it. Zohar's sentence echoed in my brain which made me go deeper into understanding the anatomy of the human machine interfaces evolution.
The Evolution of Human-Machine InterfacesThe progress in human to machine interactions has evolutionary patterns. Every new paradigm is building on capabilities from the previous paradigm, and eventually the rule of the survivor of the fittest plays a significant role where the winning capabilities survive and evolve. Thinking about its very natural to grow this way as the human factor in this evolution is the dominating one. Every change in this evolution can be decomposed into four dominating factors:
- The brain or the intelligence within the machine - the intelligence which contains the logic available to the human but also the capabilities that define the semantics and boundaries of communications.
- The communications protocol which is provided by the machine such as the ability to decipher audio into words and sentences hence enabling voice interaction.
- The way the human is communicating with the machine which has tight coupling with the machine communication protocol but represents the complementary role.
- The human brain.
Command Line 1st GenerationThe first interface used to send restricted commands to the computer by typing it in a textual screen Machine Brain: Dumb and restricted to set of commands and selection of options per system state Machine Protocol: Textual Human Protocol: Fingers typing Human Brain: Smart
Graphical User InterfacesA 2D interface controlled by a mouse and a keyboard allowing text input, selection of actions and options Machine Brain: Dumb and restricted to set of commands and selection of options per system state Machine Protocol: 2D positioning and textual Human Protocol: 2D hand movement and fingers actions, as well as fingers, typing Human Brain: Smart
Adaptive Graphical User InterfacesSame as previous one though here the GUI is more flexible in its possible input also thanks to situational awareness to the human context (location...) Machine Brain: Getting smarter and able to offer a different set of options based on profiling of the user characteristics. Still limited to set of options and 2D positioning and textual inputs. Machine Protocol: 2D positioning and textual Human Protocol: 2D hand movement and fingers actions, as well as fingers, typing Human Brain: Smart
Voice Interface 1st GenerationThe ability to identify content represented as audio and to translate it into commands and input Machine Brain: Dumb and restricted to set of commands and selection of options per system state Machine Protocol: Listening to audio and content matching within audio track Human Protocol: Restricted set of voice commands Human Brain: Smart
Gesture InterfaceThe ability to identify physical movements and translate them into commands and selection of options Machine Brain: Dumb and restricted to set of commands and selection of options per system state Machine Protocol: Visual reception and content matching within video track Human Protocol: Physical movement of specific body parts in a certain manner Human Brain: Smart
Virtual RealityA 3D interface with the ability to identify full range of body gestures and transfer them into commands Machine Brain: A bit smarter but still restricted to selection from a set of options per system state Machine Protocol: Movement reception via sensors attached to body and projection of peripheral video Human Protocol: Physical movement of specific body parts in a free form Human Brain: Smart
AI ChatbotsA natural language detection capability which can identify within supplied text the rules of human language and transfer them into commands and input Machine Brain: Smarter and flexible thanks to AI capabilities but still restricted to selection of options and capabilities within a certain domain Machine Protocol: Textual Human Protocol: Fingers typing in a free form Human Brain: Smart
Voice Interface 2nd GenerationSame as previous one but with a combination of voice interface and natural language processing Machine Brain: Same as the previous one Machine Protocol: Identification of language patterns and constructs from audio content and translation into text Human Protocol: Free speech Human Brain: Smart
ObservationsThere are several phenomenon and observations from this semi-structured analysis:
- The usage of the combination of communication protocols such as voice and VR will extend the range of communications between human and machines even without changing anything in the computer brain.
- Within time more and more human senses and physical interactions are available for computers to understand which extend the boundaries of communications. Up until today smell has not gone mainstream as well as touching. Pretty sure we will see them in the near term future.
- The human brain always stays the same. Furthermore, the rest of the chain always strives to match the human brain capabilities. It can be viewed as a funnel limiting the human brain from fully expressing itself digitally, and within the time it gets wider.
- An interesting question is whether at some point in time the human brain will get stronger if the communications to machines will be with no boundaries and AI will be stronger.
- We did not witness yet any serious leap which removed one of the elements in the chain and that I would call a revolutionary step (still behaving in an evolutionary manner). Maybe the identification of brain waves and real-time translation to a protocol understandable by a machine will be as such. Removing the need for translating the thoughts into some intermediate medium.
- Once the machine brain becomes smarter in each evolutionary step then the magnitude of expression grows bigger - so the there is progress even without creating more expressive communication protocol.
- Chat bots from a communications point of view in a way are a jump back to the initial protocol of command line though the magnitude of the smartness of the machine brains nowadays makes it a different thing. So it is really about the progress of AI and not chat bots. I may have missed some interfaces, apologies, not an expert in that area:)