October 21, 2015
People are used to effortlessly talking, asking questions, and generally exploiting the uncanny expressiveness of human language. We barely need to think to understand spoken words and parsing meaning only gets difficult when we are unable to empathize (sarcasm, humor, etc.) Despite their power, we don’t enjoy the same ease of use with personal computers. Half a century after Apple and Microsoft first introduced most of us to them, being specially trained to give instructions to computers is still in high demand. Computers are very intelligent in the information processing and memory sense, but as for intuition — the grounding of common sense understanding and probabilities in the real world that makes it so easy to work with other humans— they are only now beginning to developget.
It’s easy to program what we consciously think about. We explicitly follow a thought process when we play chess or do math, making these tasks (relatively) straightforward to instruct a computer to do. The skills we have failed to teach our machines are those we don’t need to think about. Skills like walking, understanding speech, and recognizing objects, are intuitive and automatic to us, requiring little deliberate thought. How do we program something like speech understanding when we don't think about it ourselves?
Well, outside of computers we rarely need to. From an early age, we take advantage of our ability to "learn by doing” - that is, to gradually change our behavior towards a particular goal, and thereby acquire new skills with no formal explanation at all, only experimentation. Emulating this learning process in computers is difficult, but the development of new computing techniques allows us to do so. By building large statistical functions with many parameters, and applying this method of experimentation with tight feedback loops, on the specific computing machinery otherwise used to build simulations (GPUs for video games), we are making real progress.
Why is intuition so critical to the ideal interface? Well, IiInterfaces generally get easier to use by emulating those aspects of our interactions with the real world that we find intuitive. Through improved 2D and 3D graphics (the advent of the graphical user interface), touch/gesture sensing (the iPhone, iPad, Kinect), text understanding (Google) and even speech recognition/synthesis (Google Now, Cortana, Siri), we’ve been doing our best to scalably emulate the world we’re familiar with. Sure, people can learn new things in order to use a tool, but making the initial tool interface familiar means they can leverage their learning ability on far more interesting problemsfar better.
Generally, our interactions with the virtual world should be consistent with our expectations from the physical one. When we’re exploring, creating, inventing, or engineering, we don't always know exactly what we're looking for, so the process itself needs to give birth to the end product. Easy interfaces for exploration take advantage of our natural explorative spatial awareness. Sculpting a piece of clay would be extremely difficult if you couldn't directly mould the intermediate result and instead had to write all of your instructions out at once. , so Ggreat interfaces have strong physical metaphors, tools that can be grabbed and manipulated the way we intuitively understand, and allow you to constantly see the end product as you work. All interactive graphical user interfaces are built on this principle, from advanced graphics and 3D modeling software to Excel spreadsheets.
When we communicate or seek information we know exactly what we want however, so the interface we expect is completely different. Only the intent matters, not the process, so we just speak, or type, with the completely open range of expression that language offers. These thoughts are so easy to express with language because we not only expect an intuitive experience for ourselves, but we expect the interface itself to have some; we take advantage of implicit experience and intuition on both sides to share complex ideas in very few words.
This is something we're confident computers will eventually understand, but technological progress is often incremental, and only happens when people really care. It’s rare that new approaches are as good as the tried and true standard at first, disincentivizing radicaltrue innovation. And so inventors often avoid solving what people consider mission critical. They head for the promised land of entertainment and convenience instead, where there is an intrinsic value to novelty, and robustness to unreliability. Video game consoles were a household item long before the conventional “personal computer”, and such forms of entertainment continue to push the envelope with what is technically possible (e.g. virtual reality). Eventually, the best of these entertainment and convenience technologies find their way into consumer habit and make the cut for real utility, but that happens much more slowly than it would by tackling the problems directlymore directly.
People are demanding of interfaces for good reason; a magical interface that only works 50% of the time may seem like an improvement, but when you account for switching costs and habit learning it’s actually just an enormous burden. Those cases where simple automated solutions fail are the hardest to crack, but a solution that can elegantly take care of those edge cases and works reliably means a world of value and trust. To jointly optimize for utility and technological progress, to build the human computer faster than anyone else, we need some sort of buffer. We need a system that guarantees consistentuniform performance despite constant internal improvement, and in scheduling meetings via email, that buffer is our team of specially trained, software-augmented Claras.
Our progress in developing machines has always been twofold: we improve not only their power but also our interfaces with them. While greater power can make the tools more effective, interface improvements allow everyone to do more with the same amount of power — experienced users become more effective and novices who couldn’t use the tool at all before now can. The emergence of personal computing came from utilizing progress in computational power towards such an interface, one that was both more powerful and accessible to far more people. That process continues to this day, and is nowhere near its limits.
What are we working towards then? We are working towards an interface that you can depend on, an interface that doesn’t always need explicit instruction, one that takes advantage of what it knows about you, asks for what it doesn’t, and lets you work at a higher level of abstraction. An interface with intuition like ours, in both response and understanding, so we can apply our limited but wonderful deliberate thinking ability to higher level tasks. An interface that anyone, from a child who has never touched a computer to a computer science PhD, can take advantage of — one that is human.