Recently Stephen Downes linked to an article on the various levels of sophistication of AI personal assistants. He added that while all efforts are currently at the third level of those 5 he sees a role in education for such assistance only once level 4 or higher is available (not now the case).
AI assistants maturity levels
Those five levels mentioned in the article are:
- Notification bots and canned pre-programmed responses
- Simple dialogues and FAQ style responses. All questions and answers pre-written, lots of ‘if then’ statements in the underlying code / decision tree
- More flexible dialogue, recognising turns in conversations
- Responses are shaped based on retained context and preferences stored about the person in the conversation
- An AI assistant can monitor and manage a range of other assistants set to do different tasks or parts of them
I fully appreciate how difficult it is to generate natural sounding/reading conversation on the fly, when a machine interacts with a person. But what stands out to me in the list above and surrounding difficulties is something completely different. What stands out to me is how the issues mentioned are centered on processing natural language as a generic thing to solve ‘first’. A second thing that stands out is while the article refers to an AI based assistant, and the approach is from the perspective of a generic assistant, that is put to use into 1-on-1 situations (and a number of them in parallel), the human expectation at the other end is that of receiving personal assistance. It’s the search for the AI equivalent of a help desk and call center person. There is nothing inherently personal in such assistance, it’s merely 1-on-1 provided assistance. It’s a mode of delivery, not a description of the qualitative nature of the assistance as such.
Flip the perspective to personal
If we start reasoning from the perspective of the person receiving assistance, the picture changes dramatically. I mostly don’t want to interact with AI shop assistants or help desk algorithms of each various service or website. I would want to have my own software driven assistant, that then goes to interact with those websites. I as a customer have no need or wish to automate the employees of the shops / services I use, I want to reduce my own friction in making choices and putting those choices to action. I want a different qualitative nature of the assistance provided, not a 1-on-1 delivery mode.
That’s what a real PA does too, it is someone assisting a single person, a proxy employed by that person. Not employed by whomever the PA interacts with on the assisted person’s behalf.
What is mentioned above only at level 4, retained context and preferences of the person being assisted, then becomes the very starting point. Context and preferences are then the default inputs. A great PA over time knows the person assisted deeply and anticipates friction to take care of.
This allows the lower levels in the list above, 1 and 2, the bots and preprogrammed canned responses and action, to be a lot more useful. Because apart from our personal preferences and the contexts each of us operates in, the things themselves we do based on those preferences and contexts are mostly very much the same. Most people use a handful of the same functions for the same purpose at the same time of day on their smart speakers for instance, which is a tell. We mostly have the same practices and routines, that shift slowly with time. We mostly choose the same thing in comparable circumstances etc.
Building narrow band personal assistants
A lot of the tasks I’d like assistance with can be well described in terms of ‘standard operating procedures’, and can be split up in atomic tasks. Atomic tasks you can string together.
My preferences and contextual deliberations for a practice or task can be captured in a narrow set of parameters that can serve as input for those operating procedures / tasks.
Put those two things together and you have the equivalent of a function that you pass a few parameters. Basically you have code.
Then we’re back to automating specific tasks and setting the right types of alerts.
Things like when I have a train trip scheduled in the morning, I want an automatic check for disturbances on my route when I wake up and at regular intervals until 20 mins before the train leaves (which is when I get ready to leave for the rail way station). I want my laptop to open up a specific workspace set-up if I open my laptop before 7 am, and a different one when I’m re-opening my laptop between 08:30-09:00. I want when planning a plane trip an assistant that asks me for my considerations in my schedule what would be a reasonable time to arrive at the airport for departure, when I need to be back, and I want it to already know my preferences for various event times and time zone differences w.r.t spending a night before or after a commitment at the destination. Searching a hotel with filter rules based on my standard preferences (locations vis-a-vis event location and public transport, quality, price range), or simpler yet rebook a hotel from a list of previous good experiences after checking if price range e.g. hasn’t changed upward too much. Preference for direct flights, specific airlines (and specific airlines in the case of certain clients) etc. Although travel in general isn’t a priority now obviously. When I start a new project I want an assistant to ask a handful of questions, and then arrange the right folder structure, some populated core notes, plan review moments, populate task lists with the first standard tasks. I only need to know the rain radar forecast for my daughter’s school start and finish, and where my preferred transport mode for an appointment is bicycle. For half a dozen most used voice commands I might consider Mycroft on a local system, foregoing the silos. Keeping track of daily habits, asking me daily reflection questions. Etc.
While all this sounds difficult when you would want to create this as generic functionality, it is in fact much more simpler in the case of building it for one specific individual. And it won’t need mature AI natural conversation, merely a pleasantly toned interaction surface that triggers otherwise hard coded automated tasks and scripts. The range of tasks might be diverse but the range of responses and preferences to take into account are narrow, as it only needs to pertain to me. It’s a narrow band digital assistant, it’s the small tech version.
Aazai
For some years I’ve dubbed bringing together the set of individual automation tasks I use into one interaction flow as a personal digital assistant ‘Aazai’ (a combination of my initials A.A.Z. with AI, where the AI isn’t AI of course but merely has the same intention as what is being attempted with AI generally). While it currently doesn’t exist mostly as a single thing, it is the slow emergence of something that reduces digital friction throughout my day, and shifts functionality with my shifts in habits and goals. It is a stringed together set of automated things arranged in the shape of my currently preferred processes, which allows me to reduce the time spent consciously adhering to a process. Something that is personal and local first, and the atomic parts of which can be shared or are themselves re-used from existing open source material.