The User Interface for generative AI matters
Why the way customers encounter generative AI makes a real difference in adoption.
I was one of the very early adopters of DALL-E 2 before it was released to the public. However, I played with it and got bored very quickly. I haven’t been back too many times, it struck me as a powerful curiosity. When Stable Diffusion came out, it struck me as an even better delivery model (open source, local) and I played with it some as well (Stable diffusion requires more effort). At this point in the technology’s introduction, I’ve digested how powerful it is and how much of a game-changer generative AI will become (huuuge). But one thing stuck out from these early forays:
Sometimes when we tout new technology we are pretty sloppy about HOW to introduce it to an audience. I suspect the same thing plagues other instances of ‘generative AI’ like Github’s Co-Pilot and the Copy.ai and Jasper.ai products.
But first, let us briefly examine why I was bored with DALL-E 2. First I entered text to generate a new image.
The wait for image generation was long - the wait state shown was ok but nothing that really calmed me down while I waited
The image created was usually not what I had in my head (can’t blame Dall-E, it's not a mind reader). So I needed to adjust either the generative words. Or the image
Alas, there is no real help in how adjusting the words can be better. For example, real-time previews of what new descriptions could look like before I generate the sought image. Or even basic suggestions on how to change my own input description.
And no image adjustment tools. What if it got it mostly right and I just needed to adjust 1 little thing? I was forced to take it all or leave it.
And this was all in the first 5 minutes of use!
These are all basic usability issues that were ignored to get this amazing new tool out. But being amazing wasn’t enough. The lack of usability killed my awe and my adoption.
The lesson here is: generative AI techniques are just technology. They don’t become a product until we can craft the right product experience around it to be usable by most people. This will make all the difference. The interface we use to do this will be a matter of judgment. Of research and trial and error.
This is why Chatgpt3 was such a success. The chatbot interface was very accessible. The chat history they added after, is even more amazing as a way to keep track of your evolving use of it. ChatGPt has been around for a while. But the interface made all the difference to mass adoption (as well as the free price point - but that’s a whole other story).
A couple of mental models to keep in mind:
Generative AI is a ‘fast start’. It gets people off ‘blank paper’. This is huge. But what we are doing is a substitution. In non-generative AI products (most products), I start with a blank (or mostly blank) and build something (Google docs, etc). Or I can use some kind of template. I work on it till it's finished. Let's say 60 minutes.
Workflow 1 = blank or template + hands on keyboard = finished in 60 minutes
If I have generative AI the equation changes:
Workflow 2 = fast start + tweaking (hands on keyboard) = finished in X minutes.
If X is not about 30 minutes or less, then customers will bail. In fact, it ought to be 10 minutes, for people to feel like it's amazing. This would represent a 6x compression of their work which is what perceptible productivity gain is about.
So if you don’t offer flexible tweaking options or the ‘fast start’ is not within 5% of perfection (in order to reduce the amount of tweaking needed), you will make X closer to 60 minutes or even greater and the purpose of ‘workflow compression’ will be defeated. Your use of AI will be pointless.
The User Interface and User Experience for generative AI approaches will matter tremendously to its more general adoption and success.
Totally agree. Generative AI is a mind-blowing technology, but UX/UI will be the key to drive it to the mainstream and ChatGPT is doing an amazing job at it.
Also saw your hint on its free pricing - I believe that OpenAI just ran an incredibly successful PLG experiment!
On your DALL-E 2 point about tweaking the image the AI renders, power users are always ahead of the curve - here's someone already trying to fix the problem: https://twitter.com/wbuchw/status/1563162131024920576?s=20&t=X9jlT6ydY4mb8cyNdOI2pQ
Loved this article and agree with it!