Give users an AI assistant they can ask to navigate them to the right screen or section of the application?
In a previous job, we built our AI assistant so that it could operate our UI in the front-end and it was very powerful.
Give users an AI assistant they can ask to navigate them to the right screen or section of the application?
In a previous job, we built our AI assistant so that it could operate our UI in the front-end and it was very powerful.
like a cmd+k type deal or something different? we do have cmd+k navigation to everywhere currently + global search, but i worry that less sophisticated users might not use it.
No, I mean, like a copilot style AI assistant, the user can chat with to ask what they want to do, and either the assistant can operate and navigate the UI to the right place, or perhaps even shortcut the steps for the user by asking for questions to satisfy inputs for the thing they are trying to accomplish.
An example: - user intent is to update an attribute for a component part number A21445
- user can click a chat bubble icon in lower right and chat to the assistant
- user describes their intent - "help me update the description for part number A21445
- system replies by informing the user it will open the right screen, opening a part/component editing UI, with the right part loaded, with the cursor positioned in the description field, and the assistant stays open for further assistance; or;
- system replies that it found the part and can update the description, shows the current description for the user, asks "what description do you want?"
- user enters updated description
- system confirms the change is correct
- user confirms the change is correct
- part/component description is updated without even opening the UI
FWIW, it's great that you have cmd-K and also I've seen those kind of search boxes get more smarts like being able to type "part:A21445" to go directly to a specific UI.
I just suggested the above as we learned some interesting user experiences became possible when our AI assistant had the ability to control our UI directly on behalf of a specific user.
An example in the app I worked in (a web based data pipeline tool):
- "Hey assistant, can you help me add some SQL transformation logic to dataflow ABC, to process the customer data?"
- system uses metadata and knowledge of the UI to open the right dataflow, select the right type of UI to open to enable the user to add a SQL query in the right place, maybe even autogenerate the initial SQL query - this all from the main home page of the app, from a side panel chat assistant.
- net result feels like talking to the assistant to operate the app, almost no clicks required.
I hope that makes sense.
This is a power-user’s nightmare.
Instead of arranging things in a logical hierarchy, and enabling quick navigation thru keystrokes - just toss out making the ui make sense because the chatbot can solve all.
Your job is to make a good product. AI-as-interface is just slapping a layer of randomized language parsing in between the user and bad ui.
I’m not actually saying no chatbots. What I am saying is using a chatbot to solve a usability problem is a massive middle finger to anyone who might have to use your product for 8 hours a day. Make it good, then make it idiot-proof. Don’t make it for idiots only.
I agree with you.
I used to work in green screen text based UIs from the 80s (TUI).
Power users didn't need search or anything else - they memorized the keystrokes to navigate the text UI and could just type key combinations to blaze through the UI faster than it could render on the screen.
I've never really seen anyone able to replicate that UI in a browser based or GUI desktop app to be honest.
Power users are a different use case, although the end goal is to remove the need for clicks.
I don't believe chatbots/AI assistants are a panacea, definitely encourage the architects of this new ERP platform to weigh the pros and cons.
That said, two jobs back I worked for a major manufacturing company that used the old GUI destop based Oracle EBS ERP. To automate repetitive workflows they were trying to implement UIPath (RPA automation - it drives the UI for the user) on top of the GUI.
This is what lead me to believe that if the ERP application's functionality is discoverable by an AI assistant, it can be used to automate or navigate on behalf of the user, or as part of complex workflows.
That can be done later after the basics are addressed - my only advice would be to just consider it sooner, even if you don't build it first.
It's a little easier to think of how one might simplify the workflows and design for automation into the core of the product, via the UI, APIs, etc earlier than later.
But in general - focus on the user's needs first and different roles/personas - just don't completely ignore new types of automation workflow opportunities (i.e. AI assistants/chatbots).
My opinion only.
If the product is API based then it's possible to use a TUI library and make an app that someone can keyboard their way through and have it talk to the mothership through the APIs.
Chatbots would be great for discoverability, especially if you have a ton of docs that nobody seems to read.
Ai slop to navigate the ai slop docs? Ai seven layer dip.
really appreciate you taking the time to write this!
we've started trying to work through adding agents like this: https://x.com/barbinbrad/status/1903047303180464586
the trouble is that there are 1000s of possible mutations -- and the quality of an agent tends to diminsh with the amount of "tools" you give it. i need to figure out the right abstraction for this.
I pray you focus on your core product and don’t fall into an agentification rabbit hole.
If you do want everything to be automatable take a page from Blender and give every action a key binding plus a Python method, so Python scripts can take the same actions a human would, but as function calls instead of clicks. Then maybe maybe maybe you can have a text field that allows natural language to transform to an action, but please god stay away from chat interfaces.
Rhino CAD is another interesting interface to look at, there’s a million buttons and menus but there’s also a text field at the top of the viewport where you can just type a command if you already know the name instead of rummaging through submenus. Kind of a CLI within the GUI.
I somewhat agree with you, especially that one could identify a common abstraction that later an LLM could piggyback on top of.
Genuine question though - have you implemented an AI assistant/chat interface recently using LLMs on top of a UI?
I agree it can be a rabbit hole, but I just got through doing it on an app and there were definitely some things it really made way simpler and some complex scenarios that I'm not sure could have been done any more simply.
I built a chat interface in 2017 (this was with chatscript dialog trees with hole-filling and semantic search) that was ostensibly to prevent our data scientists from redundant work, ie, before they spent all day writing a SQL script, describe the job of the script and see if one already exists. The chatbot would then ask for the parameters required for a script, run the job, and then present a CSV of the returned data.
As we collected user feedback and refined the UX, we got closer and closer to an option tree that could be better represented by a drop down menu. It was kind of depressing, but I learned that the actual job of that R & D wasn't to come up with a superintelligent chatbot that replaced data scientists, it was to come up with the infrastructure that would allow data scientists to put their python scripts in a common repository to allow re-use without re-installing locally and screwing around with pyenvs.
Anyway, I'm also traumatized by my involvement with a YC startup that actually had a very good (if ENRONish) product around peer to peer energy futures trading that completely fell apart when investors demanded they make it "AI"
Cool! Yeah, that's the kind of UI/UX I meant.
I agree with the right abstraction and it's tough to find the balance- in our data pipeline app, what we did is make key core functionality of the app exposed so the assistant can use it, and implemented a handful of basic agents out of the box, including one default one that could shell out work to others. We also made it easy as an extension point for users to add a new agent that used the core functionality/tools, just by defining the agent in a markdown file.
We found starting small for critical use cases that saved the most time, but thinking about building blocks, was useful.
Because the responses of the AI assistant come back and are processed on the UI, we found we could give the LLM our UI docs as well as knowledge about UI element IDs, etc so it could respond with input commands that would drive the UI.
This way, we could do something like, provide the LLM with the input/prompt including the context of like - what page/view is the user on, what is their intent, what tools are available in general, what sub agents are available for specialized tasks, etc.
Please don't let my suggestions sway you away from core progress in your app (take with a grain of salt). But it's great you're already experimenting- keep your eyes open if you see a great use case where it accelerates workflow.
Another HNer mentioned people not reading docs- that's a low hanging fruit use case we had too - "how do I use this view?", "what does this field mean?", or retrieving information from other parts of the app without having to navigate away, etc. It can save having to find answers in a doc or navigate elsewhere.
Edit: perhaps a useful exercise - imagine a workflow of "talking to the app to achieve a task" as a way to explore.
"Hey ERP, open the part entry screen for part 12345"
"Hey ERP, can you update the description for part 12345 to correct the spelling error?"
"Hey ERP, how many of widget XYZ are in stock? If there are enough in stock, can you transfer quantity 10 from warehouse A to B?"
"Hey ERP, how do I cancel a sales order?"
"Hey ERP, how does this screen work?"
I think if you break these down, you'll find common abstractions that map to features, API endpoints, user interface sequences and interactions, triggering workflows, looking things up in docs, etc.
Please share the feedback for keyboard shortcuts from real users. Now shops in India tend to use touchscreens or have staff fiddle with laptops with tiny touchpads, but lots of shops use ERPs (Tally) with 90s UI that dont need mouse.