I’ve been thinking lately that email is a good interface for certain modes of AI assistant interaction, namely “research” tasks that are asynchronous and take a relatively long time. Email is universal, asynchronous, uses open standards, supports structured metadata, etc.

This is how I initially pitched an AI assistant in my last shop.

It is a lot cheaper to leverage existing user interfaces & tools (i.e., Outlook) than it is to build new UIs and then train users on them.

Also an email that comes back a minute later feels fast. A chat that types at the same speed feels slow.

I'm building something similar. See my comment the OP above:

https://threadwise.app

If you want to get ahead of the curve, look into the Agent-to-Agent protocol Google just introduced. I'm currently using my own custom AI agent assistant to perform life tasks. If I could integrate a better tooling/agents into my own assistant system like your's that'd be awesome.

It's kind of like sure, I could manage my own emails, or I could offset this to someone who does it better. If you do it better and it's affordable, I'm in.

We are on that starship to the future right now and I love it.

I've build adaptive agent swarms using email, mailing lists and ftp servers.

If you don't need to have the lowest possible latency for your work and you're happy to have threads die then it's better than any bespoke solution you can build without an army of engineers to keep it chugging along.

What's even better is that you can see all the context, and use the same command plane as the agents to tell them what they are doing wrong.

yep went down a rabbit hole trying to build a company around this. it’s the perfect UI

text + attachments into the system, text + attachments out

Well, it’s funny. This is essentially how I deal with many professionals in my life.

My finance guy, tax attorney, other attorneys. Send emails, get emails, occasionally a blind status update from them.

Sure, we have phone calls, sometimes get together for lunch.

But mostly it’s just emails.

> trying to build a company around this

I am still very open to this one. An email-based, artificial coworker is so obviously the right way to penetrate virtually every B2B market in existence.

I don't even really want to touch the technology aspects. Writing code that integrates with an LLM provider and a mailbox in E365 or Gmail is boring. The schema is a grand total of ten tables if we're being pedantic about things.

Working with prospects and turning them into customers is a way more interesting problem. I hunger for tangible use cases that are actually compatible with this shiny new LLM tooling. We all know they're out there, and email is probably the lowest friction way to get them applied to most businesses.

> Working with prospects and turning them into customers is a way more interesting problem.

Agreed. That's also the hardest part, and where most value is created.

I'm building something similar. See my comment the OP above:

https://threadwise.app

How does email support structured metadata? Are you talking about X headers?

I have a couple companies that force me to send them data via email. They have an email template that you have to conform to, and they can parse it. Mainly just very rudimentary line breaks and 'LineItem: content' format. But json in the body should be fine as well. Given the way email programs strip or modify html at times, I would be leery of xml.

Maybe they're thinking of XML.

Email is decent for intermural communication. If it's intramural and you control both the sender and receiver, MQTT or ntfy are likely better communication channels since they increase flexibility and lower complexity, IMO.

Not if I want it able to have conversations with people, they don't.

I could see installing or implementing a custom client if there were some functionality that'd enable, but "support a conversation among two speakers" is something computers have done since well before I was born. If the wheel fits, why reinvent it?

If you're having conversations with people, then you don't control both ends and email is fine for that. Email is suboptimal for communicating between services/applications under your full control.

Consider the use case from the article: this is a family management support or "AI butler" application. So I control the end with the LLM on it, which I administer - but not necessarily the other, which is anyone in my family, not just me. So unless I want to try to make everyone use my weird custom AI messaging app like I aspire to Bay Area thought-cult leadership, I'm going to meet people where they are and SMTP's cheaper than SMS.

If I'm building myself a toy, then sure, I can implement whatever I want for a client, if that's where I get my jollies. React Native isn't hard but it is often annoying, and the fun for me in this project would be all in the conversation with the agent per se. Whatever doesn't get me to that as fast as possible is just getting in my way, you know?

And too, if this does turn out to be something that actually works well for me, then I'm going to want to integrate it with my phone's voice assistant, and at that point an app is required anyway - but if I start with a protocol and an app that that assistant already knows how to interact with, then again I have an essentially free if admittedly very imperfect prototype.

Under the hoods, is your AI butter one service or many? It would be not-great for your weather or family-event-calendar-management components to communicate with each other or the orchestrator via email.

Receiving an email from the AI-butler rescheduling or relocating a planned outdoors family event because rain is expected would be excellent, using IMAP to wire-up the subcomponents together would not.

Who suggested using email in the service layer? I mean, you're not wrong, but this feels like you handed me a banana and then said I should have picked a better hammer.

We're talking about a conversation that has a human on at least one end, so email makes sense. For conversations involving no humans, of course there are much better stores and protocols if something like an asynchronous world-writable queue is what we want.

"Number of humans in the conversation" wasn't the distinction you initially established, I believe, but I wonder if it's closer to the one you had in mind.