This made me think: what if my little utility assistant program that I have, similar to your Stevens, had access to a mailbox?
I've got a little utility program that I can tell to get the weather or run common commands unique to my system. It's handy, and I can even cron it to run things regularly, if I'd like.
If it had its own email box, I can send it information, it could use AI to parse that info, and possibly send email back, or a new message. Now, I've got something really useful. It would parse the email, add it to whatever internal store it has, and delete the message, without screwing up my own email box.
Thanks for the insight.
I’ve been thinking lately that email is a good interface for certain modes of AI assistant interaction, namely “research” tasks that are asynchronous and take a relatively long time. Email is universal, asynchronous, uses open standards, supports structured metadata, etc.
This is how I initially pitched an AI assistant in my last shop.
It is a lot cheaper to leverage existing user interfaces & tools (i.e., Outlook) than it is to build new UIs and then train users on them.
Also an email that comes back a minute later feels fast. A chat that types at the same speed feels slow.
I'm building something similar. See my comment the OP above:
https://threadwise.app
If you want to get ahead of the curve, look into the Agent-to-Agent protocol Google just introduced. I'm currently using my own custom AI agent assistant to perform life tasks. If I could integrate a better tooling/agents into my own assistant system like your's that'd be awesome.
It's kind of like sure, I could manage my own emails, or I could offset this to someone who does it better. If you do it better and it's affordable, I'm in.
We are on that starship to the future right now and I love it.
I've build adaptive agent swarms using email, mailing lists and ftp servers.
If you don't need to have the lowest possible latency for your work and you're happy to have threads die then it's better than any bespoke solution you can build without an army of engineers to keep it chugging along.
What's even better is that you can see all the context, and use the same command plane as the agents to tell them what they are doing wrong.
yep went down a rabbit hole trying to build a company around this. it’s the perfect UI
text + attachments into the system, text + attachments out
Well, it’s funny. This is essentially how I deal with many professionals in my life.
My finance guy, tax attorney, other attorneys. Send emails, get emails, occasionally a blind status update from them.
Sure, we have phone calls, sometimes get together for lunch.
But mostly it’s just emails.
> trying to build a company around this
I am still very open to this one. An email-based, artificial coworker is so obviously the right way to penetrate virtually every B2B market in existence.
I don't even really want to touch the technology aspects. Writing code that integrates with an LLM provider and a mailbox in E365 or Gmail is boring. The schema is a grand total of ten tables if we're being pedantic about things.
Working with prospects and turning them into customers is a way more interesting problem. I hunger for tangible use cases that are actually compatible with this shiny new LLM tooling. We all know they're out there, and email is probably the lowest friction way to get them applied to most businesses.
> Working with prospects and turning them into customers is a way more interesting problem.
Agreed. That's also the hardest part, and where most value is created.
I'm building something similar. See my comment the OP above:
https://threadwise.app
How does email support structured metadata? Are you talking about X headers?
I have a couple companies that force me to send them data via email. They have an email template that you have to conform to, and they can parse it. Mainly just very rudimentary line breaks and 'LineItem: content' format. But json in the body should be fine as well. Given the way email programs strip or modify html at times, I would be leery of xml.
Maybe they're thinking of XML.
Email is decent for intermural communication. If it's intramural and you control both the sender and receiver, MQTT or ntfy are likely better communication channels since they increase flexibility and lower complexity, IMO.
Not if I want it able to have conversations with people, they don't.
I could see installing or implementing a custom client if there were some functionality that'd enable, but "support a conversation among two speakers" is something computers have done since well before I was born. If the wheel fits, why reinvent it?
If you're having conversations with people, then you don't control both ends and email is fine for that. Email is suboptimal for communicating between services/applications under your full control.
Consider the use case from the article: this is a family management support or "AI butler" application. So I control the end with the LLM on it, which I administer - but not necessarily the other, which is anyone in my family, not just me. So unless I want to try to make everyone use my weird custom AI messaging app like I aspire to Bay Area thought-cult leadership, I'm going to meet people where they are and SMTP's cheaper than SMS.
If I'm building myself a toy, then sure, I can implement whatever I want for a client, if that's where I get my jollies. React Native isn't hard but it is often annoying, and the fun for me in this project would be all in the conversation with the agent per se. Whatever doesn't get me to that as fast as possible is just getting in my way, you know?
And too, if this does turn out to be something that actually works well for me, then I'm going to want to integrate it with my phone's voice assistant, and at that point an app is required anyway - but if I start with a protocol and an app that that assistant already knows how to interact with, then again I have an essentially free if admittedly very imperfect prototype.
Under the hoods, is your AI butter one service or many? It would be not-great for your weather or family-event-calendar-management components to communicate with each other or the orchestrator via email.
Receiving an email from the AI-butler rescheduling or relocating a planned outdoors family event because rain is expected would be excellent, using IMAP to wire-up the subcomponents together would not.
Who suggested using email in the service layer? I mean, you're not wrong, but this feels like you handed me a banana and then said I should have picked a better hammer.
We're talking about a conversation that has a human on at least one end, so email makes sense. For conversations involving no humans, of course there are much better stores and protocols if something like an asynchronous world-writable queue is what we want.
"Number of humans in the conversation" wasn't the distinction you initially established, I believe, but I wonder if it's closer to the one you had in mind.
This was the attack vector of a AI CTF hosted by Microsoft last year. I built an agent to assess, structure, and perform the attacks autonomously and found that even with some common guardrails in place the system was vulnerable to data exfiltration. My agent was able to successfully complete 18 of the challenges... Here is the write up after the finals.
https://msrc.microsoft.com/blog/2025/03/announcing-the-winne...
For gmail, there's also an amazing thing where you can hook it with pubsub. So now it's push not pull. Any server will get pubsub little webhooks for any change within milliseconds (you can filter server side or client side for specific filters)
This is amazing, you can do all sorts of automations. You can feed it to an llm and have it immediately tag it (or archive it). For important emails (I have a specific label I add, where if the person responds, it's very important and I want to know immediately) you can hook into twilio and it calls me. Costs like 20 cents a month
[dead]
I made an AI assistant telegram bot running on my Mac that runs commands for me. I'll tell it "Run ncdu in the root dir and tell me what's taking up all my disk space" or something and it converts that bash and runs it via os.system. It shows me the command it created, plus the output.
Extremely insecure, but kinda fun.
I turned it off because I'm not that crazy but I'm sure I could make a safer version of it.
Easy fix, just pipe the commands to a 2nd LLM and ask "will this command delete my home directory (y/n)"
This project has a pattern just like that to handle the inbound USPS information:
https://www.val.town/x/geoffreylitt/stevensDemo/code/importe...
I think it would be pretty easy to extend to support other types of inbound email.
Also I work for Val Town, happy to answer any questions.
yeah i actually do handle inbound email! just forgot to include that code in the shared version. the telegram inbound handler shows the rough pattern.
is there a reason you went with telegram and not slack or discord? i was thinking that it could open up a broader channel for communicating with your assistant. i understand you're also just building more of a poc, but curious if you'd thought about that. great work btw :)
Mailgun (and I'm sure many other services like it) can accept emails and POST their content to an url of your choice.
I use that for journaling: I made a little system that sends me an email every day; I respond to it and the response is then sent to a page that stores it into a db.
+1 for Mailgun. My only gripe with it is that they detect and block bot activity on their frontend. So if you have end to end (e2e) integration tests built with something like Puppeteer, you can't have them log into Mailgun and check the inbox table's HTML to see that an email was sent. So you have to write some sort of plugin manually - perhaps as a testing endpoint on your website that only appears in debug mode - that interacts with their API.
This might not seem like much of a big deal. But as we transition to more of these #nocode automated tools, the idea of having to know how programming works in order to interact with an API will start to seem archaic. I'd compare it to how esoteric the terminal looked after someone saw a GUI like the one used by Apple's Macintosh back in the 1980s.
I looked forward to this day back in the early 2000s when APIs started arriving, but felt even then that something was fishy. I would have preferred that sites had a style-free request format that returned XML or even JSON generated from HTML, rather than having to use a separate API. I have this sense that the way we do it today with a split backend/frontend, distributed state, duplicated validation, etc has been a monumental waste of time.
> I use that for journaling: I made a little system that sends me an email every day; I respond to it and the response is then sent to a page that stores it into a db.
Yes. I know note taking and journaling posts are frequent on HN, but I've thought that this is the best way to go, is universal from any client, and very expandable. It's just not generically scaleable for all users, but for the HN reader-types, it'd be perfect.
CloudMailin [0] is also great for parsing incoming email and doing stuff with it (ex. forward to a webhook / POST target, outbound capabilities, etc)
I've found it to be very reliable with a detailed dashboard to track individual transactions, plus they give you 10,000 emails a month for free.
Not an employee, just a big fan!
[0] https://www.cloudmailin.com
*Update*: I tried writing a little Python code to read and write from a mailbox, reading worked great, but writing an email had the email disappear to some filter or spam or something somewhere. I've got to figure out where it went, but this is the warning that some people had about not trusting a messaging protocol (email in this case) when you can't control the servers. Messages can disappear.
I read that [Mailgun](https://www.mailgun.com/) might improve this. Haven't tried it yet.
Other alternatives for messages that I haven't tried. My requirement is to be able to send messages and send/receive on my mobile device. I do not want to write a mobile app.
* [Telegram](https://telegram.org/) (OP's system) with [bots](https://core.telegram.org/bots)
* [MQTT](https://mqtt.org/) with server
* [Notify (ntfy.sh)](https://ntfy.sh/)
* Email (ubiquitous)
Also, to [simonw](https://news.ycombinator.com/user?id=simonw) point, LLM calls are cheap now, especially with something as low tokens as this.And, links don't format in HN markdown. I did the work to include them, they're staying in.
Try https://unfetch.com (I've built it). It can handle both inbound and outbound emails
I'm building something similar and related to the other comments below! It's not production ready but it will hopefully be in a couple of weeks. You guys can sign up for free and I will upgrade you to the premium tier manually (premium cannot be bought yet anyway) in exchange for some feedback:
https://threadwise.app
Ages ago, I proposed that the best CMS for a company would be one which used e-mail as the front-end:
- all attachments are stripped out and stored on a server in an hierarchical structure based on sender/recipient/subject line
- all discussions are archived based on similar criteria, and can be reviewed EDIT: and edited like to a wiki
My one concern there would be edits: a CMS needs to support easily making edits to content (fixing typos etc) - editing existing posts via email sounds like it would be pretty fiddly.
The idea is content comes in via e-mail, stored in some sort of tagged structure, then edited like a wiki.
Ha! I had the exact same idea! I still think it would be nice.
I built up an AI Agent using n8n and email doing exactly this. Works great and was surprised I'd not seen any other place kicking the idea around.
Probably my favorite use case is I can shoot it shopping receipts and it'll roughly parse them and dump the line item and cost into a spreadsheet before uploading it to paperless-ngx.
Sounds useful but why do you need an ai agent to do that?
"I can shoot it shopping receipts and it'll roughly parse them and dump the line item and cost into a spreadsheet" - very difficult to do that without using a vision LLM.
[flagged]