For the last couple weeks I have been building dwata and I am going to submit today for Google Gemini Hackathon.
https://github.com/brainless/dwata
dwata is built on the idea of multiple, task-specific agents. Right now it has only one agent that can be run on an email to extract regex patterns for financial data. This enables high performance data extraction from emails or documents (in future) without sending each email to an LLM.
dwata has an email scan which tests simple keywords and regex patterns, groups by sender emails, sorts by number of emails per sender (highest first), and filters out groups where the emails do not seem to be from a template (typical transaction emails are from templates). This is deterministic code in Rust. Then dwata can use the regex builder AI agent to take one email from the group and build a regex pattern to extract extensive financial data - (optional) who sent, how much, (optional) to whom, on which date, with (optional) reference ID.
The generated patterns are saved to local DB and run for the email group (by sender) which was used to generate the regex. That gives a very high performance, AI enabled financial data extractor.
Soon, I will focus on events, places, people, tasks, health and other data. All data storage and processing is local. I am testing exclusively with Google Gemini 3 Flash Preview but dwata should be able to run really well on small LLMs, ones up to 20b parameters.
I am preparing for launch, the builds are not ready yet, but if you want to try, you can compile (Rust and npm tooling needed). Sources to nocodo will also be needed (https://github.com/brainless/nocodo).