Hi HN, I’ve been working on https://datadef.io, a tool to help data team (engineer, architect, project manager) make sense of their data universe.
The problem:
- Data models (dbt, SQL, warehouses) often grow into a tangled mess of tables, joins, and undocumented assumptions. - Lineage is either scattered across tools or missing entirely. - Documentation is usually an afterthought (and gets outdated fast).
Datadef.io aims to fix that by providing:
- Interactive canvas to map tables, relationships, and indicators. - Automatic lineage visualization to trace dependencies. - Metadata management: define table/column-level details, ownership, and KPIs. - AI-generated documentation that stays in sync with your models. - Export/share features so asset managers, analysts, and other teams don’t get lost in spreadsheets or PDFs.
It’s still early, and I’d love feedback from the HN community. In particular:
What’s missing for you in lineage/metadata/documentation tools?
How would you want to integrate a tool like this into your workflow (dbt, Databricks, Power BI, etc.)?
I’d really appreciate your thoughts, feature requests, and criticism.
Thanks!
My apologies if I missed this when looking at your product site, but how is the ongoing cost structured? Also, is this OSS with some sort of stated license model, or purely proprietary software?
It looks like a really great idea to package all these evolving best practice concepts into one product!
Appreciate that Right now it’s proprietary and free, there’s no ongoing cost. I don’t have a pricing model locked in yet. The rough idea is to eventually go freemium (keep the core stuff free, add a paid tier for heavier/advanced use). Still figuring out what makes sense, and I’ll be upfront if/when that changes.
The main motivation really was what you mentioned: so many “best practices” live in blog posts or scattered docs, but almost no one actually packages them into something usable. I’m just trying to pull those ideas together in a way that saves people from reinventing the wheel. Still early days, so I’m curious to see if it resonates beyond my own itch.
When I try to sign in with GitHub I get the ‘this isn’t the page you’re looking for’ github 404 error.
Ah, thanks for flagging this. I’ve actually seen that 404 myself a couple of times, but it’s pretty rare and usually goes away on retry. Looks like something flaky in the GitHub OAuth flow. I need to dig in and investigate properly
No worries.
As someone who is actually in the process of procuring something like Informatica/Alation I was really keen to have a play with your tool. As described it sounds like a good starting point for some of our less capable teams.
I REALLY struggled to make lineage work though. Is there some kind of trick?
Also while logged in (used google in the end) I couldn’t go back to the front splash page - it kept just showing me my project page. I wanted to go back and look at the marketing!