Yeah, fair pushback, and yes the intro was AI-assisted. Marketing is not my strength nor I am a native english speaker. I built this in about a month with heavy LLM tooling and the seed comment is part of that. I'm not going to pretend otherwise.
The code is what it is. `cargo test --workspace` runs across 19 crates. CI on 5 platforms (macOS ARM/Intel, Linux x86/ARM, Windows). JSON output schemas are codegen-checked in CI so docs can't drift from the binary.
If you want to skip the marketing copy and look at engine reasoning instead: PR #240 (audit trail), #241 (column classification + masking), #270 (failed-source surfacing in discover).
I'd rather hear "the code is bad" than "the post sounds AI-written".
> I'd rather hear "the code is bad" than "the post sounds AI-written".
Of course you would. Reading through and judging the quality of AI output is the largest amount of effort in a world where you can get everything else by prompting. Please internalize this: If you want to be respected you will have to put in effort yourself. There is no way around this.
I truly appreciate your feedback and it's definitely a lesson learned for me. As I said to cmrdporcupine, "The only reason for passing my replies through AI was just because it's my first time posting here and opening a side-project of mine publicly. "
All the engine architecture decisions are mine though and this project came up to solve a real problem in a data pipeline that serves multiple clients, connectors, producers, etc.
I'm late to the party, but there is a dilemma that seems to be facing poor English speakers and writers that I think is a bit imagined, and using LLMs to cover your weakness with the language hurts the public perception of you. There was an article posted last year about East African contractors who were used to do the RLHF post-training for early frontier models, saying LLM speak was really just the way Africans speak English. I don't think that's entirely correct, because frankly, after several decades of working with international teams, I think it's the way a lot of non-Anglophone English speakers speak.
It comes across as both childish and overly formal at the same time. Affected, too excited. It's the guy that says "Hi name!" on Slack and then waits for you to respond, instead of just saying what they actually wanted to say. You're responding to everyone here with some variant of "thank you, I appreciate it." That just isn't the way people speak to each other in normal conversation. It's the way a consultant speaks to you when you're being told you're being laid off and your own manager is too cowardly to deliver the news personally. Sandwich the real point between effusion and praise, when all we actually want is the real point. It feels patronizing, like we're being spoken down to. It's the way politicians and CEOs speak, every word prepared by committee, nothing genuine.
It's all the worse knowing this isn't even you and we're being patronized and spoken down to by a marketing bot you're delegating communication to.
Yes, I can see you're a bit late indeed. Several replies ago I have already acknowledged that I used LLM for the initial post and initial replies.
I use AI everyday, for different purposes, fixing my English when I feel I need, brainstorming, automate tasks, getting ideas for dinner, asking things I don't know about raising a 1 year old boy, eck even finance stuff. And to be honest, I'm quite pleased about it and see no shame on it. I'm no salesperson, nor I have a grasp at marketing, nor I'm used to promote my work. With this thread in HN, (which was a suggestion by LLM! :D), I just wanted to share what I built a let others use it, regardless of typing each character or not, or using LLM, or using smoke signals.
But one thing I cannot avoid, is being polite and friendly because that is who I am when speaking in my native language or in English with my peers. So, saying "Hey" and "Hi", "I appreciate" and "thank you", is part of my day to day.
I didn't know about the article you mentioned but thanks for letting me know. :)
This comment itself is likely written by AI by the sounds of it. It may be worth your time writing it out in your own words in your native language and then finding a competent translation tool to translate your words.
Not sure why you are downvoted here.
'A-Lot' of side projects, hobby projects, etc.. are all using AI tools now. Also for marketing, every sales/marketing firm is using AI. So why critisize this guy inparticular.
AI is pervasive, the train has left the station. So that is not a reason to criticize this project. There might be other reasons, I'm not sure, but not that an AI was used.
Because "Yeah, fair pushback" is AI smell. Either everything this person does is passed through an AI from code to blogs to even their HN comments and submissions; or they use AI so much they're starting to talk like it colloquially. Either way no one has time for that.
"Yeah, fair pushback"
Really hard to tell. Because that used to be a common phrase that real people would use.
So now I have to change my own language in order to not appear like I'm an AI? We are getting in a weird place where Humans have to act/sound increasingly 'odd', to appear not 'perfect' like an AI.
It's really not hard to tell. It's the "How do you do fellow kids" of AI-isms. The presence of "fair pushback" and a single em dash reads as 99% AI generated as far as I am concerned.
Yes, if you don't want to sound like you're cargo culting AI, you do have to change the way you talk because people aren't going to care otherwise. At the very least just because it's boring. That's always been the nature of slang and lingo.
"not hard to tell"
Or, with all of the AI slop, you think you are detecting all AI. And don't realize the stuff that is AI and not noticed. There is a wide variety of tools now, with different degrees of output quality.
https://ifunny.co/picture/it-s-been-forever-it-s-been-foreve...
I'm fine with work that uses AI. I use AI every day. I'm not fine with AI slop and it's very easy to tell what is slop and what's not, the same way it's easy to distinguish a selfie from a museum quality photograph. Are some selfies works of art? Few and far between, so you'd be forgiven if you dismiss all selfie-looking photographs as not worth your time.
It's really a weird world now.
I do think the author is doing a disservice to themselves by writing the post and comments using LLM, even if the code is mostly agent built. People can tell right away, all the LLM shibboleths are there... it feels cheap. Just write naturally and then Google translate, don't let the LLM speak on your behalf.
What's going to distinguish projects that are built this way is the ability to explain, document, support, and maintain said projects over the long term. That will be the crucible. Gone are the days of "build it and they will come", and I feel a bit sad about that.
It's so easy to let the code grow under you beyond what you have the capacity to do the above for.
I've got the same thing going on. Eschewing paid work and grinding 16, 17 hours a day boiling the sea to build the whole universe from scratch (also a database, but of a different sort than this project) integrating all my favourite DB research papers and ideas that I've accumulated over the last 30 years. Outperforms postgres 2-4x or more, has a battery of correctness tests, Lean proofs, benchmarks, etc. etc.
But frankly I'd be nervous to share. Especially here. I don't even know where it ends up. Not least because if I'm doing it, so are 50 other people, probably.
I totally acknowledge that. The only reason for passing my replies through AI was just because it's my first time posting here and opening a side-project of mine publicly.
All the engine architecture decisions are mine though and this project came up to solve a real problem I currently have at work with a zero-touch data pipeline leveraging FiveTran, Dagster, dbt and Databricks. This is a data pipeline that servers multiple agencies and data producers who work with data from more than 300 clients and multiple connectors.
Rocky essentially was built based on all the time spent awaken at night thinking about all these problems and how could they be addressed differently, considering that dbt is not suiting well this particular use-case.
I decided to open Rocky to public for free because of two simple reasons: 1st is that it might help others and I fullfill my ego of having built something other people like and use. 2nd is that I'm the solo maintainer. A project can only get proper traction if more people contributes to it.