SQLite is a great example of a single factor mattering more than everything else combined. A database contained in a single file is such a good idea that it outweighs a poorly designed storage layer, poorly designed column formats, and a terrible SQL implementation.
If craftsmanship is measured by the long tail of good choices that give something a polished and pristine feel, then SQLite was built with none of it. And yet, it's by far the best initial choice for every project that needs a database. Most projects will never need to switch to anything more.
This seems like an unnecessarily negative comment. I've been a user of SQLite for over 20 years now (time flies!), what you're calling lack of polish, I would chalk up to Dr. Hipp has been consciousness about maintaining compatibility over the long term. So much so, that the Library of Congress recommends it for long-term preservation of data.
Long term compatibility (i.e. prioritizing the needs of users vs chasing inevitably changing ideas about what feels polished or pristine), near fanatical dedication to testing and quality, and sustained improvement over decades - these are the actual signs of true craftsmanship in an engineering project.
(plus, I don't agree with you that the storage layer, column format, or SQL implementation are bad).
> I would chalk up to Dr. Hipp has been consciousness about maintaining compatibility over the long term.
I agree. I am not suggesting that the SQLite team doesn't know how to make the technology better. Just that they aren't/haven't. Backwards compatibility is a good reason not to.
My original comment was contrasting craftsmanship and utility, since both are somewhat prized on HN, but they aren't the same thing at all. Look at a system like Wireguard. A huge amount of small decisions went into making that as simple and secure as it is. When most developers are confronted with similar decisions, they perform almost randomly and accumulate complexity over the long tail of decisions (it doesn't matter just pick a way). With Wireguard, every design decision reliably drove toward simplicity (it does matter, choose carefully).
I don't think they ever hesitate to make sqlite better. It's just that they have a different definition of "better" than you.
> contrasting craftsmanship and utility, since both are somewhat prized on HN
I'd say they're prized everywhere, though "craftsmanship" is really subjective. and the HN I usually [edit/add: see] seems to have more a meta of "criticize anything someone tries to build, and rave about IQ" tbh ;)
SQLite works and I don't have to think about it why it works (too much). That is IMO a true hallmark of solid engineering.
> If craftsmanship is measured by the long tail of good choices that give something a polished and pristine feel, then SQLite was built with none of it.
It apparently has an extensive and thorough test suite. That's an excellent design choice that tons of other projects could learn from, and is probably a key element of its success.
Sometimes a poorly-designed thing that is excellently-documented and thoroughly-tested is better than a brilliantly-designed thing that is lacking in those. In fact, unless the number of users of the thing is 1 (the creator), the former is likely a better option across all possible use-cases.
Perhaps we could generalize this by stating that determinism > pareto-optimality.
Digital Equipment Corporation sold a SQL database known as Rdb that could also run as a single file.
It was the first database to introduce a cost-based optimizer, and ran under both VMS and Digital UNIX.
Oracle bought it, and VMS versions are still supported.
https://www.oracle.com/database/technologies/related/rdb.htm...
https://en.m.wikipedia.org/wiki/Oracle_Rdb
(My employer is still using the VMS version.)
Oh! RDB was the first database I worked with. I forgot all about it. I do remember refactoring the data layer so that it also worked with Berkeley DB, which is also owned by Oracle now. Or maybe it was the other way around? There was no SQL involved in that particular application so it was just a K/V store. Working with a local data file was the primary design goal, no client/server stuff was even on the radar. SQLite would have been perfect if it had existed.
It was designed to be a DB for Tcl at a time when that language didn't have typed objects. Its SQL implementation reflects that. Where are the grand Python, or Perl, or JS DBs?
It actually does have typed values, it is just the schema didn’t constrain the value types stored in each column, until relatively recently the column type was mostly just documentation. However, now it has STRICT tables which do constrain the value types of columns. And for a lot longer you’ve been able to implement the same thing manually using check constraints-which is a bit verbose if you are writing the schema by hand, much less of a problem if it is being generated out of ORM model classes/etc
>> It was designed to be a DB for Tcl at a time when that language didn't have typed objects. Its SQL implementation reflects that.
> It actually does have typed values
Now. As the article points out, they were not part of the initial design, because of the Tcl heritage.
AFAIK it has always had typed values. Don’t confuse column types (which constrain a column to containing only values of a specified type) with value types (which enable it to treat the string “12” and the integer 12 and the floating point 12.0 as three distinct values)
Tcl has value types. Tcl 7.x and earlier only had one data type, the string-so adding two integers required two string-to-int conversions followed by an int-to-string conversion. In 1997, Tcl 8.x was released, which internally has distinct values types (int, string, etc), although it retains the outward appearance of “everything-is-a-string” for backward compatibility. So SQLite’s Tcl heritage included distinguishing different types of values, as is done in post-1997 Tcl.
Tcl still only has one data type, which is the string. It has different internal representations to allow representing integers etc more efficiently, but since that is not part of the formal semantics of the language, those representations are not data types.
At the level of its C API-which is the level at which it integrates with sqlite-it indeed has multiple data types, it even has a structure to represent them, Tcl_ObjType: https://www.tcl-lang.org/man/tcl8.4/TclLib/ObjectType.htm
I've never used it, but perl contains support for Berkeley DB in stdlib since forever. But sqlite maps to perl just fine.
ZODB https://zodb.org/en/latest/
> poorly designed storage layer, poorly designed column formats, and a terrible SQL implementation
Is this opinion shared by others?
Dr. Hipp has said several times that nobody expected a weakly-typed database to achieve the pervasiveness that is observed with SQLite.
At the same time, strict tables address some of the concern of those coming from conventional databases.
Dates and times are a core problem to SQLite not seen elsewhere as far as I know, but this does evade UTC and constantly shifting regional time. My OS gets timezone updates every few months, and avoiding that had foresight.
Default conformance with Postel's Law is SQLite's stance, and it does seem to work with the ANSI standard.
> Dr. Hipp has said several times that nobody expected a weakly-typed database to achieve the pervasiveness that is observed with SQLite.
I don't remember ever saying that. Rather, see https://sqlite.org/flextypegood.html for detailed explanation of why I think flexible typing ("weak typing" is a purgative and inaccurate label) is a useful and innovative feature, not a limitation or a bug. I am surprised at how successful SQLite has become, but if anything, the flexible typing system is a partial explanation for that success, not a cause of puzzlement.
Did I misinterpret the experts' assertion of imposibility?
"I had this crazy idea that I’m going to build a database engine that does not have a server, that talks directly to disk, and ignores the data types, and if you asked any of the experts of the day, they would say, “That’s impossible. That will never work. That’s a stupid idea.” Fortunately, I didn’t know any experts and so I did it anyway, so this sort of thing happens. I think, maybe, just don’t listen to the experts too much and do what makes sense. Solve your problem."
https://corecursive.com/066-sqlite-with-richard-hipp/
> Did I misinterpret the experts' assertion of imposibility?
Misstated, I'd say. You said "nobody" but the actual quote is about the assumed conventional wisdom of the time, which is quite different. And while this was probably inadvertent, you phrased it in a way that almost made it sound like that was Dr. Hipp's original opinion, which, of course, is the opposite of true.
While nobody expected it … it should not be unexpected.
Typically, the Lowest-Common-Denominator wins mass appeal/uasge.
By not having safety checks and even typing enforcement, SQLite caters to actually more use cases than less.
I often forget or mix up which "Law" refers to which observation, and I'm surely not the only one. So:
Postel's Law, also known as the Robustness Principle, is a guideline in software design that states: "be conservative in what you send, be liberal in what you accept."
SQLite probably doesn't do anything with times and dates except punting some functions to the limited libc facilities because including any proper date-time facilities would basically double the footprint of SQLite. Same for encodings and collations.
Same for encodings and collations.
I think it's one of the reasons DuckDB has seen the popularity that it has.
DuckDB is a columnar database, and columnar DBs are way better for analytics, statistics... That is its main reason for its popularity, the ability to run specific workloads that row based databases will struggle/be slower at.
Nothing to do with the posters badly formatted complained about Sqlite. By that metric DuckDB has a ton of issues that even out scale Sqlite.
thats a strange argument DuckDB is for OLAP and SQLite is for OLTP
Yeah, but most applications are small. So, at the scale of most applications you can drop in DuckDB with zero change in actual performance. It still has indexes to support highly selective queries because it needs to have functional primary keys.
> outweighs a poorly designed storage layer, poorly designed column formats, and a terrible SQL implementation
You're going to have to expand on that, because I have no idea what you're talking about, nor does anyone else here seem to.
It's a relational database meant primarily for a single user. It's SQL. It works as expected. It's performant. It's astonishingly reliable.
The only obviously questionable design decision I'm aware of is for columns to be able to mix types, but that's more "differently designed" rather than "poorly designed", and it's actually fantastic for automatically saving space on small integers. And maybe the fact ALTER TABLE is limited, but there are workarounds and it's not like you'll be doing that much in production anyways.
What are your specific problems with it?
Firebird also fits the bill, I think, but never took off. Firebird even supports client-server deployments.
I think they do a good job with test coverage, compatibility, and sustainable support. Can't say that about most every other hype database made by a fortune 500 and shut down 3 years later.
Exactly as in MS Access, Interbase/Firebird, and dBase II.
[flagged]