Hmm this is interesting food for thought, but I mainly disagree with the "Foreign Key Constraints" section:
> A foreign key constraint forces user_id to correspond to an actual row in the users table. If you try to create or update a post with user_id 999, and there is no user with that id, the foreign key constraint will cause the SQL query to fail.
> This sounds great, right? A record pointing at a non-existent user is in an invalid state. Shouldn’t we want it to be impossible to represent invalid states? However, many large tech companies - including the two I’ve worked for, GitHub and Zendesk - deliberately choose not to use foreign key constraints. Why not?
> The main reason is flexibility. In practice, it’s much easier to deal with some illegal states in application logic (like posts with no user attached) than it is to deal with the constraint. With foreign key constraints, you have to delete all related records when a parent record is deleted. That might be okay for users and posts - though it could become a very expensive operation - but what about relationships that are less solid? If a post has a reviewer_id, what happens when that reviewer’s account is deleted? It doesn’t seem right to delete the post, surely. And so on.
So if you need to keep reviews by a user when the user can be deleted, you have a few options:
- Use a foreign key constraint but also allow nullable foreign key values (I believe DBMSes such as postgres and mysql support this)
- Create a record representing an orphaned user and move reviews to it when the user is deleted (generally, the reviews will then show as posted by user "DELETED", though you can also enforce different ways to display deleted user associations)
- Soft delete the users (in this case you also have to exercise caution around what ex-user data may be displayed where)
There are tradeoffs with each one; soft deletion of users would likely run afoul of Right to be Forgotten laws in some places. I'd tend to favour option 1 for anything that references something like a user account where the user may be deleted, but option 2 might be useful in situations as well.
I'd generally recommend against omitting foreign key constraints when it can be assumed databases backing your application will support them.
Sure, losing the FK constraint gives you more "flexibility" (I guess?) but introduces way too many footguns. The database and tooling are a massive help in avoiding so many other bugs.
Sure, maybe one day you decide reviews can have authorship associated with entities which are not users of your system (say you've acquired republication data rights from another review service whose users are not your service's users); in that case you'd need an application-level refactor and migration. You might need to add an author table, where the author can be a user of your service, or a reference to a user of another service . Then users of your service become possible authors, and all reviews need to be migrated to have an FK relation to the author table now rather than the user table.
In any case, the FK doesn't prevent you from changing how your data is structured, and I'd argue it greatly helps you avoid mistakes as you move through the migration.