It seems very strange to define these terms based off the difficulty in reproducing them.

Let's look at the sibling comment's example of a nuclear bomb. That's "not simple for anyone to reproduce without significant access" and as citizens we don't "have a say in the security practices used to safeguard it." And international laws have done a relatively good job keeping them out of the hands of bad actors. Does that make them a dataset?

Contrast that with data that is easy to reproduce, like say the name of the 45 different Presidents of the US. That is obviously a dataset. Yet there is no private information involved, it is all public data. Many people can even produce that list entirely from memory. But having that list on a piece of paper in front of me could still be a helpful tool if I was taking a US history test.