That’s a false dichotomy.
Sensitive research systems thread that needle by giving remote access to researchers with the data in the control and supervision of the responsible organization. Strong internal data access controls and data siloing alongside strict verified extraction routines. Specifically: limited project-dedicated DB access, full logging of data interactions, and full lockouts/freezes if something feels off.
‘The five safes’ is a good presentation from the NHS(?) a decade ago covering the approaches.
Data publishing restrictions around health data aren’t reckless. Modern computing and digital permanence mean we have to be extra cautious.
No, this is a real tradeoff.
Any friction you add to "access the data" process makes it harder for legitimate researchers to get access to, and get benefits from that data.
So, at what point do stricter data controls begin to choke you at the throat?
We have dozens of data / db startups - kinda odd that there isnt one (I have seen) that focuses on this problem.
Perhaps our future ai overlords will feel its important to compartmentalise, and log data access more agressively.