I think a logic language is not a great fit, because it deals in absolute truths. Whereas news and non-fiction articles are more someone said something and they are usually reliable except in this particular domain where they can be a bit biased sometimes. Also the people they interviewed may have misremembered. Etc etc.
You could argue that all such things could in principle be modelled in logical formulas but... it would be way more complicated than the stated model.
On the other hand it's also unclear what the logical model actually adds. Usually logical formulas are used because they can be manipulated, and inferences can be drawn. But drawing logical inferences from sorta-kinda true statements can quickly become muddy and lead to false conclusions if you don't take into account the uncertainty of the statements.
The person suggesting it had simple heuristics like "this proposition was asserted by three sources". This has obvious flaws (e.g. I can cite three lying sources). But even on Wikipedia, there is no automatic checking that sources say what's claimed. So it wouldn't be useless despite having obvious flaws.
But anyway, if you have heuristics like this you can make them propositions and do inference with them. Instead of thinking of it as "I've proved this false" or "the citations are correct" perhaps think of it more like a lint that runs against your code base and tells you whether you've done something that falls below expectations.
A more natural way to model it would be something like a Bayesian system where you assign probabilities, build up a hierarchical model, and flow probabilities through the model. But there's something nice about a simpler system that doesn't pretend to do more than it can.
You can certainly build up a collection of probably-true-statements. That makes sense. Encoding them as logical formula makes sense. That's basically what you are describing. But OP wanted to then additionally put those formula into Lean, and that's where I disagree. Because he will likely have inconsistent statements in his collection and then he can prove all sorts of absurdities (principle of explosion).
So IIUC like if an article is covering a debate, 3 sources assert "A", and another 3 assert "~A", then the heuristic "3 sources assert means it's true", gives you a logical setup with "A ^ ~A"?
If so, then yes that's a bug in that heuristic. Like I said in my first comment, what OP is describing is impossible the way they're describing it. So I think in that sense we're in agreement.
But on the other hand, maybe OP will end up hacking together a thing that resembles probabilistic modal logic.
At the time I wrote that comment OP was being downvoted and there were no encouraging responses, which I felt wasn't representative of the math community as I know it. Now there is a good discussion of the problem space and some suggestions to check out different kinds of logic, so I'm glad to see all that going on!
There are many variations, but I’m on mobile rn. (Is this the new Fermat’s last theorem?)
But seriously, the idea of breaking claims into logic and using programming languages to check, compose, and verify claims is kinda the reason logical programming languages like prolog were invented.
Argument mining and automatic fact checking are things and there are actual products that do them.
It seems reasonable to have some of those products implemented in a logic language.
I think a logic language is not a great fit, because it deals in absolute truths. Whereas news and non-fiction articles are more someone said something and they are usually reliable except in this particular domain where they can be a bit biased sometimes. Also the people they interviewed may have misremembered. Etc etc.
You could argue that all such things could in principle be modelled in logical formulas but... it would be way more complicated than the stated model.
On the other hand it's also unclear what the logical model actually adds. Usually logical formulas are used because they can be manipulated, and inferences can be drawn. But drawing logical inferences from sorta-kinda true statements can quickly become muddy and lead to false conclusions if you don't take into account the uncertainty of the statements.
The person suggesting it had simple heuristics like "this proposition was asserted by three sources". This has obvious flaws (e.g. I can cite three lying sources). But even on Wikipedia, there is no automatic checking that sources say what's claimed. So it wouldn't be useless despite having obvious flaws.
But anyway, if you have heuristics like this you can make them propositions and do inference with them. Instead of thinking of it as "I've proved this false" or "the citations are correct" perhaps think of it more like a lint that runs against your code base and tells you whether you've done something that falls below expectations.
A more natural way to model it would be something like a Bayesian system where you assign probabilities, build up a hierarchical model, and flow probabilities through the model. But there's something nice about a simpler system that doesn't pretend to do more than it can.
You can certainly build up a collection of probably-true-statements. That makes sense. Encoding them as logical formula makes sense. That's basically what you are describing. But OP wanted to then additionally put those formula into Lean, and that's where I disagree. Because he will likely have inconsistent statements in his collection and then he can prove all sorts of absurdities (principle of explosion).
So IIUC like if an article is covering a debate, 3 sources assert "A", and another 3 assert "~A", then the heuristic "3 sources assert means it's true", gives you a logical setup with "A ^ ~A"?
If so, then yes that's a bug in that heuristic. Like I said in my first comment, what OP is describing is impossible the way they're describing it. So I think in that sense we're in agreement.
But on the other hand, maybe OP will end up hacking together a thing that resembles probabilistic modal logic.
At the time I wrote that comment OP was being downvoted and there were no encouraging responses, which I felt wasn't representative of the math community as I know it. Now there is a good discussion of the problem space and some suggestions to check out different kinds of logic, so I'm glad to see all that going on!
Plenty of logics with non-binary truth values...
But such system exists, mostly implemented in prolog.
Does it actually?
The first thing this thread made me think of:
https://cyc.com/
It's been around for decades.. I'm sure there are other similar systems.
cool thanks for the link
there is also whole rdf/owl/semantic web ecosystem, with tons of companies/engines/datasets, mostly small.
There are many variations, but I’m on mobile rn. (Is this the new Fermat’s last theorem?)
But seriously, the idea of breaking claims into logic and using programming languages to check, compose, and verify claims is kinda the reason logical programming languages like prolog were invented.