> This is also why PICK can usefully fail. Sometimes none of the model’s candidates is right, and PICK ends with zero survivors. Under the spec-elucidation reading, that outcome means: the commitments you made through classification could not be satisfied by anything the model produced. Better to know than to ship the regex anyway.
Zooming out (but only a little) from the impetus to formalize a commitment to a particular class of result candidate (what the author here is calling "spec elucidation"), we can also imagine this same evolution of concerns being applied in order to cause what we currently term "AI safety" into something more like "AI ethics".
For example, if we can elucidate the specifications for things like peace and justice to ensure that the class of results is formally verified as non-participation in war (or perhaps, further in the future, non-participation in state activities whatsoever), we may be able to throw cold water on all the vitriolic arguments about model capabilities and which need to be banned or delayed lest we accelerate the apocalypse (or whatever is actually on the mind of the ban-this-model constituency).
I like how the author ends tersely with:
> If you have a formal language with the closure properties above — we suspect you would be surprised how many do — we would very much like to hear from you.
That's certainly not me, but I bet it's true that it's somebody.
> ensure that the class of results is formally verified as non-participation in war
There are very few things that cannot be stated as dual use, with one totally benign and one totally screwed up. It's like wanting a hammer to distinguish if it's striking a nail for a roof vs. a nail for an illegal animal pen. That's the wrong application of constraints. The hammer shouldn't care.
The author addresses this point as well:
> This is also why we do not believe PICK becomes less useful as models improve. Better models do not make user intent more articulate — asked for “a regex matching countries of North America”, a more capable model still cannot tell you whether you want the Caribbean included, or where you want to stop heading south. Better models produce better candidates, faster — which shifts user effort precisely toward the work PICK is built to support.
That's not I'm saying tho. I quoted the "non-participation in war" bit. I don't see how any system can ascertain if a prompt asking for an algorithm is dual use or not.
Well, taking the example in the article about a formal verification to determine whether a regex is for a collection of phone numbers or countries, and if a country is specified as "America", to formally verify whether that means North America or all Americas including the Caribbean:
A regex which "properly" determines this in a witness/formal verification model is subject to the same sorts of political distinctions and nuances, right down to subjective interpretation of which nation-states are legitimate, etc, that participation in warfare/predation/injustice are.
If the USA says that Cuba is part of the USA, and the rest of the Western Hemisphere says it's not, then the formal verification is necessarily matching the prompt intention to a social and political milieu, which, while subjective, is still cognizable and subject to introspection and, in some cases, a type of consensus.
It seems similarly possible to formally verify that results do or do not meet similar subjective-but-cognizable criteria: does the result trigger engagement of weapons systems? Does it they implicate borders in a way that diminishes the sovereignty of a people? (which, like the regex example, have subjective but cognizable nuances)? Does the result serve to enrich (a perhaps pre-supplied) list of criminals/cartels/contractors/war profiteers? All of these seem like similar problems to the regex example.
I don't disagree with your high-level contention that powerful dual-use technology can always cut through the good and evil in each of us. But I think that's precisely the point of searching for formally-verifiable spec-elucidation. If it were obvious from the outset whether or not a particular word represents the same of a nation-state widely regarded as sovereign, or whether an action was or was not an act of war, then we'd not need AI (let alone formal verification of a prompt-result match) in the first place.