To be in NP, witness must be verifiable in polynomial time with respect to the size of the original input. In this problem (VAS Reachability), the solution can be `2^2^2^...^K` steps long. Even if that's linear with respect to the witness, it's not polynomial with respect to the set of moves + goal.
Hmm.. I'd love to see a more formal statement of this, because it feels unintuitive.
Notably the question "given a number as input, output as many 1's as that number" is exponential in the input size. Is this problem therefore also strictly NP-hard?
> Hmm.. I'd love to see a more formal statement of this, because it feels unintuitive.
The problem is called "Vector Addition System Reachability", and the proof that it's Ackermann-complete is here: https://arxiv.org/pdf/2104.13866 (It's actually for "Vector Addition Systems with states, but the two are equivalent formalisms. They're also equivalent to Petri nets, which is what got me interested in this problem in the first place!)
> Notably the question "given a number as input, output as many 1's as that number" is exponential in the input size. Is this problem therefore also strictly NP-hard?
(Speaking off the cuff, so there's probably a mistake or two here, computability is hard and subtle!)
Compression features in a lot of NP-hard problems! For example, it's possible to represent some graphs as "boolean circuits", eg a function `f(a, b)` that's true iff nodes `a,b` have an edge between them. This can use exponentially less space, which means a normal NP-complete problem on the graph becomes NEXPTIME-complete if the graph is encoded as a circuit.
IMO this is cheating, which is why I don't like it as an example.
"Given K as input, output K 1's" is not a decision problem because it's not a boolean "yes/no". "Given K as input, does it output K ones" is a decision problem. But IIRC then for `K=3` your input is `(3, 111)` so it's still linear on the input size. I think.
My point here more than anything else is that I find this formulation unsatisfying because it is "easy" to verify that we have a witness, but is exponential only because the size of the witness is exponential in size.
What I'd like as a minimum example for "give me something that is NP-hard but not NP-complete" Is a problem whose input size is O(N), whose witnesses are O(N), but which requires O(e^N) time to validate that the witness is correct. I don't actually know that this is possible.
I disagree with the formulation of "decision problem" here. The problem, properly phrased as a decision problem, is "For a given K, does there exist a string with K 1s".
While it is straightforward to answer in the affirmative, and to produce an algorithm that produces that output (though it takes exponential time). To validate that a solution is in fact a valid solution also takes exponential time, if we are treating the size of the input as the base for the validation of the witness.
In the definition of NP, you get the input in unary, so that gets rid of that exponential. The issue with VAS is that the number of moves required could be extremely large.
There's a similar situation in chess (at least if generalized to N by N). You could assert Black has a forced win from a given position, but verifying the claim requires checking the entire game tree, rather than a succinct certificate. Generalized chess is in fact PSPACE-complete, which is generally believed to be outside of NP.
It needs to be a decision problem (or easily recast as a decision problem). "given a number as input, output as many 1's as that number" doesn't have a yes or no answer. You could try to ask a related question like "given a number as input and a list of 1s, are there as many 1s as the number?" but that has a very large input.
Straightforward to pose as a decision problem -- you're given a sequence of {0,1} representing a base-2 input (N), output a string of {0, 1} such that there are N 1's. You could make it "N 1s followed by N 0s" to avoid being too trivial, but even in the original formulation there are plenty of "wrong" answers for each value of N, and determining whether a witness is correct requires O(2^b) time, where b is the number of bits in N.
That's not a decision problem either. You need to cast it in the form of determining membership in some set of strings (the "language"). In this case the natural choice of language is the set of all strings of the form "{encoding of n}{1111....1 n times}", which is decidable in O(log(n)+n) steps, i.e. linear time.