If you are in the US, have you considered suing them for robot.txt / copyright violation? AI companies are currently flush with cash from VCs and there may be a few big law firms willing to fight a law suit against them on your behalf. AI companies have already lost some copyright cases.
Based upon traffic you could tell whether an IP or request structure is coming from a not, but how would you reliability tell which company is DDOSing you?
It should be at least theoretically possible: each IP address is assigned to an organisation running the IP routing prefix, and you can look that up easily, and they should have some sort of abuse channel, or at the very least a legal system should be able to compel them to cooperate and give up the information they’re required to have.