What about deterministic parsing?

Basically using templates to extract info from recurring doc structures ??