GA’s optimize only combinatorial problems though — where you have discrete set of choices (~genes) for each variable, and therefore do not have a gradient

You can use a GA with continuous parameters and a smooth gradient but it probably isn't the most efficient method in that case.