In my master's thesis we used SUMO to model a small part of our town and hooked it up to the latest and greatest reinforcement learning algorithms to learn traffic light control. Eventually we beat all the other built in conventional algorithms in most parameters; Average speed. Emission. Etc.

You might be interested in Google's IRL implementation: https://sites.research.google/gr/greenlight/