This Red/Blue submarine problem seems to be a better fit for ABM simulation, rather than Monte Carlo based on Markov processes.
IRL this will be a path dependent since both sides will learn from the past actions and probabilities will be changing, i.e. the memorylessness Markov property will not hold.
In ABM the ships (agents) can move on 2D space, which makes detection easier.
Also, obviously there are lots of externalities, like weapons, food, and sailors supply, ceasfires, surrenders, politics, etc.
All of the above is easier to simulate using ABM, rather than Monte Carlo.