if you generate a lot of output text you can approximate the hidden state.