About how many training steps are required to get good output?
I trained 12,000 steps at 4 layers, and the output is kind of name-like, but it didn't reproduce any actual name from it's training data after 20 or so generations.
not many. diminishing returns start before 1000 and past that you should just add a second/third layer
I trained 12,000 steps at 4 layers, and the output is kind of name-like, but it didn't reproduce any actual name from it's training data after 20 or so generations.
not many. diminishing returns start before 1000 and past that you should just add a second/third layer