You can get consistent voice by providing a sample - and yea the timing stuff is what you have to work around - have to basically chunk your inputs.