During training, the output obtained from the last hidden layer was directly used to calculate the CTC loss but during inference, a softmax activation function is applied on the output of the last hidden layer before sending the output to the CTC loss function.
tf.nn.ctc_loss
applies the softmax internally. The decoder expects the input to already have softmax applied to it.
Ok, got it. Thank you.