@tilman_kamp With the forced alignment tool, are there any particular approaches to take with regard to disfluencies (esp fillers like um, ah, etc)
I’m interested in using it for a source that has sections of audio with fairly frequent cases of them. I suspect it would be labour intensive to identify them and adjust the transcripts (which don’t reference them currently, it’s just regular text). I realise the best test it to try it, but do you have any feeling for how resilient it is with that kind of thing in general?