This demo exercises several real-time ASR (speech-to-text) implementations. You can see how they do on a stock text recording using Start File, or you can use Start Mic to try with your own voice.
Latency is computed for each partial and final transcript, and the average value is displayed. When using a file, Word Error Rate (WER) is computed against the ground truth transcript, ignoring punctuation.