This is a NextJS-based demo of AI.JSX, a JavaScript framework for building applications with Large Language Models. For more info, check out the AI.JSX docs, or the source code for this app.

This demo exercises several real-time ASR (speech-to-text) implementations. You can see how they do on a stock text recording using Start File, or you can use Start Mic to try with your own voice.

Latency is computed for each partial and final transcript, and the average value is displayed. When using a file, Word Error Rate (WER) is computed against the ground truth transcript, ignoring punctuation.

Deepgram

Partial Latency: NaN ms
Final Latency: NaN ms
WER: 0.000

AssemblyAI

Partial Latency: NaN ms
Final Latency: NaN ms
WER: 0.000

Speechmatics

Partial Latency: NaN ms
Final Latency: NaN ms
WER: 0.000

Rev AI

Cost: $0.02/min
Partial Latency: NaN ms
Final Latency: NaN ms
WER: 0.000

Soniox

Partial Latency: NaN ms
Final Latency: NaN ms
WER: 0.000

Gladia

Partial Latency: NaN ms
Final Latency: NaN ms
WER: 0.000