When it comes to evaluating AI-generated text, perplexity has been a widely used metric. For those unfamiliar, perplexity is a measure of how well a language model predicts a given sequence of words. Lower perplexity means the text is likely according to the language model. You can think of it as the text ‘flows well’.