Anti-hallucination support, pause tolerance

Hey!

I'm aware that the underlying engineering in getting all this to work is very complicated and that this would be really hard to achieve, but I said I would throw it out nevertheless.

As you helpfully pointed out, ASR speech recognition has the amusing quirk of suffering from hallucinations much as large language models do generally. The manifestation in ASDR being that it will add nonsensical language like “thank you for watching” to the end of a transcription!

I've really come to use speech to text as my daily typing method entirely thanks to your great app!

So I still feel like I'm discovering how I use it as opposed to just text typing.

One thing that I really like to do is to pause for thought while I'm in the middle of dictating.

The problem is that if you leave long enough of a gap, you greatly increase the probability of hallucinations and at a certain point they're unavoidable.

I don't really have any thoughts on what you could try to do to avoid this. If the pause detection is too aggressive then you run the risk of not capturing user text.

But seeing as you've already made this amazing extension, I figure you know a lot more about the engineering than I can speculate about.

Perhaps something like a user pause detection threshold setting would be helpful and allow those who dictate cleanly and those who like to pause for thought to choose a setting that best reflects their unique style.

Please authenticate to join the conversation.

Upvoters
Status

In Progress

Board
💡

Feature Request

Date

18 days ago

Author

Daniel Rosehill

Subscribe to post

Get notified by email when there are changes.