Proposal: Switching from Whisper to the new GPT-4o-transcribe model

As an active user of your Chrome extension, I'd like to suggest an improvement that would significantly enhance transcription accuracy.

1. Current issue

I've been using the extension for audio transcription for over a month. When working with Russian language, numerous errors occur. This requires extensive manual corrections, reducing work efficiency.

2. Solution: Transition to GPT-4o-transcribe

OpenAI has released a new transcription model that substantially outperforms the current Whisper: https://openai.com/index/introducing-our-next-generation-audio-models/

Key advantages:

  • Significantly improved accuracy (reduced Word Error Rate), tested on more than 100 languages

  • Better recognition of accents and regional speech patterns

  • Increased resilience to background noise during recording

  • Adaptation to varying speech speeds

  • Reduction of incorrect interpretations for complex words

  • Better context understanding and recognition of specific terminology

3. Alternative solution:

If the full GPT-4o-transcribe version proves too expensive, you could implement gpt-4o-mini-transcribe. Even this mini version significantly outperforms the current Whisper model in terms of accuracy and reliability.

4. Simple integration

This enhancement requires minimal effort β€” simply changing the model name used in the API request.

Please authenticate to join the conversation.

Upvoters
Status

Completed

Board
πŸ’‘

Feature Request / Bug Report

Date

11 months ago

Author

Web

Subscribe to post

Get notified by email when there are changes.