Pinned
In Progress
🎉Early Beta Release of BlabbyAI Windows App is Here!
Hey Voice Typers! We’re excited to announce that the early beta version of BlabbyAIWindows app is finally here! Want to Try the Beta Now? Download the Beta Version Your enthusiasm and support have motivated us to bring this to you as soon as possible. However, there’s something important you should know before downloading: ⚠️About the Warning Message The app is now signed with a valid code-signing certificate. However, since it's a new release, Windows may still show a warning message. This is normal—Windows takes some time to fully trust new certificates. If you're comfortable proceeding, you can follow the steps below to bypass the warning. Here’s how to bypass the "unknown publisher" warning: If you encounter any unexpected behavior while using the app, don’t worry! You can easily close the app from the system tray to get things back to normal. We’d love to hear your feedback to make the app even better! Feel free to Request a feature / report a bug in the dedicated feedback portal. https://blabbyai.featurebase.app/ Thanks for being part of this journey! We can’t wait to hear what you think.
BlabbyAI Dev 3 months ago
High Priority
Pinned
In Progress
🎉Early Beta Release of BlabbyAI Windows App is Here!
Hey Voice Typers! We’re excited to announce that the early beta version of BlabbyAIWindows app is finally here! Want to Try the Beta Now? Download the Beta Version Your enthusiasm and support have motivated us to bring this to you as soon as possible. However, there’s something important you should know before downloading: ⚠️About the Warning Message The app is now signed with a valid code-signing certificate. However, since it's a new release, Windows may still show a warning message. This is normal—Windows takes some time to fully trust new certificates. If you're comfortable proceeding, you can follow the steps below to bypass the warning. Here’s how to bypass the "unknown publisher" warning: If you encounter any unexpected behavior while using the app, don’t worry! You can easily close the app from the system tray to get things back to normal. We’d love to hear your feedback to make the app even better! Feel free to Request a feature / report a bug in the dedicated feedback portal. https://blabbyai.featurebase.app/ Thanks for being part of this journey! We can’t wait to hear what you think.
BlabbyAI Dev 3 months ago
High Priority
In Progress
Make standalone app work with Chrome extension.
It's quite often that you copy and paste stuff between different tabs, especially if you work with prompts. And it's quite often that you first want to dictate something and then paste a block of code or just some copied thing. And if you use the stand-alone Windows app, then your dictation will override the current content and the clipboard. In general, it would be nice to have an ability to add like an “exclusion” apps to windows app so Chrome extension could capture control+space hotkey first
Aleksandr Vinogradov 5 days ago
In Progress
Make standalone app work with Chrome extension.
It's quite often that you copy and paste stuff between different tabs, especially if you work with prompts. And it's quite often that you first want to dictate something and then paste a block of code or just some copied thing. And if you use the stand-alone Windows app, then your dictation will override the current content and the clipboard. In general, it would be nice to have an ability to add like an “exclusion” apps to windows app so Chrome extension could capture control+space hotkey first
Aleksandr Vinogradov 5 days ago
In Progress
Yearly pricing option
Hey, Sorry if this is something that I'm just not seeing, or maybe I signed up for the monthly option to test it out. It would be really nice to have the ability to pay for this yearly. For a lot of small business folks like me, even if there’s no monetary incentive to do so, it’s often easier to do yearly subs for good tools just to avoid the hassle of having to capture and account for the receipts every month, etc.
Daniel Rosehill 16 days ago
In Progress
Yearly pricing option
Hey, Sorry if this is something that I'm just not seeing, or maybe I signed up for the monthly option to test it out. It would be really nice to have the ability to pay for this yearly. For a lot of small business folks like me, even if there’s no monetary incentive to do so, it’s often easier to do yearly subs for good tools just to avoid the hassle of having to capture and account for the receipts every month, etc.
Daniel Rosehill 16 days ago
Speech to text prompt library you're welcome to use
Hey! Until you bring out a Linux app, I have to make do with my own very crude Whisper transcription client to fill the gap (there are lots of locally hosted Whisper models but I haven’t found one that uses the API and has direct text input yet). I use this for doing the kind of transcription I do with your tool, but also to capture texts that I then get formatted for use in common situations like creating emails, to-do lists, notes, etc. In the course of making these various prototypes, I've built up a small library of system prompts for what I call text transformation. I'm not sure if this is the official term, but I'm sure that you're familiar with the idea System prompts, which run the dictated text through an LLM to apply some basic transformations I mentioned before that I think it would be really amazing if there was something like a default library in Whisper AI that had the most common of these ready to go so that users could actually not just do direct transcription but also some reformatting or maybe as a secondary functionality. In case you ever do consider moving with the idea, I created an inventory of some of those system prompts earlier this week on GitHub, which you are more than welcome to use for this purpose: https://github.com/danielrosehill/Speech-To-Text-System-Prompt-Library
Daniel Rosehill 18 days ago
Speech to text prompt library you're welcome to use
Hey! Until you bring out a Linux app, I have to make do with my own very crude Whisper transcription client to fill the gap (there are lots of locally hosted Whisper models but I haven’t found one that uses the API and has direct text input yet). I use this for doing the kind of transcription I do with your tool, but also to capture texts that I then get formatted for use in common situations like creating emails, to-do lists, notes, etc. In the course of making these various prototypes, I've built up a small library of system prompts for what I call text transformation. I'm not sure if this is the official term, but I'm sure that you're familiar with the idea System prompts, which run the dictated text through an LLM to apply some basic transformations I mentioned before that I think it would be really amazing if there was something like a default library in Whisper AI that had the most common of these ready to go so that users could actually not just do direct transcription but also some reformatting or maybe as a secondary functionality. In case you ever do consider moving with the idea, I created an inventory of some of those system prompts earlier this week on GitHub, which you are more than welcome to use for this purpose: https://github.com/danielrosehill/Speech-To-Text-System-Prompt-Library
Daniel Rosehill 18 days ago
In Progress
Anti-hallucination support, pause tolerance
Hey! I'm aware that the underlying engineering in getting all this to work is very complicated and that this would be really hard to achieve, but I said I would throw it out nevertheless. As you helpfully pointed out, ASR speech recognition has the amusing quirk of suffering from hallucinations much as large language models do generally. The manifestation in ASDR being that it will add nonsensical language like “thank you for watching” to the end of a transcription! I've really come to use speech to text as my daily typing method entirely thanks to your great app! So I still feel like I'm discovering how I use it as opposed to just text typing. One thing that I really like to do is to pause for thought while I'm in the middle of dictating. The problem is that if you leave long enough of a gap, you greatly increase the probability of hallucinations and at a certain point they're unavoidable. I don't really have any thoughts on what you could try to do to avoid this. If the pause detection is too aggressive then you run the risk of not capturing user text. But seeing as you've already made this amazing extension, I figure you know a lot more about the engineering than I can speculate about. Perhaps something like a user pause detection threshold setting would be helpful and allow those who dictate cleanly and those who like to pause for thought to choose a setting that best reflects their unique style.
Daniel Rosehill 18 days ago
In Progress
Anti-hallucination support, pause tolerance
Hey! I'm aware that the underlying engineering in getting all this to work is very complicated and that this would be really hard to achieve, but I said I would throw it out nevertheless. As you helpfully pointed out, ASR speech recognition has the amusing quirk of suffering from hallucinations much as large language models do generally. The manifestation in ASDR being that it will add nonsensical language like “thank you for watching” to the end of a transcription! I've really come to use speech to text as my daily typing method entirely thanks to your great app! So I still feel like I'm discovering how I use it as opposed to just text typing. One thing that I really like to do is to pause for thought while I'm in the middle of dictating. The problem is that if you leave long enough of a gap, you greatly increase the probability of hallucinations and at a certain point they're unavoidable. I don't really have any thoughts on what you could try to do to avoid this. If the pause detection is too aggressive then you run the risk of not capturing user text. But seeing as you've already made this amazing extension, I figure you know a lot more about the engineering than I can speculate about. Perhaps something like a user pause detection threshold setting would be helpful and allow those who dictate cleanly and those who like to pause for thought to choose a setting that best reflects their unique style.
Daniel Rosehill 18 days ago
In Progress
Keep the PC awake
I noticed that my laptop wasn't going into sleep mode after a period of inactivity. I realized this happened whenever I had Chrome open. So, I tried turning off all the extensions one by one until I figured out that only with this particular extension active, the laptop couldn't go into sleep mode after the idle time. There's probably something in this extension keeping the computer awake, perhaps in the Service Worker. I also have a Chrome extension, so I completely understand that leaving a review wasn't the nicest thing to do to signal this nasty issue. That's why I chose to report it here instead.
Tony Puller 20 days ago
In Progress
Keep the PC awake
I noticed that my laptop wasn't going into sleep mode after a period of inactivity. I realized this happened whenever I had Chrome open. So, I tried turning off all the extensions one by one until I figured out that only with this particular extension active, the laptop couldn't go into sleep mode after the idle time. There's probably something in this extension keeping the computer awake, perhaps in the Service Worker. I also have a Chrome extension, so I completely understand that leaving a review wasn't the nicest thing to do to signal this nasty issue. That's why I chose to report it here instead.
Tony Puller 20 days ago
Add custom spelling globally
[Anonymous user]: I really like the tool. It's a significant productivity boost. What I kind of miss, though, is the ability to ensure the correct spelling of names—like company names, project names, and personal names—across all of my languages. Do I really have to add them to each language individually? That feels quite tedious.
BlabbyAI Dev 28 days ago
Add custom spelling globally
[Anonymous user]: I really like the tool. It's a significant productivity boost. What I kind of miss, though, is the ability to ensure the correct spelling of names—like company names, project names, and personal names—across all of my languages. Do I really have to add them to each language individually? That feels quite tedious.
BlabbyAI Dev 28 days ago
Can't use in Gmail
If you can excuse the jumbled bug report / praise all-in-one! The good: Really delighted to see you introduce the unlimited pricing. This has made my week and it's a great feeling to have this accessible without worrying about running down credits. The bad: Not able to use this in Gmail, perhaps Google are on your tail or maybe it's just a weird Linux related bug but said I would let you know.
Daniel Rosehill 30 days ago
Can't use in Gmail
If you can excuse the jumbled bug report / praise all-in-one! The good: Really delighted to see you introduce the unlimited pricing. This has made my week and it's a great feeling to have this accessible without worrying about running down credits. The bad: Not able to use this in Gmail, perhaps Google are on your tail or maybe it's just a weird Linux related bug but said I would let you know.
Daniel Rosehill 30 days ago
In Progress
Not Auto Pasting into Zoho Desk
I can get this to work in Google Docs, Google Search. But it won’t auto paste the transcription into our Zoho Desk email replies.
Zack Esgar 30 days ago
In Progress
Not Auto Pasting into Zoho Desk
I can get this to work in Google Docs, Google Search. But it won’t auto paste the transcription into our Zoho Desk email replies.
Zack Esgar 30 days ago
Backup And Export Functionality
Hey! I thought that I had captured this here when I was importing my previous list of suggestions. Forgive me if I'm duplicating. I really like the idea of expanding upon a personal dictionary, although I noticed that when I stop speaking for a while and the tool begins hallucinating, it includes my dictionary words. Any chance you could clarify if the entire dictionary is passed on each current and if so whether that adversely affects token consumption? That aside: My only hesitation about adding lots of interesting modes and uploading and developing a personal dictionary is the same as the one that I have with any other SaaS product, app extension or otherwise: I'm fine with creating lots of data in the cloud so long as I can periodically export it, so just in case you vanish off the face of the earth, I know that I'll have something to fall back on. Perhaps something like a simple CSV or JSON export functionality could be implemented to allow the user to periodically grab a backup copy of their personal dictionary for safekeeping.
Daniel Rosehill About 1 month ago
Backup And Export Functionality
Hey! I thought that I had captured this here when I was importing my previous list of suggestions. Forgive me if I'm duplicating. I really like the idea of expanding upon a personal dictionary, although I noticed that when I stop speaking for a while and the tool begins hallucinating, it includes my dictionary words. Any chance you could clarify if the entire dictionary is passed on each current and if so whether that adversely affects token consumption? That aside: My only hesitation about adding lots of interesting modes and uploading and developing a personal dictionary is the same as the one that I have with any other SaaS product, app extension or otherwise: I'm fine with creating lots of data in the cloud so long as I can periodically export it, so just in case you vanish off the face of the earth, I know that I'll have something to fall back on. Perhaps something like a simple CSV or JSON export functionality could be implemented to allow the user to periodically grab a backup copy of their personal dictionary for safekeeping.
Daniel Rosehill About 1 month ago
A feature for post-dictation text reformatting.
There are a number of voice taking apps and productivity tools on the market whose secret sauce is essentially applying a system prompt on top of a dictated text passed through Whisper. From what I've seen, this actually isn't really that hard to do. I use a custom AI frontend and have created dozens of fairly simple system prompts to do everything from converting text into to-do list format through to making it more professional, making it more concise, etc. I know that the current feature set is focused really on dictation and this might be overstepping the boundary into productivity tools, but on the other hand it may actually make sense and save people from requiring multiple components to do this very useful and everyday task. I shared a library of text transformation system prompts on Hugging Face yesterday, which of course are totally open source, and if the idea ever sounds appealing then you are free to use it. https://huggingface.co/datasets/danielrosehill/text-transformation-system-prompts
Daniel Rosehill About 1 month ago
A feature for post-dictation text reformatting.
There are a number of voice taking apps and productivity tools on the market whose secret sauce is essentially applying a system prompt on top of a dictated text passed through Whisper. From what I've seen, this actually isn't really that hard to do. I use a custom AI frontend and have created dozens of fairly simple system prompts to do everything from converting text into to-do list format through to making it more professional, making it more concise, etc. I know that the current feature set is focused really on dictation and this might be overstepping the boundary into productivity tools, but on the other hand it may actually make sense and save people from requiring multiple components to do this very useful and everyday task. I shared a library of text transformation system prompts on Hugging Face yesterday, which of course are totally open source, and if the idea ever sounds appealing then you are free to use it. https://huggingface.co/datasets/danielrosehill/text-transformation-system-prompts
Daniel Rosehill About 1 month ago
The ability to use the app in Chromium and embedded Chrome browsers.
So I am fairly sure that this would be impossible but as usual I thought I would put it out there just as an idea (I guess for those normal people not using Linux, the desktop app will make this irrelevant!): I sometimes use browsers in embedded configurations, things like Electron wrappers or in the current instance Ferdium which is a Workspace OS type tool. Although these are running Chromium under the hood as far as I know, they don't have access to any of the extensions. Hence, while the UIs are very familiar, I greatly miss my favourite productivity tool and have to resort to the awful experience of using a keyboard (how did we ever survive?) Maybe there's some work around or if not hope the Linux app comes to fruition one day!
Daniel Rosehill About 1 month ago
The ability to use the app in Chromium and embedded Chrome browsers.
So I am fairly sure that this would be impossible but as usual I thought I would put it out there just as an idea (I guess for those normal people not using Linux, the desktop app will make this irrelevant!): I sometimes use browsers in embedded configurations, things like Electron wrappers or in the current instance Ferdium which is a Workspace OS type tool. Although these are running Chromium under the hood as far as I know, they don't have access to any of the extensions. Hence, while the UIs are very familiar, I greatly miss my favourite productivity tool and have to resort to the awful experience of using a keyboard (how did we ever survive?) Maybe there's some work around or if not hope the Linux app comes to fruition one day!
Daniel Rosehill About 1 month ago
In Progress
I would like to be able to dictate punctuation such as open parentheses and close parentheses, as well as open quote and close quote.
Daniel Newman About 1 month ago
In Progress
I would like to be able to dictate punctuation such as open parentheses and close parentheses, as well as open quote and close quote.
Daniel Newman About 1 month ago
Upload content while dictatiing
The system is excellent and performs exactly as intended, with impressive text conversion and explanation capabilities. The mods are also working great, providing clear explanations. However, I would suggest adding a feature that allows users to upload content (such as code snippets or text) before interacting with specific mods, like translation or coding mods. This would enhance functionality, especially when working with code where we need to submit it first before requesting modifications. Overall, I'm very satisfied with the well-organized system, reasonable pricing, and will continue using it long-term.
Naif Essa About 2 months ago
Upload content while dictatiing
The system is excellent and performs exactly as intended, with impressive text conversion and explanation capabilities. The mods are also working great, providing clear explanations. However, I would suggest adding a feature that allows users to upload content (such as code snippets or text) before interacting with specific mods, like translation or coding mods. This would enhance functionality, especially when working with code where we need to submit it first before requesting modifications. Overall, I'm very satisfied with the well-organized system, reasonable pricing, and will continue using it long-term.
Naif Essa About 2 months ago
Rollover of unused transcription minutes to the next month
I've got a question. Do you have it set up so that when you renew your subscription next month, any unused transcription minutes from the previous month roll over and get added to the new minutes you pay for this month? In other words, do you carry over unused limits from past months to the next month or not? If not, that would be a fascinating feature to add.
Web 2 months ago
Rollover of unused transcription minutes to the next month
I've got a question. Do you have it set up so that when you renew your subscription next month, any unused transcription minutes from the previous month roll over and get added to the new minutes you pay for this month? In other words, do you carry over unused limits from past months to the next month or not? If not, that would be a fascinating feature to add.
Web 2 months ago
Re-transcribe
Sometimes transcription doesn't go through. So it would be great if I can look up for previous recordings and re-transcribe them.
Issam Alameh 2 months ago
Re-transcribe
Sometimes transcription doesn't go through. So it would be great if I can look up for previous recordings and re-transcribe them.
Issam Alameh 2 months ago
Planned
Microphone level tester
Just being a nuisance again and throwing in a few ideas! From using speech-to-text full-time for a few months now, I'm getting a good handle on what the occasional pitfalls are. One that I've noticed is if your microphone level is accidentally set too low, speech-to-text accuracy naturally declines a lot. Again, as a Linux user, I'm aware that there are peculiarities in this system that most users won't experience. But the one that I've encountered is whatever setting Zoom has that automatically changes the input levels persisting after leaving a Zoom call. Just about any time the Whisper performance seems to be lagging a bit, I check my microphone level and usually it's at something like 50%. I'm guessing that there will be a percentage of people using voice typing for the first time via this extension who will run into all these “rookie errors” and more. I don't know if a decibel meter could be integrated into the extension, but that’s one idea. Another is a warning that might display if the input volume is detected to be below a certain threshold. Something like “you’re too quite! Check your mic levels” (etc)
Daniel Rosehill 2 months ago
Planned
Microphone level tester
Just being a nuisance again and throwing in a few ideas! From using speech-to-text full-time for a few months now, I'm getting a good handle on what the occasional pitfalls are. One that I've noticed is if your microphone level is accidentally set too low, speech-to-text accuracy naturally declines a lot. Again, as a Linux user, I'm aware that there are peculiarities in this system that most users won't experience. But the one that I've encountered is whatever setting Zoom has that automatically changes the input levels persisting after leaving a Zoom call. Just about any time the Whisper performance seems to be lagging a bit, I check my microphone level and usually it's at something like 50%. I'm guessing that there will be a percentage of people using voice typing for the first time via this extension who will run into all these “rookie errors” and more. I don't know if a decibel meter could be integrated into the extension, but that’s one idea. Another is a warning that might display if the input volume is detected to be below a certain threshold. Something like “you’re too quite! Check your mic levels” (etc)
Daniel Rosehill 2 months ago
Support for Deepgram, other STT APIs
Whisper AI is one of those rare technologies that has almost exhausted my ability to generate feature requests because it pretty much just works, which is a great achievement! But in the spirit of sharing ideas and because it's something that I thought about in the early days of testing this stuff out, I wanted to put in the idea that supporting a variety of SGT Cloud APIs might be useful for users. I'm guessing that it would be more hassle than it's worth and it's not really in sync with the Whisper brand (!), but DeepGram's APIs are good, and there are a few others that are useful too. One use case that I would highlight is that I believe a couple of these more niche platforms have explicit support for speaker accents in their API architecture. Users who have very pronounced or more unusual accents might find that the recognition is enhanced with these without having to go through the trouble of fine-tuning a model. Some other providers I’ve come across: Gladia, Speechmatics. As ASR is really heating up, I imagine the list will just keep growing.
Daniel Rosehill 2 months ago
Support for Deepgram, other STT APIs
Whisper AI is one of those rare technologies that has almost exhausted my ability to generate feature requests because it pretty much just works, which is a great achievement! But in the spirit of sharing ideas and because it's something that I thought about in the early days of testing this stuff out, I wanted to put in the idea that supporting a variety of SGT Cloud APIs might be useful for users. I'm guessing that it would be more hassle than it's worth and it's not really in sync with the Whisper brand (!), but DeepGram's APIs are good, and there are a few others that are useful too. One use case that I would highlight is that I believe a couple of these more niche platforms have explicit support for speaker accents in their API architecture. Users who have very pronounced or more unusual accents might find that the recognition is enhanced with these without having to go through the trouble of fine-tuning a model. Some other providers I’ve come across: Gladia, Speechmatics. As ASR is really heating up, I imagine the list will just keep growing.
Daniel Rosehill 2 months ago
Planned
Mac up with Whisper Turbo?
I’d love to see a system-wide Mac app, especially one with a tier offering a one-time purchase for offline “ECO” mode. It could leverage the recently released Whisper Turbo model, which performs exceptionally well on Apple Silicon Macs, allowing for fast and efficient on-device transcription. I’ve been developing with Whisper Turbo, and I’m really impressed with its accuracy and efficiency!
Theo 3 months ago
Planned
Mac up with Whisper Turbo?
I’d love to see a system-wide Mac app, especially one with a tier offering a one-time purchase for offline “ECO” mode. It could leverage the recently released Whisper Turbo model, which performs exceptionally well on Apple Silicon Macs, allowing for fast and efficient on-device transcription. I’ve been developing with Whisper Turbo, and I’m really impressed with its accuracy and efficiency!
Theo 3 months ago