Wandering Adventure Party

screwlisp

@reillypascal I meant that the experience the article reported of racist words being wrongly transcribed sounds bad!

I am a bit confused because it sometimes sounds like whisper.cpp is producing the speech2text.

Whisper.cpp is just the program that applies a chosen model to a chosen audio file resulting in that model's speech2text.

It is the model that does the speech2text. You can choose different models of different sizes made by different people to call with whisper.cpp.

screwlisp

@reillypascal oh, that article is really bad. But there are more than one source of speech2text models iirc.

screwlisp

@reillypascal
just run https://github.com/ggml-org/whisper.cpp locally for your speech2text. I did look into it, you can also compile it to your own locally-running android app (is that still allowed?).

speech2text is medically / quality-of-life important to so many people, I think it might be appropriate to try running local-only speech2text rather than losing it because all modern services transitioned to objectionable web-services and lots of vulnerable people are in that pickle.

screwlisp

@reillypascal like you point out, I would not want people who medically benefit from speech2text to have that taken away from them because the current deep learning models were made in an unethical way. And I think that taking away translation is similar. Maybe the workable principle would be to just run them locally and independently (e.g. whisper.cpp and w/e).

screwlisp

@reillypascal I have a general question, I do not know if you want to clarify it. Do you consider speech2text or language translation (probably via LLMs now) in the category of generative ai?

Myself I can barely muscle through german and probably not French articles with my brain but I feel guilty about using LLM translation and reading as a fake polyglot.

Wandering Adventure Party

screwlisp@gamerplus.org

Posts