Open Source Speech Analytics

For discussions about game development that does not fit in any of the other topics.
Post Reply
User avatar
hallsofvallhalla
Site Admin
Posts: 12023
Joined: Wed Apr 22, 2009 11:29 pm

Open Source Speech Analytics

Post by hallsofvallhalla »

What I am really looking for is open source software that will detect words from things like movies, videos, ect..
User avatar
Jackolantern
Posts: 10891
Joined: Wed Jul 01, 2009 11:00 pm

Re: Open Source Speech Analytics

Post by Jackolantern »

I know you have probably seen this already (it was one of the top results on Google), but there is CMU Sphinx.

However, voice analytics in movies and other media may be a problem. The background noise can be highly confusing to the engine. One of the best implementations I have ever seen is YouTube closed caption, which will try to generate a script from the video if one isn't uploaded. That has Google behind it, and it is probably only right 70% of the time (nowhere near acceptable for any serious voice application).
The indelible lord of tl;dr
User avatar
hallsofvallhalla
Site Admin
Posts: 12023
Joined: Wed Apr 22, 2009 11:29 pm

Re: Open Source Speech Analytics

Post by hallsofvallhalla »

What I want to do is create a bad words filter for youtube. As far as I know one does not exist.

Basically it would be a program that buffers the video for about 3 seconds as it will listen to it first. It then listens for bad words and bleeps them or mutes them then plays the video and sound.
User avatar
Xaos
Posts: 940
Joined: Wed Jan 11, 2012 4:01 am

Re: Open Source Speech Analytics

Post by Xaos »

Not sure how it all comes together, but you could scan the closed captioning, see where that is in the video, then mute/play a sound at that time. But not sure if that'd be any easier than normal voice recognition.
User avatar
Jackolantern
Posts: 10891
Joined: Wed Jul 01, 2009 11:00 pm

Re: Open Source Speech Analytics

Post by Jackolantern »

That is true. I doubt you could find anything open source that could do a better job than what YouTube already offers.

However, since you would have to do this external to YT as far as I know, I wonder if the captioning data would be sent along with it or if that is only on the native player.

EDIT: I guess it wouldn't have to be external if this could somehow be made into a Chrome plugin.
The indelible lord of tl;dr
Post Reply

Return to “General Development”