h a l f b a k e r y
i v n i n seeks n e t o
add, search, annotate, link, view, overview, recent, by name, random
news, help, about, links, report a problem
or get an account
Those hard of hearing often rely on being able to read a speaker's lips. Often the speaker is not aware of the listener's handicap and moves too far away, turns away from the listener, cover's their mouth while wiping their face, speaks very softly with little lip movement, etc. I would like to see
a cell phone-like device (Iphone app possibly) that would convert speech to lip movements on an animated character's face on a cell phone-like screen. The character's lips would parrot the lip movements of the speaker in detail. The cell phone-like device could also act as a hearing aid and feed sound to the listener's earphone. The device / app could also translate frequencies the listener can't hear the frequencies the listener can hear. See the annotation to jutta's frequency translator idea on the attached link.
Translate audio into vibrations [Sunstone, Jul 08 2010]
Pretty good. [wagster, Jul 09 2010]
[hippo, Jul 09 2010]
Comedy lip movement device
Not exactly what I'm thinking of but the idea is in the vicinity [Sunstone, Jun 28 2011]
Lip reading technology could help people solve crimes by deciphering what people are saying on security cameras [Sunstone, Mar 25 2016]
[Toto Anders, Mar 26 2016]
Please log in.
If you're not logged in,
you can see what this page
looks like, but you will
not be able to add anything.
Description (displayed with the short name and URL.)
||I wonder if professionally rendered CG faces (ie.
those in the movie "Avatar") can be lip-read...
||Interestingly, I've just started using lipsync software (Lipsync MX - link) after my previous animation project went massively overtime due to the huge amount of lip-sync involved, which is one of the most time-consuming parts of character animation.
||Lipsync software analyses speech for phonemes, the various sounds our mouth can make and converts them to visemes, the shapes our mouths make when we make the sounds. There's pretty much one shape for each vowel, one for all the consonants and a few extras, such as "L" and a closed mouth for non-speaking moments.
||It's far from perfect and the finished result always needs editing before use. The analysis time on a decent computer is fairly fast and could easily run in real time, but I'm not sure an iPhone will be powerful enough. Another problem is the speech quality - I'm recording with a decent mic in a quiet room - call quality will hamper the process. Also, there will always be a slight delay as the technology will only ever be able to display the last phoneme spoken, but not the one being currently spoken.
||So - it's possible, but we might have to wait until phones get better processors and the analysis algorithms get better (but with Apple or Google money...). The processing delay can be largely fixed by delaying the audio to match the display.
||Will there be subtitles ?
||Subtle t's would be too difficult to animate.
||This has been done in Flash (RoboText or something
similar) with a fairly low processor load. I doubt you
could lip-read it though, but I'm sure it's possible
with current mobile CPUs, it should certainly be less
intensive than 30fps video encoding which is
currently done in real-time on phones.
||This I like! And would use... That's it then! Bun [+]
||There are several heroic efforts of the pre-animation age to create similar effects with talking penises <link>. To emulate the effect there would have to be a choice of "faces" in this app's user interface.