Half a croissant, on a plate, with a sign in front of it saying '50c'

h a l f b a k e r y

These statements have not been evaluated by the Food and Drug Administration.

idea: add, search, annotate, link, view, overview, recent, by name, random

meta: news, help, about, links, report a problem

account: browse anonymously, or get an account and write.

user:

pass:
register,

Smile Detector

Using signal characteristics to detect smiles in spoken speech

(-2)

[vote for,
against]

Have you ever been on the phone with somebody (or where you can't see their face) but you could *hear* the smile in their voice? I have.

My half baked idea is that there is an actual difference in how we modulate our words (via mouth shape or some other factor influenced by wearing a BIG smile - maybe specific muscle usage?), and that with the proper modelling of that smiling speech, a detector can be made.

The uses of this? Well, you've got me there. In all normal situations, there would another human to "hear" the smile. It would need to be something where speech is monitored by a computer. Perhaps it would help with automated transcriptions or maybe the countries using Echelon to monitor the communications of each other's citizens would use it to know when people are smiling when they are saying something. (not that i support anything Echelon-related)

—	cameron, Oct 22 2001

AfroAssault's Speak in AOL idea http://www.halfbake...ea/Speak_20in_20AOL
[snarfyguy, Oct 22 2001, last modified Oct 04 2004]

Techniques for the Phonetic Description of Emotional Speech http://www.qub.ac.u...ings/pdfs/roach.pdf
System for human transcription of paralinguistic speech features. [pottedstu, Oct 22 2001, last modified Oct 21 2004]

Prosel http://www.elantts....produits/prosel.htm
Software extracts prosody (pauses, tone) from a speech sample, and applies to to synthesised speech; but works at a low level. [pottedstu, Oct 22 2001, last modified Oct 21 2004]

Emotional speech synthesis http://www.sfc.keio...keiida/cocosda.html
Considers whether it's possible to add emotion to a speech synthesiser like Stephen Hawkings'. [pottedstu, Oct 22 2001, last modified Oct 21 2004]

Recognition of emotions http://www.unige.ch...pdf/icphs_95(4).pdf
Compares acoustic analysis with listener identification to see how well emotions can be detected. [pottedstu, Oct 22 2001, last modified Oct 21 2004]

Expression of emotions http://www.unige.ch...pdf/icphs_95(3).pdf
More technical, but some interesting points. [pottedstu, Oct 22 2001, last modified Oct 21 2004]

[link]

Sounds like AfroAssault's Speak in AOL idea (see link)

—	snarfyguy, Oct 22 2001

I went and read AfroAssault's speak in AOL idea. I don't think it is remotely close to my idea. I don't want to detect trendy emoticon-talk - i want to detect *real* smiles in real speech. (whether or not they are doing AOL-speak)

AfroAssault's idea is to augment language to describe emoticons. Mine is to detect emotion (not emoticons) in normal spoken language.

—	cameron, Oct 22 2001

Baked: Videophone

—	stupop, Oct 22 2001

This does seem pretty baked. There's a lot of work going on at the moment in analysing prosody - those features of speech such as intonation, speed and rhythm which communicate emotional states. These characteristics are generated both unconsciously by physiological changes in the speaker (excitement making you speak faster), and consciously by rules learnt and shared between speakers (e.g. putting on a sarcastic tone of voice).

There's a fair bit of work going on in this field, not just for intellectual interest, but for very low bit rate speech coding, where it's important not just to carry the phonetic content of speech (the words spoken) but paralinguistic data about the speaker's age, sex, and emotional state.

I posted a few links: Prosel is a software package for speech generation (text-to-speech) that adds emotion to speech based on analysing a similar speech sample and extracting information on tone of voice, etc. There are also a number of scientific papers, analysing the factors involved in expressing emotion through speech. The last 2 come from the Geneva Emotion Research Group, and give some examples of computer analysis of emotion. They find some emotions are easier to recognise than others, either for listeners, or based on audio processing. However, it's still early days.

There's also a neurological/mental disorder where patients are unable to recognise the emotional content of speech, called aprosodia. This suggests that a particular area of the brain is responsible for decoding the emotional content of speech, separate from identifying the words used and extracting semantic content.

—	pottedstu, Oct 22 2001

stu, that's some pretty cool stuff. thanks!

—	cameron, Nov 10 2002

How about detecting whether a pianist was smiling when they played a particular passage?

—	bristolz, Nov 10 2002

[annotate]

back: main index

business computer culture fashion food halfbakery home other product public science sport vehicle