Automatically Assessing Personality from Speech
- Florian Metze | Carnegie Mellon University
In this talk, we present results on applying a personality assessment paradigm to speech input, and compare human and automatic performance on this task. We cue a professional speaker to produce speech using different personality profiles and encode the resulting vocal personality impressions in terms of the Big Five NEO-FFI personality traits. We then have human raters, who do not know the speaker, estimate the five factors. We analyze the recordings using signal-based acoustic and prosodic methods and observe high consistency between the acted personalities, the raters’ assessments, and initial automatic classification results. We further validate the application of our paradigm to speech input, and extend it towards text independent speech. We show that human labelers can consistently label speech data generated across multiple recording sessions with respect to personality, and investigate further which of the 5 scales in the NEO-FFI scheme can be assessed from speech, and how a manipulation of one scale influences the perception of another. Finally, we present a top-down clustering of human labels of personality traits derived from speech, which will be useful in future experiments on automatic classification of personality traits. This presents a first step towards being able to handle personality traits in speech, which we envision will be used in future voice-based communication between humans and machines.
Speaker Details
Florian Metze received his PhD from Universität Karlsruhe (TH) in 2005, for his work on speech recognition using articulatory features. Before joining Carnegie Mellon University as an Assistant Professor in 2009, he spent 3 years at Deutsche Telekom Laboratories, where he worked on the introduction of speech-to-text technology for business intelligence and improving user experience, expert search in corporate blogs, multi-modal user interfaces, and speech meta-data extraction. He is also a Project Management Professional (PMI, 2008). He is currently working on rapidly building speech recognizer for low-resource languages and audio analysis of consumer-grade video material
-
-
Jeff Running
-
-
Series: Microsoft Research Talks
-
Decoding the Human Brain – A Neurosurgeon’s Experience
- Dr. Pascal O. Zinn
-
-
-
-
-
-
Challenges in Evolving a Successful Database Product (SQL Server) to a Cloud Service (SQL Azure)
- Hanuma Kodavalla,
- Phil Bernstein
-
Improving text prediction accuracy using neurophysiology
- Sophia Mehdizadeh
-
Tongue-Gesture Recognition in Head-Mounted Displays
- Tan Gemicioglu
-
DIABLo: a Deep Individual-Agnostic Binaural Localizer
- Shoken Kaneko
-
-
-
-
Audio-based Toxic Language Detection
- Midia Yousefi
-
-
From SqueezeNet to SqueezeBERT: Developing Efficient Deep Neural Networks
- Forrest Iandola,
- Sujeeth Bharadwaj
-
Hope Speech and Help Speech: Surfacing Positivity Amidst Hate
- Ashique Khudabukhsh
-
-
-
Towards Mainstream Brain-Computer Interfaces (BCIs)
- Brendan Allison
-
-
-
-
Learning Structured Models for Safe Robot Control
- Subramanian Ramamoorthy
-