Abstract

A Maximum a posteriori framework for computing pitch tracks as well as voicing decisions is presented. The proposed algorithm consists of creating a time-pitch energy distribution based on predictable energy that improves on the normalized cross-correlation. A large database is used to evaluate the algorithm’s performance against two standard solutions, using glottal closure instants (GCI) obtained from electroglottogram (EGG) signals as a reference. The new MAP algorithm exhibits higher pitch accuracy and better voiced/unvoiced discrimination.