In this paper, we present a new approach to joint
state and parameter estimation for a target-directed, nonlinear
dynamic system model with switching states. The model, which
was recently proposed for representing speech dynamics, is also
called the hidden dynamic model (HDM). The model parameters
subject to statistical estimation consist of the target vector and
the system matrix (also called the “time-constants”), as well as
the parameters characterizing the nonlinear mapping from the
hidden state to the observation. These latter parameters are
implemented in the current work as the weights of a three-layer
feedforward multilayer perceptron (MLP) network. The new estimation
approach presented in this paper is based on the extended
Kalman filter (EKF), and its performance is compared with the
more traditional approach based on the expectation-maximization
(EM) algorithm. Extensive simulation experiment results are
presented using the proposed EKF-based and the EM algorithms
and under the typical conditions for employing the HDM for
speech modeling. The results demonstrate superior convergence
performance of the EKF-based algorithm compared with the EM
algorithm, but the former suffers from excessive computational
loads when adopted for training the MLP weights. In all cases, the
simulation results show that the simulated model output converges
to the given observation sequence. However, only in the case
where the MLP weights or the target vector are assumed known
do the time-constant parameters converge to their true values.
We also show that the MLP weights never converge to their true
values, thus demonstrating the many-to-one mapping property
of the feedforward MLP. We conclude from these simulation
experiments that for the system to be identifiable, restrictions on
the parameter space are needed.