Simultaneous Feature Selection and Parameter Estimation for Hidden Markov Models
Adams, Stephen, Systems Engineering - School of Engineering and Applied Science, University of Virginia
Beling, Peter, Department of Systems and Information Engineering, University of Virginia
Prior knowledge about a system is crucial for accurate modeling. Bayesian parameter estimation theory, specifically the use of informative prior distributions, offers one method for conveying prior knowledge to the learning algorithm that may not be present in a data set. This dissertation primarily focuses on the problem of feature selection for hidden Markov models with respect to the test cost of the individual features. Test costs include the financial cost of acquiring the feature, the difficulty in collecting the feature, the time required to collect the feature, etc. We propose using a feature saliency hidden Markov model (FSHMM) that simultaneously selects features and estimates model parameters. We assume that the number of states is known, and use the expectation maximization algorithm for parameter estimation. Informative prior distributions are used to convey the test cost to the learning algorithm. Three formulations are derived for the FSHMM: a maximum likelihood formulation using no priors, and two maximum a posteriori (MAP) formulations using informative priors. These are compared to an existing formulation that uses non-informative priors and variational Bayesian methods for parameter estimation. The proposed formulations are extended to numerous conditional feature distributions, including the gamma distribution and the Poisson distribution, and a semi-Markov model. The FSHMM is tested using synthetic data, a tool-wear data set, an activity-recognition data set, and an event-detection data set. We find that the MAP formulation using a truncated exponential distribution on the feature saliencies generally outperforms the other FSHMM formulations and conventional feature selection techniques in terms of predictive performance and selecting a feature subset.
PHD (Doctor of Philosophy)
informative priors, feature selection, hidden markov models, parameter estimation
All rights reserved (no additional license for public reuse)