Searching Optimal Solutions for Sequence-to-Sequence Models

Author: ORCID icon
Du, Wanyu, Computer Science - School of Engineering and Applied Science, University of Virginia
Ji, Yangfeng, EN-Comp Science Dept, University of Virginia

Sequence-to-sequence generation applications take source texts as inputs, and automatically generate new texts that satisfy specific target requirements, such as generating a paraphrase, translating to another language, answering a question, etc. There are two key challenges in sequence-to-sequence generation applications: first, how to encode source texts into informative representations that preserve rich semantic information; second, how to generate target texts that look like human-generated texts. In this thesis, I develop probabilistic models to encode informative context representations from source texts using variational autoencoders, and investigate different learning algorithms to train models that can effectively generate better target texts.

For learning context representations with variational autoencoders, I identify the limitation of using variational autoencoders for sequence-to-sequence models is that applying the standard normal prior is likely to trap the variational posterior into local optimal, thus preventing the model from learning rich context representations. Therefore, I propose to adapt the attention mechanism and learn some empirical priors to help the model get rid of the local optimal and learn better context representations.

For investigating different learning algorithms for sequence-to-sequence models, I present an empirical study on different learning algorithms (e.g. Reinforce, Dagger) to analyze how they can the training-inference discrepancy when training sequence-to-sequence models. I apply different learning algorithms in state-of-the-art model in paraphrase generation tasks, and find that Dagger constantly contributes to better performance.

MS (Master of Science)
sequence-to-sequence generation, variational autoencoder, imitation learning, reinforcement learning
All rights reserved (no additional license for public reuse)
Issued Date: