Examining the Performance of Different Contextual Representations in a Canonical Language Model

Jacques, Brandon, Psychology - Graduate School of Arts and Sciences, University of Virginia
Sederberg, Per, AS-Psychology, University of Virginia

Driven by advances in computer engineering, model architectures, and
training methods, the field of Natural Language Processing (NLP) has
reached new heights of performance. Currently utilizing short-term buffers
to represent the context in which a word is experienced, these models have
lagged behind recent developments in the understanding of human memory.
Finite representations of context ignore the fact that the temporal scale in
which words are predictive of each other can go up to hundreds of words
apart(H. W. Lin & Tegmark, 2017). In this paper, we leverage recent
developments in the understanding of memory to augment the performance
of a canonical NLP model with a compressed representation of context that
contains many time-scales of information. We show that the Timing from
Inverse Laplace Transform (TILT) representation, a neurally plausible
way of compressing history utilizing leaky integrators, can function as a
drop-in replacement for a buffer representation in a canonical language
model to increase performance without adding computational complexity
or increasing the size of the overall model

MA (Master of Arts)
Artificial Neural Networks, Statistical Language Modeling, Compressed Memory, Neurally Plausible Representation
Issued Date: