André Martins Ph.D. Defense

Andre Martins 

Date: Friday, May 11th, 2012
Hour: 9 a.m. at CMU // 2 p.m. at IST/UTL (videoconference room)
Place: CMU - room GHC 6501 // IST/UTL - CMU Portugal Videoconference room at V0.15

Student: André Filipe Torres Martins, dual degree Ph.D. Student in Language Technology
Advisors: Mário Figueiredo (IST/UTL), Noah Smith (CMU), Pedro Aguiar (IST/UTL), Eric Xing (CMU)
Schools: Instituto Superior Técnico of the Universidade Técnica de Lisboa (IST/UTL) and Carnegie Mellon University (CMU)  

Title: Advances in Structured Prediction for Natural Language Processing

Abstract:
This thesis proposes new models and algorithms for structured output prediction, with an emphasis on natural language processing applications. We advance in two fronts: in the inference problem, whose aim is to make a prediction given a model, and in the learning problem, where the model is trained from data.

For inference, we make a paradigm shift, by considering rich models with global features and constraints, representable as constrained graphical models. We introduce a new approximate decoder that ignores global effects caused by the cycles of the graph. This methodology is then applied to syntactic analysis of text, yielding a new framework which we call “turbo parsing,” with state-of-the-art results.

For learning, we consider a family of loss functions encompassing conditional random fields, support vector machines and the structured perceptron, for which we provide new online algorithms that dispense with learning rate hyperparameters. We then focus on the regularizer, which we use for promoting structured sparsity and for learning structured predictors with multiple kernels. We introduce online proximal-gradient algorithms that can explore large feature spaces efficiently, with minimal memory consumption. The result is a new framework for feature template selection yielding compact and accurate models.

Advances in Structured Prediction for Natural Language ProcessingThis thesis proposes new models and algorithms for structured output prediction, with an emphasis on natural language processing applications. We advance in two fronts: in the inference problem, whose aim is to make a prediction given a model, and in the learning problem, where the model is trained from data.For inference, we make a paradigm shift, by considering rich models with global features and constraints, representable as constrained graphical models. We introduce a new approximate decoder that ignores global effects caused by the cycles of the graph. This methodology is then applied to syntactic analysis of text, yielding a new framework which we call “turbo parsing,” with state-of-the-art results.For learning, we consider a family of loss functions encompassing conditional random fields, support vector machines and the structured perceptron, for which we provide new online algorithms that dispense with learning rate hyperparameters. We then focus on the regularizer, which we use for promoting structured sparsity and for learning structured predictors with multiple kernels. We introduce online proximal-gradient algorithms that can explore large feature spaces efficiently, with minimal memory consumption. The result is a new framework for feature template selection yielding compact and accurate models.

More information at CMU: http://www.cs.cmu.edu/~afm/Home_files/thesis.pdf