Mahdi Namazifar scite author profile

We study approaches for obtaining convex relaxations of global optimization problems containing multilinear functions. Specifically, we compare the concave and convex envelopes of these functions with the relaxations that are obtained with a standard relaxation approach, due to McCormick. The standard approach reformulates the problem to contain only bilinear terms and then relaxes each term independently. We show that for a multilinear function having a single product term, these relaxations are equivalent if the bounds on all variables are symmetric around zero. We then review and extend some results on conditions when the concave envelope of a multilinear function can be written as a sum of concave envelopes of its individual terms. Finally, for bilinear functions we prove that the difference between the concave overestimator and convex underestimator obtained from the McCormick relaxation approach is always within a constant of the difference between the concave and convex envelopes. These results, along with numerical examples we provide, provide insight into how to construct strong relaxations of multilinear functions.

show abstract

Language Model is all You Need: Natural Language Understanding as Question Answering

Namazifar

Papangelis

Tür

et al. 2021

View full text Add to dashboard Cite

Different flavors of transfer learning have shown tremendous impact in advancing research and applications of machine learning. In this work we study the use of a certain family of transfer learning, where the target domain is mapped to the source domain. Specifically we map Natural Language Understanding (NLU) problems to Question Answering (QA) problems and we show that in low data regimes this approach offers significant improvements compared to other approaches to NLU. Moreover, we show that these gains could be increased through sequential transfer learning across NLU problems from different domains. We show that our approach could reduce the amount of required data for the same performance by up to a factor of 10.

show abstract

Joint Contextual Modeling for ASR Correction and Language Understanding

Weng

Miryala

Khatri

et al. 2020

View full text Add to dashboard Cite

The quality of automatic speech recognition (ASR) is critical to Dialogue Systems as ASR errors propagate to and directly impact downstream tasks such as language understanding (LU). In this paper, we propose multi-task neural approaches to perform contextual language correction on ASR outputs jointly with LU to improve the performance of both tasks simultaneously. To measure the effectiveness of this approach we used a public benchmark, the 2nd Dialogue State Tracking (DSTC2) corpus. As a baseline approach, we trained task specific Statistical Language Models (SLM) and fine-tuned state-of-the-art Generalized Pre-training (GPT) Language Model to re-rank the n-best ASR hypotheses, followed by a model to identify the dialog act and slots. i) We further trained ranker models using GPT and Hierarchical CNN-RNN models with discriminatory losses to detect the best output given n-best hypotheses. We extended these ranker models to first select the best ASR output and then identify the dialogue act and slots in an end to end fashion. ii) We also proposed a novel joint ASR error correction and LU model, a word confusion pointer network (WCN-Ptr) with multihead self attention on top, which consumes the word confusions populated from the n-best. We show that the error rates of off the shelf ASR and following LU systems can be reduced significantly by 14% relative with joint models trained using small amounts of in-domain data.

show abstract

Exploration Based Language Learning for Text-Based Games

Madotto

Namazifar

Huizinga

et al. 2020

View full text Add to dashboard Cite

This work presents an exploration and imitation-learning-based agent capable of state-of-the-art performance in playing text-based computer games. These games are of interest as they can be seen as a testbed for language understanding, problem-solving, and language generation by artificial agents. Moreover, they provide a learning setting in which these skills can be acquired through interactions with an environment rather than using fixed corpora. One aspect that makes these games particularly challenging for learning agents is the combinatorially large action space. Existing methods for solving text-based games are limited to games that are either very simple or have an action space restricted to a predetermined set of admissible actions. In this work, we propose to use the exploration approach of Go-Explore for solving text-based games. More specifically, in an initial exploration phase, we first extract trajectories with high rewards, after which we train a policy to solve the game by imitating these trajectories. Our experiments show that this approach outperforms existing solutions in solving text-based games, and it is more sample efficient in terms of the number of interactions with the environment. Moreover, we show that the learned policy can generalize better than existing solutions to unseen games without using any restriction on the action space.

show abstract

Flexibly-Structured Model for Task-Oriented Dialogues

Shu¹,

Molino²,

Namazifar³

et al. 2019

View full text Add to dashboard Cite

This paper proposes a novel end-to-end architecture for task-oriented dialogue systems. It is based on a simple and practical yet very effective sequence-to-sequence approach, where language understanding and state tracking tasks are modeled jointly with a structured copy-augmented sequential decoder and a multi-label decoder for each slot. The policy engine and language generation tasks are modeled jointly following that. The copy-augmented sequential decoder deals with new or unknown values in the conversation, while the multi-label decoder combined with the sequential decoder ensures the explicit assignment of values to slots. On the generation part, slot binary classifiers are used to improve performance. This architecture is scalable to real-world scenarios and is shown through an empirical evaluation to achieve state-ofthe-art performance on both the Cambridge Restaurant dataset and the Stanford in-car assistant dataset 12 .

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Mahdi Namazifar

Some results on the strength of relaxations of multilinear functions

Language Model is all You Need: Natural Language Understanding as Question Answering

Joint Contextual Modeling for ASR Correction and Language Understanding

Exploration Based Language Learning for Text-Based Games

Flexibly-Structured Model for Task-Oriented Dialogues

Contact Info

Product

Resources

About