Pointing Out SQL Queries From Text

  • Chenglong Wang ,
  • Marc Brockschmidt ,
  • Rishabh Singh

MSR-TR-2017-45 |

The digitization of data has resulted in making datasets available to millions of users in the form of relational databases and spreadsheet tables. However, a majority of these users come from diverse backgrounds and lack the programming expertise to query and analyze such tables. We present a system that allows for querying data tables using natural language questions, where the system translates the question into an executable SQL query. We use a deep sequence to sequence model in wich the decoder uses a simple type system of SQL expressions to structure the output prediction. Based on the type, the decoder either copies an output token from the input question using an attention-based copying mechanism or generates it from a fixed vocabulary. We also introduce a value-based loss function that transforms a distribution over locations to copy from into a distribution over the set of input tokens to improve training of our model. We evaluate our model on the recently released WikiSQL dataset and show that our model trained using only supervised learning significantly outperforms the current state-of-the-art Seq2SQL model that uses reinforcement learning.

Publication Downloads

Pseudo-Task MAML

August 7, 2018

This is PointSQL, the source codes of Natural Language to Structured Query Generation via Meta-Learning and Pointing Out SQL Queries From Text from Microsoft Research. We present the setup for the WikiSQL experiments.