We study the weakly supervised question answering problem. Weakly supervised question answering aims to learn how the questions should be answered directly from the pairs without golden solutions/evidences, which makes question answering models much easier to scale to many domains. However, in weak supervision setup, a question typically involves many candidate solutions and the spuriousness of candidate solutions will hurt the performance of the question answering models. In this paper, we present an effective method to learn a question answering model in a weak supervision way. Specifically, in order to reduce the spuriousness of candidate solutions used for training, we design several simple yet effective scoring functions to rank the candidate solutions. Despite its simplicity, this ranking process can improve the quality of the training data significantly with fewer spurious candidates left. Then, different from previous approaches that either treat all candidates equally for training or only select the candidate with the largest likelihood in each iteration, we formulate this problem as a multi-task learning problem by weighing the losses computed from top-k candidates. Experimental results show that, our method can outperform previous approaches on both semantic parsing and machine reading comprehension tasks.

The Web Conference is announcing latest news and developments biweekly or on a monthly basis. We respect The General Data Protection Regulation 2016/679.