Question and Answer Test-Train Overlap in Open-Domain Question Answering Datasets is now on ArXiv! Do you use NaturalQuestions, TriviaQA, or WebQuestions? It turns out 60% of test set answers are also in the train set. More surprising, 30% of test questions have a close paraphrase in the train set. We look at what this means for models. Annotations and code available here