Publications

Filter by type:
. Task-aware Retrieval with Instructions. Preprint, 2022.

Preprint PDF

. EditEval: An Instruction-Based Benchmark for Text Improvements. Preprint, 2022.

Preprint PDF

. PEER: A Collaborative Language Model. Preprint, 2022.

Preprint PDF

. Atlas: Few-shot Learning with Retrieval-Augmented Language Models. Preprint, 2022.

Preprint PDF Dataset Project Website

. Improving Wikipedia Verifiability with AI. Preprint, 2022.

PDF Code

. Domain-matched Pre-training Tasks for Dense Retrieval. Findings of NAACL 2022, 2022.

PDF

. Challenges in Generalization in Open Domain Question Answering. Findings of NAACL 2022, 2022.

PDF

. Boosted Dense Retriever. NAACL 2022, 2022.

PDF

. Autoregressive Search Engines: Generating Substrings as Document Identifiers. NeurIPS 2022, 2022.

Preprint PDF Code

. Reasoning over Public and Private Data in Retrieval-Based Systems. ArXiv Preprint, 2022.

Preprint PDF

. The Web Is Your Oyster -- Knowledge-Intensive NLP against a Very Large Web Corpus. Arxiv Preprint, 2021.

Preprint PDF

. Salient Phrase Aware Dense Retrieval: Can a Dense Retriever Imitate a Sparse One?. Arxiv Preprint, 2021.

Preprint PDF

. A Few More Examples May Be Worth Billions of Parameters. Findings of EMNLP 2022, 2021.

Preprint PDF

. PAQ: 65 Million Probably-Asked Questions and What You Can Do With Them. TACL 2021, 2021.

Preprint PDF Dataset Project Website

. Pretrained Language Models for Biomedical and Clinical Tasks:Understanding and Extending the State-of-the-Art. In ClinicalNLP 2020 @ EMNLP 2020, 2020.

PDF Code Project

. Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval. ICLR 2021, 2020.

Preprint PDF

. KILT: a Benchmark for Knowledge Intensive Language Tasks. NAACL 2021, 2020.

Preprint PDF Code

. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. NeurIPS 2020, 2020.

Preprint PDF

. How Context Affects Language Models' Factual Predictions. Best Paper, AKBC 2020, 2020.

Preprint PDF

. Dense Passage Retrieval for Open-Domain Question Answering. EMNLP 2020, 2020.

Preprint PDF Code

. Unsupervised Question Decomposition for Question Answering. EMNLP 2020, 2020.

Preprint PDF Code

. MLQA: Evaluating Cross-lingual Extractive Question Answering. In ACL 2020, 2019.

PDF Code Dataset

. Language Models as Knowledge Bases?. EMNLP 2019, 2019.

PDF Code Dataset Project

. Unsupervised Question Answering by Cloze Translation. In ACL 2019, 2019.

PDF Code Dataset Project

. Interpretation of Natural Language Rules in Conversational Machine Reading. In EMNLP 2018, 2018.

PDF Code Dataset Website

. Understanding and predicting disease relationships through similarity fusion. Bioinformatics 2018, 2018.

PDF Code Dataset DOI