More Publications

(2021). PAQ: 65 Million Probably-Asked Questions and What You Can Do With Them. Arxiv Preprint.

Preprint PDF Dataset Project Website

(2020). Pretrained Language Models for Biomedical and Clinical Tasks:Understanding and Extending the State-of-the-Art. In ClinicalNLP 2020 @ EMNLP 2020.

PDF Code Project

(2020). Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval. Arxiv Preprint.

Preprint PDF

Recent Posts

More Posts

Earlier this year I led a collaboration between Cray Supercomputers, Digital Catapult and Bloomsbury AI (my previous employer). This …

I just got back from EMNLP in Brussels. We were presenting our dataset paper ShARC (a blog post about ShARC will be coming soon). The …


Here are some projects I’m involved with:


LAMA ia a probe for analyzing factual and commonsense knowledge in language models.


Code, Data and Models to run Unsupervised Question Answering data generation on your own documents


Cape is a software solution allowing for SUPER easy integration of Machine Reading into software.