Every once in a while I hope to write a post about what I’m reading / researching / distracted by. Blog upload frequency highly unpredictable!
Earlier this year I led a collaboration between Cray Supercomputers, Digital Catapult and Bloomsbury AI (my previous employer). This post is an informal report of how we used Cray’s compute resources to both boost the accuracy and accelerate the speed of training machine reading models. With parallel training, we were able to break accuracy records on the TriviaQA Wiki task, without any change in model architecture.
If you’re wondering how to scale up and parallelize your network training, there are excellent tools like Horovod that make it easy with almost no code changes required.
I just got back from EMNLP in Brussels. We were presenting our dataset paper ShARC (a blog post about ShARC will be coming soon). The scale and breadth of the conference was really something, with so many smart people doing amazing things. It was also great to meet, network and talk research with all kinds of academics in NLP. We’ve got some exciting projects planned already, and I’m really just starting out.
Welcome to my first real blog post. Read more about what it’s all for here. As a reminder, this is mainly a tool for me to organise my time and thoughts. These posts are not going to be infallible pieces of academic writing, (they’re not papers and shouldn’t be judged as such!) but friendly constructive feedback is welcome! Also, I expect to amend these pieces from time to time too.