Brian DuSell
デュセル・ブライアン
I am currently on the academic job market!
I am a computer science researcher who specializes in natural language processing, artificial intelligence, and formal languages. I am interested in the structure of human and computer languages, and what the difficulty of processing this structure implies about the limitations of modern AI technology, including large language models (LLMs). I am particularly interested in approaches to surmounting these limitations that take into account syntactic structure and connections to formal language theory.
I am currently a postdoc at ETH Zürich in Ryan Cotterell's lab, where I have been working on relating neural network architectures to formal language classes, and tokenization algorithms, among other things. Before coming to ETH, I completed my PhD at the University of Notre Dame, where I was advised by David Chiang. My dissertation proposed new neural network architectures that are capable of handling ambiguous syntactic structure, which is a common phenomenon in human language. I taught the Theory of Computing course at Notre Dame in the spring semester of 2022. I got my Bachelor's degree in Computer Science also at Notre Dame; I worked in industry for a few years before coming back for grad school.
Publications
Tim Vieira, Ben LeBrun, Mario Giulianelli, Juan Luis Gastaldi, Brian DuSell, John Terilla, Timothy J. O'Donnell, and Ryan Cotterell
arXiv preprint
code arXiv
@misc{vieira-etal-2024-language, title = "From Language Models over Tokens to Language Models over Characters", author = "Vieira, Tim and LeBrun, Ben and Giulianelli, Mario and Gastaldi, Juan Luis and DuSell, Brian and Terilla, John and O'Donnell, Timothy J. and Cotterell, Ryan", year = "2024", month = dec, url = "https://arxiv.org/abs/2412.03719", doi = "10.48550/arXiv.2412.03719", note = "{arXiv}:2412.03719" }
Alexandra Butoi, Ghazal Khalighinejad, Anej Svete, Josef Valvoda, Ryan Cotterell, and Brian DuSell
arXiv preprint
code arXiv
@misc{butoi-etal-2024-training, title = "Training Neural Networks as Recognizers of Formal Languages", author = "Butoi, Alexandra and Khalighinejad, Ghazal and Svete, Anej and Valvoda, Josef and Cotterell, Ryan and DuSell, Brian", year = "2024", month = nov, url = "https://arxiv.org/abs/2411.07107", doi = "10.48550/arXiv.2411.07107", note = "{arXiv}:2411.07107" }
Mario Giulianelli, Luca Malagutti, Juan Luis Gastaldi, Brian DuSell, Tim Vieira, and Ryan Cotterell
EMNLP 2024
code arXiv ACL
@inproceedings{giulianelli-etal-2024-proper, title = "On the Proper Treatment of Tokenization in Psycholinguistics", author = "Giulianelli, Mario and Malagutti, Luca and Gastaldi, Juan Luis and DuSell, Brian and Vieira, Tim and Cotterell, Ryan", booktitle = "Proc. EMNLP", pages = "18556--18572", year = "2024", month = nov, address = "Miami, Florida", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2024.emnlp-main.1032/" }
Juan Luis Gastaldi, John Terilla, Luca Malagutti, Brian DuSell, Tim Vieira, and Ryan Cotterell
arXiv preprint
arXiv
@misc{gastaldi-etal-2024-foundations, title = "The Foundations of Tokenization: Statistical and Computational Concerns", author = "Gastaldi, Juan Luis and Terilla, John and Malagutti, Luca and DuSell, Brian and Vieira, Tim and Cotterell, Ryan", year = "2024", month = jul, url = "https://arxiv.org/abs/2407.11606", doi = "10.48550/arXiv.2407.11606", note = "{arXiv}:2407.11606" }
Stephen Bothwell, Brian DuSell, David Chiang, and Brian Krostenko
LREC-COLING 2024
code arXiv ACL
@inproceedings{bothwell-etal-2024-pila, title = "{PILA}: A Historical-Linguistic Dataset of {P}roto-{I}talic and {L}atin", author = "Bothwell, Stephen and DuSell, Brian and Chiang, David and Krostenko, Brian", booktitle = "Proc. LREC-COLING", pages = "12749--12760", year = "2024", month = may, address = "Turin, Italy", publisher = "ELRA and ICCL", url = "https://aclanthology.org/2024.lrec-main.1116/" }
Brian DuSell and David Chiang
ICLR 2024 🏆 Spotlight paper!
video slides poster code arXiv OpenReview
@inproceedings{dusell-chiang-2024-stack, title = "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns", author = "DuSell, Brian and Chiang, David", booktitle = "Proc. ICLR", year = "2024", month = may, address = "Vienna, Austria", url = "https://openreview.net/forum?id=XVhm3X8Fum" }
Brian DuSell
PhD Dissertation (2023)
arXiv CurateND
@phdthesis{dusell-2023-nondeterministic, title = "Nondeterministic Stacks in Neural Networks", author = "DuSell, Brian", school = "University of Notre Dame", year = "2023", month = apr, url = "https://curate.nd.edu/show/jh343r10k4d", doi = "10.7274/jh343r10k4d" }
Brian DuSell and David Chiang
ICLR 2023
code arXiv OpenReview
@inproceedings{dusell-chiang-2023-surprising, title = "The Surprising Computational Power of Nondeterministic Stack {RNN}s", author = "DuSell, Brian and Chiang, David", booktitle = "Proc. ICLR", year = "2023", month = may, address = "Kigali, Rwanda", url = "https://openreview.net/forum?id=o58JtGDs6y" }
Alexandra Butoi, Brian DuSell, Tim Vieira, Ryan Cotterell, and David Chiang
EMNLP 2022
code arXiv ACL
@inproceedings{butoi-etal-2022-algorithms, title = "Algorithms for Weighted Pushdown Automata", author = "Butoi, Alexandra and DuSell, Brian and Vieira, Tim and Cotterell, Ryan and Chiang, David", booktitle = "Proc. EMNLP", pages = "9669--9680", year = "2022", month = dec, address = "Abu Dhabi, UAE", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2022.emnlp-main.656/", doi = "10.18653/v1/2022.emnlp-main.656" }
Brian DuSell and David Chiang
ICLR 2022 🏆 Spotlight paper!
code arXiv OpenReview
@inproceedings{dusell-chiang-2022-learning, title = "Learning Hierarchical Structures with Differentiable Nondeterministic Stacks", author = "DuSell, Brian and Chiang, David", booktitle = "Proc. ICLR", year = "2022", month = apr, address = "Online", url = "https://openreview.net/forum?id=5LXw_QplBiF" }
Brian DuSell and David Chiang
CoNLL 2020
code arXiv ACL
@inproceedings{dusell-chiang-2020-learning, title = "Learning Context-Free Languages with Nondeterministic Stack {RNN}s", author = "DuSell, Brian and Chiang, David", booktitle = "Proc. CoNLL", pages = "507--519", year = "2020", month = nov, address = "Online", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2020.conll-1.41/", doi = "10.18653/v1/2020.conll-1.41" }
Kenton Murray, Brian DuSell, and David Chiang
Workshop on Neural Generation and Translation 2019
arXiv ACL
@inproceedings{murray-etal-2019-efficiency, title = "Efficiency through Auto-Sizing: {N}otre {D}ame {NLP}{'}s Submission to the {WNGT} 2019 Efficiency Task", author = "Murray, Kenton and DuSell, Brian and Chiang, David", booktitle = "Proc. Workshop on Neural Generation and Translation", pages = "297--301", year = "2019", month = nov, address = "Hong Kong", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/D19-5634/", doi = "10.18653/v1/D19-5634" }
Recorded Talks
News
- Our paper "Training Neural Networks as Recognizers of Formal Languages" (Alexandra Butoi, Ghazal Khalighinejad, Anej Svete, Josef Valvoda, Ryan Cotterell, and Brian DuSell) is available as a preprint on arXiv.
- Our paper "On the Proper Treatment of Tokenization in Psycholinguistics" (Mario Giulianelli, Luca Malagutti, Juan Luis Gastaldi, Brian DuSell, Tim Vieira, and Ryan Cotterell) was presented at EMNLP 2024.
- Our paper "The Foundations of Tokenization: Statistical and Computational Concerns" (Juan Luis Gastaldi, John Terilla, Luca Malagutti, Brian DuSell, Tim Vieira, and Ryan Cotterell) is available as a preprint on arXiv.
- I attended LREC-COLING 2024 in Turin.
- I attended ICLR 2024 in Vienna.
- I gave a talk on Stack Attention to FLaNN. The recording is available here, and the slides are available here.
- I gave a talk on Stack Attention at the ZurichNLP Meetup.
- Our paper "PILA: A Historical-Linguistic Dataset of Proto-Italic and Latin" (Stephen Bothwell, Brian DuSell, David Chiang, and Brian Krostenko) has been accepted at LREC-COLING 2024.
- My paper with David Chiang, "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns," has been accepted as a spotlight paper at ICLR 2024. See you in Vienna!
- I've started as a postdoc at ETH Zürich in Rycolab!
Experience
ETH Zürich
2023-presentPostdoc
Supervisor: Ryan CotterellUniversity of Notre Dame
2016-2023M.S. and Ph.D., Computer Science
Dissertation: Nondeterministic Stacks in Neural Networks
Advisor: David ChiangAmazon Web Services
Jun-Sep 2020 and 2021Applied Scientist Intern
Team: Amazon TranslateUniversity of Notre Dame
2009-2013B.S. in Computer Science, magna cum laude
Software
Neural Network Recognizers
PyTorch code for training RNNs, LSTMs, and transformers as recognizers of formal languages. Supports multi-task learning and implements efficient generation of both positive and negative examples.
Rau
Language modeling and sequence-to-sequence pipeline for PyTorch.
Stack Attention
PyTorch implementation of stack attention, including a full machine translation pipeline.
Nondeterministic Stack RNN
PyTorch implementation of our Nondeterministic Stack RNN model, as well as other Stack RNN models.
Semiring Einsum
Python package for efficiently performing einsum operations in different semirings in PyTorch.
Jishosen
Online Japanese-English dictionary.
See more on my GitHub page.
Hey, you scrolled down this far!
Despite my interest in languages, I am monolingual. But I dabble in ancient Greek, Latin, 日本語, 中文, and now that I live in Switzerland, a little bit of Deutsch.
Fun fact: my grandfather, D. Lee DuSell, was an artist and sculptor, and you can check out his work on his website, made by yours truly.