Brian DuSell

/ˈbɹaɪən duˈsɛl/

杜亦然
デュセル・ブライアン

I am a computer science researcher who specializes in natural language processing, artificial intelligence, and formal languages. I am interested in the structure of human and computer languages, and what the difficulty of processing this structure implies about the limitations of modern AI technology, including large language models (LLMs). I am particularly interested in approaches to surmounting these limitations that take into account syntactic structure and connections to formal language theory.

I am currently a postdoc at ETH Zürich in Ryan Cotterell's lab, where I have been working on relating neural network architectures to formal language classes, and tokenization algorithms, among other things. Before coming to ETH, I completed my PhD at the University of Notre Dame, where I was advised by David Chiang. My dissertation proposed new neural network architectures that are capable of handling ambiguous syntactic structure, which is a common phenomenon in human language. I taught the Theory of Computing course at Notre Dame in the spring semester of 2022. I got my Bachelor's degree in Computer Science also at Notre Dame; I worked in industry for a few years before coming back for grad school.

Publications

Information Locality as an Inductive Bias for Neural Language Models
Taiga Someya, Anej Svete, Brian DuSell, Timothy J. O'Donnell, Mario Giulianelli, and Ryan Cotterell
ACL 2025
code arXiv

@inproceedings{someya-etal-2025-information,
    title = "Information Locality as an Inductive Bias for Neural Language Models",
    author = "Someya, Taiga and Svete, Anej and DuSell, Brian and O'Donnell, Timothy J. and Giulianelli, Mario and Cotterell, Ryan",
    booktitle = "Proc. ACL",
    year = "2025",
    month = jul # "--" # aug,
    address = "Vienna, Austria",
    url = "https://arxiv.org/abs/2506.05136"
}

From Language Models over Tokens to Language Models over Characters
Tim Vieira, Ben LeBrun, Mario Giulianelli, Juan Luis Gastaldi, Brian DuSell, John Terilla, Timothy J. O'Donnell, and Ryan Cotterell
ICML 2025 🏆 Spotlight paper!
code arXiv

@inproceedings{vieira-etal-2025-language,
    title = "From Language Models over Tokens to Language Models over Characters",
    author = "Vieira, Tim and LeBrun, Ben and Giulianelli, Mario and Gastaldi, Juan Luis and DuSell, Brian and Terilla, John and O'Donnell, Timothy J. and Cotterell, Ryan",
    booktitle = "Forty-second International Conference on Machine Learning",
    year = "2025",
    month = jul,
    address = "Vancouver, Canada",
    url = "https://arxiv.org/abs/2412.03719"
}

Language Models over Canonical Byte-Pair Encodings
Tim Vieira, Tianyu Liu, Clemente Pasti, Yahya Emara, Brian DuSell, Benjamin LeBrun, Mario Giulianelli, Juan Luis Gastaldi, Timothy J. O'Donnell, and Ryan Cotterell
ICML 2025
code arXiv

@inproceedings{vieira-etal-2025-language,
    title = "Language Models over Canonical Byte-Pair Encodings",
    author = "Vieira, Tim and Liu, Tianyu and Pasti, Clemente and Emara, Yahya and DuSell, Brian and LeBrun, Benjamin and Giulianelli, Mario and Gastaldi, Juan Luis and O'Donnell, Timothy J. and Cotterell, Ryan",
    booktitle = "Forty-second International Conference on Machine Learning",
    year = "2025",
    month = jul,
    address = "Vancouver, Canada",
    url = "https://arxiv.org/abs/2506.07956"
}

Training Neural Networks as Recognizers of Formal Languages
Alexandra Butoi, Ghazal Khalighinejad, Anej Svete, Josef Valvoda, Ryan Cotterell, and Brian DuSell
ICLR 2025
code arXiv OpenReview

@inproceedings{butoi-etal-2025-training,
    title = "Training Neural Networks as Recognizers of Formal Languages",
    author = "Butoi, Alexandra and Khalighinejad, Ghazal and Svete, Anej and Valvoda, Josef and Cotterell, Ryan and DuSell, Brian",
    booktitle = "The Thirteenth International Conference on Learning Representations",
    year = "2025",
    month = apr,
    address = "Singapore",
    url = "https://openreview.net/forum?id=aWLQTbfFgV"
}

The Foundations of Tokenization: Statistical and Computational Concerns
Juan Luis Gastaldi, John Terilla, Luca Malagutti, Brian DuSell, Tim Vieira, and Ryan Cotterell
ICLR 2025
arXiv OpenReview

@inproceedings{gastaldi-etal-2025-foundations,
    title = "The Foundations of Tokenization: Statistical and Computational Concerns",
    author = "Gastaldi, Juan Luis and Terilla, John and Malagutti, Luca and DuSell, Brian and Vieira, Tim and Cotterell, Ryan",
    booktitle = "The Thirteenth International Conference on Learning Representations",
    year = "2025",
    month = apr,
    address = "Singapore",
    url = "https://openreview.net/forum?id=B5iOSxM2I0"
}

On the Proper Treatment of Tokenization in Psycholinguistics
Mario Giulianelli, Luca Malagutti, Juan Luis Gastaldi, Brian DuSell, Tim Vieira, and Ryan Cotterell
EMNLP 2024
code arXiv ACL

@inproceedings{giulianelli-etal-2024-proper,
    title = "On the Proper Treatment of Tokenization in Psycholinguistics",
    author = "Giulianelli, Mario and Malagutti, Luca and Gastaldi, Juan Luis and DuSell, Brian and Vieira, Tim and Cotterell, Ryan",
    booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing",
    pages = "18556--18572",
    year = "2024",
    month = nov,
    address = "Miami, Florida",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.emnlp-main.1032/"
}

PILA: A Historical-Linguistic Dataset of Proto-Italic and Latin
Stephen Bothwell, Brian DuSell, David Chiang, and Brian Krostenko
LREC-COLING 2024
code arXiv ACL

@inproceedings{bothwell-etal-2024-pila,
    title = "{PILA}: A Historical-Linguistic Dataset of {P}roto-{I}talic and {L}atin",
    author = "Bothwell, Stephen and DuSell, Brian and Chiang, David and Krostenko, Brian",
    booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
    pages = "12749--12760",
    year = "2024",
    month = may,
    address = "Turin, Italy",
    publisher = "ELRA and ICCL",
    url = "https://aclanthology.org/2024.lrec-main.1116/"
}

Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns
Brian DuSell and David Chiang
ICLR 2024 🏆 Spotlight paper!
video slides poster code arXiv OpenReview

@inproceedings{dusell-chiang-2024-stack,
    title = "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns",
    author = "DuSell, Brian and Chiang, David",
    booktitle = "The Twelfth International Conference on Learning Representations",
    year = "2024",
    month = may,
    address = "Vienna, Austria",
    url = "https://openreview.net/forum?id=XVhm3X8Fum"
}

Nondeterministic Stacks in Neural Networks
Brian DuSell
PhD Dissertation (2023)
arXiv CurateND

@phdthesis{dusell-2023-nondeterministic,
    title = "Nondeterministic Stacks in Neural Networks",
    author = "DuSell, Brian",
    school = "University of Notre Dame",
    year = "2023",
    month = apr,
    url = "https://curate.nd.edu/show/jh343r10k4d",
    doi = "10.7274/jh343r10k4d"
}

The Surprising Computational Power of Nondeterministic Stack RNNs
Brian DuSell and David Chiang
ICLR 2023
code arXiv OpenReview

@inproceedings{dusell-chiang-2023-surprising,
    title = "The Surprising Computational Power of Nondeterministic Stack {RNN}s",
    author = "DuSell, Brian and Chiang, David",
    booktitle = "The Eleventh International Conference on Learning Representations",
    year = "2023",
    month = may,
    address = "Kigali, Rwanda",
    url = "https://openreview.net/forum?id=o58JtGDs6y"
}

Algorithms for Weighted Pushdown Automata
Alexandra Butoi, Brian DuSell, Tim Vieira, Ryan Cotterell, and David Chiang
EMNLP 2022
code arXiv ACL

@inproceedings{butoi-etal-2022-algorithms,
    title = "Algorithms for Weighted Pushdown Automata",
    author = "Butoi, Alexandra and DuSell, Brian and Vieira, Tim and Cotterell, Ryan and Chiang, David",
    booktitle = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing",
    pages = "9669--9680",
    year = "2022",
    month = dec,
    address = "Abu Dhabi, UAE",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.emnlp-main.656/",
    doi = "10.18653/v1/2022.emnlp-main.656"
}

Learning Hierarchical Structures with Differentiable Nondeterministic Stacks
Brian DuSell and David Chiang
ICLR 2022 🏆 Spotlight paper!
code arXiv OpenReview

@inproceedings{dusell-chiang-2022-learning,
    title = "Learning Hierarchical Structures with Differentiable Nondeterministic Stacks",
    author = "DuSell, Brian and Chiang, David",
    booktitle = "International Conference on Learning Representations",
    year = "2022",
    month = apr,
    address = "Online",
    url = "https://openreview.net/forum?id=5LXw_QplBiF"
}

Learning Context-Free Languages with Nondeterministic Stack RNNs
Brian DuSell and David Chiang
CoNLL 2020
code arXiv ACL

@inproceedings{dusell-chiang-2020-learning,
    title = "Learning Context-Free Languages with Nondeterministic Stack {RNN}s",
    author = "DuSell, Brian and Chiang, David",
    booktitle = "Proceedings of the 24th Conference on Computational Natural Language Learning",
    pages = "507--519",
    year = "2020",
    month = nov,
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.conll-1.41/",
    doi = "10.18653/v1/2020.conll-1.41"
}

Efficiency through Auto-Sizing: Notre Dame NLP's Submission to the WNGT 2019 Efficiency Task
Kenton Murray, Brian DuSell, and David Chiang
Proceedings of the 3rd Workshop on Neural Generation and Translation
arXiv ACL

@inproceedings{murray-etal-2019-efficiency,
    title = "Efficiency through Auto-Sizing: {N}otre {D}ame {NLP}{'}s Submission to the {WNGT} 2019 Efficiency Task",
    author = "Murray, Kenton and DuSell, Brian and Chiang, David",
    booktitle = "Proceedings of the 3rd Workshop on Neural Generation and Translation",
    pages = "297--301",
    year = "2019",
    month = nov,
    address = "Hong Kong",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5634/",
    doi = "10.18653/v1/D19-5634"
}

Recorded Talks

Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns
Apr 22, 2024
FLaNN
video slides
Nondeterministic Stacks in Neural Networks
Oct 17, 2022
FLaNN
video

News

May 2025 Our paper "Information Locality as an Inductive Bias for Neural Language Models" (Taiga Someya, Anej Svete, Brian DuSell, Timothy J. O'Donnell, Mario Giulianelli, and Ryan Cotterell) has been accepted at ACL 2025.
May 2025 Two papers accepted at ICML 2025.
- "From Language Models over Tokens to Language Models over Characters" (Tim Vieira, Ben LeBrun, Mario Giulianelli, Juan Luis Gastaldi, Brian DuSell, John Terilla, Timothy J. O'Donnell, and Ryan Cotterell)
- "Language Models over Canonical Byte-Pair Encodings" (Tim Vieira, Tianyu Liu, Clemente Pasti, Yahya Emara, Brian DuSell, Benjamin LeBrun, Mario Giulianelli, Juan Luis Gastaldi, Timothy J. O'Donnell, and Ryan Cotterell)
Jan 22, 2025 Two papers accepted at ICLR 2025!
- "Training Neural Networks as Recognizers of Formal Languages" (Alexandra Butoi, Ghazal Khalighinejad, Anej Svete, Josef Valvoda, Ryan Cotterell, and Brian DuSell)
- "The Foundations of Tokenization: Statistical and Computational Concerns" (Juan Luis Gastaldi, John Terilla, Luca Malagutti, Brian DuSell, Tim Vieira, and Ryan Cotterell)
Jan 21-22, 2025 I presented our work "Training Neural Networks as Recognizers of Formal Languages" with Alexandra Butoi at a workshop with Michael Hahn's group at Saarland University.
Nov 2024 Our paper "On the Proper Treatment of Tokenization in Psycholinguistics" (Mario Giulianelli, Luca Malagutti, Juan Luis Gastaldi, Brian DuSell, Tim Vieira, and Ryan Cotterell) was presented at EMNLP 2024.

Experience

ETH Zürich
2023-present
Postdoc
Supervisor: Ryan Cotterell
University of Notre Dame
2016-2023
M.S. and Ph.D., Computer Science
Dissertation: Nondeterministic Stacks in Neural Networks
Advisor: David Chiang
Amazon Web Services
Jun-Sep 2020 and 2021
Applied Scientist Intern
Team: Amazon Translate
University of Notre Dame
2009-2013
B.S. in Computer Science, magna cum laude

Software

Neural Network Recognizers
PyTorch code for training RNNs, LSTMs, and transformers as recognizers of formal languages. Supports multi-task learning and implements efficient generation of both positive and negative examples.
Rau
Language modeling and sequence-to-sequence pipeline for PyTorch.
Stack Attention
PyTorch implementation of stack attention, including a full machine translation pipeline.
Nondeterministic Stack RNN
PyTorch implementation of our Nondeterministic Stack RNN model, as well as other Stack RNN models.
Semiring Einsum
Python package for efficiently performing einsum operations in different semirings in PyTorch.
Jishosen
Online Japanese-English dictionary.

See more on my GitHub page.

Hey, you scrolled down this far!

Despite my interest in languages, I am monolingual. But I dabble in ancient Greek, Latin, 日本語, 中文, and now that I live in Switzerland, a little bit of Deutsch.

Fun fact: my grandfather, D. Lee DuSell, was an artist and sculptor, and you can check out his work on his website, made by yours truly.

Publications

Recorded Talks

News

Experience

ETH Zürich

University of Notre Dame

Amazon Web Services

University of Notre Dame

Software

Neural Network Recognizers

Rau

Stack Attention

Nondeterministic Stack RNN

Semiring Einsum

Jishosen

Hey, you scrolled down this far!