Photo of Brian DuSell.

Brian DuSell

/ˈbɹaɪən duˈsɛl/
杜亦然
デュセル・ブライアン

I am currently on the academic job market!

I am a computer science researcher who specializes in natural language processing, artificial intelligence, and formal languages. I am interested in the structure of human and computer languages, and what the difficulty of processing this structure implies about the limitations of modern AI technology, including large language models (LLMs). I am particularly interested in approaches to surmounting these limitations that take into account syntactic structure and connections to formal language theory.

I am currently a postdoc at ETH Zürich in Ryan Cotterell's lab, where I have been working on relating neural network architectures to formal language classes, and tokenization algorithms, among other things. Before coming to ETH, I completed my PhD at the University of Notre Dame, where I was advised by David Chiang. My dissertation proposed new neural network architectures that are capable of handling ambiguous syntactic structure, which is a common phenomenon in human language. I taught the Theory of Computing course at Notre Dame in the spring semester of 2022. I got my Bachelor's degree in Computer Science also at Notre Dame; I worked in industry for a few years before coming back for grad school.

Publications

  • From Language Models over Tokens to Language Models over Characters
    Tim Vieira, Ben LeBrun, Mario Giulianelli, Juan Luis Gastaldi, Brian DuSell, John Terilla, Timothy J. O'Donnell, and Ryan Cotterell
    arXiv preprint
    code arXiv
    @misc{vieira-etal-2024-language,
        title = "From Language Models over Tokens to Language Models over Characters",
        author = "Vieira, Tim and LeBrun, Ben and Giulianelli, Mario and Gastaldi, Juan Luis and DuSell, Brian and Terilla, John and O'Donnell, Timothy J. and Cotterell, Ryan",
        year = "2024",
        month = dec,
        url = "https://arxiv.org/abs/2412.03719",
        doi = "10.48550/arXiv.2412.03719",
        note = "{arXiv}:2412.03719"
    }
  • Training Neural Networks as Recognizers of Formal Languages
    Alexandra Butoi, Ghazal Khalighinejad, Anej Svete, Josef Valvoda, Ryan Cotterell, and Brian DuSell
    arXiv preprint
    code arXiv
    @misc{butoi-etal-2024-training,
        title = "Training Neural Networks as Recognizers of Formal Languages",
        author = "Butoi, Alexandra and Khalighinejad, Ghazal and Svete, Anej and Valvoda, Josef and Cotterell, Ryan and DuSell, Brian",
        year = "2024",
        month = nov,
        url = "https://arxiv.org/abs/2411.07107",
        doi = "10.48550/arXiv.2411.07107",
        note = "{arXiv}:2411.07107"
    }
  • On the Proper Treatment of Tokenization in Psycholinguistics
    Mario Giulianelli, Luca Malagutti, Juan Luis Gastaldi, Brian DuSell, Tim Vieira, and Ryan Cotterell
    EMNLP 2024
    code arXiv ACL
    @inproceedings{giulianelli-etal-2024-proper,
        title = "On the Proper Treatment of Tokenization in Psycholinguistics",
        author = "Giulianelli, Mario and Malagutti, Luca and Gastaldi, Juan Luis and DuSell, Brian and Vieira, Tim and Cotterell, Ryan",
        booktitle = "Proc. EMNLP",
        pages = "18556--18572",
        year = "2024",
        month = nov,
        address = "Miami, Florida",
        publisher = "Association for Computational Linguistics",
        url = "https://aclanthology.org/2024.emnlp-main.1032/"
    }
  • The Foundations of Tokenization: Statistical and Computational Concerns
    Juan Luis Gastaldi, John Terilla, Luca Malagutti, Brian DuSell, Tim Vieira, and Ryan Cotterell
    arXiv preprint
    arXiv
    @misc{gastaldi-etal-2024-foundations,
        title = "The Foundations of Tokenization: Statistical and Computational Concerns",
        author = "Gastaldi, Juan Luis and Terilla, John and Malagutti, Luca and DuSell, Brian and Vieira, Tim and Cotterell, Ryan",
        year = "2024",
        month = jul,
        url = "https://arxiv.org/abs/2407.11606",
        doi = "10.48550/arXiv.2407.11606",
        note = "{arXiv}:2407.11606"
    }
  • PILA: A Historical-Linguistic Dataset of Proto-Italic and Latin
    Stephen Bothwell, Brian DuSell, David Chiang, and Brian Krostenko
    LREC-COLING 2024
    code arXiv ACL
    @inproceedings{bothwell-etal-2024-pila,
        title = "{PILA}: A Historical-Linguistic Dataset of {P}roto-{I}talic and {L}atin",
        author = "Bothwell, Stephen and DuSell, Brian and Chiang, David and Krostenko, Brian",
        booktitle = "Proc. LREC-COLING",
        pages = "12749--12760",
        year = "2024",
        month = may,
        address = "Turin, Italy",
        publisher = "ELRA and ICCL",
        url = "https://aclanthology.org/2024.lrec-main.1116/"
    }
  • Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns
    Brian DuSell and David Chiang
    ICLR 2024 🏆 Spotlight paper!
    video slides poster code arXiv OpenReview
    @inproceedings{dusell-chiang-2024-stack,
        title = "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns",
        author = "DuSell, Brian and Chiang, David",
        booktitle = "Proc. ICLR",
        year = "2024",
        month = may,
        address = "Vienna, Austria",
        url = "https://openreview.net/forum?id=XVhm3X8Fum"
    }
  • Nondeterministic Stacks in Neural Networks
    Brian DuSell
    PhD Dissertation (2023)
    arXiv CurateND
    @phdthesis{dusell-2023-nondeterministic,
        title = "Nondeterministic Stacks in Neural Networks",
        author = "DuSell, Brian",
        school = "University of Notre Dame",
        year = "2023",
        month = apr,
        url = "https://curate.nd.edu/show/jh343r10k4d",
        doi = "10.7274/jh343r10k4d"
    }
  • The Surprising Computational Power of Nondeterministic Stack RNNs
    Brian DuSell and David Chiang
    ICLR 2023
    code arXiv OpenReview
    @inproceedings{dusell-chiang-2023-surprising,
        title = "The Surprising Computational Power of Nondeterministic Stack {RNN}s",
        author = "DuSell, Brian and Chiang, David",
        booktitle = "Proc. ICLR",
        year = "2023",
        month = may,
        address = "Kigali, Rwanda",
        url = "https://openreview.net/forum?id=o58JtGDs6y"
    }
  • Algorithms for Weighted Pushdown Automata
    Alexandra Butoi, Brian DuSell, Tim Vieira, Ryan Cotterell, and David Chiang
    EMNLP 2022
    code arXiv ACL
    @inproceedings{butoi-etal-2022-algorithms,
        title = "Algorithms for Weighted Pushdown Automata",
        author = "Butoi, Alexandra and DuSell, Brian and Vieira, Tim and Cotterell, Ryan and Chiang, David",
        booktitle = "Proc. EMNLP",
        pages = "9669--9680",
        year = "2022",
        month = dec,
        address = "Abu Dhabi, UAE",
        publisher = "Association for Computational Linguistics",
        url = "https://aclanthology.org/2022.emnlp-main.656/",
        doi = "10.18653/v1/2022.emnlp-main.656"
    }
  • Learning Hierarchical Structures with Differentiable Nondeterministic Stacks
    Brian DuSell and David Chiang
    ICLR 2022 🏆 Spotlight paper!
    code arXiv OpenReview
    @inproceedings{dusell-chiang-2022-learning,
        title = "Learning Hierarchical Structures with Differentiable Nondeterministic Stacks",
        author = "DuSell, Brian and Chiang, David",
        booktitle = "Proc. ICLR",
        year = "2022",
        month = apr,
        address = "Online",
        url = "https://openreview.net/forum?id=5LXw_QplBiF"
    }
  • Learning Context-Free Languages with Nondeterministic Stack RNNs
    Brian DuSell and David Chiang
    CoNLL 2020
    code arXiv ACL
    @inproceedings{dusell-chiang-2020-learning,
        title = "Learning Context-Free Languages with Nondeterministic Stack {RNN}s",
        author = "DuSell, Brian and Chiang, David",
        booktitle = "Proc. CoNLL",
        pages = "507--519",
        year = "2020",
        month = nov,
        address = "Online",
        publisher = "Association for Computational Linguistics",
        url = "https://aclanthology.org/2020.conll-1.41/",
        doi = "10.18653/v1/2020.conll-1.41"
    }
  • Efficiency through Auto-Sizing: Notre Dame NLP's Submission to the WNGT 2019 Efficiency Task
    Kenton Murray, Brian DuSell, and David Chiang
    Workshop on Neural Generation and Translation 2019
    arXiv ACL
    @inproceedings{murray-etal-2019-efficiency,
        title = "Efficiency through Auto-Sizing: {N}otre {D}ame {NLP}{'}s Submission to the {WNGT} 2019 Efficiency Task",
        author = "Murray, Kenton and DuSell, Brian and Chiang, David",
        booktitle = "Proc. Workshop on Neural Generation and Translation",
        pages = "297--301",
        year = "2019",
        month = nov,
        address = "Hong Kong",
        publisher = "Association for Computational Linguistics",
        url = "https://aclanthology.org/D19-5634/",
        doi = "10.18653/v1/D19-5634"
    }
  • Recorded Talks

    • Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns

      FLaNN
      video slides
    • Nondeterministic Stacks in Neural Networks

      FLaNN
      video

    News

    Experience

    • University of Notre Dame logo

      ETH Zürich

      2023-present
      Postdoc
      Supervisor: Ryan Cotterell
    • University of Notre Dame logo

      University of Notre Dame

      2016-2023
      M.S. and Ph.D., Computer Science
      Dissertation: Nondeterministic Stacks in Neural Networks
      Advisor: David Chiang
    • AWS logo

      Amazon Web Services

      Jun-Sep 2020 and 2021
      Applied Scientist Intern
      Team: Amazon Translate
    • University of Notre Dame logo

      University of Notre Dame

      2009-2013
      B.S. in Computer Science, magna cum laude

    Software

    • Neural Network Recognizers

      PyTorch code for training RNNs, LSTMs, and transformers as recognizers of formal languages. Supports multi-task learning and implements efficient generation of both positive and negative examples.

    • Rau

      Language modeling and sequence-to-sequence pipeline for PyTorch.

    • Stack Attention

      PyTorch implementation of stack attention, including a full machine translation pipeline.

    • Nondeterministic Stack RNN

      PyTorch implementation of our Nondeterministic Stack RNN model, as well as other Stack RNN models.

    • Semiring Einsum

      Python package for efficiently performing einsum operations in different semirings in PyTorch.

    • Jishosen

      Online Japanese-English dictionary.

    See more on my GitHub page.

    Hey, you scrolled down this far!

    Despite my interest in languages, I am monolingual. But I dabble in ancient Greek, Latin, 日本語, 中文, and now that I live in Switzerland, a little bit of Deutsch.

    Fun fact: my grandfather, D. Lee DuSell, was an artist and sculptor, and you can check out his work on his website, made by yours truly.