Culture and
Computation Lab
Welcome! The Culture and Computation Lab is a lab at Cornell University specializing in the intersection of the arts, social sciences, and engineering. Our work builds on research in cultural analytics, natural language processing, digital humanities, legal studies, etc.
Read more about our work here, and see our members here.
| 2025-11-21 |
Causal Effect of Character Gender on Readers’ Preferences Federica Bologna, Ian Lundberg, Matthew Wilkens Anthology of Computers and the Humanities, 2025 |
| 2025-11-21 |
Zero-shot Methods for Historical Text Restoration Kiara M. H. Liu, Martin Mueller, Matthew Wilkens Anthology of Computers and the Humanities, 2025 |
| 2025-11-21 |
Castles, Battlefields, and Continents: A Dataset of Maps from Literature Axel Bax, David Mimno, Matthew Wilkens Anthology of Computers and the Humanities, 2025 |
| 2025-10-12 |
LONGQAEVAL: Designing Reliable Evaluations of Long-Form Clinical QA under Resource Constraints Federica Bologna, Tiffany Pan, Matthew Wilkens, Yue Guo, Lucy Lu Wang arXiv.org, 2025 |
| 2025-10-10 |
NarraBench: A Comprehensive Framework for Narrative Benchmarking Sil Hamilton, Matthew Wilkens, Andrew Piper arXiv.org, 2025 |
| 2025-10-08 |
Agent Bain vs. Agent McKinsey: A New Text-to-SQL Benchmark for the Business Domain Yue Li, Ran Tao, Derek Hommel, Yusuf Denizay Dönder, Sungyong Chang, David Mimno, Unso Eun Seo Jo arXiv.org, 2025 |
| 2025-08-02 |
Show or Tell? Modeling the evolution of request-making in Human-LLM conversations Shengqi Zhu, Jeffrey M. Rzeszotarski, David Mimno arXiv.org, 2025 |
| 2025-07-26 |
Are You There God? Lightweight Narrative Annotation of Christian Fiction with LMs Rebecca M. M. Hicke, Brian Haggard, Mia Ferrante, Rayhan Khanna, David Mimno Anthology of Computers and the Humanities, 2025 |
| 2025-05-20 |
Too Long, Didn't Model: Decomposing LLM Long-Context Understanding With Novels Sil Hamilton, Rebecca M. M. Hicke, Matthew Wilkens, David Mimno arXiv.org, 2025 |
| 2025-05-20 |
Cheaper, Better, Faster, Stronger: Robust Text-to-SQL without Chain-of-Thought or Fine-Tuning Yusuf Denizay Donder, Derek Hommel, Andrea Wen-Yi Wang, David Mimno, Unso Eun Seo Jo arXiv.org, 2025 |
| 2025-04-25 |
Shengqi Zhu, Jeffrey M. Rzeszotarski, David Mimno CHI Extended Abstracts, 2025 |
| 2025-04-08 |
The Zero Body Problem: Probing LLM Use of Sensory Language Rebecca M. M. Hicke, Sil Hamilton, David Mimno arXiv.org, 2025 |
| 2025-04-02 |
Tasks and Roles in Legal AI: Data Curation, Annotation, and Verification Allison Koenecke, Edward H. Stiglitz, David Mimno, Matthew Wilkens arXiv.org, 2025 |
| 2025-03-31 |
Do Chinese models speak Chinese languages? Andrea Wen-Yi Wang, Unso Eun Seo Jo, David Mimno arXiv.org, 2025 |
| 2025-03-31 |
Endometriosis Communities on Reddit: Quantitative Analysis Federica Bologna, Rosamond Thalken, Kristen Pepin, Matthew Wilkens Journal of Medical Internet Research, 2025 |
| 2025-02-26 |
A City of Millions: Mapping Literary Social Networks At Scale Sil Hamilton, Rebecca M. M. Hicke, David Mimno, Matthew Wilkens Proceedings of the 5th International Conference on Natural Language Processing for Digital Humanities, 2025 |
| 2025-02-26 |
Provocations from the Humanities for Generative AI Research Lauren F. Klein, Meredith Martin, Andr'e Brock, Maria Antoniak, Melanie Walsh, Jessica Marie Johnson, Lauren Tilton, David Mimno arXiv.org, 2025 |
| 2025-02-20 | |
| 2025-02-05 |
Looking for the Inner Music: Probing LLMs' Understanding of Literary Style Rebecca M. M. Hicke, David M. Mimno Computational Humanities Research, 2025 |
| 2025-01-01 |
Lost in Space: Optimizing Tokens for Grammar-Constrained Decoding Sil Hamilton, David Mimno arXiv.org, 2025 |
| 2024-10-16 |
Context is Key(NMF): Modelling Topical Information Dynamics in Chinese Diaspora Media R. Kristensen-Mclachlan, Rebecca M. M. Hicke, Márton Kardos, M. Thunø Workshop on Computational Humanities Research, 2024 |
| 2024-10-13 |
Quilt: Custom UIs for Linking Unstructured Documents to Structured Datasets Pragya Kallanagoudar, Chithra Anand, Rolando Garcia, Rebecca M. M. Hicke, Aditya G. Parameswaran, Eunice Jun, Sarah E. Chasins ACM Symposium on User Interface Software and Technology, 2024 |
| 2024-10-11 |
SCIENCE IS EXPLORATION: Computational Frontiers for Conceptual Metaphor Theory Rebecca M. M. Hicke, R. Kristensen-Mclachlan Workshop on Computational Humanities Research, 2024 |
| 2024-10-09 |
Large Language Models in Qualitative Research: Uses, Tensions, and Intentions Hope Schroeder, Marianne Aubin Le Quéré, Casey Randazzo, David Mimno, S. Schoenebeck International Conference on Human Factors in Computing Systems, 2024 |
| 2024-09-29 |
Judicial self fashioning: Rhetorical performance in Supreme Court opinions Rosamond Thalken, David M. Mimno, Matthew Wilkens Discourse Studies, 2024 |
| 2024-09-17 |
Says Who? Effective Zero-Shot Annotation of Focalization Rebecca M. M. Hicke, Yuri Bizzoni, P. Moreira, R. Kristensen-Mclachlan Anthology of Computers and the Humanities, 2024 |
| 2024-08-01 |
Kristen Pepin, Federica Bologna, Rosamond Thalken, Matthew Wilkens Journal of minimally invasive gynecology, 2024 |
| 2024-07-17 |
Andrea Wen-Yi Wang, Kathryn Adamson, Nathalie Greenfield, Rachel Goldberg, Sandra Babcock, David Mimno, Allison Koenecke AAAI/ACM Conference on AI, Ethics, and Society, 2024 |
| 2024-07-12 |
How Chinese are Chinese Language Models? The Puzzling Lack of Language Policy in China's LLMs Andrea Wen-Yi Wang, Unso Eun Seo Jo, Lu Jia Lin, David Mimno arXiv.org, 2024 |
| 2024-07-02 |
What We Talk About When We Talk About LMs: Implicit Paradigm Shifts and the Ship of Language Models Shengqi Zhu, Jeffrey M. Rzeszotarski North American Chapter of the Association for Computational Linguistics, 2024 |
| 2024-05-11 |
LLMs as Research Tools: Applications and Evaluations in HCI Data Work Marianne Aubin Le Quere, Hope Schroeder, Casey Randazzo, Jie Gao, Ziv Epstein, S. Perrault, David Mimno, Louise Barkhuus, Hanlin Li CHI Extended Abstracts, 2024 |
| 2024-04-19 | |
| 2024-02-29 |
Endometriosis Online Communities: A Quantitative Analysis Federica Bologna, MS Rosamond Thalken, MS Kristen Pepin, Mph Matthew Wilkens Md medRxiv, 2024 |
| 2024-02-06 | |
| 2024-01-31 |
[Lions: 1] and [Tigers: 2] and [Bears: 3], Oh My! Literary Coreference Annotation with LLMs Rebecca M. M. Hicke, David M. Mimno LATECHCLFL, 2024 |
| 2024-01-14 |
The Afterlives of Shakespeare and Company in Online Social Readership Maria Antoniak, David M. Mimno, Rosamond Thalken, Melanie Walsh, Matthew Wilkens, Gregory Yauney Journal of Cultural Analytics, 2024 |
| 2024-01-01 |
Large Language Models in Qualitative Research: Can We Do the Data Justice? Hope Schroeder, Marianne Aubin Le Qu'er'e, Casey Randazzo, David Mimno, S. Schoenebeck arXiv.org, 2024 |
| 2024-01-01 |
Contextualized Topic Coherence Metrics Hamed Rahimi, David M. Mimno, Jacob Louis Hoover, Hubert Naacke, Camélia Constantin, Bernd Amann Findings, 2024 |
| 2024-01-01 |
Shengqi Zhu, Jeffrey M. Rzeszotarski Findings of the Association for Computational Linguistics ACL 2024 |
| 2023-11-29 |
Hyperpolyglot LLMs: Cross-Lingual Interpretability in Token Embeddings Andrea Wen-Yi Wang, David Mimno Conference on Empirical Methods in Natural Language Processing, 2023 |
| 2023-11-15 |
Data Similarity is Not Enough to Explain Language Model Performance Gregory Yauney, Emily Reif, David M. Mimno Conference on Empirical Methods in Natural Language Processing, 2023 |
| 2023-11-11 |
Report of the 1st Workshop on Generative AI and Law A. F. Cooper, Katherine Lee, James Grimmelmann, Daphne Ippolito, Christopher Callison-Burch, Christopher A. Choquette-Choo, Niloofar Mireshghallah, Miles Brundage, David Mimno, M. Z. Choksi, J. Balkin, Nicholas Carlini, Christopher De Sa, Jonathan Frankle, Deep Ganguli, Bryant Gipson, A. Guadamuz, Swee Leng Harris, Abigail Z. Jacobs, Elizabeth Joh, Gautam Kamath, M. Lemley, Cass Matthews, Christine McLeavey, Corynne Mcsherry, Milad Nasr, Paul Ohm, Adam Roberts, Tom Rubin, Pamela Samuelson, Ludwig Schubert, Kristen Vaccaro, Luis Villa, Felix Wu, Elana Zeide Social Science Research Network, 2023 |
| 2023-10-27 |
Modeling Legal Reasoning: LM Annotation at the Edge of Human Agreement Rosamond Thalken, Edward H. Stiglitz, David M. Mimno, Matthew Wilkens Conference on Empirical Methods in Natural Language Processing, 2023 |
| 2023-10-27 |
T5 meets Tybalt: Author Attribution in Early Modern English Drama Using Large Language Models Rebecca M. M. Hicke, David M. Mimno Workshop on Computational Humanities Research, 2023 |
| 2023-07-07 |
Deep distant reading: The rise of realism in Scandinavian literature as a case study Jens Bjerring-Hansen, Matthew Wilkens Orbis Litterarum, 2023 |
| 2023-07-06 |
Establishing Connectivity and Trust in High Schools During COVID-19 Lisa De Leon, Matthew Wilkens The Scholarship Without Borders Journal, 2023 |
| 2023-05-28 |
More than Classification: A Unified Framework for Event Temporal Relation Extraction Quzhe Huang, Yutong Hu, Shengqi Zhu, Yansong Feng, Chang Liu, Dongyan Zhao Annual Meeting of the Association for Computational Linguistics, 2023 |
| 2023-05-27 |
Grounding Characters and Places in Narrative Text Sandeep Soni, Amanpreet Sihra, Elizabeth F. Evans, Matthew Wilkens, David Bamman Annual Meeting of the Association for Computational Linguistics, 2023 |
| 2023-05-22 |
S. Longpre, Gregory Yauney, Emily Reif, Katherine Lee, Adam Roberts, Barret Zoph, Denny Zhou, Jason Wei, Kevin Robinson, David M. Mimno, Daphne Ippolito North American Chapter of the Association for Computational Linguistics, 2023 |
| 2023-01-23 |
Sensemaking About Contraceptive Methods Across Online Platforms LeAnn McDowall, Maria Antoniak, David M. Mimno International Conference on Web and Social Media, 2023 |
| 2023-01-12 | |
| 2023-01-01 |
Large Language Models and NER: better results with less work Rosamond Thalken, Matthew Wilkens, David M. Mimno Digital Humanities Conference, 2023 |
| 2023-01-01 |
The Chatbot and the Canon: Poetry Memorization in LLMs Lyra D'Souza, David Mimno Workshop on Computational Humanities Research, 2023 |
| 2023-01-01 |
MultiHATHI: A Complete Collection of Multilingual Prose Fiction in the HathiTrust Digital Library S. Hamilton, Andrew Piper Journal of Open Humanities Data, 2023 |
| 2023-01-01 |
Mrs. Dalloway Said She Would Segment the Chapters Herself Peiqi Sui, Lin Wang, S. Hamilton, Thorsten Ries, Kelvin Wong, Stephen Wong WNU, 2023 |
| 2022-10-13 |
The COVID That Wasn’t: Counterfactual Journalism Using GPT S. Hamilton, Andrew Piper LATECHCLFL, 2022 |
| 2022-10-07 |
Breaking BERT: Evaluating and Optimizing Sparsified Attention Siddhartha Brahma, Polina Zablotskaia, David M. Mimno arXiv.org, 2022 |
| 2022-10-05 |
Jacob Eisenstein, D. Andor, Bernd Bohnet, Michael Collins, David M. Mimno arXiv.org, 2022 |
| 2022-10-01 |
Rebecca M. M. Hicke, Maanya Goenka, E. Alexander 2022 IEEE 7th Workshop on Visualization for the Digital Humanities (VIS4DH), 2022 |
| 2022-04-17 |
Does Recommend-Revise Produce Reliable Annotations? An Analysis on Missing Instances in DocRED Quzhe Huang, Shibo Hao, Yuan Ye, Shengqi Zhu, Yansong Feng, Dongyan Zhao Annual Meeting of the Association for Computational Linguistics, 2022 |
| 2021-11-12 |
On-the-fly Rectification for Robust Large-Vocabulary Topic Inference Moontae Lee, Sungjun Cho, Kun Dong, David M. Mimno, D. Bindel International Conference on Machine Learning, 2021 |
| 2021-10-05 |
Federica Bologna, A. Iorio, S. Peroni, Francesco Poggi Quantitative Science Studies, 2021 |
| 2021-09-22 |
‘Tecnologica cosa’: Modeling Storyteller Personalities in Boccaccio’s ‘Decameron’ A. Feder Cooper, Maria Antoniak, Christopher De Sa, Marilyn Migiel, David M. Mimno LATECHCLFL, 2021 |
| 2021-09-15 |
Comparing Text Representations: A Theory-Driven Approach Gregory Yauney, David M. Mimno Conference on Empirical Methods in Natural Language Processing, 2021 |
| 2021-07-09 |
Reply to Linton: Perspectival interference up close Jorge Morales, Axel Bax, C. Firestone Proceedings of the National Academy of Sciences of the United States of America, 2021 |
| 2021-06-30 |
Too isolated, too insular: American Literature and the World Matthew Wilkens Journal of Cultural Analytics, 2021 |
| 2021-06-10 |
Academics evaluating academics: a methodology to inform the review process on top of open citations Federica Bologna, A. Iorio, S. Peroni, Francesco Poggi arXiv.org, 2021 |
| 2021-06-03 |
Exploring Distantly-Labeled Rationales in Neural Network Models Quzhe Huang, Shengqi Zhu, Yansong Feng, Dongyan Zhao Annual Meeting of the Association for Computational Linguistics, 2021 |
| 2021-06-03 |
Three Sentences Are All You Need: Local Path Enhanced Document Relation Extraction Quzhe Huang, Shengqi Zhu, Yansong Feng, Yuan Ye, Yuxuan Lai, Dongyan Zhao Annual Meeting of the Association for Computational Linguistics, 2021 |
| 2021-06-01 |
Alicia Eads, Alexandra Schofield, Fauna Mahootian, David M. Mimno, Rens Wilderom Poetics, 2021 |
| 2021-05-18 |
Federica Bologna, A. Iorio, S. Peroni, Francesco Poggi arXiv.org, 2021 |
| 2021-03-14 |
Federica Bologna, A. Iorio, S. Peroni, Francesco Poggi Scientometrics, 2021 |
| 2021-01-01 |
Towards Hybrid Human-Machine Workflow for Natural Language Generation Heloisa Candello, Stevie Chancellor, S. Ernala, Mitchell Gordon, Huda Khayrallah, Geza Kovacs, Narges Mahyar, David Mimno, Swati Mishra, Carolyn Rosé, Koustuv Saha, Joseph Seering, Qinlan Shen, Alison Smith-Renner, Adam Trischler, Neslihan Iskender, Tim Polzehl, C. Kennington, J. Fails, Katherine Landau, M. S. Pera, Alyssa Lees, Daniel Borkan, Ian Kivlichan, Jorge Nario HCINLP, 2021 |
| 2021-01-01 |
Federica Bologna, A. Iorio, S. Peroni, Francesco Poggi arXiv.org, 2021 |
| 2020-11-23 | |
| 2020-10-30 |
Finding Domain-Specific Grounding in Noisy Visual-Textual Documents Gregory Yauney, Jack Hessel, David M. Mimno Conference on Empirical Methods in Natural Language Processing, 2020 |
| 2020-10-23 |
Topic Modeling with Contextualized Word Representation Clusters Laure Thompson, David M. Mimno arXiv.org, 2020 |
| 2020-07-08 |
Like Two Pis in a Pod: Author Similarity Across Time in the Ancient Greek Corpus G. Storey, David M. Mimno
|
| 2020-06-12 |
Sustained representation of perspectival shape Jorge Morales, Axel Bax, C. Firestone Proceedings of the National Academy of Sciences of the United States of America, 2020 |
| 2020-02-05 |
R. Griffin, Maria Antoniak, P. D. Mac, Vladimir N. Kramskiy, S. Waldman, David M. Mimno Frontiers in Neuroscience, 2020 |
| 2020-01-11 |
Making, Preserving, and Curating Born-Digital Literature Anastasia Salter, Marjorie C. Luesebrink, Dene Grigar, Leonardo Flores, Julian Ankney, Nicholas Binford, Kathryn Manis, Ricardo A. Ramirez, Troy Rowden, R. Snyder, Rosamond Thalken, N. Idris
|
| 2020-01-01 |
Prior-aware Composition Inference for Spectral Topic Models Moontae Lee, D. Bindel, David M. Mimno International Conference on Artificial Intelligence and Statistics, 2020 |
| 2020-01-01 | |
| 2020-01-01 |
Replication and Computational Literary Studies Christof Schöch, K. Dalen-Oskam, Maria Antoniak, Fotis Jannidis, David M. Mimno Digital Humanities Conference, 2020 |
| 2020-01-01 | |
| 2020-01-01 | |
| 2020-01-01 | |
| 2020-01-01 | |
| 2020-01-01 | |
| 2020-01-01 | |
| 2020-01-01 | |
| 2020-01-01 |
Accessing and Comparing Word Frequency Data Matthew L. Jockers, Rosamond Thalken Text Analysis with R, 2020 |
| 2020-01-01 |
Constructing and Analyzing Short Science Fiction at Scale Laure Thompson, David M. Mimno Digital Humanities Conference, 2020 |
| 2020-01-01 | |
| 2020-01-01 | |
| 2020-01-01 |
First Foray into Text Analysis with R Matthew L. Jockers, Rosamond Thalken Text Analysis with R, 2020 |
| 2020-01-01 | |
| 2020-01-01 | |
| 2020-01-01 | |
| 2020-01-01 |
Network Analysis Finds Shifts in the History of Modern Architecture Gregory Yauney, David M. Mimno Digital Humanities Conference, 2020 |
| 2020-01-01 | |
| 2020-01-01 | |
| 2020-01-01 |
Matthew L. Jockers, Rosamond Thalken Quantitative Methods in the Humanities and Social Sciences, 2020 |
| 2019-11-07 |
Narrative Paths and Negotiation of Power in Birth Stories Maria Antoniak, David M. Mimno, K. Levy Proc. ACM Hum. Comput. Interact., 2019 |
| 2019-11-01 | |
| 2019-11-01 |
Practical Correlated Topic Modeling and Analysis via the Rectified Anchor Word Algorithm Moontae Lee, Sungjun Cho, D. Bindel, David M. Mimno Conference on Empirical Methods in Natural Language Processing, 2019 |
| 2019-07-02 |
How We Do Things With Words: Analyzing Text as Social and Cultural Data D. Nguyen, Maria Liakata, S. Dedeo, Jacob Eisenstein, David Mimno, Rebekah Tromble, J. Winters Frontiers in Artificial Intelligence, 2019 |
| 2019-07-01 |
Boosted negative sampling by quadratically constrained entropy maximization Taygun Kekeç, David M. Mimno, D. Tax Pattern Recognition Letters, 2019 |
| 2019-01-01 | |
| 2019-01-01 |
Unsupervised Discovery of Multimodal Links in Multi-image, Multi-sentence Documents Jack Hessel, Lillian Lee, David M. Mimno Conference on Empirical Methods in Natural Language Processing, 2019 |