Culture and
Computation Lab
Welcome! The Culture and Computation Lab is a lab at Cornell University specializing in the intersection of the arts, social sciences, and engineering. Our work builds on research in cultural analytics, natural language processing, digital humanities, legal studies, etc.
Read more about our work here, and see our members here.
| 2025-10-10 |
NarraBench: A Comprehensive Framework for Narrative Benchmarking Sil Hamilton, Matthew Wilkens, Andrew Piper
|
| 2025-08-02 |
Show or Tell? Modeling the evolution of request-making in Human-LLM conversations Shengqi Zhu, Jeffrey M. Rzeszotarski, David Mimno arXiv.org, 2025 |
| 2025-07-26 |
Are You There God? Lightweight Narrative Annotation of Christian Fiction with LMs Rebecca M. M. Hicke, Brian Haggard, Mia Ferrante, Rayhan Khanna, David Mimno arXiv.org, 2025 |
| 2025-05-20 |
Too Long, Didn't Model: Decomposing LLM Long-Context Understanding With Novels Sil Hamilton, Rebecca M. M. Hicke, Matthew Wilkens, David Mimno arXiv.org, 2025 |
| 2025-05-20 |
Cheaper, Better, Faster, Stronger: Robust Text-to-SQL without Chain-of-Thought or Fine-Tuning Yusuf Denizay Donder, Derek Hommel, Andrea Wen-Yi Wang, David Mimno, Unso Eun Seo Jo arXiv.org, 2025 |
| 2025-04-25 |
Shengqi Zhu, Jeffrey M. Rzeszotarski, David Mimno CHI Extended Abstracts, 2025 |
| 2025-04-08 |
The Zero Body Problem: Probing LLM Use of Sensory Language Rebecca M. M. Hicke, Sil Hamilton, David Mimno arXiv.org, 2025 |
| 2025-04-02 |
Tasks and Roles in Legal AI: Data Curation, Annotation, and Verification Allison Koenecke, Edward H. Stiglitz, David Mimno, Matthew Wilkens arXiv.org, 2025 |
| 2025-03-31 |
Endometriosis Communities on Reddit: Quantitative Analysis Federica Bologna, Rosamond Thalken, Kristen Pepin, Matthew Wilkens Journal of Medical Internet Research, 2025 |
| 2025-03-31 |
Do Chinese models speak Chinese languages? Andrea Wen-Yi Wang, Unso Eun Seo Jo, David Mimno arXiv.org, 2025 |
| 2025-02-26 |
A City of Millions: Mapping Literary Social Networks At Scale Sil Hamilton, Rebecca M. M. Hicke, David Mimno, Matthew Wilkens Proceedings of the 5th International Conference on Natural Language Processing for Digital Humanities, 2025 |
| 2025-02-20 | |
| 2025-02-05 |
Looking for the Inner Music: Probing LLMs' Understanding of Literary Style Rebecca M. M. Hicke, David M. Mimno Computational Humanities Research, 2025 |
| 2025-01-01 |
Lost in Space: Optimizing Tokens for Grammar-Constrained Decoding Sil Hamilton, David Mimno arXiv.org, 2025 |
| 2024-10-16 |
Context is Key(NMF): Modelling Topical Information Dynamics in Chinese Diaspora Media R. Kristensen-Mclachlan, Rebecca M. M. Hicke, Márton Kardos, M. Thunø Workshop on Computational Humanities Research, 2024 |
| 2024-10-13 |
Quilt: Custom UIs for Linking Unstructured Documents to Structured Datasets Pragya Kallanagoudar, Chithra Anand, Rolando Garcia, Rebecca M. M. Hicke, Aditya G. Parameswaran, Eunice Jun, Sarah E. Chasins ACM Symposium on User Interface Software and Technology, 2024 |
| 2024-10-11 |
SCIENCE IS EXPLORATION: Computational Frontiers for Conceptual Metaphor Theory Rebecca M. M. Hicke, R. Kristensen-Mclachlan Workshop on Computational Humanities Research, 2024 |
| 2024-09-29 |
Judicial self fashioning: Rhetorical performance in Supreme Court opinions Rosamond Thalken, David M. Mimno, Matthew Wilkens Discourse Studies, 2024 |
| 2024-09-17 |
Says Who? Effective Zero-Shot Annotation of Focalization Rebecca M. M. Hicke, Yuri Bizzoni, P. Moreira, R. Kristensen-Mclachlan arXiv.org, 2024 |
| 2024-08-01 |
Kristen Pepin, Federica Bologna, Rosamond Thalken, Matthew Wilkens Journal of minimally invasive gynecology, 2024 |
| 2024-07-17 |
Andrea Wen-Yi Wang, Kathryn Adamson, Nathalie Greenfield, Rachel Goldberg, Sandra Babcock, David Mimno, Allison Koenecke AAAI/ACM Conference on AI, Ethics, and Society, 2024 |
| 2024-07-12 |
How Chinese are Chinese Language Models? The Puzzling Lack of Language Policy in China's LLMs Andrea Wen-Yi Wang, Unso Eun Seo Jo, Lu Jia Lin, David Mimno arXiv.org, 2024 |
| 2024-07-02 |
What We Talk About When We Talk About LMs: Implicit Paradigm Shifts and the Ship of Language Models Shengqi Zhu, Jeffrey M. Rzeszotarski North American Chapter of the Association for Computational Linguistics, 2024 |
| 2024-04-19 | |
| 2024-02-29 |
Endometriosis Online Communities: A Quantitative Analysis Federica Bologna, MS Rosamond Thalken, MS Kristen Pepin, Mph Matthew Wilkens Md medRxiv, 2024 |
| 2024-02-06 | |
| 2024-01-31 |
[Lions: 1] and [Tigers: 2] and [Bears: 3], Oh My! Literary Coreference Annotation with LLMs Rebecca M. M. Hicke, David M. Mimno LATECHCLFL, 2024 |
| 2024-01-14 |
The Afterlives of Shakespeare and Company in Online Social Readership Maria Antoniak, David M. Mimno, Rosamond Thalken, Melanie Walsh, Matthew Wilkens, Gregory Yauney Journal of Cultural Analytics, 2024 |
| 2024-01-01 |
Contextualized Topic Coherence Metrics Hamed Rahimi, David M. Mimno, Jacob Louis Hoover, Hubert Naacke, Camélia Constantin, Bernd Amann Findings, 2024 |
| 2024-01-01 |
Shengqi Zhu, Jeffrey M. Rzeszotarski Findings of the Association for Computational Linguistics ACL 2024 |
| 2023-11-29 |
Hyperpolyglot LLMs: Cross-Lingual Interpretability in Token Embeddings Andrea Wen-Yi Wang, David Mimno Conference on Empirical Methods in Natural Language Processing, 2023 |
| 2023-11-15 |
Data Similarity is Not Enough to Explain Language Model Performance Gregory Yauney, Emily Reif, David M. Mimno Conference on Empirical Methods in Natural Language Processing, 2023 |
| 2023-10-27 |
Modeling Legal Reasoning: LM Annotation at the Edge of Human Agreement Rosamond Thalken, Edward H. Stiglitz, David M. Mimno, Matthew Wilkens Conference on Empirical Methods in Natural Language Processing, 2023 |
| 2023-10-27 |
T5 meets Tybalt: Author Attribution in Early Modern English Drama Using Large Language Models Rebecca M. M. Hicke, David M. Mimno Workshop on Computational Humanities Research, 2023 |
| 2023-07-07 |
Deep distant reading: The rise of realism in Scandinavian literature as a case study Jens Bjerring-Hansen, Matthew Wilkens Orbis Litterarum, 2023 |
| 2023-07-06 |
Establishing Connectivity and Trust in High Schools During COVID-19 Lisa De Leon, Matthew Wilkens The Scholarship Without Borders Journal, 2023 |
| 2023-05-28 |
More than Classification: A Unified Framework for Event Temporal Relation Extraction Quzhe Huang, Yutong Hu, Shengqi Zhu, Yansong Feng, Chang Liu, Dongyan Zhao Annual Meeting of the Association for Computational Linguistics, 2023 |
| 2023-05-27 |
Grounding Characters and Places in Narrative Text Sandeep Soni, Amanpreet Sihra, Elizabeth F. Evans, Matthew Wilkens, David Bamman Annual Meeting of the Association for Computational Linguistics, 2023 |
| 2023-05-22 |
S. Longpre, Gregory Yauney, Emily Reif, Katherine Lee, Adam Roberts, Barret Zoph, Denny Zhou, Jason Wei, Kevin Robinson, David M. Mimno, Daphne Ippolito North American Chapter of the Association for Computational Linguistics, 2023 |
| 2023-01-23 |
Sensemaking About Contraceptive Methods Across Online Platforms LeAnn McDowall, Maria Antoniak, David M. Mimno International Conference on Web and Social Media, 2023 |
| 2023-01-12 | |
| 2023-01-01 |
MultiHATHI: A Complete Collection of Multilingual Prose Fiction in the HathiTrust Digital Library S. Hamilton, Andrew Piper Journal of Open Humanities Data, 2023 |
| 2023-01-01 |
Mrs. Dalloway Said She Would Segment the Chapters Herself Peiqi Sui, Lin Wang, S. Hamilton, Thorsten Ries, Kelvin Wong, Stephen Wong WNU, 2023 |
| 2023-01-01 |
Large Language Models and NER: better results with less work Rosamond Thalken, Matthew Wilkens, David M. Mimno Digital Humanities Conference, 2023 |
| 2023-01-01 |
The Chatbot and the Canon: Poetry Memorization in LLMs Lyra D'Souza, David Mimno Workshop on Computational Humanities Research, 2023 |
| 2022-10-13 |
The COVID That Wasn’t: Counterfactual Journalism Using GPT S. Hamilton, Andrew Piper LATECHCLFL, 2022 |
| 2022-10-07 |
Breaking BERT: Evaluating and Optimizing Sparsified Attention Siddhartha Brahma, Polina Zablotskaia, David M. Mimno arXiv.org, 2022 |
| 2022-10-05 |
Jacob Eisenstein, D. Andor, Bernd Bohnet, Michael Collins, David M. Mimno arXiv.org, 2022 |
| 2022-10-01 |
Rebecca M. M. Hicke, Maanya Goenka, E. Alexander 2022 IEEE 7th Workshop on Visualization for the Digital Humanities (VIS4DH), 2022 |
| 2022-04-17 |
Does Recommend-Revise Produce Reliable Annotations? An Analysis on Missing Instances in DocRED Quzhe Huang, Shibo Hao, Yuan Ye, Shengqi Zhu, Yansong Feng, Dongyan Zhao Annual Meeting of the Association for Computational Linguistics, 2022 |
| 2021-11-12 |
On-the-fly Rectification for Robust Large-Vocabulary Topic Inference Moontae Lee, Sungjun Cho, Kun Dong, David M. Mimno, D. Bindel International Conference on Machine Learning, 2021 |
| 2021-10-05 |
Federica Bologna, A. Iorio, S. Peroni, Francesco Poggi Quantitative Science Studies, 2021 |
| 2021-09-22 |
‘Tecnologica cosa’: Modeling Storyteller Personalities in Boccaccio’s ‘Decameron’ A. Feder Cooper, Maria Antoniak, Christopher De Sa, Marilyn Migiel, David M. Mimno LATECHCLFL, 2021 |
| 2021-09-15 |
Comparing Text Representations: A Theory-Driven Approach Gregory Yauney, David M. Mimno Conference on Empirical Methods in Natural Language Processing, 2021 |
| 2021-07-09 |
Reply to Linton: Perspectival interference up close Jorge Morales, Axel Bax, C. Firestone Proceedings of the National Academy of Sciences of the United States of America, 2021 |
| 2021-06-30 |
Too isolated, too insular: American Literature and the World Matthew Wilkens Journal of Cultural Analytics, 2021 |
| 2021-06-10 |
Academics evaluating academics: a methodology to inform the review process on top of open citations Federica Bologna, A. Iorio, S. Peroni, Francesco Poggi arXiv.org, 2021 |
| 2021-06-03 |
Exploring Distantly-Labeled Rationales in Neural Network Models Quzhe Huang, Shengqi Zhu, Yansong Feng, Dongyan Zhao Annual Meeting of the Association for Computational Linguistics, 2021 |
| 2021-06-03 |
Three Sentences Are All You Need: Local Path Enhanced Document Relation Extraction Quzhe Huang, Shengqi Zhu, Yansong Feng, Yuan Ye, Yuxuan Lai, Dongyan Zhao Annual Meeting of the Association for Computational Linguistics, 2021 |
| 2021-06-01 |
Alicia Eads, Alexandra Schofield, Fauna Mahootian, David M. Mimno, Rens Wilderom Poetics, 2021 |
| 2021-05-18 |
Federica Bologna, A. Iorio, S. Peroni, Francesco Poggi arXiv.org, 2021 |
| 2021-03-14 |
Federica Bologna, A. Iorio, S. Peroni, Francesco Poggi Scientometrics, 2021 |
| 2021-01-01 |
Federica Bologna, A. Iorio, S. Peroni, Francesco Poggi arXiv.org, 2021 |
| 2020-11-23 | |
| 2020-10-30 |
Finding Domain-Specific Grounding in Noisy Visual-Textual Documents Gregory Yauney, Jack Hessel, David M. Mimno Conference on Empirical Methods in Natural Language Processing, 2020 |
| 2020-10-23 |
Topic Modeling with Contextualized Word Representation Clusters Laure Thompson, David M. Mimno arXiv.org, 2020 |
| 2020-07-08 |
Like Two Pis in a Pod: Author Similarity Across Time in the Ancient Greek Corpus G. Storey, David M. Mimno
|
| 2020-06-12 |
Sustained representation of perspectival shape Jorge Morales, Axel Bax, C. Firestone Proceedings of the National Academy of Sciences of the United States of America, 2020 |
| 2020-02-05 |
R. Griffin, Maria Antoniak, P. D. Mac, Vladimir N. Kramskiy, S. Waldman, David M. Mimno Frontiers in Neuroscience, 2020 |
| 2020-01-11 |
Making, Preserving, and Curating Born-Digital Literature Anastasia Salter, Marjorie C. Luesebrink, Dene Grigar, Leonardo Flores, Julian Ankney, Nicholas Binford, Kathryn Manis, Ricardo A. Ramirez, Troy Rowden, R. Snyder, Rosamond Thalken, N. Idris
|
| 2020-01-01 | |
| 2020-01-01 | |
| 2020-01-01 | |
| 2020-01-01 | |
| 2020-01-01 | |
| 2020-01-01 | |
| 2020-01-01 | |
| 2020-01-01 | |
| 2020-01-01 |
Matthew L. Jockers, Rosamond Thalken Quantitative Methods in the Humanities and Social Sciences, 2020 |
| 2020-01-01 | |
| 2020-01-01 | |
| 2020-01-01 | |
| 2020-01-01 |
Network Analysis Finds Shifts in the History of Modern Architecture Gregory Yauney, David M. Mimno Digital Humanities Conference, 2020 |
| 2020-01-01 | |
| 2020-01-01 | |
| 2020-01-01 | |
| 2020-01-01 |
Prior-aware Composition Inference for Spectral Topic Models Moontae Lee, D. Bindel, David M. Mimno International Conference on Artificial Intelligence and Statistics, 2020 |
| 2020-01-01 | |
| 2020-01-01 |
First Foray into Text Analysis with R Matthew L. Jockers, Rosamond Thalken Text Analysis with R, 2020 |
| 2020-01-01 |
Replication and Computational Literary Studies Christof Schöch, K. Dalen-Oskam, Maria Antoniak, Fotis Jannidis, David M. Mimno Digital Humanities Conference, 2020 |
| 2020-01-01 |
Accessing and Comparing Word Frequency Data Matthew L. Jockers, Rosamond Thalken Text Analysis with R, 2020 |
| 2020-01-01 |
Constructing and Analyzing Short Science Fiction at Scale Laure Thompson, David M. Mimno Digital Humanities Conference, 2020 |
| 2019-11-07 |
Narrative Paths and Negotiation of Power in Birth Stories Maria Antoniak, David M. Mimno, K. Levy Proc. ACM Hum. Comput. Interact., 2019 |
| 2019-11-01 |
Practical Correlated Topic Modeling and Analysis via the Rectified Anchor Word Algorithm Moontae Lee, Sungjun Cho, D. Bindel, David M. Mimno Conference on Empirical Methods in Natural Language Processing, 2019 |
| 2019-11-01 | |
| 2019-07-02 |
How We Do Things With Words: Analyzing Text as Social and Cultural Data D. Nguyen, Maria Liakata, S. Dedeo, Jacob Eisenstein, David Mimno, Rebekah Tromble, J. Winters Frontiers in Artificial Intelligence, 2019 |
| 2019-07-01 |
Boosted negative sampling by quadratically constrained entropy maximization Taygun Kekeç, David M. Mimno, D. Tax Pattern Recognition Letters, 2019 |
| 2019-01-01 | |
| 2019-01-01 |
Unsupervised Discovery of Multimodal Links in Multi-image, Multi-sentence Documents Jack Hessel, Lillian Lee, David M. Mimno Conference on Empirical Methods in Natural Language Processing, 2019 |