Book

  • Bird, S., Klein, E., & Loper, E. (2009). Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit NLTK.
  • Glass, I., Dickinson, M., Brew, C., & Meurers, D. (2024). Language and Computers (2nd edition) LC.

Papers

1. NLP-humanities

  • Reddy, S., & Knight, K. (2011). What we know about the Voynich manuscript. Proceedings of the 5th ACL-HLT workshop on language technology for cultural heritage, social sciences, and humanities (pp. 78-86). PDF
  • Kao, J., & Jurafsky, D. (2012). A computational analysis of style, affect, and imagery in contemporary poetry. Proceedings of the NAACL-HLT 2012 workshop on computational linguistics for literature (pp. 8-17). PDF
  • Brooke, J., Hammond, A., & Hirst, G. (2015). GutenTag: an NLP-driven tool for digital humanities research in the Project Gutenberg corpus. Proceedings of the fourth workshop on computational linguistics for literature (pp. 42-47). PDF
  • Schofield, A., & Mehr, L. (2016). Gender-distinguishing features in film dialogue. Proceedings of the Fifth Workshop on Computational Linguistics for Literature (pp. 32-39). PDF
  • Öhman, E., Bizzoni, Y., Moreira, P. F., & Nielbo, K. (2024). EmotionArcs: Emotion arcs for 9,000 literary texts. Proceedings of the 8th joint SIGHUM workshop on computational linguistics for cultural heritage, social sciences, humanities and literature (LaTeCH-CLfL 2024) (pp. 51-66). PDF
  • Luo, Y., Jurafsky, D., & Levin, B. (2019). From insanely jealous to insanely delicious: Computational models for the semantic bleaching of English intensifiers. Proceedings of the 1st international workshop on computational approaches to historical language change (pp. 1-13). PDF
  • Tapo, A. A., Coulibaly, N., Diallo, S., Diarra, S., Homan, C. M., Keita, M. K., & Leventhal, M. (2025). GAIfE: Using GenAI to Improve Literacy in Low-resourced Settings. Findings of the Association for Computational Linguistics: NAACL 2025 (pp. 7914-7929). PDF

2. NLP-social sciences

  • Voigt, R., Camp, N. P., Prabhakaran, V., Hamilton, W. L., Hetey, R. C., Griffiths, C. M., … & Eberhardt, J. L. (2017). Language from police body camera footage shows racial disparities in officer respect. Proceedings of the National Academy of Sciences, 114(25), 6521-6526. PDF
  • Horne, B., & Adali, S. (2017). This just in: Fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news. Proceedings of the International AAAI Conference on Web and Social Media (Vol. 11, No. 1, pp. 759-766). PDF
  • Demszky, D., Garg, N., Voigt, R., Zou, J., Shapiro, J., Gentzkow, M., & Jurafsky, D. (2019). Analyzing Polarization in Social Media: Method and Application to Tweets on 21 Mass Shootings. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (pp. 2970-3005). PDF
  • Luo, Y., Gligorić, K., & Jurafsky, D. (2024). Othering and low status framing of immigrant cuisines in US restaurant reviews and large language models. Proceedings of the International AAAI Conference on Web and Social Media (Vol. 18, pp. 985-998). PDF
  • Liu, Z., Huang, D., Huang, K., Li, Z., & Zhao, J. (2021). Finbert: A pre-trained financial language representation model for financial text mining. Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence (pp. 4513-4519). PDF

3. NLP-language studies

  • Sung, H., Csuros, K., & Sung, M. C. (2025). Comparing human and LLM proofreading in L2 writing: Impact on lexical and syntactic features. arXiv preprint arXiv:2506.09021. PDF
  • Bannò, S., Knill, K., & Gales, M. (2025). Exploiting the English Vocabulary Profile for L2 word-level vocabulary assessment with LLMs. arXiv preprint arXiv:2506.02758. PDF
  • Han, J., & Choi, J. D. (2025). Beyond Linear Digital Reading: An LLM-Powered Concept Mapping Approach for Reducing Cognitive Load. PDF
  • Rezayi, S., Le An Ha, Y. Z., Houriet, A., D’Addario, A., Baldwin, P., Harik, P., … & Yaneva, V. (2025). Automated Scoring of Communication Skills in Physician-Patient Interaction: Balancing Performance and Scalability. PDF
  • Schmalz, V. J., & Tack, A. (2025). Can GPTZero’s AI Vocabulary Distinguish Between LLM-Generated and Student-Written Essays?. PDF

4. NLP-investigation on LLMs

  • Bender, E. M., & Koller, A. (2020). Climbing towards NLU: On meaning, form, and understanding in the age of data. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 5185-5198). PDF
  • Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big?🦜. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (pp. 610-623). PDF
  • Gururangan, S., Card, D., Dreier, S., Gade, E., Wang, L., Wang, Z., … & Smith, N. A. (2022). Whose Language Counts as High Quality? Measuring Language Ideologies in Text Data Selection. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (pp. 2562-2580). PDF
  • Lappin, S. (2024). Assessing the strengths and weaknesses of large language models. Journal of Logic, Language and Information, 33(1), 9-20. PDF

Notes. These are advanced academic papers, so it’s normal not to understand everything—focus on key sections and approach them with a growth mindset. Feel free to choose other articles beyond this list, but please talk to me first.