Portfolio item number 1
Short description of portfolio item number 1
Short description of portfolio item number 1
Short description of portfolio item number 2
Published in The 2022 Conference on Empirical Methods in Natural Language Processing, 2022
This paper addresses the challenge of knowledge conflicts that arise when models have access to rich, diverse knowledge sources. We propose methods for recalibrating models to appropriately handle conflicting evidence and reflect uncertainty in their predictions.
Recommended citation: Hung-Ting Chen, Michael J.Q. Zhang, Eunsol Choi. (2022). "Rich Knowledge Sources Bring Complex Knowledge Conflicts: Recalibrating Models to Reflect Conflicting Evidence." The 2022 Conference on Empirical Methods in Natural Language Processing.
Download Paper
Published in The 2023 Conference on Empirical Methods in Natural Language Processing, 2023
We study continually improving an extractive question answering (QA) system via human user feedback. We design and deploy an iterative approach, where information-seeking users ask questions, receive model-predicted answers, and provide feedback. We conduct experiments involving thousands of user interactions under diverse setups to broaden the understanding of learning from feedback over time. Our experiments show effective improvement from user feedback of extractive QA models over time across different data regimes, including significant potential for domain adaptation.
Recommended citation: Ge Gao*, Hung-Ting Chen*, Yoav Artzi, Eunsol Choi. (2023). "Continually Improving Extractive QA via Human Feedback." The 2023 Conference on Empirical Methods in Natural Language Processing.
Download Paper
Published in Conference On Language Modeling 2024, 2024
This paper provides a comprehensive analysis of retrieval augmentation for long-form question answering. We examine how retrieval augmentation affects model performance across different types of questions and identify key factors that determine its effectiveness.
Recommended citation: Hung-Ting Chen, Fangyuan Xu*, Shane A. Arora*, Eunsol Choi. (2024). "Understanding Retrieval Augmentation for Long-Form Question Answering." Conference On Language Modeling 2024.
Download Paper
Published in The 63rd Annual Meeting of the Association for Computational Linguistics, 2024
This paper introduces CaLMQA, a benchmark for evaluating long-form question answering systems on culturally specific questions across 23 languages. We analyze how different models handle cultural nuances and language-specific knowledge.
Recommended citation: Shane Arora*, Marzena Karpinska*, Hung-Ting Chen, Ipsita Bhattacharjee, Mohit Iyyer, Eunsol Choi. (2024). "CaLMQA: Exploring culturally specific long-form question answering across 23 languages." The 63rd Annual Meeting of the Association for Computational Linguistics.
Download Paper
Published in 2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics, 2025
We study retrieving a set of documents that covers various perspectives on a complex and contentious question (e.g., will ChatGPT do more harm than good?). We curate a Benchmark for Retrieval Diversity for Subjective questions (BERDS), where each example consists of a question and diverse perspectives associated with the question, sourced from survey questions and debate websites. On this data, retrievers paired with a corpus are evaluated to surface a document set that contains diverse perspectives. Our framing diverges from most retrieval tasks in that document relevancy cannot be decided by simple string matches to references. Instead, we build a language model-based automatic evaluator that decides whether each retrieved document contains a perspective. This allows us to evaluate the performance of three different types of corpus (Wikipedia, web snapshot, and corpus constructed on the fly with retrieved pages from the search engine) paired with retrievers. Retrieving diverse documents remains challenging, with the outputs from existing retrievers covering all perspectives on only 33.74% of the examples. We further study the impact of query expansion and diversity-focused reranking approaches and analyze retriever sycophancy. Together, we lay the foundation for future studies in retrieval diversity handling complex queries.
Recommended citation: Hung-Ting Chen, Eunsol Choi. (2025). "Open-World Evaluation for Retrieving Diverse Perspectives." 2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics.
Download Paper
Published:
This is a description of your talk, which is a markdown file that can be all markdown-ified like any other post. Yay markdown!
Published:
This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field.
Undergraduate course, University 1, Department, 2014
This is a description of a teaching experience. You can use markdown like any other post.
Workshop, University 1, Department, 2015
This is a description of a teaching experience. You can use markdown like any other post.