publications

2025

  1. Under review
    MoReBench: Evaluating Procedural and Pluralistic Moral Reasoning in Language Models, More than Outcomes
    Yu Ying Chiu, Michael S. Lee, Rachel Calcott, Brandon Handoko, Paul Font-Reaulx, Paula Rodriguez, Chen Bo Calvin Zhang, Ziwen Han, Udari Madhushani Sehwag, Yash Maurya, Christina Q Knight, Harry R. Lloyd, Florence Bacus, Mantas Mazeika, Bing Liu, Yejin Choi, Mitchell L Gordon, and Sydney Levine
    2025
  2. Under review
    Language Matters: How Do Multilingual Input and Reasoning Paths Affect Large Reasoning Models?
    Zhi Rui Tam, Cheng-Kuang Wu, Yu Ying Chiu, Chieh-Yen Lin, Yun-Nung Chen, and Hung-yi Lee
    2025
  3. Under review
    Will AI Tell Lies to Save Sick Children? Litmus-Testing AI Values Prioritization with AIRiskDilemmas
    Yu Ying Chiu, Zhilin Wang, Sharan Maiya, Yejin Choi, Kyle Fish, Sydney Levine, and Evan Hubinger
    2025

2024

  1. ACL 2025
    CulturalBench: a Robust, Diverse and Challenging Benchmark on Measuring the (Lack of) Cultural Knowledge of LLMs Through Human-AI Red-Teaming
    Yu Ying Chiu, Liwei Jiang, Bill Yuchen Lin, Chan Young Park, Shuyue Stella Li, Sahithya Ravi, Mehar Bhatia, Maria Antoniak, Yulia Tsvetkov, Vered Shwartz, and Yejin Choi
    2024
  2. ICLR 2025 (Spotlight)
    DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life
    Yu Ying Chiu, Liwei Jiang, and Yejin Choi
    In International Conference on Learning Representations 2025 (Spotlight), 2024
  3. Under Review
    A Computational Framework for Behavioral Assessment of LLM Therapists
    Yu Ying Chiu*, Ashish Sharma*, Inna Wanyin Lin, and Tim Althoff
    2024
  4. TACL 2024
    Filtered Corpus Training (FiCT) Shows that Language Models can Generalize from Indirect Evidence
    Abhinav Patil, Jaap Jumelet, Yu Ying Chiu, Andy Lapastora, Peter Shen, Lexie Wang, Clevis Willrich, and Shane Steinert-Threlkeld
    In Transactions of the Association for Computational Linguistics 2024, 2024
  5. Under Review
    WildHallucinations: Evaluating Long-form Factuality in LLMs with Real-World Entity Queries
    Wenting Zhao, Tanya Goyal, Yu Ying Chiu, Liwei Jiang, Benjamin Newman, Abhilasha Ravichander, Khyathi Chandu, Ronan Le Bras, Claire Cardie, Yuntian Deng, and Yejin Choi
    2024

2023

  1. EMNLP 2023 System Demo
    humanoidagent_gif.gif
    Humanoid Agents: Platform for Simulating Human-like Generative Agents
    Zhilin Wang*Yu Ying Chiu*, and Yu Cheung Chiu
    In Empirical Methods in Natural Language Processing 2023 (System Demonstrations), Dec 2023