WU JUNCHAO

Research Interests

  • Regulatable & Trustworthy AI & Safety: LLM-Generated Text Detection; Hallucination Detection & Mitigation; Internal interpretability of LLMs; Cybersecurity Ability of LLMs
  • Machine Translation: Low-Resource, LLM-based MT (Document-Level; Culture; Idioms)
  • Other Interests: Knowledge Graphs and Ontology, Public Opinion Analysis

News

  • [2025-06-27] One Evaluation Paper accepted by CCL 2025 Evaluation Workshop as co-author: Overview of CCL25-Eval Task 4: Factivity Inference Evaluation 2025, we will update the paper in arxiv soon~
  • [2025-06-13] One Evaluation Paper accepted by NLPCC 2025 as first-author, Overview of the NLPCC 2025 Shared Task 1: LLM-Generated Text Detection. We provide a comprehensive overview of the task, including the dataset, task design, evaluation results, and an in-depth analysis of the submitted solutions.
  • [2025-05-16] Two paper accepted by ACL 2025 Findings as co-author. Congratulations to Shu🌳, Xinyi, and all co-authors!
  • [2025-04-03] Our new preprint Understanding Aha Moments: from External Observations to Internal Mechanisms released. We reveal that “aha moments” help LRMs solve complex problems by using anthropomorphic language, adjusting uncertainty, and preventing reasoning collapse, while internally balancing self-reflection with reasoning and adapting to task difficulty.
  • [2025-03-04] 🔥 We’re excited to announce a shared task at NLPCC 2025: LLM-Generated Text Detection !!! This task focuses on Chinese and offers a great opportunity to explore unique findings compared to English detection. Join us and create your own detector now! 🚀
  • [2024-11-31] One paper accepted by COLING 2025 as first-author: Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore. We introduce a novel black-box zero-shot method for detecting LLM-generated text, called GECScore 🔍. This method is remarkably simple yet effective, based on the observation that LLMs tend to prefer grammatically correcting human-written text over LLM-generated text.
  • [2024-11-28] One paper accepted by Computational Linguistics (CL) as first-author: A Survey on LLM-generated Text Detection: Necessity, Methods, and Future Directions. This survey offers a comprehensive overview of the latest advancements in LLM-generated text detection, highlighting the urgent need for more robust methods. It reviews mainstream approaches, addresses key challenges, and outlines promising future research directions. The paper serves as both a clear introduction for newcomers and a valuable resource for experts seeking updates in the field.
  • [2024-09-27] One paper accepted by NeurIPS 2024 D&B Track as first-author: DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios. We introduce a new benchmark, DetectRL, which covers multiple realistic scenarios, including usage of various prompts, human revisions of LLM-generated text, adversarial spelling errors, taking measures to attack detectors, etc., provide real utility to researchers on the topic and practitioners looking for consistent evaluation methods.

Publications

2025

Is Long-to-Short a Free Lunch? Investigating Inconsistency and Reasoning Efficiency in LRMs

Shu Yang, Junchao Wu , Xuansheng Wu, Derek F. Wong, Ninhao Liu, Di Wang*

arxiv (Under Review) 2025

[PAPER] [CODE]

Overview of CCL25-Eval Task 4: Factivity Inference Evaluation 2025

Guanliang Cong, Junchao Wu , Yang Chen, Tianqi Xun, Derek F. Wong, Bin Li, Yulin Yuan*

CCL Evalation Workshop 2025

[PAPER] [CODE]

Overview of the NLPCC 2025 Shared Task 1: LLM-Generated Text Detection

Junchao Wu , Runzhe Zhan, Qianli Wang, Yulin Yuan, Lidia S. Chao, and Derek F. Wong*

NLPCC 2025

[PAPER] [CODE]

Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore

Junchao Wu , Runzhe Zhan, Derek F. Wong*, Shu Yang, Xuebo Liu, Lidia S. Chao, Min Zhang

COLING 2025

[PAPER] [CODE]

Understanding Aha Moments: From External Observations to Internal Mechanisms

Shu Yang, Junchao Wu , Xin Chen, Yunze Xiao, Xinyi Yang, Derek F. Wong, Di Wang*

arxiv (Under Review) 2025

[PAPER] [CODE]

Rethinking Prompt-based Debiasing in Large Language Models

Xinyi Yang, Runzhe Zhan, Derek F. Wong*, Shu Yang, Junchao Wu , Lidia S. Chao

ACL Findings 2025

[PAPER] [CODE]

Fraud-R1: A Multi-Round Benchmark for Assessing the Robustness of LLM Against Augmented Fraud and Phishing Inducements

Shu Yang, Shenzhe Zhu, Zeyu Wu, Keyu Wang, Junchi Yao, Junchao Wu , Lijie Hu, Mengdi Li, Derek F Wong, Di Wang*

ACL Findings 2025

[PAPER] [CODE]

2024

A Survey on LLM-Generated Text Detection: Necessity, Methods, and Future Directions

Junchao Wu , Shu Yang, Runzhe Zhan, Yulin Yuan*, Derek F. Wong*, Lidia S. Chao

Computational Linguistics 2024

[PAPER] [CODE]

DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios

Junchao Wu , Runzhe Zhan, Derek F. Wong*, Shu Yang, Xinyi Yang, Yulin Yuan, Lidia S. Chao

NeurIPS 2024

[PAPER] [CODE]

2023

Human-in-the-loop Machine Translation with Large Language Model

Xinyi Yang, Runzhe Zhan, Derek F. Wong*, Junchao Wu , Lidia S. Chao

Proceedings of Machine Translation Summit XIX, Vol. 2: Users Track 2023

[PAPER] [CODE]

2021

“The Canton Canon” Digital Library Based on Knowledge Graph - Taking the Revolutionary Archives of Canton in the Republic of China as an Example

Junchao Wu *, Ying Jiang, Xin Chen, Lingyu Guo, Xiaotong Wei and Xiaoyan Yang

ICEIT(2021 IEEE 10th International Conference on Educational and Information Technology)2021 2021

[PAPER] [CODE]
Best Oral Presentation Award (5/57)


Services

  • Conference Reviewer: ACL ARR, ICML, ICLR, NeurIPS, CCL, NLPCC
  • Journal Reviewer: Proc. IEEE, IEEE TIFS, ACM TALLIP, ACM TIST
  • Student Volunteer: MT Summit 2023
  • Teaching Assistant
    • Computational Linguistics (MSc) programme (2023 Fall)
    • AHGC7315 Language and Linguistics (2023 Spring)

Experience

  • Alibaba Group - Research Intern (Jul. 2025 - Present)
  • PRADA Lab, KAUST - Visiting Intern (Mar. 2025 - Jun. 2025)

Professional skills

  • English Level: IELTS(6.5), CET-6(507)
  • Programming Languages: C, Python, Java, JavaScript, SQL, Bash
  • Development Frameworks: SpringBoot, React.js, Flask
  • Deep Learning Tools: Scikit-learn, PyTorch, Fairseq, Transformers
  • DB Tools: Neo4j, MySQL, Oracle DB
  • Other Tools: WebLogic, Apache Ant, Jena, Git

Others

  • 🎓 If you are also interested in natural language processing and machine translation, you can follow my lab and lab members: NLP2CT LAB, I love them.
  • Zhan Runzhe is my current advisor, who focus on machine translation and LLMs. He is a brilliant scientist and one of the nicest people I have ever met.
  • 🌈Chen Xin is one of my best friends, who is a incoming PhD student at Nanjing University and is also committed to NLP research., he has many interesting dreams.

Flag Counter