WU JUNCHAO

Research Interests

  • Trustworthy AI: LLM-Generated Text Detection, Hallucination Detection, Pre-training Data Detection
  • Large Language Models: LLMs Evaluation, Data Selection for LLM Instruction Tuning
  • Machine Translation: Low-Resource, LLM-based MT
  • Other Interests: Knowledge Graphs and Ontology, Public Opinion Analysis

News

  • [2025-03-04] 🔥 We’re excited to announce a shared task at NLPCC 2025: LLM-Generated Text Detection !!! This task focuses on Chinese and offers a great opportunity to explore unique findings compared to English detection. Join us and create your own detector now! 🚀
  • [2024-11-31] One paper accepted by COLING 2025 as first-author: Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore. We introduce a novel black-box zero-shot method for detecting LLM-generated text, called GECScore 🔍. This method is remarkably simple yet effective, based on the observation that LLMs tend to prefer grammatically correcting human-written text over LLM-generated text.
  • [2024-11-28] One paper accepted by Computational Linguistics (CL) as first-author: A Survey on LLM-generated Text Detection: Necessity, Methods, and Future Directions. This survey offers a comprehensive overview of the latest advancements in LLM-generated text detection, highlighting the urgent need for more robust methods. It reviews mainstream approaches, addresses key challenges, and outlines promising future research directions. The paper serves as both a clear introduction for newcomers and a valuable resource for experts seeking updates in the field.
  • [2024-09-27] One paper accepted by NeurIPS 2024 D&B Track as first-author: DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios. We introduce a new benchmark, DetectRL, which covers multiple realistic scenarios, including usage of various prompts, human revisions of LLM-generated text, adversarial spelling errors, taking measures to attack detectors, etc., provide real utility to researchers on the topic and practitioners looking for consistent evaluation methods.

Publications

2024

Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore

Junchao Wu , Runzhe Zhan, Derek F. Wong*, Shu Yang, Xuebo Liu, Lidia S. Chao, Min Zhang

COLING 2024

[PAPER] [CODE]

A Survey on LLM-Generated Text Detection: Necessity, Methods, and Future Directions

Junchao Wu , Shu Yang, Runzhe Zhan, Yulin Yuan*, Derek F. Wong*, Lidia S. Chao

Computational Linguistics 2024

[PAPER] [CODE]

DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios

Junchao Wu , Runzhe Zhan, Derek F. Wong*, Shu Yang, Xinyi Yang, Yulin Yuan, Lidia S. Chao

NeurIPS 2024 (Datasets and Benchmarks Track) 2024

[PAPER] [CODE]

2023

Human-in-the-loop Machine Translation with Large Language Model

Xinyi Yang, Runzhe Zhan, Derek F. Wong*, Junchao Wu , Lidia S. Chao

Proceedings of Machine Translation Summit XIX, Vol. 2: Users Track 2023

[PAPER] [CODE]

2021

“The Canton Canon” Digital Library Based on Knowledge Graph - Taking the Revolutionary Archives of Canton in the Republic of China as an Example

Junchao Wu *, Ying Jiang, Xin Chen, Lingyu Guo, Xiaotong Wei and Xiaoyan Yang

ICEIT(2021 IEEE 10th International Conference on Educational and Information Technology)2021 2021

[PAPER] [CODE]
Best Oral Presentation Award (5/57)


Services

  • Conference Reviewer: ACL ARR, ICML, ICLR, NeurIPS, CCL, NLPCC
  • Journal Reviewer: Proceedings of the IEEE, TALLIP, TIST
  • Student Volunteer: MT Summit 2023
  • Teaching Assistant
    • Computational Linguistics (MSc) programme (2023 Fall)
    • AHGC7315 Language and Linguistics (2023 Spring)

Awards

  • Outstanding graduate of Beijing Normal University,Zhuhai, 2022
  • The Outstanding Graduation Thesis of Beijing Normal University, Zhuhai, 2022
  • Professional Scholarship of Beijing Normal University, Zhuhai, 2019, 2020, 2021
  • Academic Scholarship of Beijing Normal University, Zhuhai, 2021
  • Outstanding student cadre Honor of Beijing Normal University, Zhuhai, 2019, 2020
  • National Second Prize & Guangdong First Prize in the 7th Teddy Cup Data Mining Challenge (Title: Analysis of Safe Driving Behavior of Transport Vehicles), 2019
  • First Prize in China University Students Mathematical Contest in Modeling, Guangdong Division (Title: Credit Decisions of Small, Medium and Micro Enterprises), 2020
  • Best Presentation Award in the 10th IEEE International Conference on Education and Information Technology, ICEIT 2021
  • Third Prize in the 8th “Challenge Cup” College Students’ Extracurricular Academic Science and Technology Works Competition in Beijing Normal University, Zhuhai, 2020
  • Third Prize in the 2nd Guangdong-Hong Kong-Macau College Student Public Administration Data Analysis Contest, 2022

Professional skills

  • English Level: IELTS(6.5), CET-6(507)
  • Programming Languages: C, Python, Java, JavaScript, SQL, Bash
  • Development Frameworks: SpringBoot, React.js, Flask
  • Deep Learning Tools: Scikit-learn, PyTorch, Fairseq, Transformers
  • DB Tools: Neo4j, MySQL, Oracle DB
  • Other Tools: WebLogic, Apache Ant, Jena, Git

Others

  • 🎓 If you are also interested in natural language processing and machine translation, you can follow my lab and lab members: NLP2CT LAB, I love them.
  • Zhan Runzhe is my current advisor, who focus on machine translation and LLMs. He is a brilliant scientist and one of the nicest people I have ever met.
  • 🌈Chen Xin is one of my best friends, who focus on Fintech and NLP at University of Southampton currently, he has many interesting dreams.

Flag Counter