HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY
GRADUATION THESIS
ScholarSleuth: A System for Fine-grained
AI-Generated Content Detection
T NGỌC MINH
minh.tn214918@sis.hust.edu.vn
Supervisor: Dr. Đinh Viết Sang
Signature
Program: Data Science and Aritificial Intelligence (IT-E10)
School: School of Information and Communications Technology
HANOI, 06/2025
ACKNOWLEDGMENT
"If you pass by quiet Bách khoa street,
Bring late crêpe myrtles in summer heat.
Where friends once rushed night and day,
Our final exams take youth away..."
"Nếu em v qua phố nhỏ Bách khoa
Nhớ mang cho tôi chùm bằng lăng cuối hạ,
Nơi bạn đêm ngày hối hả
Mùa thi cuối cùng, mai hết tuổi sinh viên..."
This is my graduation thesis, marking the end of my four years at Hanoi University
of Science and Technology.
I want to express my gratitude to Dr. Đinh Viết Sang, my beloved supervisor,
who inspired me to select and complete the thesis. His mentorship extended far
beyond academic matters, influencing my career path and personal development.
Besides, I also extend my thanks to Professor Preslav Nakov, Dr. Yuxia Wang and
the research team at Natural Language Processing depar tment, Mohamed bin Za-
yed University of Artificial Intelligence where I will pursue a PhD degree after
finishing my bachelor for their invaluable supports dur ing entire this time and
related research in this topic.
A hear tfelt thanks goes to my family, who are my "platinum sponsors" during
the past 22 years. Mom and Dad, thank you for your unconditional love and encour-
agement. Your patience, trust, and constant belief in me have been the foundation
of all my efforts and successes.
I also extend my thank to my old colleagues at Viettel IT Center (formerly IT
Division, Viettel Group HQ) for creating favorable environment during my studies.
Particularly, I want to show my appreciation to Mr. Nguyễn Chí Thanh and Mr. Võ
Đức Quân for their valuable insights in my career path.
My sincere appreciation goes to Ho Chi Minhs Youth Union SOICT and
Google Developer Group on Campus HUST. These organizations enr iched my
university life, helping me grow personally and professionally while connecting
me with inspiring peers and friends.
To my close friends, Bình, Trường, Việt Anh and Dũng thank you for your
companionship throughout the four years. Thanks my teammates at Foundation
Model Lab, BKAI Research Center, Mỹ Anh, Đông, Trường, Anh Minh and Đức
Anh for their valuable feedbacks and contributions that helped shape this thesis.
Finally, thanks HUST for being an unforgettable part of my life.
"When I have journeyed all life through,
My soul remains Bách khoa and true."
"Mai đây đi trọn đường đời,
Hồn tôi vẫn mãi người Bách khoa"
i
LỜI CẢM ƠN
"Nếu em v qua phố nhỏ Bách khoa
Nhớ mang cho tôi chùm bằng lăng cuối hạ,
Nơi bạn đêm ngày hối hả
Mùa thi cuối cùng, mai hết tuổi sinh viên..."
Bốn năm tại Bách khoa của tôi đang dần khép lại cùng với quyển ĐATN này.
Em xin chân thành cảm ơn thầy Đinh Viết Sang, người thầy hướng dẫn đã luôn
đồng hành, truyền cảm hứng, giúp em lựa chọn hoàn thiện đề tài này. Sự dìu
dắt của thầy không chỉ giới hạn trong phạm vi học thuật còn ảnh hưởng sâu sắc
đến định hướng nghề nghiệp quá trình trưởng thành của em. Em cũng xin gửi
lời cảm ơn đến GS. Preslav Nakov, TS. Yuxia Wang nhóm nghiên cứu tại khoa
X lý ngôn ngữ tự nhiên, Đại học Mohamed bin Zayed v Trí tuệ Nhân tạo nơi
em sẽ tiếp tục theo học chương trình tiến những hỗ trợ quý báu trong suốt
thời gian qua các nghiên cứu liên quan đến đề tài y.
Con xin y tỏ lòng biết ơn sâu sắc đến gia đình những "nhà tài trợ kim cương"
của con trong suốt 22 năm qua. Cảm ơn bố mẹ luôn yêu thương, động viên
đặt niềm tin vào con. Sự kiên nhẫn và tin tưởng của bố mẹ chính nền tảng vững
chắc cho mọi nỗ lực thành công của con.
Em cũng xin gửi lời cảm ơn đến các anh chị đồng nghiệp tại Trung tâm CNTT
Viettel (trước đây Ban CNTT Tập đoàn) đã tạo điều kiện thuận lợi cho em trong
suốt quá trình học tập và công tác tại đây. Đặc biệt, em cảm ơn anh Nguyễn Chí
Thanh và anh Võ Đức Quân những chia sẻ cho em v định hướng tương lai.
Chặng đường y của tôi được đồng hành bởi Đoàn Thanh niên - Hội Sinh viên
trường CNTT&TT Google Developer Group on Campus HUST. Những tổ
chức y đã thêm màu sắc cho 4 năm đại học, cũng như giúp tôi trưởng thành
hơn trong cả chuyên môn lẫn kỹ năng xã hội.
Gửi đến những người bạn, Bình, Trường, Việt Anh và Dũng cảm ơn các cậu
đã đồng hành cùng tớ suốt bốn năm qua. Cảm ơn các thành viên của Foundation
Model Lab, Trung tâm BKAI Mỹ Anh, Đông, Trường, Anh Minh và Đức Anh
những góp ý q giá đã góp phần hoàn thiện đồ án y.
Cuối cùng, cảm ơn Bách khoa đã trở thành một phần của thanh xuân.
"Mai đây đi trọn đường đời,
Hồn tôi vẫn mãi người Bách khoa"
ii
ABSTRACT
The growing use of large language models in writing has made it increasingly
difficult to distinguish between human-written and AI-generated content, espe-
cially in academic contexts where authorship integrity is vital. As AI-assisted texts
become commonplace, there is a pressing need for reliable methods to assess the
extent of AI involvement, even when experts struggle to distinguish them.
While existing detection systems focus on binary classification, they fall short
in real-world scenarios. These approaches often fail to capture human-AI collab-
oration, struggle to generalize across domains and languages, and perform poorly
on unseen AI models.
This thesis introduces ScholarSleuth, a fine-grained detection framework for
nuanced authorship analysis. It comprises: (1) FAIDSet, a multilingual, multi-
domain dataset; (2) FAID, a novel model combining contrastive learning and multi-
task classification to detect human-written, AI-generated, and collaborative texts;
and (3) an interactive web application featuring an AI Factor Impact Score.
ScholarSleuth achieves up to 95.58% accuracy on in-domain, known genera-
tors settings and 93.31% on unseen generators, outperforming existing methods in
accuracy, interpretability, demonstrating strong generalizability across diverse do-
mains. This work provides a robust, practical and adaptable solution for transparent
and responsible AI use in writing, especially in academic scenarios.
iii
Student
T Ngọc Minh
TABLE OF CONTENTS
ACKNOWLEDGEMENTS......................................... ............................... i
ABSTRACTS . ....................... ...................................... ............... .............. iii
TABLE OF CONTENTS ......................................... .................................. iv
LIST OF FIGURES.... ............... ....................... ...................................... ... vii
LIST OF TABLES ............... ...................................... ............................... viii
LIST OF EQUATIONS ... ............... ...................................... ..................... ix
LIST OF ABBREVIATIONS.................. ............... ............... ..................... x
CHAPTER 1. INTRODUCTION...................................... ....................... 1
1.1 Problem Statement......................................................... ....................... 1
1.2 Foundational Background and Problem of Research............. ............... ..... 2
1.3 Scope of Research ........................................................ ....................... . 3
1.4 Contributions .... ...................................... ...................................... ....... 5
1.5 Organization of Thesis .............................. ...................................... ...... 6
CHAPTER 2. BACKGROUND AND RELATED WORKS ...................... 8
2.1 Overview .................................. ...................................... ..................... 8
2.2 Advancements in Large Language Models and Their Impacts on Writing
Practices.............................. ............... ............... ............... ....................... .. 8
2.2.1 The Advancements of Large Language Models..... ............... ......... 8
2.2.2 Tendency of Using AI Assistant in Writing.................. ............... .. 10
2.2.3 Human vs. AI: The Limits of Expert Judgment............... .............. 12
2.3 Fine-grained AI-generated Text Datasets.......................................... ....... 12
2.4 Generalization of AI-generated Text Detection .................... ............... ..... 16
2.5 Contrastive Learning for AI-generated Text Detection.............................. 18
2.6 Summary ......................................... ....................... ............... .............. 19
iv
CHAPTER 3. METHODOLOGY............................ ................................ 20
3.1 Overview .................................. ...................................... ..................... 20
3.2 Overview of FAID Architecture ............... ............... ............... ................ 20
3.3 Training Architecture ........... ....................... ............... ............... ............ 21
3.3.1 Architecture Design ..................... ....................... ....................... 21
3.3.2 What is LLM family? ....... ............... ............... ............................ 22
3.3.3 Multi-level Contrastive Lear ning ............... ............... ................... 23
3.3.4 Multitask Auxiliary Learning .......... ............... ............................. 25
3.4 Inference Architecture.......... ............... ............... ................................... 25
3.4.1 Architecture Design ..................... ....................... ....................... 25
3.4.2 Handling Unseen Data without Retraining.................................... 27
3.5 Summary ......................................... ....................... ............... .............. 27
CHAPTER 4. EXPERIMENTS AND NUMERICAL RESULTS........... ... 28
4.1 Overview .................................. ...................................... ..................... 28
4.2 FAIDSet ............... ....................... ............... ............... ............... ........... 28
4.2.1 Overview .................................... ....................... ....................... 28
4.2.2 The Formulation of Human-written Dataset................................ .. 28
4.2.3 Dataset Statistics.... ...................................... .............................. 29
4.2.4 Diverse Prompt Strategies...... ............... ...................................... 29
4.3 Other Datasets....... ............... ....................... ............... ............... ........... 32
4.4 Encoder Selection for FAID Architecture............ ............... ..................... 33
4.5 Analysis on Text generated by Different AI Families........ ............... ......... 34
4.5.1 Text Distribution between AI Families ......................................... 34
4.5.2 Text Distribution between AI Models within the Same Family ....... 36
4.5.3 Embedding Visualization and Semantic Cohesion......................... 36
4.5.4 Key Findings .................... ............... ............... ........................... 36
v
4.6 How can We deal with Unseen Data?.................... ............... ................... 39
4.6.1 The Need to Use Vector Database................................................ 39
4.6.2 Clustering Algorithm Selection ................... ....................... ......... 39
4.7 Baseline Models for FAID Evaluation ..................... ............... ................ 40
4.8 Experimental Settings to evaluate FAID Architecture performance............ 41
4.9 Evaluation Metrics for Evaluation Process ............................ .................. 41
4.10 Whether the text is Human, AI, or Human-AI? ...................................... 43
4.11 Identifying Different Generators................... ...................................... .. 44
4.12 Discussion................................ ...................................... .................... 44
4.13 Summary ..... ...................................... ...................................... .......... 46
CHAPTER 5. SCHOLARSLEUTH WEB APPLICATION.......... ............ 47
5.1 Overview .................................. ...................................... ..................... 47
5.2 AI Factor Impact Score ............... ............... ...................................... ..... 47
5.2.1 AFI Score Calculation and Threshold ................ .......................... 47
5.2.2 Why AFI = 0.7 for Human-AI Collaborative Texts?....... ................ 48
5.3 System Overview.......... ............................................................. ........... 48
5.4 User Interface Walkthrough ...... ............... ............... ............................... 49
5.4.1 Analysis Tools .................. ...................................... ................... 49
5.4.2 Playgrounds........................................................... .................... 54
5.5 Summary ......................................... ....................... ............... .............. 57
CHAPTER 6. CONCLUSIONS AND BROADER IMPACTS................... 60
6.1 Conclusion....................................................... .................................... 60
6.2 Suggestion for Future Works...... ............... ...................................... ....... 61
6.3 Ethics and Broader Impacts ............................................. ...................... 61
PUBLICATIONS AND AWARDS GIVEN TO THIS THESIS .................. 63
REFERENCES ............... ...................................... ................................... 64
vi
LIST OF FIGURES
Figure 2.1 Survey on students tendency of using AI . . . . . . . . . . 10
Figure 2.2 Analysis from University of Illinois Chicago on students
AI usage purpose. . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Figure 3.1 Training architecture . . . . . . . . . . . . . . . . . . . . . 22
Figure 3.2 Inference architecture . . . . . . . . . . . . . . . . . . . . 26
Figure 4.1 Text length distributions in words and characters across gen-
erators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Figure 4.2 Top 20 most common 3-gram from generators using 500
sample prompts . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Figure 4.3 Visualizations showing clustering behavior using 2D and
3D embeddings . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Figure 5.1 Homepage of ScholarSleuth system . . . . . . . . . . . . . 50
Figure 5.2 Text Analyzer interface . . . . . . . . . . . . . . . . . . . 51
Figure 5.3 Document Analyzer interface . . . . . . . . . . . . . . . . 53
Figure 5.4 A sample of generated report . . . . . . . . . . . . . . . . 55
Figure 5.5 Prompt Crafting Tool interface . . . . . . . . . . . . . . . 56
Figure 5.6 Rewrite Challenge interface . . . . . . . . . . . . . . . . . 58
vii
LIST OF TABLES
Table 2.1 English Fine-grained AI-generated Text Detection Datasets
Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Table 4.1 Statistics of human-written texts origins in FAIDSet. . . . . 29
Table 4.2 Number of examples per label in subsets in FAIDSet. . . . . 30
Table 4.3 Samples of diverse prompt templates used to generate FAIDSet 32
Table 4.4 Encoder selection on known vs. unseen generators. . . . . . 33
Table 4.5 Comparison of cluster ing algorithms on known vs. unseen
generators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Table 4.6 Performances of detectors with three labels . . . . . . . . . 43
Table 4.7 Accuracy of detectors in identifying generators . . . . . . . 44
viii
LIST OF EQUATIONS
Equation 3.1 Relations between text distributions . . . . . . . . . . . . 21
Equation 3.2 Relations between classes in case x = 1 . . . . . . . . . 23
Equation 3.3 Relations between classes in case x = 0 . . . . . . . . . 23
Equation 3.4 Relations between classes in case y = 1 . . . . . . . . . 23
Equation 3.5 Relations explained by parameter z for human-AI collab-
orative texts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Equation 3.6 Overall constraints for text representation . . . . . . . . . 24
Equation 3.7 Loss function for each case based on SimCLR . . . . . . 24
Equation 3.8 Multi-level contrastive loss . . . . . . . . . . . . . . . . 24
Equation 3.9 Multitask Auxiliary loss computed using cross entropy . 25
Equation 3.10 Overall loss formula . . . . . . . . . . . . . . . . . . . . 25
Equation 4.1 Accuracy formula . . . . . . . . . . . . . . . . . . . . . 41
Equation 4.2 Precision formula . . . . . . . . . . . . . . . . . . . . . 42
Equation 4.3 Recall formula . . . . . . . . . . . . . . . . . . . . . . . 42
Equation 4.4 F1-macro score formula . . . . . . . . . . . . . . . . . . 42
Equation 4.5 MSE formula . . . . . . . . . . . . . . . . . . . . . . . 42
Equation 4.6 MAE formula . . . . . . . . . . . . . . . . . . . . . . . 42
ix
LIST OF ABBREVIATIONS
Abbreviation Definition
AFI AI Factor Impact score
AI Artificial Intelligence
AIGT AI-Generated Text Detection
FAID Fine-grained AI-generated text
Detection (Name of the proposed
framework)
FAIDSet Fine-grained AI-generated text
Detection dataSet (Name of the
proposed dataset)
GPT Generative Pretrained Transformer
IELTS International English Language Testing
System
KNN k-Nearest Neighbors
LLM Large Language Model
MAE Mean Absolute Error
MSE Mean Squared Error
NLP Natural Language Processing
PDF Por table Document Format
SVM Support Vector Machine
VJOL Vietnam Online Journals
XLM Cross-lingual Language Model
x
CHAPTER 1. INTRODUCTION
1.1 Problem Statement
In recent years, LLMs such as ChatGPT
1
, Gemini
2
, Llama
3
, and DeepSeek
4
have emerged as powerful tools capable of generating high-quality natural language
text. These models are trained on massive corpora of web and human-authored
data, enabling them to produce fluent, contextually relevant, and stylistically var-
ied output in response to simple prompts. Their capabilities are being rapidly in-
tegrated into both professional and educational workflows, profoundly reshaping
how people write, learn, and communicate.
While these models offer considerable benefits, such as aiding in idea genera-
tion, automating routine writing tasks, or providing assistance to non-native speak-
ers, they have also introduced a number of serious challenges. The alarming con-
cern among these is their potential for misuse in contexts where originality, au-
thorship, and intellectual contribution are essential. In academic environments, for
instance, students increasingly use LLMs not just as writing aids, but as substitutes
for substantial parts of their assignments, including essays, code submissions, and
even research papers. In some cases, entire documents are generated with minimal
to no human refinement, yet submitted under the guise of original student work.
This shift has introduced a new layer of complexity to the evaluation of aca-
demic submissions and the enforcement of academic integrity. Unlike traditional
forms of plagiarism which rely on the reuse of existing human-authored text
AI-generated content is often novel in form and phrasing, making it undetectable
to conventional plagiarism detection tools. Consequently, instructors, editors, and
evaluators are increasingly faced with the question: Was this written by a human,
or by a machine?
The difficulty is not merely technical, it also strikes at the core of educational
values. Academic institutions rely on the assumption that submitted work reflects
the students own understanding, reasoning, and expression. When LLMs are used
to bypass this process, not only is the assessment compromised, but the students
learning process is also diminished. Furthermore, in broader domains such as jour-
nalism and digital media, AI-generated content can obscure the origin of informa-
1
https://chatgpt.com
2
https://gemini.google.com
3
https://www.perplexity.ai
4
https://chat.deepseek.com
1
CHAPTER 1. INTRODUCTION
tion, making it harder to verify sources and reducing trust in what is published.
A further complication arises from the increasingly blurred boundaries between
human and AI contributions. Many individuals use LLMs to draft content that
they later revise or personalize a form of human-AI collaboration. This gray area
presents a significant challenge for detection: existing tools often reduce the prob-
lem to a binary classification (human vs. AI), which fails to capture the nuanced
reality of mixed-authorship text.
As the use of LLMs becomes more pervasive and the models themselves grow
more sophisticated, the need for accurate, reliable, and fine-grained methods to dis-
tinguish between human-generated and AI-generated content becomes urgent. This
is not merely a technical requirement, but a foundational necessity for maintaining
integrity, accountability, and trust across academic, professional, and digital plat-
forms.
1.2 Foundational Background and Problem of Research
Recent studies have highlighted the growing presence of AI-assisted writing in
education and research. Tools like ChatGPT are increasingly used to generate es-
says, solve prog ramming tasks, or even draft entire research papers. While these ap-
plications offer undeniable productivity benefits, they also introduce critical chal-
lenges related to academic integrity. Notably, the line between AI assistance and
academic misconduct has become increasingly difficult to draw.
A key issue lies in the detection of AI-generated content. Traditional plagiar ism
detection tools, which are designed to identify copied material through textual over-
lap, are inadequate for this task. AI-generated texts are often or iginal in phrasing
and structure, making them elusive to existing detection systems. Prior approaches
to detecting such content have largely focused on binary classification (i.e., human
vs. AI), but these models tend to suffer from limited adaptability. They often fail
when applied to different academic domains, languages, or newer AI models, and
they struggle to detect more nuanced cases such as human-AI collaboration.
Moreover, many existing methods rely on surface-level features, such as word
frequency or sentence patterns, which do not capture the deeper stylistic or seman-
tic traits of AI-generated texts. The lack of sophisticated, fine-grained detection
tools introduces several significant problems:
Erosion of Trust: The inability to reliably detect AI-generated content can
under mine trust in digital platforms, as users may struggle to verify the au-
thenticity of information.
2
CHAPTER 1. INTRODUCTION
Challenges in Content Moderation: Platforms that rely on user-generated
content face difficulties in policing AI-generated texts, leading to the prolifer-
ation of misleading or harmful content.
Ethical Concerns: AI-generated content can be used for malicious purposes,
such as disinformation campaigns or academic dishonesty, making it vital to
develop robust detection mechanisms.
Lack of Generalization: Existing detection systems often fail to generalize
across different AI models, languages, and domains, limiting their practical
effectiveness as AI technology continues to evolve.
To address these limitations, this thesis proposes a new approach aimed at im-
proving the detection of AI-generated content, especially in academic settings. The
proposed system, ScholarSleuth, is designed to go beyond binary classification
and support a more nuanced analysis of authorship. By focusing on generalizabil-
ity and adaptability, the framework seeks to offer a more robust solution for distin-
guishing human, AI, and hybrid-authored content across varied contexts.
In addition to safeguarding academic integrity, the broader implications of this
work extend to other fields such as content moderation, journalism, and disinfor-
mation detection where verifying the authenticity and origin of text is equally
cr itical. As AI technologies continue to evolve rapidly, there is a growing need for
detection systems that can evolve in parallel and maintain their effectiveness across
different domains.
1.3 Scope of Research
The scope of this research is centered on the detection of AI-generated text, with
a particular emphasis on academic content. This research aims to develop a com-
prehensive framework to identify AI-generated text in academic theses, research
papers, and other scholarly reports, ensuring academic integrity and transparency
in the evolving landscape of AI-assisted writing.
The ScholarSleuth system, incorporating the FAID architecture, will be the core
tool developed to address this problem. The FAID framework will focus on de-
tecting fine-grained AI-generated text, distinguishing between three categories of
authorship: fully AI-generated, human-written, and human-AI collaborative text.
The ability to identify the specific AI model family responsible for generating the
content (such as GPT-4, Gemini, or Llama) will also be an essential feature of the
proposed detection system.
This study primarily concentrates on the following areas:
3
CHAPTER 1. INTRODUCTION
1. Multilingual, Multi-Domain, Multi-Generator Dat aset Construction: To
support robust training and evaluation, this research includes the creation
of FAIDSet a fine-grained dataset encompassing texts from multiple lan-
guages, academic domains, and AI generators. The dataset includes human-
wr itten, AI-generated, and hybrid texts sourced from diverse educational and
disciplinary contexts. This diversity allows the system to learn generalized pat-
terns and improves its adaptability to real-world scenarios involving various
types of AI involvement and linguistic nuances.
2. Detection of AI-Influenced Academic Texts: This research addresses the
challenge of detecting both fully AI-generated and human-AI collaborative
academic texts, including research papers, thesis drafts, and scholarly re-
ports. I investigates the linguistic and stylistic characteristics common to AI-
influenced academic writing, such as excessive fluency, generic phrasing, lack
of nuanced argumentation, or uniform structure that differentiate it from fully
human writing. The ScholarSleuth system is designed to identify these fea-
tures across a spectrum of AI involvement, from minor assistance to complete
generation.
3. Generalization Across AI Models and Domains: A core objective of this re-
search is to ensure the detection system’s robustness across various AI model
families (e.g., GPT, Llama, Gemini, DeepSeek) and academic subject areas.
Because each model exhibits distinct stylistic tendencies, the FAID architec-
ture will be developed with the capacity to generalize across these differences.
This generalization ensures the system remains effective even as new models
and applications emerge in different academic domains.
4. Multilingual Detection: Recognizing the global nature of academic publish-
ing, this study emphasizes the importance of detecting AI-generated content in
multiple languages. The ScholarSleuth system is evaluated for multilingual ca-
pability, extending detection support beyond English to include high-resource
and low-resource languages alike. This ensures broader applicability and rel-
evance in international academic environments.
5. Scalability and Real-World Applicability: Finally, the ScholarSleuth frame-
work is designed for practical deployment in real-world academic settings.
The system will be evaluated for scalability, efficiency, and ease of integra-
tion within workflows at universities, publishers, or research institutions. Its
ability to handle large volumes of submissions and deliver reliable classifica-
tion results in operational environments will be a cr itical metric of success.
4
CHAPTER 1. INTRODUCTION
In summary, this research is aimed at creating a robust and adaptable solution
for detecting AI-generated text in academic writing, addressing the critical need
for reliable methods to maintain academic integrity in an era of AI-assisted con-
tent creation. By focusing on the detection of fine-grained AI-generated and col-
laborative content, the ScholarSleuth system offers a comprehensive approach that
combines multi-level contrastive learning with real-world applicability across lan-
guages, domains, and evolving AI models.
1.4 Contributions
This research makes several significant contributions to the field of AI-
generated content detection, with a particular focus on academic texts. Three pri-
mary contributions of this thesis also three main components of ScholarSleuth
are outlined below:
1. Creation of FAIDSet a Multilingual, Multi-Domain, Multi-Generator
Dataset: I introduce a new benchmark dataset comprising approximately
84,000 examples across multiple domains (e.g., student theses, research ab-
stracts), languages (English and Vietnamese), and AI generators (e.g., GPT,
Llama, Gemini). This fine-grained dataset includes three categories of au-
thorship: human-written, AI-generated, and human-AI collaborative texts. It
addresses the scarcity of publicly available resources for comprehensive eval-
uation of AI-generated content detection systems in multilingual and mixed-
authorship settings.
2. Design of the FAID Detection Framework: I propose FAID (Fine-grained
AI-authorship Identification Detector), a novel detection framework designed
to improve generalization in identifying AI-generated and collaborative con-
tent. FAID leverages a combination of contrastive learning and auxiliary clas-
sification to capture subtle stylistic and semantic cues in latent representa-
tions. This enables the model to distinguish between closely related authorship
classes with higher robustness, especially in complex or deeply mixed inputs.
FAID also aims to deal with these two specific problems:
Improved Generalization to Unseen Domains and Generators: FAID
demonstrates superior performance over strong baseline detectors in both
in-domain and out-of-domain scenarios. In particular, its ability to gener-
alize to unseen AI models and academic domains makes it a practical and
forward-compatible solution. This advantage stems from its architectures
focus on learning model-agnostic stylistic features, rather than overfitting
to known generators.
5
CHAPTER 1. INTRODUCTION
Enhanced Interpretability via Stylistic Proximity: Beyond raw classi-
fication, FAID provides a mechanism to interpret its predictions by mea-
sur ing the stylistic proximity between the input text and reference exam-
ples with known authorship in a learned embedding space. This capability
offers insights into why a given text is attributed to a particular class, sup-
porting greater transparency and trust in AI-authorship judgments.
3. Web Application for AI-Generated Text Detection: In addition to the core
detection framework, this research presents a web application that enables
users to upload and generate reports on the AI-generated content within the
document. It also introduce AI Factor Impact score a benchmark to identify
the AI involvement level. After processing, the application generates a detailed
report, categorizing sections of the text as human-written, AI-generated, or
collaborative, and providing insights into the specific AI model used for gen-
eration. This tool makes it easy for researchers, educators, and content moder-
ators to assess the authenticity of texts in an accessible, user-friendly manner.
Overall, this thesis contributes to the growing field of AI content detection by
providing an advanced, flexible, and scalable solution for identifying AI-generated
text in academic writing. The ScholarSleuth framework, combined with its web
application, offers a practical tool for improving transparency, accountability, and
academic integrity in an era where AI-assisted content creation is becoming in-
creasingly prevalent.
1.5 Organization of Thesis
The remaining parts of this thesis is organized as follow:
Chapter 2 provides the necessary backgrounds and related works. This chapter
explores the increasing integration of AI assistance in writing processes, the need
for detection methods and the limitations of existing detection techniques. It also
discusses recent developments in dataset construction, generalization challenges in
detection systems, and the use of contrastive learning to improve model robustness.
Chapter 3 defines the core task addressed in this thesis: the fine-grained clas-
sification of text authorship into three categories. It introduces the FAID frame-
work, including its task formulation and the architectural design of both the train-
ing and inference components. This chapter also extends the methodology by defin-
ing "LLM families" and optimizing sentence encoders for better generalization. It
introduces the full training pipeline, including a multitask auxiliary classifier and
a contrastive learning approach that incorporates both authorship types and LLM
families. The chapter also describes an adaptive inference strategy for handling
6
CHAPTER 1. INTRODUCTION
unseen texts without retraining.
Chapter 4 details the selection and evaluation process in designing FAID archi-
tecture, the experimental evaluation of the architecture and presents the construc-
tion of FAIDSet, a new multilingual, multi-domain, and multi-generator dataset
designed to support research on AI-generated text detection with details about the
data sources, collection methodology, and the rationale behind the dataset design.
It covers the setup, baseline comparisons, and results from tasks involving both
authorship classification and generator attribution. The experiments are conducted
across multiple conditions, including unseen domains and generators, to test the
system’s generalization performance.
Chapter 5 describes the ScholarSleuth web application, which implements the
FAID framework in an interactive system for real-time AI-generated content detec-
tion. This chapter showcases the system’s practical deployment and demonstrates
its usability for academic and educational settings.
Finally, Chapter 6 summarizes and draws a conclusions about the key findings
of this thesis, highlighting the importance and effects of applying this system into
real-world scenario. This chapter also makes some suggestions about potential di-
rections for future works and improvements in the field of AI-generated content
detection.
7
CHAPTER 2. BACKGROUND AND RELATED WORKS
2.1 Overview
Advanced LLMs have changed how we create and refine text. Instead of origi-
nating every word, humans now more often act as post-editors or reviewers, step-
ping in after AI drafts the initial content. This collaborative workflow spans fields
from academic publishing and journalism to education and social media, making
hybrid human-AI texts the new norm [12, 30]. As AI-generated content becomes
increasingly sophisticated, detecting the origin of written text whether human or
AI-generated has become a critical challenge.
This chapter provides an overview of the key concepts and research areas rel-
evant to this thesis. First, it examines the evolution and capabilities of large lan-
guage models, discussing their development and how they have been leveraged in
wr iting assistance. It then explores the growing tendency to use AI as an assis-
tant in writing, the str uggle of experts in distinguishing authorship, followed by an
overview of the fine-g rained AI-generated text datasets and the need for accurate
detection methods. I also discuss the generalization challenges in detecting AI-
generated content across domains and languages. The chapter also discuss about
contrastive learning techniques, which have proven effective in improving the ac-
curacy of AI content detection. They are also the foundational knowledge required
to understand the contributions of this thesis will be established.
2.2 Advancements in Large Language Models and Their Impacts on Writing
Practices
Advancements in LLMs have transformed how text is generated, refined, and
evaluated. As models grow more sophisticated, they are increasingly embedded in
wr iting workflows, from academic research to everyday communication. At the
same time, their human-like output challenges traditional notions of authorship
and authenticity, raising complex questions about trust, literacy, and detection in
the age of AI-assisted writing.
2.2.1 The Advancements of Large Language Models
The inception of LLMs can be traced back to the introduction of the transformer
architecture [46] with self-attention mechanisms to process and generate sequences
of text, which replaced traditional recurrent neural networks and long short-term
memory networks with self-attention mechanisms [34, 53]. This architecture al-
lows for more efficient processing of sequences and better handling of long-range
8
CHAPTER 2. BACKGROUND AND RELATED WORKS
dependencies in text.
Building upon this foundation, OpenAI introduced the Generative Pre-trained
Transformer (GPT) series, starting with GPT-2 [41], which demonstrated the abil-
ity to generate coherent and contextually relevant text. GPT-3, released in 2020,
further scaled up the model to 175 billion parameters [7], achieving state-of-the-
art performance on a wide range of NLP tasks without task-specific fine-tuning.
GPT-4 [37], released in 2023, continued this trend, offering improved reasoning
capabilities and performance across various benchmarks.
Metas Llama series provides an open-source alternative to proprietary mod-
els. Llama-2 [22], released in 2023, and Llama-3 [32], released in 2024, are auto-
regressive transformer models trained on publicly available datasets. These models
emphasize efficiency and accessibility, aiming to democratize access to powerful
language models.
Googles Gemini series [20], introduced in 2023, represents a significant ad-
vancement in LLMs. Gemini models are multimodal, capable of processing and
generating not only text but also images and other data types. This multimodal ca-
pability allows Gemini models to perform complex tasks that require understanding
and generating multiple forms of data simultaneously.
The release of DeepSeek R1 [15] marks a pivotal moment in the evolution of AI
models, offering an innovative approach to AI’s "thought process" or reasoning ca-
pabilities. Unlike traditional LLMs that focus primarily on generating contextually
appropr iate text, DeepSeek R1 introduces a model trained to simulate reasoning
in a manner more akin to human cognition. It is designed to replicate the way AI
"thinks" by mimicking how humans process information, make decisions, and ana-
lyze situations. The model incorporates a more structured decision-making process
that not only generates text but also considers multiple layers of reasoning before
arriving at conclusions. This is achieved through a unique architecture that inte-
grates multilevel contrastive learning and decision layers to model the reasoning
paths an AI might take in generating text.
This model’s release has significant implications for academic, professional,
and creative writing applications, as it offers a level of sophistication that is closer
to human thought processes. For content detection systems, however, the chal-
lenge becomes more complex, as distinguishing between machine-generated text
and human-generated content becomes less about surface-level patterns and more
about deeper, reasoning-driven structures in the text. The development of such AI
models emphasizes the need for detection methods that can account for more than
9
CHAPTER 2. BACKGROUND AND RELATED WORKS
Figure 2.1: Survey on students tendency of using AI conducted by Digital Education
Council [16].
just linguistic features, incorporating mechanisms that can assess the logic and rea-
soning behind content generation.
2.2.2 Tendency of Using AI Assistant in Writing
The integration of AI assistants into writing processes has seen a significant
uptick across educational and professional domains. A 2024 survey of Digital Ed-
ucation Council revealed that 86% of students reported using AI tools in their stud-
ies, 54% of them are using in weekly or daily basis, with 58% feeling inadequately
prepared for an AI-enabled workforce, highlighting the rapid adoption and the ac-
companying need for AI literacy [16].
In academic settings, AI tools are employed for various purposes, including
literature reviews, drafting, and editing. A survey of research scholars indicated
that 73.6% use AI in education, with 51% utilizing it for literature reviews and
10
CHAPTER 2. BACKGROUND AND RELATED WORKS
Figure 2.2: Analysis from University of Illinois Chicago on students AI usage purpose.
46.3% for writing and editing tasks [54]. Furthermore, 72.04% of participants in a
study on AI tools in academic writing reported that these tools had improved the
overall academic rigor of their work [35].
However, the widespread use of AI assistants has raised concerns about aca-
demic integrity. A survey by the Tertiary Education Quality and Standards Agency
found that over a third of students use chatbots like ChatGPT for assessments with-
out viewing it as cheating [5]. This has prompted educators to reconsider assess-
ment strategies, with some advocating for oral assessments and requiring students
to "show their working" to mitigate potential misuse.
Despite these concerns, many students view AI tools as valuable educational
aids. A report from the University of Illinois Chicago, which is shown in Figure 2.2
noted that around 56% of students acknowledged using AI for research and and
75% of them use AI for writing assistance, highlighting its utility in their academic
endeavors [26]. Moreover, 75% of educational leaders believe that generative AI
tools will improve student research skills, and 69% think these tools will enhance
students ability to write clearly and persuasively [44].
In summary, the adoption of AI assistants in writing is widespread, with stu-
dents and researchers leveraging these tools to enhance productivity and writing
quality. While concerns about academic integrity persist, the general consensus
11
CHAPTER 2. BACKGROUND AND RELATED WORKS
acknowledges the potential benefits of AI in education, underscoring the need for
balanced and ethical integration of these technologies.
2.2.3 Human vs. AI: The Limits of Expert Judgment
Our recent findings [52] show that even expert annotators linguists, NLP re-
searchers, and practitioners struggle to reliably distinguish between human and
AI-generated text. Across 16 datasets in 9 languages, expert accuracy averaged just
87.6%, suggesting that even seasoned evaluators are often misled by AI outputs
designed to mimic human writing.
Detection success varies by context. In paired comparisons, where human and
AI texts are judged side-by-side, cues like concreteness and linguistic quirks are
more detectable. But in single-text evaluations, accuracy drops signif icantly, some-
times nearing chance levels highlighting the influence of framing and context on
expert judgment.
Prompt engineering further blurs the line. When models are guided to avoid
templates, add cultural nuance, and adopt varied styles, expert detection perfor-
mance can fall to 72.5%. These refined outputs increasingly mirror the richness
and spontaneity of human language, making even trained readers uncertain.
Ultimately, these findings challenge the reliability of conventional detection cri-
ter ia and signal the need for a reevaluation of what it means to produce "authentic"
text in an era increasingly influenced by advanced language models. If even the
experts, armed with extensive domain knowledge and contextual acuity, struggle
to distinguish between human and machine prose, then the implications extend be-
yond academic exercises to broader concerns over trust, authenticity, and account-
ability in our digital communications.
2.3 Fine-grained AI-generated Text Dat asets
The detection of AI-generated text, particularly in academic contexts, requires
fine-grained datasets that can distinguish between human-authored, AI-generated,
and human-AI collaborative content. Fine-grained detection goes beyond tradi-
tional plagiarism detection by focusing on the subtle stylistic differences between
these types of content. As AI models like GPT, Llama, Gemini, and DeepSeek
become more sophisticated, the need for datasets that can capture these nuanced
distinctions grows.
Fine-grained datasets typically categorize AI-generated content into several
types [51]:
AI-polished: Texts in this category are originally human-written and then re-
12
CHAPTER 2. BACKGROUND AND RELATED WORKS
fined or polished by AI models. These refinements typically involve enhanc-
ing grammar, clarity, or fluency without altering the original meaning or struc-
ture significantly. Detecting AI-polished text requires models that can identify
minor, yet characteristic stylistic changes introduced by the AI.
AI-continued: In this case, a human starts the text, and an AI model generates
the continuation. This category often presents the most challenging detection
scenar ios because the AI’s writing must maintain coherence with the original
human-wr itten part while adhering to its own stylistic patterns.
AI-paraphrased: Texts that were originally written by humans and reworded
by AI to express the same meaning using different phrasing, sentence struc-
ture, or vocabulary. Paraphrased texts may retain the original ideas but lack
the human-like identity that distinguish genuine human writing.
Purely AI-generated: These texts are entirely generated by AI, without any
human input. The challenge here lies in detecting the characteristics that
differentiate machine-generated content from human-authored text, such as
overly structured language, formulaic sentence construction, and lack of deep
analysis.
Various datasets have been developed to support the fine-grained detection of
AI-generated text. These datasets typically contain a mix of human-written and
AI-generated content, spanning different domains, languages, and levels of collab-
oration.
LLM-DetectAIve [3]: This dataset includes a variety of domains, such as
academic papers, student essays, and articles, with labels for both human and
AI-generated texts. The dataset also includes machine-polished and machine-
humanized texts, providing valuable data for training and evaluating AI con-
tent detection models.
HART [6]: This dataset spans domains such as student essays, research pa-
per abstracts, and creative writing. It includes four categories: human-written
texts, LLM-refined texts, AI-generated texts, and humanized AI-generated
texts. The inclusion of hybrid texts where human input is mixed with AI-
generated content makes it especially useful for fine-grained detection sys-
tems.
M4GT [50]: This dataset focuses on the generalization of AI-generated con-
tent detection across different domains and AI models. It contains a wide va-
r iety of texts, including both human-wr itten and machine-generated content,
13
CHAPTER 2. BACKGROUND AND RELATED WORKS
and is designed to test the perfor mance of detection models in real-world, di-
verse scenarios.
Beemo [4]: The Beemo dataset includes texts that are AI-generated, AI-
humanized, and deeply mixed, along with human-written texts. It is partic-
ularly useful for understanding how AI models blend with human writing and
for detecting subtle differences in mixed content.
These datasets play a critical role in developing detection systems capable of
handling the increasing variety of AI-generated text. They offer valuable annotated
examples that allow models to learn the distinguishing features between human and
AI-generated content. I do analysis in many cur rent dataset on AI-generated text
and summarized them in Table 2.1. Many dataset contains fine-grained labels and
multi-generator, but all of analyzed datasets are monolingual only in English.
This problem makes me to collect and formulate a new dataset for multilingual,
multi-domain and multilingual AI-generated text dataset to adapt the approach to
real-world scenarios.
While these datasets provide the necessary foundation for fine-grained detec-
tion, several challenges remain:
Multilinguality: Many existing datasets are limited to English or a few other
languages, making it difficult to apply detection methods in multilingual con-
texts. There is a pressing need for datasets that cover a wider array of lan-
guages to ensure that detection models can generalize across linguistic bound-
ar ies.
Domain Diversity: AI models like GPT and Llama are trained on diverse data
sources, which means their generated content can span many domains. Fine-
grained detection systems must be trained on datasets that represent a broad
range of topics, from technical papers and research articles to creative writing
and casual dialogue.
Evolving AI Models: The rapid development of new AI models means that
detection systems need to be adaptable. Models like Gemini and DeepSeek
represent new generations of AI tools, each with its own distinctive features.
Fine-grained detection must be able to handle new model architectures and
the content they generate, which requires continuously updated datasets and
adaptable detection techniques.
Hybrid Texts: One of the most challenging aspects of AI-generated text detec-
tion is identifying hybr id texts those created through collaboration between
14
CHAPTER 2. BACKGROUND AND RELATED WORKS
Dataset Languages Label Space Domains Generators Size
MixSet
[57]
English
Human-written, AI-polished
Human-initiated, AI-continued
AI-generated, human-edited
Deeply-mixed text
Email
News
Game reviews,
Paper abstracts,
Speech,
Blog
GPT-4
Llama 2
Dolly
3,600
LLM-DetectiAIve
[3]
English
Human-written
AI-generated
Human-written, AI-polished
AI-generated, AI-humanized
arXiv abstracts,
Reddit posts,
Wikihow,
Wikipedia articles,
OUTFOX essays,
Peer reviews
GPT-4o
Mistral 7B
Llama 3.1 8B
Llama 3.1 70B
Gemini
Cohere
487,996
Beemo
[4]
English
AI-generated, AI-humanized
Human-written
AI-generated
AI-generated, human-edited
Generation,
Rewrite,
Open QA,
Summarize,
Closed QA
Llama 2
Llama 3.1
GPT-4o
Zephyr
Mixtral
Tulu
Gemma
Mistral
19,256
M4GT
[50]
English
Human-written, AI-continued
Human-written
AI-generated
Peer review,
OUTFOX
Llama 2
GPT-4
GPT-3.5
33,912
Real or Fake
[17]
English
Human-written
Human-initiated, AI-continued
Recipes,
Presidential Speeches,
Short Stories,
New York Times
GPT-2,
GPT-2 XL
CTRL
9,148
RoFT-chatgpt
[29]
English Human-initiated, AI-continued
Short Stories,
Recipes,
New York Times,
Presidential Speeches
GPT-3.5-turbo 6,940
Co-author
[55]
English Deeply-mixed text
Creative writing,
New York Times
GPT-3 1,447
TriBERT
[56]
English
Human-initiated, AI-continued
Deeply-mixed text
Human-written
Essays ChatGPT 34,272
LAMP
[9]
English AI-generated, human-edited Creative writing
GPT-4o
Claude 3.5 Sonnet
Llama 3.1 70B
1,282
APT-Eval
[42]
English Human-written, AI-polished Based on MixSet
GPT-4o
Llama 3.1 70B
Llama 3.1 8B
Llama 2 7B
11,700
HART
[6]
English
Human-written
Human-written, AI-polished
AI-generated, AI-humanized
AI-generated text
AI-generated, human-edted
GPT-3.5-turbo
GPT-4o
Claude 3.5 Sonnet
Gemini 1.5 Pro
Llama 3.3 70B
Qwen 2.5 72B
16,000
LLMDetect
[12]
English
Human-written
Human-written, AI-polished
Human-written, AI-extended
AI-generated
DeepSeek v2
Llama 3 70B
Claude 3.5 Sonnet
GPT-4o
64,304
ICNALE corpus
[33]
English Human-written Essays
Qwen 2.5
Llama 3.1 8B/70B
Llama 3.2 1B/3B
Mistral Small
67,000
CyberHumanAI
[36]
English Human-written Wikipedia 500
Table 2.1: English Fine-grained AI-generated Text Detection Datasets Overview.
15
CHAPTER 2. BACKGROUND AND RELATED WORKS
humans and AI. These texts often blend human creativity with AI-generated
content, making it harder to attribute authorship accurately. The datasets used
for training detection models must therefore include a significant amount of
hybrid content to allow the model to learn how to identify these complex cases.
2.4 Generalization of AI-generated Text Detection
Recent studies such as M4GT [50] and LLM-DetectAIve [3] have highlighted a
persistent challenge for both binary and fine-grained AI-generated text detection:
poor generalization to unseen domains, languages, and generators. Many detection
methods show a signif icant drop in performance on out-of-distribution data, under-
scor ing the difficulty of building robust detectors to deal with real-world scenarios
and evolving LLMs outputs.
Several key challenges must be overcome to improve the generalization of AI-
generated text detection:
Domain Shifts: One of the most significant hurdles in AI-generated text de-
tection is the variability of content across different domains. AI models like
GPT and Llama are often trained on diverse datasets, enabling them to gener-
ate content in a wide variety of domains, including technical writing, creative
content, academic research, and casual conversations. A model trained to de-
tect AI-generated content in one domain may not perform well in another,
especially if the writing styles or terminology used in the domain differ sig-
nificantly. This issue, known as domain shift, makes it necessary to develop
detection models that can adapt to different topics and content types without
significant performance degradation.
Language Diversity: Most AI-generated text detection models are initially
trained on English-language datasets, which limits their applicability in mul-
tilingual settings. As AI models like Gemini and DeepSeek become increas-
ingly global, generating text in multiple languages, detection models must be
capable of handling non-English content. This multilingual challenge requires
the development of language-agnostic features or the use of multilingual train-
ing datasets to ensure that models can detect AI-generated text in diverse lin-
guistic contexts.
Emerging AI Models: The rapid pace of innovation in AI technology means
that new language models are continually being introduced. For example,
GPT-4 introduced new capabilities compared to its predecessors, and mod-
els like DeepSeek R1 offer unique features that distinguish them from other
AI systems. Detection models that are trained on one specific AI model may
16
CHAPTER 2. BACKGROUND AND RELATED WORKS
not generalize well to newer models with different architectural or training
methodologies. Furthermore, generative models are becoming more sophis-
ticated in their ability to mimic human writing styles, making it increasingly
difficult for detection systems to identify subtle AI-generated characteristics
in new content.
Hybrid Texts: The increasing prevalence of human-AI collaboration in con-
tent generation creates another challenge for generalization. Hybrid texts
those in which humans and AI collaborate are not easily categorized as
either fully human or fully AI-generated. These texts often exhibit character-
istics of both human creativity and AI-generated content, complicating the
detection process. Models trained on purely human or purely AI texts may
struggle to identify the nuanced features of hybrid texts, requiring more ad-
vanced detection strategies that can assess the degree of AI involvement in the
wr iting process.
To address these challenges and improve the generalization of AI-generated text
detection, several strategies have been proposed:
Domain-Adversarial Training: One approach to improving generalization
across domains is domain-adversarial training [18]. This method involves
training the detection model to be domain-agnostic by using adversarial tech-
niques that encourage the model to learn features that are not specific to any
particular domain. By learning to ignore domain-specific information, the
model can become more adaptable to new topics and content types, improving
its ability to detect AI-generated text across a wide range of domains.
Adversarial In-Context Learning: OUTFOX [28] proposes dynamically
generating challenging examples within the in-context learning paradigm. By
exposing the model to adversarially difficult instances, it enhances robust-
ness and forces the model to better distinguish subtle signals of AI generation.
However, this method can be computationally expensive and does not fully
resolve transfer issues to unseen domains.
Sentence-Level Detection with Style-Content Interaction: SeqXGPT [49]
adopts a sentence-level detection strategy by leveraging a combination of log-
probability features and deep neural architectures, including convolutional
layers and self-attention. This allows the model to better capture local mixed-
content patterns and stylistic inconsistencies, contributing to improved gen-
eralization across text structures. Nevertheless, its reliance on model-specific
features may hinder scalability to unseen generators.
17
CHAPTER 2. BACKGROUND AND RELATED WORKS
Fine-Grained Multi-Class Classification: LLM-DetectAIve [3] takes a fine-
grained classification approach, categorizing texts according to specific gener-
ator types while also using domain-adversarial training to mitigate overfitting.
This dual mechanism improves detection within known domains but faces
challenges in generalizing to completely novel generators and lacks multilin-
gual support.
The ability to generalize AI-generated text detection methods across domains,
languages, and models is essential for deploying these systems in real-world set-
tings. For example, in academic publishing, where papers from a wide range of
disciplines and authors are submitted, a detection model must be capable of iden-
tifying AI-generated content regardless of the subject matter or language used.
Similarly, in content moderation for social media platforms or news outlets, AI-
generated posts must be identified and flagged for review, even when they come
from different sources or involve different AI models.
Furthermore, as AI tools continue to evolve, detection systems must be able to
keep pace with advancements in generative models. This requires a flexible and
adaptable approach to model training and evaluation, ensuring that AI-generated
content is reliably detected across a wide range of scenar ios.
Generalizing AI-generated text detection is a complex and ongoing challenge,
dr iven by factors such as domain diversity, language variability, the rapid develop-
ment of new AI models, and the rise of hybrid texts. To ensure the robustness of
detection systems, it is essential to employ strategies such as domain-adversarial
training, multilingual training, and incremental learning. By addressing these chal-
lenges, detection systems can be developed that are not only accurate but also
adaptable to the ever-changing landscape of AI-generated content.
2.5 Contrastive Learning for AI-generated Text Detection
Contrastive learning has been extensively used to improve the sentence repre-
sentation by pulling semantically similar sentences closer and pushing dissimilar
ones apart, reorganizing the hidden space of sentence representation by seman-
tics. For example, a representative work SimCSE regarded an input sentence un-
der dropout noise as its semantically similar counterpart (i.e., positive pairs), and
trained the encoder to close their distance. Additionally, their natural language in-
ference pairs, treats entailment pairs as positives and contradiction pairs as hard
negatives [19]. For negative pairs with contradictory meanings, the embeddings
were trained to be distant in the latent space. Similarly, DeCLUTR [23] constructed
positive pairs by extracting different spans from the same texts, and negative pairs
18
CHAPTER 2. BACKGROUND AND RELATED WORKS
are sampled from different texts.
We adopt this philosophy to reorganize the latent space based on writing styles,
pulling human-written texts to cluster together while remaining distant from AI-
generated ones. Analogous to semantic textual similarity tasks, where sentence
similar ity ranges from 0 to 5 to reflect varying degrees of semantic overlap, we
incorporate ordinal reg ression into our framework to model the degree of human
involvement in collaborative texts, ranging from 0 (fully AI-generated) to 1 (fully
human-wr itten).
DeTeCtive also took advantage of contrastive learning and was used in binary
AI-generated text detection [24]. Based on a multi-task framework, DeTeCtive was
trained to learn writing-style diversity using a multi-level contrastive loss, and an
auxiliary task of classifying the source of a given text (human vs. AI) to capture dis-
tinguishable signals. For inference, the pipeline encoded the input text as a hidden
vector and applied dense retrieval to match the cluster, depending on the stylistic
similar ity with previously indexed training feature database. Additionally, instead
of retraining the model when faced with new data, they encoded new data with the
trained encoder to obtain embeddings, and then added them to the feature database
to augment. This largely improved the generalizability in unseen domains and new
generators.
However, this approach only considers two types of text: human-written and
AI-generated, without further consideration of human-AI collaborative texts. Our
work fills this gap with the goal of improving generalization performance for fine-
grained AI-generated text detection.
2.6 Summary
In this chapter, I have reviewed the foundational concepts and recent advance-
ments in the detection of AI-generated text. I began by examining the backgrounds
of large language models and their increasing use as writing assistants in both aca-
demic and professional contexts. I then discussed the importance of fine-grained
datasets for training detection systems and the challenges associated with gener-
alizing detection methods across domains, languages, and evolving AI models.
Finally, I explored the role of contrastive learning in improving the detection of
AI-generated content. This background provides the basis for the development of
the ScholarSleuth framework, which leverages these techniques to offer a compre-
hensive solution for detecting AI-generated content in academic writing.
19
CHAPTER 3. METHODOLOGY
3.1 Overview
This chapter presents the overall FAID architecture and the methodology for de-
tecting and analyzing AI-generated texts with an emphasis on generalization and
robustness to unseen data. I begin by exploring stylistic and semantic consistencies
among outputs of different LLM families, establishing the foundation for author-
ship attribution and classification. Through empirical analysis, I identify distinct
family-specific traits, justifying the use of family-level labels.
Next, I evaluate a variety of sentence encoders to select the optimal model for
my task. Building on this, I introduce a multi-level contrastive learning framework
that leverages hierarchical relationships between text sources and model families
to enhance representation learning. To further improve performance and encourage
broader generalization, I incorporate a multitask auxiliar y classifier trained with
cross-entropy loss alongside the contrastive objective.
Finally, to handle texts from unseen models or domains without requiring re-
training, I propose an adaptive inference pipeline that combines the trained encoder
with a vector database and Fuzzy k-Nearest Neighbors for robust similarity-based
classification. This strategy significantly improves resilience to distributional shifts
in real-world deployment scenarios.
3.2 Overview of FAID Architecture
In this work, I formulate the task as a three-class classification problem: human-
written, AI-generated, and human-AI collaborative texts. The human-AI collabora-
tive categor y involves a range of interaction between humans and AI systems, such
as human-written then AI-polished, human-initiated then AI-continued, human-
written then AI-paraphrased, AI-generated then AI-humanized, etc.
Given the growing variety and complexity (e.g., deeply mixed text) of collab-
orative patterns, I do not exhaust all patterns. Instead, I consolidate all forms of
human-AI collaboration into a single label to maintain practical simplicity and
model generalizability. This approach reflects real-world usage, where using AI
tools to enhance clarity or expression is increasingly common and often ethically
acceptable.
I also investigate the dataset and find that the AI models within the same family
tend to have similar writing styles and text distributions, due to their shared training
data and architecture, which is described more detailed in Section 3.3.2. Therefore,
20
CHAPTER 3. METHODOLOGY
I consider each model’s family to be an author with unique writing styles.
I aim to enable the encoder to capture multi-level similarities between authors
(AI, human and human-AI) as Equation 3.6, forcing model to learn features that can
differentiate distinct authors in a high-dimensional hidden vector space. I denote
S
c
as cosine similarity, ϕ(·) is the encoder function and P
i
, P
j
, (1 i j < 5) are
distributions of different text subsets. I have:
E
xP
i
,yP
j
[S
c
(ϕ(x), ϕ(y))] E
xP
i
,yP
j+1
[S
c
(ϕ(x), ϕ(y))] (3.1)
where P
1
corresponds to the distribution generated by a particular LLM family,
P
2
to the distribution generated by any LLMs, P
3
to the distribution of a collabo-
rative text generated by humans and the LLM family of P
1
, P
4
to the distr ibution
of collaborative text generated by humans and any LLM families, and P
5
to the
distribution of human-written text.
To clarify the rationale for configuring FAID to expect that the similarity of a
text x (from lower-level distributions P
1
or P
2
) with samples from P
3
is generally
greater than or equal to its similarity with samples from P
4
, consider the following.
If x P
1
(x is generated by a particular LLM): Let y
LHS
be drawn from P
3
and y
RHS
be drawn from P
4
. Naturally, the similarity S
c
(x, y
LHS
) should be
greater than S
c
(x, y
RHS
). This is because P
3
contains texts that share a direct
LLM family origin with x.
If x P
2
(x is generated by any LLM): In this scenario, with y
LHS
from P
3
and
y
RHS
from P
4
as defined above, the similarity S
c
(x, y
LHS
) is generally expected
to be equal to S
c
(x, y
RHS
). Since x can originate from any LLM, it does not
inherently possess a stronger connection to the specific LLM family in P
3
than
to the broader human-LLM collaborations represented in P
4
.
This configuration aims to ensure that closeness in distribution corresponds to
heightened similarity after encoding, encouraging the model to discern fine-
grained multi-level relations.
3.3 Training Architecture
3.3.1 Architecture Design
Leveraging multi-level contrastive learning loss, I fine-tune a language model
based on the human, human-AI, and AI texts to reorganize the embedding space.
The goal is to pull embeddings of the same authorship category closer and push
apart embeddings of different origins. This method allows the encoder to capture
21
CHAPTER 3. METHODOLOGY
Encoder
(XLM-RoBERTa)
AI-generated
Human
Human
GPT
LLaMA
Human +
Deepseek
Human-Written
Human-AI collaboration
Push
Text Embeddings
Differentiating Authors Via Contrastive Learning
Pull
Push
Figure 3.1: Training architecture. Leveraging multi-level contrastive learning loss, we fine-
tune a language model based on the human, human-AI and AI texts, to force the model to
reorganize the hidden space, pulling the embeddings within the same author families closer,
and pushing the embeddings from different authors farther. We train an encoder that can
represent text with distinguishable signals to discern authorship of text.
nuanced differences across the three major classes: human-written, human-AI col-
laborative, and AI-generated.
As illustrated in Figure 3.1, three categories of texts are first passed through a
shared encoder, specifically XLM-RoBERTa [45], which is selected for its multi-
lingual capability and robust per formance on diverse datasets. The encoder trans-
forms the input into dense vector representations. These embeddings are then used
to compute contrastive loss by comparing intra-class and inter-class distances.
On the right side of the figure, the visualized embedding space demonstrates
the desired outcome: embeddings from the same author group (e.g., Human, GPT,
Llama, or Human + DeepSeek) are pulled together, while embeddings from dif-
ferent groups are pushed away. For example, human-only embeddings are grouped
separately from those of GPT or Llama. Meanwhile, collaborative texts such as
Human + DeepSeek are positioned between AI-only and Human-only clusters, re-
flecting their hybrid nature.
This structure not only supports three-way classification, but also enables deeper
analysis by identifying subtle signals of AI involvement, supporting downstream
tasks like generator attribution and authorship clustering.
More detailed information about the model selection process is provided in Sec-
tion 4.4.
3.3.2 What is LLM family?
Here, I define a LLM family is all the models from a company, but with different
model generations. For example: GPT-4 and GPT-4o both come from OpenAI, but
GPT-4o is newer than GPT-4, so they are called same family. This assumption was
made because different generations in a same family have shared training data,
22
CHAPTER 3. METHODOLOGY
shared architectures, guiding styles, etc.
To confirm the assumption, assess the stylistic and semantic consistency of AI-
generated texts, I conducted a comprehensive analysis using various perspectives:
N-grams distribution, text length patter ns, and semantic embedding visualization in
Section 4.5. These analyses allow me to examine similarities both within the same
model family and across different model families, leading to a robust understanding
of LLM "authorship" characteristics.
3.3.3 Multi-level Contrastive Learning
Given a dataset with N samples, the i
th
sample is denoted as T
i
and assigned
with three-level labels indicating its source.
x: if T
i
is fully AI-generated, x
i
= 0, otherwise x
i
= 1;
y: if T
i
is fully human-written, y
i
= 0, otherwise y
i
= 1;
z: an indicator of a specific LLM family.
The encoder ϕ(·) represents the sample T
i
in a d-dimensional vector space R
d
.
Then I calculate the cosine similarity between two samples T
i
and T
j
, denoted by:
σ(i, j) = S
c
(ϕ(T
i
), ϕ(T
j
)).
For AI-generated text (x = 0), the similarity between T
i
and another AI-
generated text T
j
is greater than that with a human-written or collaborative text
T
k
(x = 1):
σ(i, j) > σ(i, k), x
i
= 0, x
i
= x
j
, x
k
= 1 (3.2)
If x = 0, then the text is fully AI-generated. For this case, I do not consider
y since AI-generated text is considered as non-AI-collaboration. I can imply that
the similarity between two texts written by the same AI is higher than that of two
different AI families. Hence:
σ(i, j) > σ(i, k), x
i
= 0, z
i
= z
j
, z
i
= z
k
(3.3)
The reverse condition (with y = 1) is also true. I can conclude that:
σ(i, j) > σ(i, k), x
i
= 1, y
i
= y
j
, y
i
= y
k
(3.4)
For cases where all samples are human-AI collaborative texts (x, y = 1), two
texts with involvements of the same AI family tend to be more similar than the ones
23
CHAPTER 3. METHODOLOGY
that involve contributions from different families. That is:
σ(i, j) > σ(i, k), x
i
= 1, y
i,j,k
= 1, z
i
= z
j
= z
k
(3.5)
Combining all, the text representation is learned with the following constraints:
σ(i, j) > σ(i, k), x
i
= 0, x
i
= x
j
= x
k
;
σ(i, j) > σ(i, k), x
i
= 0, z
i
= z
j
= z
k
;
σ(i, j) > σ(i, k), x
i
= 1, x
i
= x
j
= x
k
;
σ(i, j) > σ(i, k), x
i
= 1, y
i
= y
j
= y
k
;
σ(i, j) > σ(i, k), x
i
= 1, y
i
= y
j
= y
k
= 1, z
i
= z
j
, z
i
= z
k
(3.6)
To enforce the similarity constraints outlined in Equation 3.6, I build upon the
SimCLR framework [10] and introduce a strategy for defining both positive and
negative sample pairs, which forms the basis of my contrastive learning loss. De-
parting from traditional contrastive losses that rely on a single positive sample,
my approach considers a group of positive instances that satisfy specif ic criteria.
The similarity between the anchor and positive samples is computed as the average
similar ity across this entire positive set. For negative samples, I follow the method-
ology used in SimCLR. The resulting contrastive loss, expressed in Equation 3.7,
involves q as the anchor sample, K
+
as the positive sample set, K
as the negative
sample set, τ as the temperature parameter, and N
K
+
as the number of positive
samples.
L
q
= log
exp
P
kK
+
S(q,k)
τ
/N
K
+
!
exp
P
kK
+
S(q,k)
τ
/N
K
+
!
+
P
kK
exp
S(q,k)
τ
(3.7)
Different constraints result in varied sets of positive and negative samples.
Based on these sets, contrastive losses are computed at multiple levels. As I de-
clared in Equation 3.7, each inequality in Equation 3.6 is denoted as L
q
i
where
ε =
1, 5 respectively. To form Equation 3.8, I need to add the coefficients α, β, γ, δ,
and ζ to maintain the balance of multi-level relations.
L
mcl
=
N
X
i=1
[x
i
(αL
q
i
,1
+ βL
q
i
,2
) + (1 x
i
) (γL
q
i
,3
+ δL
q
i
,4
+ ζL
q
i
,5
)] (3.8)
24
CHAPTER 3. METHODOLOGY
Therefore, due to the last inequality in Equation 3.6 only specifying a case (y =
1), and the other cases considering both values for y, so I have ζ = 2γ = 2δ. Also,
to maintain the equilibrium, I need to keep α + β = γ + δ + ζ. I set γ = δ = 1,
then ζ = 2, α = β = 2.
This approach encourages the model to capture subtle and detailed features from
different sources. As a result, the model becomes more adept at recognizing vari-
ations in writing styles. This capability enhances its accuracy and strengthens its
ability to generalize when detecting AI-generated text.
3.3.4 Multitask Auxiliary Learning
Multi-task learning [8] allows a model to learn several tasks concurrently by
shar ing relevant information across them. This joint learning process helps the
model develop more general and distinctive features. As a result, it improves the
model’s ability to generalize to new data. Building on the previously described con-
trastive learning framework, I extend the encoder by attaching an MLP classifier at
its output layer. This classifier is responsible for binary classification, determining
whether a given quer y text was written by a human or generated by a LLM. Denote
that the probability of i
th
sample with label x
i
= 0 is p
i
. To train this component, I
apply a cross-entropy loss function, denoted as L
ce
, defined as follows:
L
ce
=
1
N
N
X
i=1
x
i
log p
i
+ (1 x
i
) log (1 p
i
) (3.9)
Therefore, the overall loss is computed as:
L = L
ce
+ L
mcl
(3.10)
3.4 Inference Architecture
3.4.1 Architecture Design
The inference architecture as shown in Figure 3.2 is composed of three sequen-
tial stages that leverage a frozen encoder and a vector-based retrieval system to
generalize beyond training distributions:
(a) Embed the input text into embedding vectors using the fine-tuned en-
coder: All incoming texts including human-written, human-AI collabora-
tive, AI-generated, and unseen data are first processed by the previously
fine-tuned XLM-RoBERTa encoder. This encoder remains frozen during in-
ference to ensure stable representation learning.
25
CHAPTER 3. METHODOLOGY
Encoder
(XLM-RoBERTa)
Encoder
(XLM-RoBERTa)
Human
Human-written
Human-AI Collab.
Unseen Data
Deepseek
GPT
AI-generated
Human+GPT
LLaMA
Human+GPT
GPT
Deepseek
Unseen data
(a) Encode Input Texts to Embedding Vectors
Dataset
(c) Vector Database for Feature Embeddings(b) Clustering the Input Text by Querying in
Vector Database with Fuzzy K-Nearest Neighbor
Vector
Database
Figure 3.2: Inference architecture. (a) embed the input text into embedding vector using
the fine-tuned encoder; (b) use Fuzzy KNN algorithm to cluster, retrieving which cluster
the input text belongs to. (c) The stored vector database VD was created by saving all
embeddings of texts in training and validation sets using the fine-tuned encoder. If the
input text is unseen, we embed the unseen text and save into a temporary vector database
VD
, enhancing the generalization of the detector.
(b) Cluster and classify using a fuzzy k-Nearest Neighbors retrieval mecha-
nism: The encoded vectors are compared against those stored in a pre-built
vector database VD, using a fuzzy KNN algorithm. This step helps identify
the closest author-style cluster for the input, even if it comes from an unseen
combination or slightly novel stylistic pattern. The fuzzy clustering allows
some flexibility in ambiguous cases, especially for hybrid or borderline texts
such as human+GPT or human+DeepSeek.
(c) Utilize and expand vector databases to enhance generalization: The main
vector database VD is built from embeddings of all labeled training and vali-
dation data, including diverse author types (Human, GPT, Llama, DeepSeek,
Human+GPT, etc.). For unseen inputs during evaluation, an auxiliary tem-
porary database VD
is instantiated. This buffer enables dynamic general-
ization by maintaining embeddings of new, previously unobserved samples,
thereby improving clustering performance and robustness on domain-shifted
or author-shifted examples.
This architecture supports not only reliable authorship classification but also
fine-grained clustering across AI model families and collaborative writing types.
The structured use of vector search and embedding consistency enables better in-
terpretability and scalability for real-world deployment scenarios.
More detailed information about the selection and justification of the fuzzy K-
Nearest Neighbor algorithm can be found in Section 3.4.2.
26
CHAPTER 3. METHODOLOGY
3.4.2 Handling Unseen Data without Retraining
Unseen data, whether from an unseen domain or generated by an unfamiliar
model, remains a significant challenge even for state-of-the-art AI-generated text
detection methods, due to the increasing capabilities of large language models.
I tried to use the model classifier on its own, but I ended up by using a vector
database along with Fuzzy k-Nearest Neighbors, which is illustrated in Figure 3.2
along with detailed experiments in Section 4.6. Specifically, when dealing with
unseen data, I use my trained model to embed these samples and add them to my
existing vector database. Through careful parameter tuning, this approach enables
my system to effectively handle newly encountered unseen data without the need
retraining.
3.5 Summary
In this chapter, I presented a comprehensive methodology and framework for
fine-grained AI-generated text detection, focusing on the identification of stylistic
and semantic traits among model families. My analysis established the presence of
family-level consistency in text length, lexical usage, and embedding structure
motivating my treatment of each LLM family as a unif ied authoring entity.
I identified the unsupervised SimCSE XLM-RoBERTa-base encoder as the op-
timal backbone for my system, offering superior performance on both known and
unseen LLM outputs. I then proposed a multi-level contrastive learning framework
that enforces nuanced relational constraints among text sources and model families,
effectively enhancing feature representation.
To further support classification, I added a multitask auxiliar y loss and intro-
duced a robust inference mechanism using a vector database and Fuzzy KNN, en-
abling the model to generalize to texts from previously unseen generators. These
innovations collectively contribute to a flexible and scalable framework capable
of reliable authorship detection in diverse and evolving LLM-generated content
environments.
27
CHAPTER 4. EXPERIMENTS AND NUMERICAL RESULTS
4.1 Overview
This chapter presents the experimental validation of the FAID model for fine-
grained AI-generated text detection. I begin by detailing the datasets used in the ex-
per iments, including both in-domain and unseen-domain sources, along with syn-
thetic extensions using a variety of large language models. I then introduce three
baseline methods for comparison: LLM-DetectAIve, SVM with N-gram features,
and T5-Sentinel.
Two primary tasks are evaluated: (1) classifying text as human-written, AI-
generated, or human-AI collaborative; and (2) identifying the specific AI generator
responsible for a text. The experiments are conducted under four evaluation condi-
tions: in-domain with known generators, unseen domains, unseen generators, and
a combination of unseen domains and unseen generators.
4.2 FAIDSet
4.2.1 Overview
To fill the gap of fewer publicly-available multilingual fine-grained AI-
generated text datasets, I collected a multilingual, multi-domain and multi-
generator AI-generated text dataset FAIDSet. It contains texts generated by LLMs,
wr itten by humans and collaborated by both humans and LLMs, resulting in a total
of 84,000 examples.
FAIDSet covers two domains: student theses and paper abstracts, where identi-
fying authorship is critical, across two languages including Vietnamese and En-
glish. I collected student theses from the database of Hanoi University of Sci-
ence and Technology, and paper abstracts from arXiv and Vietnam Journals Online
(VJOL).
FAIDSet contributes not only a large-scale resource for training and evaluation
but also a methodological foundation for future research into multilingual, fine-
grained authorship attribution in the age of large language models.
4.2.2 The Formulation of Human-written Dataset
To build a reliable benchmark for AI-generated text detection, I first curated a
high-quality set of human-written texts across two languages and domains. The
goal was to ensure that these texts reflect genuine human authorship without AI
assistance.
28
CHAPTER 4. EXPERIMENTS AND NUMERICAL RESULTS
Source No. Human texts
arXiv abstracts 2,000
VJOL abstracts 2,195
HUST theses (English) 4,898
HUST theses (Vietnamese) 11,164
Table 4.1: Statistics of human-written texts origins in FAIDSet.
I collected undergraduate theses written in Vietnamese and English from the
Hanoi University of Science and Technology (HUST). These documents span a
diverse range of disciplines including computer science, mechanical engineering,
economics, and more. To ensure quality and consistency, I selected theses that
passed internal university review processes and excluded documents with notice-
able signs of AI assistance.
For paper writing domain, I sourced human-written English abstracts from
arXiv, a well-known open-access repository of scientific papers. Only papers pub-
lished prior to 2022 were selected to minimize the chance of LLM involvement
dur ing wr iting. For Vietnamese abstracts, I collected abstracts from Vietnam Jour-
nals Online
1
, which indexes peer-reviewed academic journals across various scien-
tific disciplines. All abstracts were manually verified to ensure they do not include
AI-generated segments.
The resulted dataset for human-written labels is described in Table 4.1. These
texts serve as critical anchors in my dataset to help train and evaluate AI-generated
text detection models with greater reliability and linguistic diversity.
4.2.3 Dataset Statistics
I used the following multilingual LLMs families to produce both AI and human-
AI collaborative texts: GPT-4/4o, Llama-3.x, Gemini 2.x, and DeepSeek V3/R1.
Regarding the human-AI collaborative text, I include AI-polished, AI-continued,
and AI-paraphrased where the models are requested to polish or paraphrase inputs
while ensuring the accuracy of any figures and statistics.
My FAIDSet includes 83,350 examples, which are separated into three subsets:
train, validation, and test with the ratio shown in Table 4.2. The data set also com-
pr ises various sources of human-written texts, which are described in Table 4.1.
4.2.4 Diverse Prompt Strategies
To avoid biasing my generated corpora toward any single style or topic, I employ
a broad set of prompt templates when synthesizing AI-generated and human-AI
1
https://vjol.info.vn
29
CHAPTER 4. EXPERIMENTS AND NUMERICAL RESULTS
Subset Human AI Human-AI collab.
Train 14,176 12,076 32,091
Validation 3,038 2,588 6,876
Test 6,879 2,599 3,038
Table 4.2: Number of examples per label in subsets in FAIDSet.
collaborative texts. By varying prompt structure, content domain, and complexity,
I ensure that the resulting outputs cover a wide distribution of writing patterns, vo-
cabulary, and rhetorical devices. This diversity helps my detector generalize more
effectively to real-world inputs [52]. Depending on the data source and context, I
craft prompts to create varied outputs suitable for different real-world application
scenar ios. I generate responses with different tones using prompts such as "You are
an IT student..." and "...who are very familiar with abstract writing...".
Concretely, I sample two out of five prompts for AI-generated texts and several
categor ies of human-AI collaborative texts in Table 4.3:
AI-polished: Texts that were originally written by a human and then lightly
refined by an AI system to improve grammar, clarity, or fluency without alter-
ing the core content or intent.
AI-continued: Texts where a human wrote an initial portion (e.g., a sentence
or paragraph), and an AI generated a continuation that attempts to follow the
or iginal style, tone, and intent.
AI-paraphrased: Texts that were originally written by a human and then re-
worded by an AI system to express the same meaning using different phrasing,
possibly altering sentence structure or word choice while preserving the or ig-
inal message.
By mixing prompts across these categories, I generate a balanced corpus that
mitigates over-fitting to any one prompt pattern and better reflects the diversity of
real user queries.
30
CHAPTER 4. EXPERIMENTS AND NUMERICAL RESULTS
Category Student Thesis Paper Abstract
AI-generated
You are an IT student who are
writing the graduation thesis. In
a formal academic tone, write
this section of a thesis on ,
clearly outlining the structure,
then write the passage con-
cisely. The original text: .
In clear, structured prose, draft
the section for a thesis titled
, cite some related works
you mentioned in the passage
and highlighting the contribu-
tion. The original text: .
You are a computer scientist
who are very familiar with ab-
stract writing for your works
based on the title. Craft a con-
cise word_count-word ab-
stract for a paper titled ,
summarizing the problem state-
ment, methodology, key find-
ings, and contributions. The
original text: .
Compose a word_count-
word abstract for the paper ,
ensuring it includes motivation,
approach, results, and implica-
tions for future research. The
original text: .
AI-polished
You are an IT student who are
refining your work to make
it more complete. Refine the
following section excerpt for
grammar, clarity, and academic
style while preserving its orig-
inal meaning and terminology.
The original text: .
You are an IT student who are
refining your work to make it
more complete. Enhance the
academic tone, coherence, and
logical flow of this thesis sec-
tion without altering technical
content. The original text: .
You are a computer scientist
who are very familiar with ab-
stract writing and refining the
written abstract. Improve the
coherence, precision, and for-
mal tone of this draft abstract
without introducing new con-
tent. The original text: .
Improve the clarity and con-
ciseness of this abstract para-
graph while maintaining all
original findings and terminol-
ogy. The original text: .
Table continued on next page
31
CHAPTER 4. EXPERIMENTS AND NUMERICAL RESULTS
Table continued from previous page
Category Student Thesis Paper Abstract
AI-continued
You are an IT student who are
writing your graduation thesis.
Continue to write the section
from this thesis excerpt for ap-
proximately word_count
words, maintaining formal aca-
demic structure and style. The
original text: .
You are an IT student who are
writing your graduation thesis.
Extend the section by adding
supporting detailed information
for a thesis on . The origi-
nal text: .
You are a computer scientist
who are very familiar with ab-
stract writing. Add some con-
cise concluding sentences to
this partial abstract that high-
lights implications for future re-
search. The original text: .
Continue the abstract by writing
a closing statement that under-
scores the study’s contributions
and potential applications. The
original text: .
AI-paraphrased
Paraphrase the following thesis
section in a clear academic tone,
preserving citations and techni-
cal terms exactly. The original
text: .
Reword this thesis excerpt to im-
prove readability and maintain
its scholarly voice, keeping all
references unchanged. The origi-
nal text: .
Rephrase this abstract in formal
academic English, maintaining
all original citations and techni-
cal accuracy. The original text:
.
Paraphrase this abstract para-
graph to enhance clarity and
flow, ensuring all technical terms
and citations remain intact. The
original text: .
Table 4.3: Samples of diverse prompt templates used to generate FAIDSet
4.3 Other Datasets
In addition to FAIDSet, for in-domain evaluation, I used another two datasets:
LLM-DetectAIve [3] encompasses various domains including arXiv, Wikihow,
Wikipedia, Reddit, student essays, and peer reviews. The original human-written
and machine-generated texts were augmented using a variety of LLMs (e.g.,
Llama 3, Mixtral, Gemma) to create 236,000-example dataset with two new la-
bels: machine-written then machine-humanized, and human-written then machine-
32
CHAPTER 4. EXPERIMENTS AND NUMERICAL RESULTS
Model #. Params
Known Generators Unseen Generators
Acc F1-macro MSE MAE Acc F1-macro MSE MAE
RoBERTa-base 125M 80.09 76.22 0.7328 0.3778 73.45 69.10 0.8901 0.4320
FLAN-T5-base 248M 80.19 75.77 0.7783 0.3947 72.80 68.55 0.9123 0.4467
e5-base-v2 109M 81.53 77.90 0.8023 0.4086 74.21 70.15 0.8804 0.4392
Multilingual-e5-base 278M 91.41 90.82 0.3436 0.1732 85.32 84.50 0.5102 0.2543
XLM-RoBERTa-base 279M 91.90 90.63 0.2345 0.1190 86.75 85.20 0.4125 0.2104
Sup-SimCSE-RoBERTa-base 279M 81.22 78.88 0.7102 0.3619 74.00 71.30 0.8420 0.4251
UnSup-SimCSE-RoBERTa-base 279M 82.19 79.38 0.7156 0.3637 75.10 72.40 0.8305 0.4207
UnSup-SimCSE-XLM-RoBERTa-base 279M 92.12 91.75 0.1904 0.0958 87.45 86.90 0.3507 0.1802
Table 4.4: Encoder selection on known vs. unseen generators. The best results in each
column are in bold.
polished.
HART [6] is a dataset spans domains of student essays, arXiv abstracts, story
wr iting, and news articles. It has four categories: human-written texts, LLMs-
refined texts, AI-generated texts, and humanized AI-generated texts. They further
generated 21,500 cases based on this dataset for unbalanced labels.
To evaluate generalization ability of FAID on unseen scenarios, I collected the
following data:
Unseen domain: I create a dataset consisting of 150 IELTS Writing essays from
Kaggle [25], where all original texts are human-written. These essays were then
used to generate human-AI collaborative and AI-generated texts, using the same
models with FAIDSet.
Unseen generators: I selected 150 human-written abstracts from the FAIDSet
test set and generated data for remaining labels using the three new LLM families:
Qwen 2, Mistral, and Gemma 2.
Unseen domain & generators: Based on the human-written IELTS essays
above, instead of the original four LLMs, I used the same LLM families as Un-
seen generator test set to generate data for the AI and human-AI labels, following
the same process as with the unseen domains.
4.4 Encoder Selection for FAID Architecture
To identify the best encoder for my classification task, I evaluated each candi-
date model on both known generators (FAIDSet testing data) and the new unseen-
generator test set introduced in Section 4.3. Each transformer-based encoder was
fine-tuned on FAIDSet training data and then used to predict labels on the two eval-
uation splits. Table 4.4 summarizes the accuracy, F1-macro, Mean Squared Error
(MSE), and Mean Absolute Error (MAE) for each model under both conditions.
Base model comparison. I first evaluated three popular monolingual models:
RoBERTa-base [31], Flan-T5-base [13], and e5-base-v2 [47] using the same train-
33
CHAPTER 4. EXPERIMENTS AND NUMERICAL RESULTS
ing and evaluation splits. RoBERTa-base and e5-base-v2 achieved the most bal-
anced trade-off between classification accuracy and regression error (MSE, MAE),
while Flan-T5-base lagged slightly in F1-macro. These results indicated that a
stronger encoder backbone yields more robust performance and motivated explo-
ration of multilingual variants for further gains.
Multilingual variants. Next, I experimented with XLM-RoBERTa-base [14]
and Multilingual-e5-base [48]. Both models leverage cross-lingual pretraining,
which in my scenario appears to improve representation of diverse linguistic pat-
terns present in FAIDSet. In particular, XLM-RoBERTa-base delivered a substan-
tial boost in all metrics, suggesting that its multilingual training enhances general-
ization even on primarily monolingual inputs.
Contrastive learning with SimCSE. Finally, I integrated contrastive learning
via SimCSE [19] to further refine the sentence embeddings. I tested both super-
vised (trained on NLI data) and unsupervised (trained on Wikipedia corpus) Sim-
CSE variants applied to RoBERTa-base. The unsupervised version outperformed
its supervised counterpar t, confirming prior findings that unsupervised SimCSE
often yields stronger semantic encoders for downstream tasks. Encouraged by
this, I adopted the unsupervised SimCSE approach on XLM-RoBERTa-base [45],
which achieved the highest accuracy and lowest error rates across all configura-
tions.
Based on these experiments, I selected the unsupervised SimCSE
XLM-RoBERTa-base model for my final system.
4.5 Analysis on Text generated by Different AI Families
4.5.1 Text Distribution between AI Families
I first examined the distribution of text length, measured in both word and
character counts across outputs from five AI models: Llama-3.3-70B-Instruct-
Turbo [32], GPT-4o-mini [38], Gemini 2.0, Gemini 2.0 Flash-Lite, and Gemini 1.5
Flash [20]. Using 2000 arXiv prompt seeds, each model generated a corresponding
output, and their length distr ibutions were plotted.
From Figure 4.1, distinct patterns emerged between families. Gemini models
consistently produced shorter and more compact outputs, while Llama and GPT
models demonstrated more variance and a tendency toward longer completions.
Despite using different versions (e.g., Gemini 2.0 vs. Gemini 1.5 Flash), the Gem-
ini outputs remained closely grouped in terms of both word and character counts,
suggesting a shared output generation strategy and stylistic consistency across the
family. In contrast, Llama and GPT distributions showed greater separation and
34
CHAPTER 4. EXPERIMENTS AND NUMERICAL RESULTS
(a) Word length distr ibution across five AI models.
(b) Character length distribution across five AI models.
Figure 4.1: Text length distributions in words and characters across Llama-3.3, GPT-
4o/4o-mini, Gemini 2.0, Gemini 2.0 Flash-Lite, and Gemini 1.5 Flash.
35
CHAPTER 4. EXPERIMENTS AND NUMERICAL RESULTS
variability.
This reinforces the hypothesis that text length behavior is not only model-
dependent but also family-coherent, with Gemini models forming a distinct cluster.
4.5.2 Text Distribution between AI Models within the Same Family
I performed N-grams frequency analysis on responses generated by three mod-
els within the Gemini family: Gemini 2.0, Gemini 2.0 Flash, and Gemini 1.5 Flash
using 500 texts from the arXiv abstract dataset. Figure 4.2 highlights overlapping
high-frequency tokens and similar patterns in word usage and phrase structure
among the three models.
Despite minor differences in architectural speed (e.g., Flash vs. regular) or re-
lease chronology, the N-grams distributions show minimal divergence. Frequently
used tokens, such as domain-specific terms and transitional phrases, appeared with
nearly identical frequencies. This suggests that these models share similar decod-
ing strategies and training biases, likely due to shared pretraining corpora and op-
timization techniques resulting in highly consistent stylistic patterns.
These intra-family similarities support treating model variants within a family
as a unified authoring entity when performing analysis or authorship attribution.
4.5.3 Embedding Visualization and Semantic Cohesion
To delve into deeper semantic alignment, I visualized embeddings generated by
unsuper vised SimCSE XLM-RoBERTa-base model of generated texts from two
model families: Gemini family (texts generated by Gemini 2.0, Gemini 2.0 Flash-
Lite, and Gemini 1.5 Flash), and GPT-4o/4o-mini using PCA to reduce the high-
dimensional embeddings into a space.
As shown in Figure 4.3, Gemini model embeddings form tight, overlapping
clusters, indicating a high degree of semantic cohesion among their outputs. This
clustering remains consistent across both sample sizes of 2000 texts. In contrast,
GPT-4o/4o-mini embeddings occupy a separate region of the space, with greater
dispersion and less overlap with Gemini clusters.
This visualization confirms that the Gemini family not only shares stylistic fea-
tures but also maintains semantic coherence, distinguishing it from models of other
families at a conceptual level.
4.5.4 Key Findings
The consistency observed across N-grams distributions, text length patterns, and
semantic embeddings among Gemini models substantiates my decision to treat
each AI family as an author. These models demonstrate coherent writing styles,
36
CHAPTER 4. EXPERIMENTS AND NUMERICAL RESULTS
(a) Top 3-g rams of Gemini 2.0 (500 samples).
(b) Top 3-g rams of Gemini 2.0 Flash (500 samples).
(c) Top 3-grams of Gemini 1.5 Flash (500 samples).
Figure 4.2: Top 20 most common 3-gram from Gemini 2.0, Gemini 2.0 Flash-Lite, Gemini
1.5 Flash using 500 sample prompts.
37
CHAPTER 4. EXPERIMENTS AND NUMERICAL RESULTS
(a) 2D embedding space.
(b) 3D embedding space.
Figure 4.3: Visualizations showing clustering behavior of Gemini model family (Gemini
2.0, Gemini 2.0 Flash-Lite, Gemini 1.5 Flash) and GPT-4o/4o-mini using 2D and 3D em-
beddings with sample size of 2000 texts.
38
CHAPTER 4. EXPERIMENTS AND NUMERICAL RESULTS
Algorithm
Known generators Unseen generators
Accuracy F1-macro Accuracy F1-macro
k-Nearest Neighbors 90.52 90.21 85.37 84.95
k-Means 88.13 87.48 80.22 79.81
Fuzzy k-Nearest Neighbors 95.18 95.05 93.31 93.25
Fuzzy C-Means 92.67 92.31 90.04 89.53
Table 4.5: Comparison of clustering algorithms on known vs. unseen generators. The best
results are shown in bold.
shared lexical preferences, and tightly clustered semantic representations - hall-
marks of unified authorship. Conversely, inter-family comparisons show clear sep-
arability, emphasizing the distinctiveness of each AI model family’s writing behav-
ior.
4.6 How can We deal with Unseen Data?
4.6.1 The Need to Use Vector Database
I first used the trained model to classify the text and observed a performance
decline from 92.12% to 87.45% when evaluating on unseen models, which is re-
flected in Table 4.4. This drop reflects the model’s limited ability to generalize
beyond the distribution of its original training data. To address this, I propose inte-
grating a vector database that stores dense embeddings of all collected examples,
both training and incoming unlabeled data.
By indexing and retrieving semantically similar examples at inference time, the
vector database serves as a flexible, scalable memory that bridges gaps between
training and test distributions, enhancing the classifier’s resilience to domain shifts
and unseen generators. Specifically:
Robust Domain Adaptation: New inputs are matched against a broad, con-
tinuously growing repository of embeddings, allowing the classifier to lever-
age analogous instances from related domains without full retraining.
Generator-Independent Coverage: As novel text generators emerge, their
embeddings populate the database; retr ieval naturally adapts to new styles or
patterns by finding the closest existing vectors.
4.6.2 Clustering Algorithm Selection
To improve my detector’s robustness against unseen domains and generators,
I evaluated four clustering strategies on the encoded vectors stored in my vec-
tor database. Each algorithm was tasked with grouping text samples into human-
wr itten, AI-generated, and human-AI collaborative categories, using both known-
39
CHAPTER 4. EXPERIMENTS AND NUMERICAL RESULTS
generator data (held-out from training) and entirely unseen generator data. Perfor-
mance was measured in terms of accuracy and F1-macro score.
I encoded each text sample using my pretrained sequence-classification model’s
penultimate layer, then applied clustering within the vector database to assign soft
or hard cluster labels corresponding to the three text-origin classes. From the re-
sults in Table 4.5, I draw some key findings:
Traditional algorithms show reasonable performance on held-out known
generators but degrade notably on unseen generators.
Fuzzy C-Means leverages membership deg rees to handle overlapping distri-
butions, improving both accuracy and F1-macro by about 4% over k-Means,
with smaller degradation on unseen data.
Fuzzy KNN combines local neighbor information with fuzzy membership,
achieving the best overall performance: 95.2% accuracy and 95.1% F1-macro
on known generators, and 93.3% accuracy and 93.3% F1-macro on unseen
generators.
Given its super ior ability to adapt to novel domains and generators through
weighted neighbor voting and soft cluster assignments, I adopt Fuzzy k-Nearest
Neighbors as the cluster ing component in my full architecture.
4.7 Baseline Models for FAID Evaluation
LLM-DetectAIve [3]: Fine-tuned a RoBERTa [31] on fine-grained English AI-
generated texts with learning rate of 2 × 10
5
for 10 epochs.
Support Vector Machine with N-grams [39]: Utilized SVM with N-gram
features for binary AI text detection (Human vs. AI). I adapted this approach to
a three-class detection. I first generated word-level trigram features using Count
Vectorizer, and then used a Linear SVM to train with the gold labels to guide the
classification.
T5-Sentinel [11]: Utilized T5-Sentinel to classify text into two labels (Hu-
man vs. AI). This approach leverages the T5 model to predict the conditional
probability of the next token, integrating the classification task into a sequence-
to-sequence completion framework. This unique method directly leverages the
strength of encoder-decoder model, which are trained to understand context and
generate coherent text.
40
CHAPTER 4. EXPERIMENTS AND NUMERICAL RESULTS
4.8 Experimental Settings to evaluate FAID Architecture performance
I evaluate the performance of my proposed method FAID and baseline models
under two main tasks: (1) fine-grained classification of text into three categories
human-written, AI-generated, and human-AI collaborative, and (2) identification
of the specific AI generator responsible for the text.
For in-domain evaluation, models are trained and tested on splits of the same
dataset, using FAIDSet, LLM-DetectAIve, and HART. These datasets include ex-
amples generated by known LLMs that are also available during training.
For out-of-domain generalization, I assess the detectors trained on FAIDSet
and test them on unseen datasets constructed to simulate three settings:
Unseen domain: where text originates from a new domain (IELTS essays).
Unseen generators: where texts are generated by previously unseen LLMs
(Qwen 2 [40], Mistral [27], Gemma 2 [21]).
Unseen domain & generators: combining both shifts.
Each model is evaluated on both classification and generator identification
tasks. For the three-class classification task, I report Accuracy, Precision, Recall,
F1-macro, MSE, and MAE). For the generator identification task, I classify each
text into one of five categories (Human, GPT, Gemini, DeepSeek, Llama) and re-
port Accuracy, Precision, Recall, and F1-macro.
This experimental setup enables a comprehensive assessment of model perfor-
mance in both familiar and challenging generalization scenarios.
4.9 Evaluation Metrics for Evaluation Process
To evaluate the performance of text classification in this thesis, we adopt several
standard metrics, including Accuracy, Precision, Recall, F1 Score, Mean Squared
Error and Mean Absolute Error. Below are the definitions and formulas for each
metr ic:
Accuracy: Accuracy measures the proportion of correct predictions among the
total number of cases examined. It is defined as:
Accuracy =
T P + T N
T P + T N + F P + F N
(4.1)
where T P is true positives, T N is true negatives, F P is false positives, and F N
is false negatives.
Precision: Precision quantifies the number of true positive predictions made out
41
CHAPTER 4. EXPERIMENTS AND NUMERICAL RESULTS
of all positive predictions (both true and false). It is defined as:
Precision =
T P
T P + F P
(4.2)
Recall: Recall (also known as sensitivity or true positive rate) measures the
number of true positives identified out of all actual positives. It is defined as:
Recall =
T P
T P + F N
(4.3)
F1 Score: The F1 Score is the harmonic mean of precision and recall. It bal-
ances the trade-off between the two, especially in imbalanced datasets. It is calcu-
lated as:
F1 Score = 2 ×
Precision × Recall
Precision + Recall
(4.4)
Because the text distribution in each class is put in a natural order with human-
wr itten texts being the most sparse, followed by human-AI collaborative, and AI-
generated being the most frequent we denote the labels as: human-written = 0,
human-AI collaborative = 1, and AI-generated = 2. This numerical encoding re-
flects the semantic distance among the categories and allows the use of regression-
based error metrics (MSE and MAE) to not only capture misclassifications but
also to penalize confusion between more distant classes (e.g., mistaking human for
AI) more heavily than between adjacent classes (e.g., human-AI vs. AI). This setup
aims to minimize the severity of such confusion, encouraging more context-aware
classification.
Mean Squared Error: MSE measures the average squared difference between
predicted and actual values. It penalizes larger errors more heavily and is defined
as:
MSE =
1
n
n
X
i=1
(y
i
ˆy
i
)
2
(4.5)
where y
i
is the true value, ˆy
i
is the predicted value, and n is the number of data
points.
Mean Absolute Error: MAE measures the average absolute difference between
predicted and actual values, providing a more interpretable metric in terms of units.
It is defined as:
MAE =
1
n
n
X
i=1
|y
i
ˆy
i
| (4.6)
These metrics together provide a comprehensive evaluation of the models used
42
CHAPTER 4. EXPERIMENTS AND NUMERICAL RESULTS
Scenario Dataset Detector Accuracy Precision Recall F1-macro MSE MAE
In-domain,
Known generator
FAIDSet
LLM-DetectAIve 94.34 94.45 93.79 94.10 0.1888 0.1107
SVM (N-grams) 84.41 84.53 82.93 83.62 0.5630 0.2916
T5-Sentinel 93.31 94.92 93.10 93.15 0.2104 0.1101
FAID 95.58 95.78 95.33 95.54 0.1719 0.0875
LLM-DetectAIve
LLM-DetectAIve 95.71 95.78 95.72 95.71 0.1606 0.1314
SVM (N-grams) 83.43 76.09 71.84 73.60 0.2878 0.2064
T5-Sentinel 94.77 94.70 92.60 93.60 0.1663 0.1503
FAID 96.99 95.29 88.14 91.58 0.1561 0.0754
HART
LLM-DetectAIve 94.39 94.25 94.33 94.29 0.3244 0.1789
SVM (N-grams) 58.93 59.75 61.07 60.30 1.2334 0.6849
T5-Sentinel 86.68 87.25 87.69 87.38 0.4339 0.2334
FAID 96.73 97.61 98.05 97.80 0.4631 0.1806
Unseen domain,
Unseen generator
Unseen domain
LLM-DetectAIve 52.83 47.31 64.62 53.28 0.4733 0.4722
SVM (N-grams) 39.28 43.13 29.41 22.77 0.7689 0.6611
T5-Sentinel 55.56 49.54 66.67 55.34 0.4444 0.4444
FAID 62.78 70.73 71.77 69.46 0.4514 0.4486
Unseen
generators
LLM-DetectAIve 75.71 73.25 75.63 74.30 0.3714 0.2957
SVM (N-grams) 74.19 61.88 50.17 55.17 0.4324 0.3162
T5-Sentinel 85.95 85.77 84.59 85.16 0.3648 0.2419
FAID 93.31 92.40 94.44 93.25 0.1691 0.1167
Unseen domain
and Unseen
generators
LLM-DetectAIve 62.93 66.74 71.17 61.97 0.4479 0.3964
SVM (N-grams) 34.50 39.38 26.67 21.70 0.9186 0.7429
T5-Sentinel 57.07 49.82 66.61 55.45 0.4314 0.4300
FAID 66.55 74.44 73.57 72.58 0.3939 0.3167
Table 4.6: Perfor mances of detectors with three labels. The best results are in bold and the
second best are underlined.
in this work, covering both classification effectiveness and regression prediction
accuracy.
4.10 Whether the text is Human, AI, or Human-AI?
Table 4.6 shows the per formance of FAID and three baselines in four evaluation
settings. FAID consistently achieves the best accuracy in (i) in-domain and known
generators, (ii) unseen domain, (iii) unseen generators, and (iv) unseen domain &
generators settings, followed by LLM-DetectAIve in (i) and (iv), and T5-Sentinel
in (ii) and (iii). Taking advantage of pretrained language models (i.e., RoBERTa,
T5), they outperform SVM in most cases, particularly on HART dataset.
FAID’s approach enhances the generalization performance over unseen domains
and generators, compared with other methods. From the absolute accuracy, I can
find that generalizing to (ii) unseen domains and (iv) unseen domain & generator
remains challenging, with accuracy of 62.78% and 66.55% respectively. The result
suggests that FAID is an effective method to address the multilingual fine-grained
AI-generated text detection task, improving the performance by leveraging multi-
level contrastive learning to capture generalizable stylistic differences tied to AI
families, rather than overfitting to surface-level artifacts.
43
CHAPTER 4. EXPERIMENTS AND NUMERICAL RESULTS
Dataset Detector Accuracy Precision Recall F1-macro
FAIDSet
LLM-DetectAIve 0.7596 0.7697 0.7690 0.7653
SVM (N-grams) 0.6764 0.6515 0.6126 0.6312
T5-Sentinel 0.7568 0.7985 0.7840 0.7837
FAID 0.7964 0.8328 0.8352 0.8327
LLM-DetectAIve
LLM-DetectAIve 0.9049 0.9064 0.8352 0.8693
SVM (N-grams) 0.8567 0.6516 0.6127 0.6313
T5-Sentinel 0.8154 0.8137 0.8009 0.8105
FAID 0.9089 0.8817 0.8672 0.8737
HART
LLM-DetectAIve 0.8900 0.8787 0.8674 0.8715
SVM (N-grams) 0.6347 0.5454 0.4385 0.4806
T5-Sentinel 0.7852 0.7713 0.7834 0.7759
FAID 0.8996 0.9157 0.8648 0.8667
Table 4.7: Accuracy of detectors in identifying generators: human, GPT, Gemini,
DeepSeek, and Llama. The best is in bold.
4.11 Identifying Different Generators
I further examine the detector to identify specific AI generators which were
seen during training. The goal of FAID is not only to detect AI involvement but
also to identify the specific AI model or family applied, treating them as dis-
tinct authors. As shown in Table 4.7, across the three datasets, FAID consistently
achieves higher performance compared to other baselines in almost all metrics.
LLM-DetectAIve achieves comparable results with my detector FAID on the test
set of LLM-DetectAIve dataset, except that its precision is slightly lower. In the
generator identification task, FAID also shows high discriminative power across
AI families (GPT, Gemini, DeepSeek, Llama), achieving over 90% accuracy in
in-domain settings (FAIDSet, LLM-DetectAIve). This suggests that AI model out-
puts still retain distinctive patterns that can be captured with a sufficiently nuanced
detection framework.
FAID’s high performance when dealing with text originating from diverse
known generators within these datasets indicates that FAID learned unique writ-
ing patterns and features of different AI generators by leveraging multi-level con-
trastive learning.
4.12 Discussion
The results across both experiments demonstrate that FAID is a robust and gen-
eralizable detector for fine-grained AI-generated content detection. It consistently
outperforms strong baselines in both three-way classification (human, AI, human-
AI) and generator identification tasks, across a wide range of domains and genera-
tors (Table 4.6, Table 4.7).
44
CHAPTER 4. EXPERIMENTS AND NUMERICAL RESULTS
In particular, FAID achieves strong generalization to unseen generators and
domains a key requirement for real-world deployment where new models and
wr iting styles continuously emerge. This demonstrates that multi-level contrastive
learning, by encouraging the model to learn deeper stylistic and structural signals
rather than superficial artifacts, is an effective strategy for building generalizable
detectors.
Failure case discussion. While FAID performs strongly overall, some failure
cases reveal interesting challenges. One notable example comes from an unseen do-
main sample classified as human-AI collaborative, when it was in fact fully human-
wr itten. Upon inspection, this is a sample human-AI collaborative text where the
detector misclassified as human-written:
Sample case of detectors failure
Lately, many researchers are using neural topic models to automatically find
topics in text because they are simpler than older, math-heavy methods like
LDA. However, these newer models have two main weaknesses: they often
make poor assumptions about how topics are structured, and they sometimes
cant figure out the specific blend of topics within a document. To solve this,
we created a new method called the Bidirectional Adversarial Topic (BAT)
model. Its the first of its kind to use a special bidirectional adversarial train-
ing technique. This technique creates a two-way street between the words
in a document and the topics it represents, helping the model learn more
effectively. We also developed an enhanced version, Gaussian-BAT, which
understands how different words relate to each other.
In this case, the sample was primarily written by a human author, but was later
polished using light AI-assisted editing for grammar and flow improvements. The
core ideas, sentence structures, and phrasing largely remained human-authored.
However, subtle surface-level regularities introduced by the AI polishing such
as smoother connective phrases and more uniform sentence rhythm likely con-
tr ibuted to the detector assigning a human-AI collaborative label.
Importantly, this illustrates a key nuance: even when minor AI involvement is
present, its impact on authorship can remain small, and the text should still be
considered predominantly human-written. The detector, while highly effective at
identifying strong AI signatures, may at times be overly sensitive to light editing
effects. This highlights the need for future calibration work to better reflect the true
degree of AI contribution in borderline cases.
45
CHAPTER 4. EXPERIMENTS AND NUMERICAL RESULTS
4.13 Summary
The experiments demonstrate that the designed FAID framework significantly
outperforms existing baselines in both classification and generator identif ication
tasks. FAID achieves state-of-the-art performance across all in-domain datasets
and shows robust generalization to unseen domains and generators, outperforming
LLM-DetectAIve, T5-Sentinel, and SVM-based methods.
In the multi-class classification task, FAID achieves the highest accuracy, pre-
cision, and recall in most scenarios, especially when generalization is required
highlighting its robustness. In the generator identification task, FAID not only cor-
rectly identifies AI involvement but also attributes authorship to the correct model
family with higher precision and F1 scores than the baselines.
These results confirm the effectiveness of FAID’s multi-level contrastive learn-
ing approach, which captures semantic and stylistic cues specific to different AI
model families. This establishes FAID as a strong and generalizable solution for
fine-grained AI-generated text detection and attribution across domains, languages,
and model types.
46
CHAPTER 5. SCHOLARSLEUTH WEB APPLICATION
5.1 Overview
This chapter introduces the concept of the AI Factor Impact Score a novel
metr ic designed to measure the degree of AI influence within a document. I
also present the public-facing implementation of this methodology via Schol-
arSleuth [1], a web-based application developed to help users assess and visual-
ize AI authorship in texts. The platform supports both short text classification and
document-level analysis, generating comprehensive reports that include paragraph-
wise classifications and an overall AFI score.
5.2 AI Factor Impact Score
To provide a quantifiable measure of AI involvement in a given text, I introduce
the AI Factor Impact Score (AFI). This score represents the overall degree of AI
influence across a document, based on paragraph-level classifications.
5.2.1 AFI Score Calculation and Threshold
I first apply the fuzzy k-Nearest Neighbors classifier to each paragraph in the
text. The classifier outputs a probabilistic classification, from which I determine
the dominant label: human-written, AI-generated, or human-AI collaborative.
Let the document be composed of n paragraphs. For parag raph i
th
, let:
w
i
denote the number of words in parag raph i
th
,
L
i
denote the label assigned to paragraph i
th
,
f(L
i
) denote the AFI weight for the label L
i
:
f(L
i
) =
0 if L
i
= human-written
1 if L
i
= AI-generated
0.7 if L
i
= human-AI collaborative
(5.1)
The overall AI Factor Impact Score for the document is then calculated as a
weighted average:
AFI =
n
P
i=1
w
i
× f(L
i
)
n
P
i=1
w
i
(5.2)
This formulation ensures that longer paragraphs contribute proportionally more
to the overall score, yielding a representative measure of AI influence relative to
47
CHAPTER 5. SCHOLARSLEUTH WEB APPLICATION
the actual volume of content.
I interpret the AFI score using the following scale:
0 39: Primarily human-written content Acceptable.
40 69: Mixed human-AI collaboration Marginally Acceptable.
70 100: Primarily AI-generated content Unacceptable.
5.2.2 Why AFI = 0.7 for Human-AI Collaborative Texts?
While a naive assumption might assign a value of 0.5 to the human-AI collabo-
rative class implying equal contribution my decision to set it at 0.7 is based on
several empir ical and practical considerations:
In many collaborative cases, the AI’s influence often extends beyond a simple
50-50 split. For example, a paragraph initially written by a human may be
heavily rewritten or paraphrased by an AI, resulting in text that reflects more
of the AI’s style and structure than the original human input.
I observed in user studies and qualitative analyses that collaborative texts
tend to exhibit more stylistic traits associated with AI-generated content than
human-authored writing. These include consistent tone, fluent but generic
phrasing, and minimal semantic drift, all characteristics amplified by AI edit-
ing.
The 0.7 value maintains sensitivity to the elevated influence of AI while pre-
ser ving a clear distinction from fully AI-generated text (AFI = 1). It thus
ser ves as a balanced representation of "partial but substantial" AI involve-
ment.
This approach allows us to flag documents where AI may have had a substantial
impact while avoiding overly penalizing legitimate use of AI as a writing assistant.
5.3 System Overview
To demonstrate the practical utility of the proposed AI-generated content detec-
tion system, I developed a publicly accessible web-based platform named Schol-
arSleuth, available at: https: //scholarsleuth.tnminh.com. The web
application allows users to interact with the underlying detection model and ob-
tain meaningful insights into the authorship composition of texts. ScholarSleuth
supports two main functionalities:
Analysis Tools: These features allows user to detect the level of AI contri-
butions in their documents and get a detailed report to raise the awareness of
user’s in writing. We will have deeper discussion with this function in Sub-
48
CHAPTER 5. SCHOLARSLEUTH WEB APPLICATION
section 5.4.1.
Playground: While I offer the detection method and tools, I do not stay of-
fensive to this new norm. I offer two features in the playground, for user to use
AI generator wisely and help them to improve their writing skills in adapting
with AI asssitants. The more detailed information about this function will be
descr ibed in Sub-section 5.4.2.
These functionalities make ScholarSleuth suitable for academic authors, instruc-
tors, reviewers, or anyone interested in their adaptation to the the extent of AI in-
volvement in written content.
5.4 User Interface Walkthrough
Figure 5.1 show the homepage of the system, which introduces the system with
a clear call to action and highlights its two main functionalities: Analysis Tools
and Playground. This entry point allows users to begin interacting with the system
with a single click and learn about the features through concise descriptions.
5.4.1 Analysis Tools
ScholarSleuth offers a suite of core analysis tools that help users assess and inter-
pret the degree of AI involvement in a given text. These tools serve as the backbone
of the system’s capability to support fine-grained content attribution. They allow
users to analyze text for transparency, self-evaluation, academic integrity checks,
or curiosity about AI-generated content patterns. The primary tool in this suite is
the Text Analyzer, described below.
a, Text Analyzer
The Text Analyzer feature (Figure 5.2) allows users to submit a text passage
and obtain a detailed analysis of its likely composition in terms of human and
AI involvement. Users can input text through a simple web form and trigger the
analysis by clicking the Analyze button.
The system processes the input and outputs a distribution of label scores across
three categories:
Human-written
AI-generated
Human-AI collaboration
Results are presented visually with a horizontal bar chart and an overall AI Fac-
tor Impact Score, which summarizes the weighted contribution of AI-generated
content and collaboration in the analyzed text. This visualization helps users
49
CHAPTER 5. SCHOLARSLEUTH WEB APPLICATION
Figure 5.1: Homepage of ScholarSleuth system. Users can select between single text or
Document Analyzer.
50
CHAPTER 5. SCHOLARSLEUTH WEB APPLICATION
Figure 5.2: Text Analyzer interface. The input text is analyzed into three categories with
visual feedback.
51
CHAPTER 5. SCHOLARSLEUTH WEB APPLICATION
quickly interpret how much AI influence is present in their text.
In the example shown in Figure 5.2, the analysis indicates a dominant proportion
of human-AI collaborative writing (67.15%), a substantial human-written portion
(32.85%), and no fully AI-generated segments. This fine-grained feedback allows
users to better understand their own writing processes, assess the degree of AI use,
and make informed decisions about transparency and attribution.
b, Document Analyzer
For more comprehensive review, users can upload entire documents through the
Document Analyzer interface (Figure 5.3). The platform supports PDF files and
provides a detailed multi-level analysis report, including a calculated AI Factor
Impact (AFI) score and a fine-grained paragraph-level classification.
After uploading a document and clicking the Analyze PDF button, the sys-
tem processes the file, chunk file into paragraphs and returns several key outputs:
A visual AI Factor Impact Score, summarizing the overall AI-related influence
across the document.
A classification breakdown across three categories Human-written, AI-
generated, and Human-AI collaboration.
Selected Top Examples by Category, where representative text excerpts are
shown for each class to support interpretability and transparency.
An example analysis result is shown in Figure 5.3, based on a random gradua-
tion thesis from a SOICT student in the Fall 2024 semester. In this example, the
document displays a high level of Human-AI collaboration (70.32%), with some
fully AI-generated segments (23.77%) and a small proportion of Human-written
content. Such insights help users and evaluators assess whether the degree of AI
involvement aligns with the expected standards for the document.
In addition to the interactive on-screen analysis, users can generate a polished
visual report by clicking the Export PDF button. The generated report (Fig-
ure 5.4) summarizes the analysis results in an easy-to-read format that includes:
A visual AI Factor Impact Score bar.
A classification results pie chart.
Full analyzed pieces of text for each category.
The report enhances transparency in content attribution and is suitable for shar-
ing with educators, reviewers, or institutional evaluators. In the illustrated report,
the document received an AFI score of 73/100, which exceeds the critical thresh-
52
CHAPTER 5. SCHOLARSLEUTH WEB APPLICATION
Figure 5.3: Document Analyzer interface. A document is uploaded and evaluated with a
detailed breakdown and categorized text examples.
53
CHAPTER 5. SCHOLARSLEUTH WEB APPLICATION
old, indicating a high level of AI influence in the document.
With its clean design, responsive layout, and intuitive navigation, ScholarSleuth
lowers the barrier to understanding AI-generated content detection. The platform
not only supports educational applications, but also facilitates transparency in au-
thorship evaluation, thereby promoting ethical writing practices in academic and
professional contexts.
5.4.2 Playgrounds
The Playgrounds serves as an interactive space where users can explore ad-
ditional features beyond AI-generated content detection. One of the guiding prin-
ciples behind the design of ScholarSleuth is not to discourage the use of AI in
wr iting, but to promote transparency and raise user awareness regarding its appli-
cation. To suppor t this goal, we developed two new features that encourage mindful
and responsible use of AI writing tools.
a, Prompt Crafting Tool
The Prompt Crafting Tool, as shown in Figure 5.5 assists users in designing
prompts that produce AI-generated text matching specific writing styles. This tool
accepts an input text sample and analyzes its style using the underlying LLM-based
detection model. It then suggests optimized prompts that can guide users to gener-
ate new content with similar stylistic characteristics.
An additional unique feature of this tool is its ability to recommend which AI
model is most suitable for producing the desired style of text. This helps users make
informed choices rather than relying blindly on AI tools. The tool also provides
an AI Factor Impact Score and a breakdown of the writing style (Human-wr itten,
AI-generated, Human-AI collaboration), helping users understand the current style
profile of their input.
To further encourage conscious use of AI, the tool suggests different types of
prompt strategies such as starting with a human-written draft and letting the
AI polish it, or having the AI generate a draft that the user then refines. These
strategies promote human-in-the-loop writing practices instead of fully automated
content generation. This aligns with our system’s goal to foster user awareness and
responsibility when employing AI in academic and professional writing.
This feature assists users in designing prompts that produce AI-generated text
matching specific writing styles. This tool accepts an input text sample and ana-
lyzes its style using the underlying LLM-based detection model. It then suggests
optimized prompts that can guide users to generate new content with similar stylis-
54
CHAPTER 5. SCHOLARSLEUTH WEB APPLICATION
Figure 5.4: Generated report summarizing AI Factor Impact (AFI) score and classification
results.
55
CHAPTER 5. SCHOLARSLEUTH WEB APPLICATION
Figure 5.5: Prompt Crafting Tool interface. This features provides user a range of prompts
that helps user to produce texts with AI assistant wisely.
56
CHAPTER 5. SCHOLARSLEUTH WEB APPLICATION
tic characteristics.
An additional unique feature of this tool is its ability to recommend which AI
model is most suitable for producing the desired style of text. This helps users make
informed choices rather than relying blindly on AI tools. The tool also provides
an AI Factor Impact Score and a breakdown of the writing style (Human-wr itten,
AI-generated, Human-AI collaboration), helping users understand the current style
profile of their input.
To further encourage conscious use of AI, the tool suggests different types of
prompt strategies such as starting with a human-written draft and letting the
AI polish it, or having the AI generate a draft that the user then refines. These
strategies promote human-in-the-loop writing practices instead of fully automated
content generation. This aligns with our system’s goal to foster user awareness and
responsibility when employing AI in academic and professional writing.
b, Rewrite Challenge
Figure 5.6 illustrates the Rewrite Challenge. This is an interactive gamified fea-
ture designed to help users improve their ability to identify and adapt writing styles
between human-like and AI-like text. In each challenge, users are presented with a
text passage and asked to rewrite it according to one of two modes: Humanize It or
Make It More AI-like.
The tool then evaluates the rewritten text using an AI-based scoring system that
assesses its stylistic alignment with the target mode. Users receive a correspond-
ing score and feedback to guide their learning. A leaderboard encourages friendly
competition and motivates users to improve their skills through repeated practice.
This feature serves a dual purpose. First, it raises user awareness by training
them to better distinguish stylistic markers of human and AI-generated writing.
Second, it fosters more mindful use of AI writing tools by encouraging deliberate
style choices rather than passive acceptance of AI output. Through this engaging
format, the Rewrite Challenge supports our broader goal of promoting ethical and
transparent AI use in writing.
5.5 Summary
In this chapter, I proposed and formalized the AI Factor Impact score as a prac-
tical and interpretable metric for assessing the level of AI contribution in textual
content. By assigning scalar weights to each classification label, the AFI score
captures nuanced authorship dynamics at scale. The paragraph-level classification
approach, combined with word count-based weighting, ensures a representative
57
CHAPTER 5. SCHOLARSLEUTH WEB APPLICATION
Figure 5.6: Rewrite Challenge interface. User can use it to improve the writing skills and
raise awareness of using AI-generated contents.
58
CHAPTER 5. SCHOLARSLEUTH WEB APPLICATION
and document-sensitive evaluation.
I implemented this methodology into ScholarSleuth, a fully functional web
platform that enables users to analyze text passages or upload full documents for
comprehensive authorship analysis. The system generates AFI scores, label distri-
butions, and downloadable repor ts to support ethical decision-making and promote
transparency in content creation.
Importantly, ScholarSleuth is designed not to discourage the use of AI in writ-
ing, but to raise user awareness about its influence. To support this goal, the system
includes an interactive Playground with features such as the Prompt Crafting Tool
and the Rewrite Challenge. These tools encourage users to experiment with writing
styles and reflect on the role of AI in their writing processes. Through this balanced
approach, ScholarSleuth fosters responsible and informed use of AI-assisted writ-
ing technologies.
59
CHAPTER 6. CONCLUSIONS AND BROADER IMPACTS
6.1 Conclusion
In this work, I introduced the ScholarSleuth system, which includes three main
components.
First, FAIDSet is a multilingual and fine-grained benchmark for AI-generated
text detection, comprising approximately 84,000 samples across human-written,
AI-generated, and human-AI collaborative categories.
Building on this dataset, I proposed FAID, a novel detection framework that
combines multi-level contrastive learning and multi-task auxiliary objectives. By
treating AI model families as stylistic "authors," FAID captures nuanced writing
patterns that persist across different domains and models. My approach integrates
a fuzzy KNN-based inference mechanism and a training-free incremental adapta-
tion strategy to enhance performance and robustness in both in-domain and out-of-
distribution settings. Experimental results demonstrate that FAID achieves up to
95.58% accuracy on in-domain, known generators settings and 93.31% on unseen
generators, significantly outperforms strong baselines across a variety of datasets
and configurations, especially in detecting collaborative texts and adapting to un-
seen generative models key challenges in the evolving landscape of AI-assisted
wr iting.
To suppor t practical deployment, I also developed ScholarSleuth a web-based
application where users can analyze documents and receive detailed reports, in-
cluding an AI Factor Impact score a reference metric estimating the degree of AI
contr ibution in a given text. The web application also provides some features that
help users to raise their awareness in writing, promoting transparency and respon-
sibility in AI-assisted content creation.
While ScholarSleuth demonstrates strong performance and generalization
across various domains and AI models, several limitations remain. First, although
my dataset is multilingual and multi-domain, it is still limited in terms of low-
resource languages and niche writing domains, which may affect performance in
those contexts. Second, my framework is based on the fact that texts produced by
AI models from the same family share similar stylistic features. However, this as-
sumption may break down when a single text is influenced by multiple AI systems
such as when a human uses different models for drafting, rewriting, and pol-
ishing. In such cases, the resulting style may blend traits from multiple sources,
60
CHAPTER 6. CONCLUSIONS AND BROADER IMPACTS
making it more difficult to attribute authorship to a single AI family or clearly
distinguish collaboration boundaries.
6.2 Suggestion for Future Works
In future work, I aim to further enrich FAIDSet by incorporating additional lan-
guages, including low-resource ones, more generative models, and a broader range
of domains such as informal texts, social media posts, and educational writing.
This expansion will support better generalization in multilingual and low-resource
scenar ios.
I also plan to extend FAIDs capabilities to handle increasingly complex human-
AI collaborative scenarios. These include texts generated by multiple AI families
and subsequently edited by humans reflecting real-world behaviors and work-
flows. Additionally, I will explore integrating more fine-grained attribution tech-
niques to distinguish not only between human and AI contributions but also to
localize segments within collaborative texts.
On the application side, I intend to enhance the ScholarSleuth web platform to
support batch analysis, educational integration, and real-time document feedback.
These improvements aim to further the impact of FAID in real-world use cases,
particularly in academic and editorial settings where authorship transparency is
essential.
6.3 Ethics and Broader Impacts
Data Collection and Licenses A primary ethical consideration is the data li-
cense. I reused existing dataset for my research: LLM-DetectAIve, HART and
IELTS Writing, which have been publicly released under clear licensing agree-
ments. I adhere to the intended usage of these dataset licenses.
Security Implications. FAIDSet streamlines both the creation and rigorous test-
ing of FAID. By spotting AI-generated material, FAID helps preserve academic in-
tegrity, flag potential misconduct, and protect the genuine contributions of authors.
More broadly, it supports efforts to prevent the misuse of generative technologies in
areas such as credential falsification. Detecting AI-crafted content across different
languages can be tricky, due to languages unique grammar and style. By enabling
robust, multilingual and multi-generator detection with accurate results, FAIDSet
empowers people everywhere, especially in academic scenarios, to deploy AI re-
sponsibly. At the same time, it fosters critical digital literacy, giving everyone a
clear understanding of both the strengths and limits of generative AI.
61
CHAPTER 6. CONCLUSIONS AND BROADER IMPACTS
Responsible Use of AI-generated Text Detection. FAID is designed to enhance
transparency in AI-assisted writing by enabling the fine-grained detection of AI
involvement in text generation. While this has clear benefits for academic integrity,
and content provenance, I acknowledge the potential for misuse. For instance, such
tools could be used to unfairly penalize individuals in educational or professional
settings based on incorrect or biased predictions. To mitigate this, I stress that FAID
is not intended for high-stakes decision-making without human oversight.
Bias and Fairness. AI-generated text detection systems may inadvertently en-
code or amplify biases present in training data. FAIDSet has been carefully con-
structed to include diverse domains and languages to reduce such biases. Nonethe-
less, I encourage ongoing auditing and benchmarking of fairness across popula-
tions and writing styles, and welcome community feedback for further improve-
ments.
Transparency and Reproducibility. In this work, to promote open research and
community contributions, I published my code and data to the public.
62
PUBLICATIONS AND AWARDS GIVEN TO THIS THESIS
1. Publications
The whole research of this research thesis is published into two publications,
which are under-review in The 2025 Conference on Empirical Methods in Nat-
ural Language Processing. I first authored both of them:
[1] FAID: Fine-grained AI-generated Text Detection using Multi-task Auxiliary
and Multi-level Contrastive Lear ning [2].
[2] ScholarSleuth: A System for Fine-grained AI-generated Text Detection for
Academic Integrity [1].
Some supports components for this paper are published in three publications,
which are:
[3] LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detec-
tion, In Proceedings of The 2024 Conference on Empirical Methods in Natu-
ral Language Processing: System Demonstration (I co-first authored this pa-
per) [3].
[4] GenAI Content Detection Task 1: English and Multilingual Machine-
Generated Text Detection: AI vs. Human, In Proceedings of the 1st Workshop
on GenAI Content Detection, The 2025 International Conference on Compu-
tational Linguistics (I co-authored this paper) [51].
[5] Is Human-Written Text Liked by Humans? Multilingual Human Detection and
Preference Against AI, Under-review in Proceedings of The 2025 Conference
on Empirical Methods in Natural Language Processing (I co-authored this
paper) [52].
2. Awards
This thesis has won the Second Prize in The 42
th
Student Scientific Research
Conference at Hanoi University of Science and Technology, in the area of Applied
AI, Blockchain and Big Data [43].
63
REFERENCES
[1] Minh Ngoc Ta, Duc-Anh Hoang, Dong Cao Van, Minh Le-Anh, Truong
Nguyen, My Anh Tran Nguyen, Yuxia Wang, Preslav Nakov, and Sang Dinh.
ScholarSleuth: A System for Fine-grained AI-generated Text Detection for
Academic Integrity. 2025. OpenReview: UnderReview.
[2] Minh Ngoc Ta, Dong Cao Van, Duc-Anh Hoang, Minh Le-Anh, Truong
Nguyen, My Anh Tran Nguyen, Yuxia Wang, Preslav Nakov, and Sang
Dinh. FAID: Fine-grained AI-generated Text Detection using Multi-task Aux-
iliary and Multi-level Contrastive Learning. 2025. arXiv: 2505 . 14271
[cs.CL]. URL: https://arxiv.org/abs/2505.14271.
[3] Mer vat Abassy, Kareem Elozeiri, Alexander Aziz, Minh Ngoc Ta, Raj
Vardhan Tomar, Bimarsha Adhikari, Saad El Dine Ahmed, et al. “LLM-
DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection”.
In: Proceedings of the 2024 Conference on Empirical Methods in Natural
Language Processing: System Demonstrations. Miami, Florida, USA: Asso-
ciation for Computational Linguistics, Nov. 2024, pp. 336–343. DOI: 10 .
18653/v1/2024.emnlp-demo.35.
[4] Ekater ina Artemova, Jason S Lucas, Saranya Venkatraman, Jooyoung Lee,
Sergei Tilga, Adaku Uchendu, and Vladislav Mikhailov. “Beemo: Bench-
mark of Expert-edited Machine-generated Outputs”. In: Proceedings of the
2025 Conference of the Nations of the Americas Chapter of the Associa-
tion for Computational Linguistics: Human Language Technologies (Vol-
ume 1: Long Papers). Ed. by Luis Chiruzzo, Alan Ritter, and Lu Wang.
Albuquerque, New Mexico: Association for Computational Linguistics,
Apr. 2025, pp. 6992–7018. ISBN: 979-8-89176-189-6. URL: https : / /
aclanthology.org/2025.naacl-long.357/.
[5] The Australian. “TEQSA survey: Over a third of students use chatbots for
assessments without viewing it as cheating”. In: (2025). URL: https :
/ / www . theaustralian . com . au / higher - education /
unis - warned - chatbots - taking - over - and - return -
to - pen - and - paper - is - futile / news - story /
0a3e668fe18e4f8c7a65365d2de2e9eb.
[6] Guangsheng Bao, Lihua Rong, Yanbin Zhao, Qiji Zhou, and Yue Zhang.
Decoupling Content and Expression: Two-Dimensional Detection of AI-
64
REFERENCES
Generated Text. 2025. arXiv: 2503 . 00258 [cs.CL]. URL: https :
//arxiv.org/abs/2503.00258.
[7] Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Ka-
plan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry,
Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger,
Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu,
Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin,
Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCan-
dlish, Alec Radford, Ilya Sutskever, and Dario Amodei. “Language Models
are Few-Shot Learners”. In: Advances in Neural Information Processing Sys-
tems. Ed. by H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H.
Lin. Vol. 33. Curran Associates, Inc., 2020, pp. 1877–1901. URL: https:
/ / proceedings. neurips . cc /paper_ files/ paper / 2020/
file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf.
[8] Rich Caruana. “Multitask learning”. In: Machine learning 28 (1997), pp. 41–
75.
[9] Tuhin Chakrabarty, Philippe Laban, and Chien-Sheng Wu. “Can AI writ-
ing be salvaged? Mitigating Idiosyncrasies and Improving Human-AI Align-
ment in the Writing Process through Edits”. In: Proceedings of the 2025 CHI
Conference on Human Factors in Computing Systems, CHI 2025, Yokohama-
Japan, 26 April 2025- 1 May 2025 . Ed. by Naomi Yamashita, Vanessa Ev-
ers, Koji Yatani, Sharon Xianghua Ding, Bongshin Lee, Marshini Chetty,
and Phoebe O. Toups Dugas. ACM, 2025, 1210:1–1210:33. DOI: 10 .
1145/3706598.3713559. URL: https://doi.org/10.1145/
3706598.3713559.
[10] Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton.
“A Simple Framework for Contrastive Learning of Visual Representations”.
In: Proceedings of the 37th International Conference on Machine Learning.
Ed. by Hal Daumé III and Aarti Singh. Vol. 119. Proceedings of Machine
Learning Research. PMLR, July 2020, pp. 1597–1607. URL: https :/ /
proceedings.mlr.press/v119/chen20j.html.
[11] Yutian Chen, Hao Kang, Vivian Zhai, Liangze Li, Rita Singh, and Bhiksha
Raj. Token Prediction as Implicit Classification to Identify LLM-Generated
Text”. In: Proceedings of the 2023 Conference on Empirical Methods in
Natural Language Processing. Ed. by Houda Bouamor, Juan Pino, and Ka-
lika Bali. Singapore: Association for Computational Linguistics, Dec. 2023,
65
REFERENCES
pp. 13112–13120. DOI: 10 . 18653 / v1 / 2023 . emnlp - main . 810.
URL: https://aclanthology.org/2023.emnlp-main.810/.
[12] Zihao Cheng, Li Zhou, Feng Jiang, Benyou Wang, and Haizhou Li. “Be-
yond Binary: Towards Fine-Grained LLM-Generated Text Detection via Role
Recognition and Involvement Measurement”. In: Proceedings of the ACM on
Web Conference 2025. WWW ’25. ACM, Apr. 2025, pp. 2677–2688. DOI:
10.1145/3696410.3714770. URL: http://dx.doi.org/10.
1145/3696410.3714770.
[13] Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William
Fedus, Yunxuan Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, et
al. “Scaling instruction-finetuned language models”. In: Journal of Machine
Learning Research 25.70 (2024), pp. 1–53. URL: http://jmlr.org/
papers/v25/23-0870.html.
[14] Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary,
Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke
Zettlemoyer, and Veselin Stoyanov. “Unsupervised Cross-lingual Represen-
tation Learning at Scale”. In: Proceedings of the 58th Annual Meeting of the
Association for Computational Linguistics. Ed. by Dan Jurafsky, Joyce Chai,
Natalie Schluter, and Joel Tetreault. Online: Association for Computational
Linguistics, July 2020, pp. 8440–8451. DOI: 10.18653/v1/2020.acl-
main . 747. URL: https : / / aclanthology. org / 2020 . acl -
main.747/.
[15] DeepSeek-AI. “DeepSeek-V3 Technical Report”. In: CoRR abs/2412.19437
(2024). DOI: 10.48550/ARXIV.2412.19437. arXiv: 2412.19437.
URL: https://doi.org/10.48550/arXiv.2412.19437.
[16] Digital Education Council. Digital Education Council Global AI Student
Survey 2024. https :// www.digitaleducationcouncil. com/
post/digital-education-council- global- ai- student-
survey-2024. 2025.
[17] Liam Dugan, Daphne Ippolito, Arun Kirubarajan, Sherry Shi, and Chris
Callison-Burch. “Real or fake text? investigating human ability to detect
boundar ies between human-written and machine-generated text”. In: Pro-
ceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence
and Thirty-Fifth Conference on Innovative Applications of Artificial Intel-
ligence and Thirteenth Symposium on Educational Advances in Artificial
Intelligence. AAAI’23/IAAI’23/EAAI’23. AAAI Press, 2023. ISBN: 978-1-
66
REFERENCES
57735-880-0. DOI: 10 . 1609/aaai . v37i11.26501. URL: https :
//doi.org/10.1609/aaai.v37i11.26501.
[18] Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo
Larochelle, Franc¸ois Laviolette, Mario March, and Victor Lempitsky.
“Domain-Adversarial Training of Neural Networks”. In: Journal of Machine
Learning Research 17.59 (2016), pp. 1–35. URL: http://jmlr.org/
papers/v17/15-239.html.
[19] Tianyu Gao, Xingcheng Yao, and Danqi Chen. “SimCSE: Simple Con-
trastive Learning of Sentence Embeddings”. In: Proceedings of the 2021
Conference on Empirical Methods in Natural Language Processing. Ed. by
Mar ie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau
Yih. Online and Punta Cana, Dominican Republic: Association for Com-
putational Linguistics, Nov. 2021, pp. 6894–6910. DOI: 10.18653/v1/
2021.emnlp- main.552. URL: https ://aclanthology.org/
2021.emnlp-main.552/.
[20] Gemini Team, Google. Gemini: A Family of Highly Capable Multimodal
Models. 2024. arXiv: 2312.11805 [cs.CL]. URL: https://arxiv.
org/abs/2312.11805.
[21] Gemma Team, Google DeepMind. Gemma 2: Improving Open Language
Models at a Practical Size. 2024. arXiv: 2408 .00118 [cs.CL]. URL:
https://arxiv.org/abs/2408.00118.
[22] GenAI, Meta. Llama 2: Open Foundation and Fine-Tuned Chat Models.
2023. arXiv: 2307 .09288 [cs.CL]. URL: https://arxiv.org/
abs/2307.09288.
[23] John Giorgi, Osvald Nitski, Bo Wang, and Gary Bader. “DeCLUTR: Deep
Contrastive Learning for Unsupervised Textual Representations”. In: Pro-
ceedings of the 59th Annual Meeting of the Association for Computational
Linguistics and the 11th International Joint Conference on Natural Lan-
guage Processing (Volume 1: Long Papers). Ed. by Chengqing Zong, Fei
Xia, Wenjie Li, and Roberto Navigli. Online: Association for Computational
Linguistics, Aug. 2021, pp. 879–895. DOI: 10.18653/v1/2021.acl-
long.72. URL: https://aclanthology.org/2021.acl-long.
72/.
[24] Xun Guo, Shan Zhang, Yongxin He, Ting Zhang, Wanquan Feng, Haibin
Huang, and Chongyang Ma. “DeTeCtive: Detecting AI-generated Text
via Multi-Level Contrastive Learning”. In: Advances in Neural Informa-
67
REFERENCES
tion Processing Systems. Ed. by A. Globerson, L. Mackey, D. Belgrave,
A. Fan, U. Paquet, J. Tomczak, and C. Zhang. Vol. 37. Cur ran Asso-
ciates, Inc., 2024, pp. 88320–88347. URL: https : / / proceedings .
neurips . cc / paper _ files / paper / 2024 / file /
a117a3cd54b7affad04618c77c2fb18b - Paper- Conference .
pdf.
[25] ibrahimmazlum. IELTS Writing Scored Essays Dataset kaggle.com.
https : / / www. kaggle . com / datasets / mazlumi / ielts -
writing-scored-essays- dataset. [Accessed 29-04-2025]. 2023.
[26] University of Illinois Chicago. “AI tools in education: A report on students
attitudes and usage”. In: (2025). URL: https://learning.uic.edu/
news- stories/report- on- student- attitudes- towards-
ai-in-academia/.
[27] Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford,
Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna
Lengyel, Guillaume Lample, Lucile Saulnier, Lélio Renard Lavaud, Marie-
Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang,
Timothée Lacroix, and William El Sayed. Mistral 7B. 2023. arXiv: 2310.
06825 [cs.CL]. URL: https://arxiv.org/abs/2310.06825.
[28] Ryuto Koike, Masahiro Kaneko, and Naoaki Okazaki. “OUTFOX: LLM-
Generated Essay Detection Through In-Context Learning with Adversarially
Generated Examples”. In: Proceedings of the AAAI Conference on Artificial
Intelligence 38.19 (Mar. 2024), pp. 21258–21266. DOI: 10.1609/aaai.
v38i19 . 30120. URL: https : / / ojs .aaai.org/ index. php/
AAAI/article/view/30120.
[29] Laida Kushnareva, Tatiana Gaintseva, German Magai, Serguei Barannikov,
Dmitry Abulkhanov, Kristian Kuznetsov, Eduard Tulchinskii, Irina Pio-
ntkovskaya, and Sergey Nikolenko. AI-generated text boundary detection
with RoFT. 2024. arXiv: 2311 . 08349 [cs.CL]. URL: https : / /
arxiv.org/abs/2311.08349.
[30] Mina Lee, Percy Liang, and Qian Yang. “CoAuthor: Designing a Human-AI
Collaborative Writing Dataset for Exploring Language Model Capabilities”.
In: CHI ’22: CHI Conference on Human Factors in Computing Systems, New
Orleans, LA, USA, 29 April 2022 - 5 May 2022. Ed. by Simone D. J. Barbosa,
Cliff Lampe, Caroline Appert, David A. Shamma, Steven Mark Drucker,
Julie R. Williamson, and Koji Yatani. ACM, 2022, 388:1–388:19. DOI: 10.
68
REFERENCES
1145/3491102.3502030. URL: https://doi.org/10.1145/
3491102.3502030.
[31] Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi
Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov.
“RoBERTa: A Robustly Optimized BERT Pretraining Approach”. In: CoRR
abs/1907.11692 (2019). arXiv: 1907 . 11692. URL: http : / / arxiv.
org/abs/1907.11692.
[32] Llama Team, AI @ Meta. “The Llama 3 Herd of Models”. In: (2024). arXiv:
2407.21783 [cs.AI]. URL: https://arxiv.org/abs/2407.
21783.
[33] Dominik Macko, Robert Moro, and Ivan Srba. Increasing the Robustness of
the Fine-tuned Multilingual Machine-Generated Text Detectors. 2025. arXiv:
2503.15128 [cs.CL]. URL: https://arxiv.org/abs/2503.
15128.
[34] Larry R Medsker, Lakhmi Jain, et al. “Recurrent neural networks”. In: De-
sign and Applications 5.64-67 (2001), p. 2.
[35] M Monika, V Divyavarsini, and C Suganthan. “A survey on analyzing the
effectiveness of AI tools among research scholars in academic writing and
publishing”. In: International Journal of Advance Research and Innovative
Ideas in Education 9.6 (2023), pp. 1293–1305.
[36] Ayat A. Najjar, Huthaifa I. Ashqar, Omar A. Darwish, and Eman Hammad.
Detecting AI-Generated Text in Educational Content: Leveraging Machine
Learning and Explainable AI for Academic Integrity. 2025. arXiv: 2501.
03203 [cs.CL]. URL: https://arxiv.org/abs/2501.03203.
[37] OpenAI. “GPT-4 Technical Report”. In: ArXiv abs/2303.08774 (2023). URL:
https://api.semanticscholar.org/CorpusID:257532815.
[38] OpenAI. GPT-4o System Card. 2024. arXiv: 2410 . 21276 [cs.CL].
URL: https://arxiv.org/abs/2410.21276.
[39] Nuzhat Prova. “Detecting AI Generated Text Based on NLP and Machine
Learning Approaches”. In: arXiv preprint arXiv:2404.10032 (2024). URL:
https://arxiv.org/abs/2404.10032.
[40] Qwen Team, Alibaba Group. Qwen2 Technical Report. 2024. arXiv: 2407.
10671 [cs.CL]. URL: https://arxiv.org/abs/2407.10671.
69
REFERENCES
[41] Alec Radford, Jeff Wu, Rewon Child, David Luan, Dario Amodei, and Ilya
Sutskever. “Language Models are Unsupervised Multitask Learners”. In:
(2019).
[42] Shoumik Saha and Soheil Feizi. Almost AI, Almost Human: The Challenge
of Detecting AI-Polished Writing. 2025. arXiv: 2502 .15666 [cs.CL].
URL: https://arxiv.org/abs/2502.15666.
[43] SoICT. Tổng kết Hội nghị Sinh viên nghiên cứu khoa học 2025 - SoICT
soict.hust.edu.vn. https : / / soict . hust . edu . vn / tong - ket -
hoi-nghi- sinh- vien-nghien-cuu- khoa-hoc-2025.html.
2025.
[44] Elon University. “Generative AI tools in education: Improving research skills
and writing ability”. In: (2025). URL: https:/ /www. elon.edu/u/
news/2025/01/23/elon- aacu- survey- focuses- on- ais-
impact-on-teaching-and-learning/.
[45] Jannis Vamvas and Rico Sennrich. “Towards Unsupervised Recognition of
Token-level Semantic Differences in Related Documents”. In: Proceedings
of the 2023 Conference on Empirical Methods in Natural Language Pro-
cessing. Ed. by Houda Bouamor, Juan Pino, and Kalika Bali. Singapore:
Association for Computational Linguistics, Dec. 2023, pp. 13543–13552.
DOI: 10 . 18653 / v1/ 2023 . emnlp - main . 835. URL: https : / /
aclanthology.org/2023.emnlp-main.835/.
[46] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion
Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Atten-
tion is All you Need”. In: Advances in Neural Information Processing
Systems 30: Annual Conference on Neural Information Processing Sys-
tems 2017, December 4-9, 2017, Long Beach, CA, USA. Ed. by Isabelle
Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fer-
gus, S. V. N. Vishwanathan, and Roman Gar nett. 2017, pp. 5998–6008. URL:
https : / / proceedings . neurips . cc / paper / 2017 / hash /
3f5ee243547dee91fbd053c1c4a845aa-Abstract.html.
[47] Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang,
Daxin Jiang, Rangan Majumder, and Furu Wei. Text Embeddings by
Weakly-Supervised Contrastive Pre-training. 2024. arXiv: 2212 . 03533
[cs.CL]. URL: https://arxiv.org/abs/2212.03533.
70
REFERENCES
[48] Liang Wang, Nan Yang, Xiaolong Huang, Linjun Yang, Rangan Majumder,
and Furu Wei. Multilingual E5 Text Embeddings: A Technical Report. 2024.
arXiv: 2402.05672 [cs.CL]. URL: https ://arxiv. org/abs /
2402.05672.
[49] Pengyu Wang, Linyang Li, Ke Ren, Botian Jiang, Dong Zhang, and Xipeng
Qiu. “SeqXGPT: Sentence-Level AI-Generated Text Detection”. In: Pro-
ceedings of the 2023 Conference on Empirical Methods in Natural Language
Processing. Ed. by Houda Bouamor, Juan Pino, and Kalika Bali. Singapore:
Association for Computational Linguistics, Dec. 2023, pp. 1144–1156. URL:
https://aclanthology.org/2023.emnlp-main.73/.
[50] Yuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem Shel-
manov, Akim Tsvigun, Osama Mohammed Afzal, Tarek Mahmoud, Gio-
vanni Puccetti, Thomas Arnold, Alham Aji, Nizar Habash, Iryna Gurevych,
and Preslav Nakov. “M4GT-Bench: Evaluation Benchmark for Black-Box
Machine-Generated Text Detection”. In: Proceedings of the 62nd Annual
Meeting of the Association for Computational Linguistics (Volume 1: Long
Papers). Ed. by Lun-Wei Ku, Andre Martins, and Vivek Srikumar. Bangkok,
Thailand: Association for Computational Linguistics, Aug. 2024, pp. 3964–
3992. DOI: 10 . 18653 / v1 / 2024 . acl - long . 218. URL: https :
//aclanthology.org/2024.acl-long.218/.
[51] Yuxia Wang, Artem Shelmanov, Jonibek Mansurov, Akim Tsvigun,
Vladislav Mikhailov, Rui Xing, Zhuohan Xie, Jiahui Geng, Giovanni Puc-
cetti, Ekaterina Artemova, Jinyan Su, Minh Ngoc Ta, Mervat Abassy, et
al. “GenAI Content Detection Task 1: English and Multilingual Machine-
Generated Text Detection: AI vs. Human”. In: Proceedings of the 1st Work-
shop on GenAI Content Detection (GenAIDetect). Ed. by Firoj Alam, Preslav
Nakov, Nizar Habash, Iryna Gurevych, Shammur Chowdhury, Artem Shel-
manov, Yuxia Wang, Ekaterina Artemova, Mucahid Kutlu, and George
Mikros. Abu Dhabi, UAE: International Conference on Computational Lin-
guistics, Jan. 2025, pp. 244–261. URL: https://aclanthology.org/
2025.genaidetect-1.27/.
[52] Yuxia Wang, Rui Xing, Jonibek Mansurov, Giovanni Puccetti, Zhuohan Xie,
Minh Ngoc Ta, Jiahui Geng, et al. Is Human-Like Text Liked by Humans?
Multilingual Human Detection and Preference Against AI. 2025. arXiv:
2502.11614 [cs.CL]. URL: https://arxiv.org/abs/2502.
11614.
71
REFERENCES
[53] Chengfeng Xu, Jian Feng, Pengpeng Zhao, Fuzhen Zhuang, Deqing Wang,
Yanchi Liu, and Victor S. Sheng. “Long- and short-term self-attention
network for sequential recommendation”. In: Neurocomputing 423 (2021),
pp. 580–589. ISSN: 0925-2312. DOI: https://doi.org/10.1016/
j.neucom.2020.10 .066. URL: https://www.sciencedirect.
com/science/article/pii/S0925231220316441.
[54] Zendy.io. “AI in research for students and researchers: 2025 trends and
statistics”. In: (2025). URL: https : / / zendy. io / blog / ai - in -
research - for - students - researchers - 2025 - trends -
statistics.
[55] Zijie Zeng, Shiqi Liu, Lele Sha, Zhuang Li, Kaixun Yang, Sannyuya Liu,
Dragan Ga
ˇ
sevi
´
c, and Guanliang Chen. “Detecting AI-generated sentences in
human-AI collaborative hybrid texts: challenges, strategies, and insights”. In:
Proceedings of the Thirty-Third International Joint Conference on Artif icial
Intelligence. IJCAI ’24. Jeju, Korea, 2024. ISBN: 978-1-956792-04-1. DOI:
10 . 24963 / ijcai . 2024 / 835. URL: https : / / doi . org / 10 .
24963/ijcai.2024/835.
[56] Zijie Zeng, Lele Sha, Yuheng Li, Kaixun Yang, Dragan Ga
ˇ
sevi
´
c, and Guan-
liang Chen. Towards automatic boundary detection for human-AI collabo-
rative hybrid essay in education”. In: Proceedings of the Thirty-Eighth AAAI
Conference on Artificial Intelligence and Thirty-Sixth Conference on Innova-
tive Applications of Artificial Intelligence and Fourteenth Symposium on Ed-
ucational Advances in Artificial Intelligence. AAAI’24/IAAI’24/EAAI’24.
AAAI Press, 2024. ISBN: 978-1-57735-887-9. DOI: 10 . 1609 / aaai .
v38i20 . 30258. URL: https : / / doi . org / 10 . 1609 / aaai .
v38i20.30258.
[57] Qihui Zhang, Chujie Gao, Dongping Chen, Yue Huang, Yixin Huang,
Zhenyang Sun, Shilin Zhang, Weiye Li, Zhengyan Fu, Yao Wan, and Lichao
Sun. “LLM-as-a-Coauthor: Can Mixed Human-Written and Machine-
Generated Text Be Detected?” In: Findings of the Association for Com-
putational Linguistics: NAACL 2024. Ed. by Kevin Duh, Helena Gomez,
and Steven Bethard. Mexico City, Mexico: Association for Computational
Linguistics, June 2024, pp. 409–436. DOI: 10 . 18653 / v1 / 2024 .
findings - naacl . 29. URL: https : / / aclanthology. org /
2024.findings-naacl.29/.
72