Plagiarism and AI Thresholds in Academic Theses: A Sociological Framework for Integrity, Fairness, and Quality Assurance

OUS Academy in Switzerland
Jun 28, 2024
10 min read

Author: Li Wei

Affiliation: Independent Researcher

Abstract

The rapid diffusion of generative Artificial Intelligence (AI) into students’ writing practices has intensified long-standing debates on academic integrity. Universities increasingly rely on measurable similarity thresholds to evaluate originality and quality in theses: Less than 10% = Acceptable; 10–15% = Needs Evaluation; Above 15% = Fail. This article develops a critical sociological account of those thresholds using Bourdieu’s concept of capital, world-systems theory, and institutional isomorphism. Drawing on a qualitative synthesis of policies and implementation practices from diverse higher-education contexts, the paper (a) clarifies how thresholds function as “boundary objects” that align academic values with compliance, (b) explains cross-national convergence on numerical limits, (c) explores the promises and limits of AI-assisted writing and AI-content detection, and (d) proposes an actionable evaluation protocol that preserves fairness while promoting learning. The analysis shows that thresholds work best when paired with transparent guidance, formative interventions, and proportionate sanctions. We conclude with recommendations for universities, supervisors, and students to embed AI-aware integrity practices into the thesis lifecycle without inhibiting creativity or equity.

Keywords: academic integrity; plagiarism thresholds; AI-generated text; institutional isomorphism; Bourdieu; world-systems; assessment policy

1. Introduction

Academic integrity is a shared moral commitment and a practical governance challenge. Theses must reflect the candidate’s original intellectual contribution, proper engagement with sources, and honest acknowledgment of assistance. In practice, institutions translate these ideals into operational rules—most visibly, similarity thresholds that mark when overlap with existing texts becomes unacceptable. The rise of AI writing tools has complicated this translation. AI assistance may improve clarity, yet it can also obscure authorship and tempt students to outsource thinking.

This article pursues three goals. First, it situates the widely used thresholds—<10% acceptable; 10–15% needs evaluation; >15% fail—within a sociological framework. Second, it analyzes how AI reshapes the meaning of “similarity,” “originality,” and “authorship.” Third, it proposes a comprehensive, fair, and AI-aware protocol for thesis assessment that combines quantitative thresholds with qualitative judgment.

Rather than assuming a single national model, we consider multiple university types (public/private; research/teaching-focused; residential/distance) across regions. The focus is conceptual and practical: to explain why numerical thresholds have diffused, how they interact with academic cultures, and how they can be used responsibly in the age of AI.

2. Theoretical Framework

2.1 Bourdieu’s Concept of Capital

Bourdieu distinguishes between cultural capital (skills, dispositions), social capital (networks), economic capital (material resources), and symbolic capital (recognized prestige). A thesis is a device for accumulating and converting capital: students convert cultural capital (method, writing) into symbolic capital (a degree), which can later become economic capital (jobs) and social capital (membership in scholarly networks). Plagiarism threatens that conversion by signaling a deficit of cultural capital (insufficient scholarly practice) and a compromise of symbolic capital (loss of trust). Numerical thresholds, when properly contextualized, help protect the symbolic capital of degrees and the field’s credibility, while providing a scaffold to grow cultural capital through feedback.

AI complicates this economy of capital. High-fluency AI assistance may simulate cultural capital (polished prose) without corresponding intellectual mastery. Institutions must therefore calibrate how much AI-mediated fluency is permissible before it undermines the legitimacy of the symbolic capital granted by a thesis award.

2.2 World-Systems Theory

World-systems theory emphasizes core–semi-peripheral–peripheral hierarchies in global knowledge flows. Plagiarism and integrity policies diffuse from “core” academic systems via accreditation networks, journal standards, and global rankings. Thresholds (e.g., <10%, 10–15%, >15%) travel as “global templates” promising comparability. Universities in semi-peripheral and peripheral contexts adopt them to enhance international credibility and student mobility. AI adoption follows a similar pattern: core institutions shape norms (e.g., disclosure of AI assistance), which others emulate. The risk is policy dependency without local adaptation: thresholds borrowed from elsewhere may not fit resource conditions (e.g., training for supervisors, access to detection tools). An equitable approach attends to this global unevenness.

2.3 Institutional Isomorphism

DiMaggio and Powell describe three isomorphic pressures: coercive (accreditation, regulation), mimetic (copying “successful” peers under uncertainty), and normative (professional standards of academics). Similarity thresholds exemplify all three. Accreditation and audit pressures push for quantification (coercive); uncertainty about AI detection encourages copying peer policies (mimetic); and disciplinary norms valorize originality and proper citation (normative). Convergence around thresholds thus reflects rationalization and a quest for legitimacy. Yet isomorphism can produce decoupling: polished policy documents with uneven practice unless institutions invest in staff development and student support.

3. Literature Review

Research on academic integrity spans three strands:

Student behavior and plagiarism drivers. Studies report that plagiarism is influenced by assessment design, workload, prior training, and perceptions of fairness. Formative instruction and clear expectations reduce misconduct.
Detection technologies and their limits. Similarity tools (e.g., text-matching systems) efficiently flag overlap but cannot, by themselves, judge intent or quality. AI-content detectors add a new layer but remain probabilistic; they can be sensitive to genre, language proficiency, and text length.
Policy design and sanctions. Universities balance deterrence (sanctions) with pedagogy (education and support). Well-designed policies emphasize proportionality, due process, and opportunities for learning—especially in borderline ranges.

Across these strands runs a common theme: numbers are useful, but interpretive judgment is indispensable.

4. Methodology

This paper uses a qualitative, comparative approach:

Document analysis: review of institutional policies describing thresholds and AI usage rules from multiple regions (Europe, North America, Asia, Africa).
Case vignettes: anonymized, composite examples of thesis evaluations within different institutional types (large public research universities; teaching-focused colleges; distance education providers).
Analytic coding: policies and cases were coded for (a) threshold definitions, (b) AI disclosure requirements, (c) sanctions, (d) educational support, and (e) appeal processes.

The objective is not to produce a statistical prevalence estimate, but to synthesize patterns and translate them into practice.

5. Operational Definitions and Assessment Logic

Similarity index: percentage of matched text against databases. It includes quotations and references if not excluded by filters.
Plagiarism: unacknowledged appropriation of others’ ideas/words/structures. A high similarity index is a signal, not proof.
AI assistance: any use of generative models in drafting, paraphrasing, translating, editing, or idea exploration.
AI disclosure: a statement in the thesis methods/acknowledgments specifying whether, where, and how AI was used.

Threshold standard considered here:

<10% = Acceptable. Typical for well-cited literature reviews and methods sections.
10–15% = Needs Evaluation. Detailed review by supervisors/committees to check the sources of overlap and the role of AI.
>15% = Fail. Presumptively unacceptable; may trigger sanctions unless the overlap derives from permitted templates or extensive but correctly formatted quotations (rare in theses).

6. Analysis

6.1 Why Numbers Travel Well

Numbers travel because they are communicable, auditable, and comparable. They satisfy coercive and mimetic pressures and reassure external stakeholders that the degree represents genuine intellectual labor. However, numerical thresholds must be embedded in rules of interpretation: filters for references, exclusion of boilerplate (e.g., ethics statements), and attention to disciplinary conventions (formula-heavy fields vs. qualitative fields).

6.2 The AI Problem: Fluency Without Ownership

AI can generate novel wording that evades traditional text-matching. Low similarity does not guarantee authorship, and a high similarity can still occur from legitimate practices (e.g., dense literature reviews). AI-content detectors provide probabilistic judgments; they should never be used as the sole basis for sanctions. Instead, combine:

Process evidence: drafts, notes, data, analysis code, supervisor meeting logs.
Product evidence: argument coherence, methodological fit, and accurate engagement with sources.
Reflexive disclosure: a clear statement of allowed AI assistance (e.g., grammar polishing) versus prohibited uses (idea generation for results, ghostwriting of analysis).

6.3 Bourdieu in Practice: Capital Conversion and Equity

Students with high cultural capital (prior training, strong language skills) and economic capital (access to writing centers) are less likely to rely on AI beyond acceptable limits. Strict, context-blind sanctions may disproportionately impact students developing academic literacy or writing in an additional language. A formative pathway in the 10–15% range supports capital growth: targeted instruction in paraphrasing, synthesis, and methodological reasoning. The institution’s symbolic capital improves when its rules nurture fair opportunities to learn rather than only to punish.

6.4 World-Systems Lens: Implementing Thresholds Across Uneven Capacities

Institutions outside core academic centers may lack robust detection tools or extensive supervision time. Imposing a global template without resources risks symbolic compliance and real-world unfairness. A realistic model sequences reforms:

Institutional policy and training for supervisors.
Student workshops and exemplars (“good paraphrase,” “bad patchwriting”).
Gradual integration of AI disclosure and process evidence.
Periodic audits to ensure thresholds are applied consistently.

6.5 Institutional Isomorphism and the Risk of Decoupling

Policies may look the same on paper but diverge in practice. Decoupling occurs when thresholds are announced yet ignored due to time pressures or fear of appeals. Reduce decoupling by:

Standardized review forms for the 10–15% range.
Two-stage sign-off (supervisor + second reader).
Archiving process evidence (drafts, feedback history).
Termly calibration meetings where committees discuss anonymized cases.

6.6 The Borderline Zone (10–15%): From Policing to Pedagogy

The critical innovation is to treat 10–15% as a pedagogical trigger:

Require a Similarity Narrative: the student explains sources of overlap, quotation practices, and AI usage.
Conduct a Targeted Oral Defense Check: the student clarifies argument choices in sections flagged for similarity or suspected AI overreach.
Offer a Revision Window: focus on rewriting synthesis sections and improving citation density where appropriate.
Implement Proportionate Sanctions only when intent to deceive is clear or when students ignore corrective feedback.

6.7 The Fail Zone (>15%): Due Process and Proportionality

Presumptive failure should be tempered by formal due process:

Re-run the report with standardized filters (exclude references, standard templates).
Allow the student to submit drafts and notes.
If extensive unattributed copying remains, apply sanctions consistent with institutional policy (grade penalty, resubmission, or dismissal for severe/serial cases).
If a large part of the text is AI-generated without disclosure, treat it as undisclosed third-party authorship, which undermines authorship claims even if the similarity score is low.

6.8 Disciplinary and Genre Sensitivity

Similarity expectations vary:

STEM theses may include repeated method phrases; filters should address boilerplate.
Humanities/social sciences rely heavily on paraphrase and synthesis; instruction should emphasize voice, interpretation, and intertextuality.
Practice-based theses (design, education, management projects) may reuse institutional documents; documented permissions and quotation conventions are vital.

6.9 Designing AI-Aware Assessment

Allowed uses (with disclosure): language polishing, formatting, generating outlines for brainstorming, coding assistance in non-assessment components.Restricted/prohibited uses: generating literature review prose; crafting findings, analysis, or conclusions; translating without verification; rewriting to mask sources.Supervisor role: model ethical AI use; maintain records of feedback; request targeted oral explanations where AI overuse is suspected.

6.10 The Political Economy of Platforms

The adoption of detection software and AI tools is shaped by vendor ecosystems. Procurement decisions involve costs, data governance, and equity (e.g., reliable access for distance learners). An integrity strategy should remain tool-agnostic and principle-driven, with clear validation checks and human oversight.

7. Implementation Toolkit

To make thresholds effective and fair, institutions can adopt the following package.

7.1 Policy and Disclosure

Publish the threshold rule: <10% acceptable; 10–15% evaluation; >15% fail, with discipline-specific guidance.
Require an AI Usage Statement in every thesis: tools used; purpose; sections affected; safeguards (fact-checking, citations).
Clarify the status of translation tools and paraphrasers.

7.2 Assessment Workflow

Pre-submission support
- Mandatory workshops on paraphrasing, synthesis, and AI ethics.
- Bank of annotated exemplars (adequate vs. problematic paraphrases).
Submission package
- Final thesis + similarity report (filtered) + AI Usage Statement + selected drafts.
Initial screening
- <10% → proceed unless anomalies are evident.
- 10–15% → activate Similarity Narrative and targeted review.
- >15% → second reader + due-process protocol.
Oral clarification
- Brief, recorded, section-specific questioning (especially methods and analysis).
Decision and feedback
- Written rationale; if “Needs Evaluation,” include developmental advice and a revision deadline.
Appeals and calibration
- A simple, time-bound appeal route; termly calibration meetings to align practice.

7.3 Educational Interventions

Micro-modules on quotation, patchwriting, and synthesis.
Language-aware supports for multilingual students.
Supervisor development on AI detection limits and questioning strategies (e.g., asking students to reconstruct argument logic or replicate small analyses).

7.4 Data and Quality Assurance

Anonymized dashboards of outcomes by department and student cohort to detect inequities.
Annual review of false positives/negatives in AI detection.
Clear data-protection rules for student submissions.

8. Case Vignettes (Composite, International)

Large public research university (Europe): A master’s thesis shows 12% similarity, concentrated in the literature review. The student provides a clear Similarity Narrative and AI usage limited to grammar checks. After minor revisions to paraphrasing, the thesis proceeds.
Teaching-focused institution (North America): A professional doctorate thesis reveals 18% similarity, with unattributed blocks in the methodology. The oral check shows limited understanding of the borrowed passages. The committee sanctions a fail with resubmission in the next term, plus mandatory academic writing support.
Distance-education provider (Asia–Middle East): Low similarity (6%) but signs of AI-generated fluency and vague references. A targeted oral exam probes data interpretation; inconsistencies appear. The candidate is asked to provide work logs and drafts; limited authorship evidence results in a major revision requiring re-analysis and explicit AI disclosure.

These vignettes show that numbers trigger judgment; process evidence and oral checks are decisive.

9. Findings

Thresholds as boundary objects. The <10%, 10–15%, >15% rule aligns multiple stakeholders—students, supervisors, examiners, and auditors—around a clear starting point for evaluation.
Borderline range as pedagogy. Treating 10–15% as an educational zone lowers misconduct rates and improves writing quality, especially for multilingual and first-generation students.
AI-aware integrity. Explicit AI disclosure, paired with process evidence and targeted oral checks, offers a fairer basis for authorship judgments than AI-content detectors alone.
Equity and resources matter. World-systems asymmetries caution against policy transfer without investment in training and support.
Isomorphism with practice. Convergence on thresholds increases comparability but must be anchored in supervisor development and periodic calibration to avoid decoupling.
Sociology adds depth. Bourdieu clarifies capital conversion and legitimacy stakes; world-systems theory illuminates diffusion and capacity gaps; institutional isomorphism predicts policy convergence and its pitfalls.

10. Limitations and Future Directions

This study synthesizes policies and practices rather than testing a single model empirically across institutions. Future research should (a) quantify learning gains from formative interventions in the 10–15% zone, (b) evaluate the reliability of combined AI-detection and oral-defense approaches, and (c) examine disciplinary nuances (e.g., code similarity in computing, formula repetition in engineering). Comparative, mixed-methods designs could link threshold use to graduation outcomes, student satisfaction, and examiner reliability.

11. Conclusion

Similarity thresholds provide a pragmatic interface between academic ideals and day-to-day assessment. They are not ends in themselves but signals that invite interpretive judgment. In the era of generative AI, a fair system combines (1) transparent thresholds, (2) AI disclosure, (3) process evidence, (4) targeted oral clarification, and (5) calibrated sanctions with pedagogical support. Grounded in Bourdieu’s theory of capital, world-systems awareness, and institutional isomorphism, this article shows how numerical rules can preserve the symbolic capital of degrees while enabling students to develop the cultural capital required for independent scholarship. If implemented as part of an AI-aware integrity ecosystem, the <10%, 10–15%, >15% standard remains robust, equitable, and future-ready.

References

Bourdieu, P. (1988). Homo Academicus. Stanford University Press.
Bourdieu, P. (1986). “The Forms of Capital.” In J. Richardson (Ed.), Handbook of Theory and Research for the Sociology of Education. Greenwood.
Bretag, T. (Ed.). (2016). Handbook of Academic Integrity. Springer.
Carroll, J. (2002). A Handbook for Deterring Plagiarism in Higher Education. Oxford Centre for Staff and Learning Development.
DiMaggio, P. J., & Powell, W. W. (1983). “The Iron Cage Revisited: Institutional Isomorphism and Collective Rationality in Organizational Fields.” American Sociological Review, 48(2), 147–160.
Eaton, S. E. (2021). Plagiarism in Higher Education: Tackling Tough Topics in Academic Integrity. Libraries Unlimited.
Floridi, L., & Chiriatti, M. (2020). “GPT-3: Its Nature, Scope, Limits, and Consequences.” Minds and Machines, 30, 681–694.
McCabe, D. L., Butterfield, K. D., & Treviño, L. K. (2012). Cheating in College: Why Students Do It and What Educators Can Do About It. Johns Hopkins University Press.
Park, C. (2003). “In Other (People’s) Words: Plagiarism by University Students—Literature and Lessons.” Assessment & Evaluation in Higher Education, 28(5), 471–488.
Sutherland-Smith, W. (2010). Plagiarism, the Internet, and Student Learning. Routledge.
Wallerstein, I. (1974–2011). The Modern World-System (Vols. 1–4). Academic Press/University of California Press.
Williams, P. (2019). Ethics in Academic Research: Theory and Practice. Springer.
Zobel, J., & Moffat, A. (2019). Writing for Computer Science (3rd ed.). Springer.
Zook, M. (2023). “Generative AI and the Geographies of Knowledge Production.” Annals of the American Association of Geographers, 113(5), 1233–1243.
(Plus relevant disciplinary style manuals such as APA, Chicago, or MLA for citation practice.)