ChatGPT AI plagiarism detection

Created by OpenAI and released to the public in November 2022 (Zhai, 2022), ChatGPT is a universal conversational chatbot with the ability to converse naturally and openly on a variety of topics. It is made to produce human-like texts in response to requests or chats – summarizing, translating, paraphrasing, generating content, describing art, composing email messages, social media posts, essays, or even coding, doing mathematical calculations, writing after a title, writing in a chosen style or “writing in a particular tone – humorous, familiar, professional, witty, friendly” (de Vicente-Yagüe-Jara et al, 2023). It is based on the GPT-3 language model – a large language model (LLM), GPT standing for “generative pretrained transformer”, the most recent advancement in generative machine learning (ML) in Artificial Intelligence (Elbana & Armstrong, 2023). ChatGPT is far superior to the familiar chatbots used extensively in recent years as conversation agents or virtual assistants. It produces responses through the extraction of knowledge and data acquired through training on massive data sets and billions of words from online sources such as books, articles, online news, and conversations.

Artificial intelligence has a disruptive effect on higher education. It offers previously unheard-of advantages and opportunities for teachers and students, but it also poses significant challenges due to its limitations and, particularly, the ethical concerns raised by its widespread use in writing assignments, reports, and other academic deliverables (which can result in major plagiarism issues), or in the creation of answers for various academic and professional examinations. However, at present, it is not known (still up for debate) whether Chat GPT could successfully pass the Turing test – widely employed to assess if artificial intelligence is capable of thinking and acting in human-like ways (Turing, 1950).

Regarding the limitations, according to various research, there are a number of restrictions, risks and pitfalls associated with ChatGPT (Sallam, 2023), including potential legal risks, copyright issues, a lack of transparency, misinformation, inaccurate or hallucinated responses, fabricated data, wrong citations, interpretability issues, impersonality, potential for biases, plagiarism, incorrect content, incoherence, lack of up-to-date information, authorship issues and cybersecurity threats that could result in unanticipated and unwanted scenarios.

Even though many colleges allow the use of the program with the requirement that students cite the source, the practice of designating ChatGPT as an author does not seem to be accepted – editorials in Science, Nature, and the Lancet have referred to this practice as scientific misconduct. In these circumstances, it’s critical to review the students’ work and identify any AI-generated content that they have included in their papers. Goals include reducing attempts at cheating and helping students realize that learning, using creative thought and deeper comprehension, and conducting their own research with integrity and diligence are crucial for their future career path. Shortly, it is about understanding that “AI is a means, not an end” (Breton, 2021).

There are numerous and varied techniques that can be employed in academic settings to identify content produced by AI. First of all, teachers can identify the markers that can indicate the use of AI-generated content in the students’ papers (Cain & Buskey, 2023, p. 403), such as wording that is not appropriate for the student’s vocabulary, superfluous or repetitive language, especially when referring to fundamental information, content that contains errors or misinformation, insufficient original thought or analysis in the text, strange or awkward wording, false or inaccurate bibliographic references, poor narrative originality. However, this is no easy task: a study of linguists’ ability to distinguish between ChatGPT/AI and human writing found that “reviewers were largely unsuccessful in identifying AI versus human writing, with an overall positive identification rate of only 38.9% “(Casal & Kessler, 2023).

Secondly, academic teachers can use a large variety of AI content detectors available on the market, capable of assessing whether a text was likely produced by a machine. There are, however, some limitations that make the probability of detection still more or less very accurate.

Last but not least, universities can decide to ban students from using ChatGPT, as Sciences Po in Paris, Bangalore’s RV University in India, or the University of Michigan in the US did. Some educational institutions, such as the University of Hong Kong, have temporarily prohibited ChatGPT until appropriate guidelines for its usage can be developed, while many have adopted a wait-and-see attitude. No matter what solution was chosen, it seems that banning AI chatbots is not a good idea, as long as the forbidden fruit is always more tempting and sweeter, especially for students who are excited about these new writing tools that make their work much easier and save them time. Therefore, taking a more positive approach to AI, many universities are attempting to guide students’ usage of AI tools rather than banning them.

Regardless of the attitude teachers have towards the use of AI – total opposition, moderate optimism, enthusiasm, or concern, most university professors recognize that students can become more conscientious and judicious in their use of artificial intelligence tools when they understand both the benefits of using these tools ethically and especially when they become aware of the challenges, limitations, and potential pitfalls associated with them. In this respect, academics, together with students, have started to investigate and experiment with artificial intelligence content generation on a wide variety of writing tasks (Elkins & Chun, 2020), attempting to push LLM beyond its comfort zone by imagining situations that differ from those that LLM has often met and learned from its training set.

According to numerous studies (Crompton & Burke, 2023), the results of students’ experiments have shown the advantages, disadvantages, difficulties, and questionable aspects of AI usage. This has encouraged students to hand in original, self-created assignments instead of relying on AI to complete their homework. Good evidence of this is the conclusion of one student after testing the pros and cons of AI in such an experiment: “Writing is a skill that is able to be cultivated, but only through practice and understanding of your own identity” (Fyfe, 2023).

References

Breton, M. (2021). Europe fit for the Digital Age: Commission proposes new rules and actions for excellence and trust in Artificial Intelligence. link

Cain, C., & Buskey, C. (2023). Artificial intelligence and conversational agent evolution – a cautionary tale of the benefits and pitfalls of advanced technology in education, academic research, and practice. Journal of Information, Communication and Ethics in Society. 21(4), 394-405. DOI

Casal, J. E., & Kessler, M. (2023). Can linguists distinguish between ChatGPT/AI and human writing? A study of research ethics and academic publishing. Research Methods in Applied Linguistics, 2(3). DOI 

Crompton, H., & Burke, D. (2023). Artificial intelligence in higher education: the state of the field. International Journal of  Educational Technology in Higher Education, 20(22). DOI

de Vicente-Yagüe-Jara, M. et al. (2023). Writing, creativity, and artificial intelligence. ChatGPT in the university context. Comunicar, 31(77), 45-54. DOI

Elbanna, S., & Armstrong, L. (2023). Exploring the integration of ChatGPT in education: adapting for the future. https://doi.org/10.1108/MSAR-03-2023-0016

Fyfe, P. (2023). How to cheat on your final paper: Assigning AI for student writing. AI & Soc, 38, 1395–1405. DOI

Sallam, M. (2023). ChatGPT utility in health care education, research, and practice: systematic review on the promising perspectives and valid concerns. Healthcare, 11(6), 887.

Turing, A. M. (1950) Computing Machinery and Intelligence. Mind, 49, 433-460. link

Zhai, X. (2022). ChatGPT user experience: implications for education. link

About the author

Voichița Dragomir is the director of the library at the National University of Political Studies and Public Administration (SNSPA) in Bucharest. Her academic background is in electronics and telecommunications. She worked at the Carol I Central University Library  in Bucharest for 26 years, of which more than 8 years as deputy director, having a major contribution to the automation process of the library. She was the recipient of a 2-month Getty Fellowship at the Mortenson Center at University of Urbana-Champagne, Illinois. She participated in several international projects and professional exchanges in the US and European countries. Voichița holds a PhD in Library Science from the University of Bucharest.

She joined SNSPA in December 2016. She was, from 2018 to 2021, manager of the project “Creativity and Excellence in Study, Academic and Social Integration – CESIAS@SNSPA” no. 96/SGU/CI/II, funded by the ROSE Secondary Education Project, with external financial support provided by the World Bank. She is also an associated professor at the Faculty of Management.  She published in 2022 the book entitled „Library 2.0 – an apomediator of knowledge, communication and cooperation.” In 2023 she attended the course „Leadership Institute for Academic Librarians” at Harvard Graduate School of Education, Cambridge, MA.

She is a member of the Academic Advisory Board of EBSCO Information Services. Voichita is deeply involved in the activities of the joint working groups concerning libraries (she is co-leader of the working group 2.3.2. “Monitor the evolution of CIVICA audiences’ needs, skills and information literacy”) – carried out within the European alliance Civica of which SNSPA is a member together with 9 other prestigious European universities.

Written by