White Paper: Linguistic Fingerprinting to Flag AI Authorship Cheating
In recent months, alarm has been raised that AI language tools, such as ChatGPT, could be the basis for widespread proliferation of authorship cheating in academic papers submitted by students. The concern is that students might be mis-using advanced machine learning toolboxes that are designed and trained to generate human-like responses to various prompts and questions. It should be clarified that these tools provide useful information and assistance when used properly. However when applied towards ill-gotten purposes, they can provide rampant and difficult to ascertain applications.
In this regard, students have been found to cheat by engaging in unethical practice of claiming credit for work that is actually authored by these AI language tools, in whole or in part. For example, a student or researcher might use ChatGPT to generate a section of text for a paper, and then submit the paper as their own work, without acknowledging that the text was generated by an AI language model. This is grounds for sanction in most academic institutions, up to and including expulsion.
Following the concept of it takes a thief to find a thief, recently developed services apply AI toolboxes in order to identify possible AI generated text (“AI Detection”), with the most famous being GPTZERO, that uses the GPT technology in order to identify possible ChatGPT generated text. By way of a brief background, AI toolboxes generate text by training a neural network-based language model on massive amounts of text data to generate human-like text. AI-generated text is typically produced by algorithms and may contain patterns that are not found in human-generated text. For example, AI-generated text may include repetitive patterns, use of uncommon words, or have a more formulaic structure. The new AI Detection software reverse engineers the process, by analyzing the language patterns and style used in the text, and for each text segment, predicts what would be the following text if an AI toolkit were used to generate it. If there are recurring matches throughout the document, and the AI Detection recognizes these kind of non-human patterns, it uses these to predict whether the text is AI-generated.
However, by definition, these AI Detection tools are playing catch up. They are identifying patterns that are unique for the AI Toolboxes, and are applying these to identify the likelihood that a document was generated by an AI toolbox. As the AI Toolboxes improve their algorithms, and more randomness is inserted, randomness that mimics human behavior and the human writing style, the gap between human and AI generated text will become smaller and smaller, making it more difficult to identify the gaps and accurately assess whether the text was generated by a human or by a machine.
Academic dishonesty is not new. It has been around since the days of the first universities. In the last few years, academic institutions have confronted a new wave of academic dishonesty in the form of Contract Cheating, situations where students pay someone else to complete academic work on their behalf. These works are written by a human, and can be of sufficiently high quality that it is sometimes difficult to distinguish from the student's own work. Over the years, several tools were developed to aid college ethical officers to identify contract cheating. The two leading methodologies include:
• Plagiarism Detection Software: software that compares the text submitted by a student with other sources of text available (such as the internet, academic journals, and databases of previously submitted papers). Some popular plagiarism detection software tools include Turnitin, PlagScan, and iThenticate.
• Style and Language Analysis (“Forensic Linguistic”): Forensic linguistic tools that identify differences in writing style and language between the student's work and the work submitted by someone else. These tools can help identify instances where the writing style or language used in a student's work differs significantly from their previous work, or from the work of their peers. Similarly, forensic linguistics analysis enable the comparison of the student's work with samples of work completed by known contract cheating providers.
By using these tools, college ethical officers have been working to identify instances of contract cheating and taking appropriate action to ensure academic integrity. FLINT AI has applied the principles of forensic linguistics to develop a software tool that essentially creates a linguistic fingerprint. Similar to our physiological fingerprint, each individual has an individualized style of how they communicate. Based upon education, community and environment as well as personal style and many other variables that impact each of us as individuals, we develop unique writing styles. By identifying these unique individualized language patterns, the FLINT software can identify the author of a text. Like a fingerprint used for identification purposes, a linguistic fingerprint is unique to each individual and reflects their distinct writing style, word choice, syntax, and grammar.
FLINT is applying methodologies that have been used by Prof. Robert Leonard, of Hoffstra University, in support of crime investigation units across the nation and internationally, and which have withstood the scrutiny of the legal judiciary in multiple high profile cases. These methodologies essentially create a linguistic fingerprint, and use these to identify anonymous authors of texts, such as in cases of cybercrime or online harassment, or to verify the authorship of a text, such as in cases of plagiarism or academic misconduct. The techniques apply linguistic features including grammar and linguistic patterns together with statistical analysis and machine learning algorithms in order to identify the specific features of language use that are characteristic of a particular author.
Some examples of linguistic features that contribute to a linguistic fingerprint include:
• Word choice: the specific words that an author uses, including their frequency and distribution;
• Syntax and grammar: the way that an author structures sentences and uses grammar;
• Punctuation: the way that an author uses punctuation marks, such as commas, semicolons, and dashes, or spaces between words and sentences can contribute to their linguistic fingerprint;
• Capitalization: the way the author might, or might not, use capital letters, for certain words or within sentences. When combining these features, a linguistic fingerprint can be created, a unique pattern of language use that is applied to identify any specific individual. By analyzing specific linguistic features, linguistic fingerprinting techniques can identify patterns that are characteristic of a particular author and use them to verify their authorship of a text.
The benefit of a linguistic based analysis over the AI Detection toolbox is that it is not playing catch-up, and not applying a reverse engineering methodology for detection purposes. The linguistic based analysis is applying tried and proven methodologies to attach the problem from a different direction. Linguistic analysis has been used for decades to determine authorship and has a proven track record of reliability. By creating the linguistic fingerprint, the linguistic toolbox is transparent, much more so than the AI Detection analysis that provides a black box response. The linguistic fingerprint applies methods that are clearly defined, easier to understand and transparent to the user, i.e. they are easier to defend.
By definition, AI Detection tools are applied towards the identification of whether a text was created by human or an AI toolbox. What happens, however, where the question is whether the author is not an AI toolbox, but rather another human? Or when there are significant elements of human added text into the document? The only way to then correctly identify whether the author is indeed the person who submitted the document is by creating an individual linguistic fingerprint.
However, linguistic fingerprinting analysis has historically been a human labor intensive process, subject to human analysis. A lengthy, expensive process, and therefore relegated to unique legal proceedings. One of the clear benefits attributable to the AI Detection is its speed of analysis, and scalability, which translate to the ability to process large amounts of text quickly and efficiently, and can therefore scale, which reduces the cost to use per application. FLINT AI has bridged this gap, by automating the historical lengthy, human intensive process of creating the linguistic fingerprint and analyzing documents, so that this process is now automated and scalable, therefore significantly reducing the cost per application. FLINT AI has essentially developed the first on-line linguistic fingerprinting technology.
In a recent extensive study evaluating the effectiveness of the FLINT Linguistic Fingerprinting tool in identifying contract cheating, and comparing to available AI Detection tools, analyzing situations where the document at question was authored by either an individual or by an AI Toolbox, the FLINT system consistently provided accurate results in over 75% of the test cases. The FLINT test consisted of over 1,000 test cases, which included texts written by different AI toolkits, as well as texts written by either the same individual or by different human individuals. To assure consistency across platforms, the AI generated texts were all produced by the same prompt that was developed from the control human text. The test scenarios included texts of variable lengths, and also included scenarios where human text of variable lengths were inserted into the AI generated text (AI only; 20% of the text was human developed text; and 50% of the text of the document was human developed text).
As predicted, the AI Detection tool (GPTZero) found it difficult to provide accurate results when the text was written by human. In over 90% of these instances, GPTZero incorrectly identified the document as being authored by an AI toolkit. In fact, when the document was authored by a different human author, the GPTZero could not assess whether or not the document was authored by a human or AI Toolkit. By using linguistic fingerprinting, the FLINT System had over 80% accuracy in correctly identifying a situation of an authorship cheating, even though the document at question was not authored by an AI toolkit, but rather by another human.
Similar anticipated trends were found throughout the testing, where adding human written text into the document significantly reduced the accuracy of the AI Detection toolkit, making it more difficult for it to correctly identify whether the document was authored by an AI toolkit or by a human. The absence of a linguistic fingerprint capability render the AI toolkit incapable of correctly identifying whether or not the certain individual authored the document, or components thereof. By applying a linguistic fingerprinting technology, the FLINT System does not differentiate between how much human text is injected into the document. Its analysis is purely based upon whether or not the certain individual authored the document at question.
In conclusion, by creating a linguistic fingerprint, the FLINT System evaluates whether the document was authored by the person who submitted the document, as compared to the rest of the universe. FLINT is indifferent whether the ‘rest of the universe’ is AI or human. This kind of individualized analysis is devoid of biases, such as what happens as AI toolkits develop better methodologies to introducing more randomness, mimicking human behavior and writing style. Unless the AI toolkit is trained to expressly copy and imitate the linguistic style of the specific individual, the linguistic fingerprint is the more accurate methodology that will survive the test of time and progress of AI methodologies.