Statistics behind cheating

The Central government has said that data analytics conducted by the Indian Institute of Technology (IIT) Madras on the results of the National Eligibility Entrance Test (NEET) undergraduate 2024 show no abnormalities, and concludes that there is very less likelihood of malpractice having taken place.

"The marks distribution follows the bell-shaped curve that is witnessed in any large-scale examination indicating no abnormality...There is an overall increase in the marks obtained by students, specifically in the range of 550 to 720. This increase is seen across the cities and centres. This is attributed to 25% reduction in syllabus. In addition, candidates obtaining such high marks are spread across multiple cities and multiple centres, indicating very less likelihood of malpractice."

Absence of cheating and bell curve

A bell curve in the distribution of marks doesn't necessarily indicate the absence of cheating. The bell curve, or normal distribution, suggests that most students scored around the average mark, with fewer students achieving very high or very low marks.

In large groups, individual instances of cheating might not significantly alter the overall distribution.

About 20 lakh students registered for the exam, and there are about 41,000 government seats. Including private colleges, there are 78,000 seats in total. The percentage of students getting seats is 3.9%.

It's a minuscule number of seats, and the bell curve can hardly provide any insight if 4-10% of cheating has occurred, and they got the seats.

Furthermore, there is no mention of standard deviation, variance, or the spread of data, whether it’s a flattened normal distribution or a squeezed normal distribution, based on exam history. Changes in variability can also suggest deviations from normal patterns of results.

According to the Center, marks do deviate from the usual history, but they suggest the reduction of the syllabus as the reason instead of cheating. How can they be so sure?

Mass scale manipulation argument is vague

The argument regarding mass-scale manipulation is vague. In today's internet and mobile world, it takes just minutes for a leaked paper to spread. If it reaches even 10,000 to 20,000 students, the entire purpose of the examination is defeated, especially considering there are fewer than a lakh medical seats, and even fewer (41,000) government seats.

What statistics researcher use for Detection of Cheating?

The Detection of Cheating on E-Exams in Higher Education—The Performance of Several Old and Some New Indicators

They usually use person-fit indices (e.g., the U3 statistic).

Person fit indices, such as U3 statistics, are used in educational measurement to detect irregular or unexpected response patterns, which might indicate cheating or malpractice during exams.

A high U3 value suggests that the person's responses deviate significantly from what the model predicts, which might indicate unusual behavior, such as cheating.

So, one has to map the item difficulties of each multiple-choice question and each student’s response for each MCQ. Such an analysis requires data from all answer sheets.

The Opaque Govt

The government is neither making the data nor the analysis code public. How can we trust such analysis? This lack of transparency creates a low trust environment towards an opaque government. When data and methods are not openly shared, it becomes difficult for independent parties to verify the results or understand the processes used. This opacity fosters suspicion and undermines confidence in the government's findings and decisions. Transparency is crucial for building trust, ensuring accountability, and fostering a sense of reliability and integrity in governmental operations.