Anti-Plagiarism Software Goof: Paper Rejected for Repeat CitationsThe scholar was obliged by discipline rules to cite the flagged information repetitively
Not only was Jean-François Bonnefon’s paper rejected by conventional anti-plagiarism software but the rejection didn’t make any sense. Bonnefon, research director at Toulouse School of Economics, was informed of “a high level of textual overlap with previous literature” (plagiarism) when he was citing scientists’ affiliations, standard descriptions, and papers cited by other—information he was obliged to cite accurately, according to a standard format.
“It would have taken two [minutes] for a human to realise the bot was acting up,” he wrote on Twitter. “But there is obviously no human in the loop here. We’re letting bots make autonomous decisions to reject scientific papers.”
Reaction to the post by Dr Bonnefon, who is currently a visiting scientist at the Massachusetts Institute of Technology, suggested that his experience was far from unique. “Your field is catching up,” said Sarah Horst, professor of planetary science at Johns Hopkins University, “this happened to me for the first time in 2013.”Rachel Pells, “Paper rejected after plagiarism detector stumped by references” at Times Higher Education
Here’s Bonnefon’s story on Twitter.
Yesterday, our Analysis feature addressed the fact that machines do not have common sense: “the problem doesn’t seem to go away, even when we throw big data, huge quad-core processing power, and fancy machine learning algorithms at it.” Of course, common sense requires analysis of life experience—which, by definition, machines cannot have. When we apply common sense to the resolution of a problem, we are bringing in information that is outside the “programming” of the event. But machines don;t have any such outside.
Many AI failures are like that:
- An AI-written puff piece for a wannabe celeb describes the subject as “wearing a yellow dress and champagne flute” because there is nothing in the program that pictures the scene for which the words are generated. They are picked on the basis of their occurrence in given positions in similar contexts.
- AI-written news stories are, by definition, formula news. That may work for posting routine sports writeups, for example, but by definition, it is boilerplate copy.
- Machine vision mistakes a teapot for a golf ball because neither object is really a concept for the machine, just a series of images. It is helpless in the face of ambiguity. Self-driving cars may hit people for reasons no one can easily make sense of.
So, as Pells tells us at Times Higher, journal editors say that the episode demonstrates the need for a human in the loop— someone to notice what the software is actually doing now and then.
Further reading on the limits of artificial intelligence:
Superintelligent AI is still a myth. Neither the old classical approaches nor the new data scientific angle can make any headway on good ol’ common sense. The official Winograd Schema Challenge, organized by Levesque and friends to see if AI could learn common sense, was retired officially in 2016 for the embarrassing reason that even the well-funded bleeding age Google Brain team performed poorly on a test set of a few hundred questions.
AI is not match for ambiguity: Many simple sentences confuse AI but not humans (Robert J. Marks)
If you think common sense is easy to acquire, try teaching it to a self-driving car. (Brendan Dixon)