Mind Matters News and Analysis on Natural and Artificial Intelligence
rodrigo-de-mendoza-Dg0-_ioXtng-unsplash

Ask Alexa (and an anonymous crowd answers?)

Amazon is testing a crowd sourcing approach to difficult questions. How did that work out at Wikipedia?

So far, the crowd will answer them in a private beta test where answers to questions are crowdsourced:

It’s not hard to see why Amazon might want to call in external help to deal with arcane questions. With the Google Assistant, Google can pluck answers to difficult questions from the billions of pages it’s indexed for its search engine. But Amazon has no comparably extensive corpus of material for Alexa queries. Instead, the company licenses data from hundreds of sources that it deems high-quality, and it uses more than a dozen question-and-answer techniques that it ranks according to a machine learning algorithm, which then determines the best response.

Jared Newman, “Exclusive: Amazon will let anyone answer your Alexa questions now” at Fast Company

Newman notes that Amazon is approaching the project with “a surprising degree of optimism” considering the problems YouTube and Reddit have had with trolls, Wikipedia with vandalism. Plus, Amazon itself has had big troubles with fake reviews. Nonetheless:

Amazon also isn’t requiring any kind of citation in its responses. That means Alexa Answers participants can’t easily check where information is coming from before upvoting it, and users will only hear “According to an Amazon customer” as the information source in Alexa’s responses. Barton says people just haven’t been asking for citations so far, but he believes a mass of voting across lots of users is a good enough signal for accuracy anyway.

“We’re certainly open to revisiting it if we get strong feedback from contributors or from end users if they find more attribution beneficial,” he says, “but in general what we find from our community is that they prefer crispness and conciseness in their responses.”

Jared Newman, “Exclusive: Amazon will let anyone answer your Alexa questions now” at Fast Company

Well, if “crispness and conciseness” happen to also be off the mark, will some of the users be content with “two out of three ain’t so bad”? And then what they are getting the rest of us into?

The term crowdsourcing has been traced to Jeff Howe (Wired, 2006). The basic idea is that if a question is put to a large number of volunteers, a right answer will eventually emerge.

Well, maybe. One industry source sums it up:

Using crowdsourcing brings some genuine advantages to companies and individuals looking to complete defined tasks with affordable price. The main attraction so far has been its fairly lower price, compared to the price for hiring a dedicated professional. Also the best thing with the low price is the high number of people who are ready to work for you anytime.

As with all things, there are disadvantages, and interestingly the main disadvantage mirrors the main advantage: cheap labor results less credible product, compared to professionals. You pay professionals for their expertise, experience and dedicated spirit, but you buy labor for completing simple tasks. Any task considered above simple is risky for crowd sourcing.

Darren Stevens, “Crowdsourcing: Pros, Cons, and More” at Hongkiat (November 1, 2018)

Many critical information-gathering tasks are well “above simple” and one problem is crowd bias or group interest creeping in unnoticed.

Wikipedia is a classic example of how crowdsourcing can go wrong. Consider:

● Information can present a point of view whose source is not apparent. From Spiked Online (April 29, 2014): “Celebrities and companies pay PR agencies to edit entries. Controversial topics are often the subject of edit wars that can go on for years and involve scores of editors. Pranksters and jokers may change entries and insert bogus facts. Whole entries about events that never happened may be created. Other entries will disappear without notice. Experts may be banned from editing subjects that they are leading authorities on, because they are cited as primary sources.” The problem isn’t the source; it’s the lack of obvious transparency.

● Crowdsourcing can cause genuinely doubtful claims to appear settled. Columbia University mathematician Peter Woit has criticized Wikipedia entries on the speculative multiverse along these lines: “an outrageously one-sided promotional piece for pseudo-science.” (March 19, 2014) The entry may have improved since then. But it could just as easily take off in a different less-than-useful direction at any time.

● Shouting down experts in favor of a crowd is not a good way to get the best information on an obscure topic. That’s what happened in an interesting, carefully detailed example involving a 19th-century terrorist incident in the United States, the Haymarket Riot. The crowd managed to shout down, via “edits,” one of the few people who actually knew a good deal about the conspirators via his original research. That could happen with any obscure topic on which a crowd has an obstinate opinion.

● Wikipedia’s rules entail that information can be regarded as factual if it is often sourced, irrespective of whether it is true in any other sense.

● Here’s an example of the overall problem: Wikipedia’s own co-founder Larry Sanger has described an article on the intelligent design controversy as “appallingly biased” (8 December 2017). Sanger is not a partisan of design in nature. The problem he identifies is created by the fact that fierce partisans of naturalism and Darwinism are far more likely to be the “crowd” from which the information is “sourced” and to keep others out. As just one example, when paleontologist, Gunter Bechly announced that he doubted Darwinism, he was disappeared from Wikipedia, despite continuing to work in the field and classify fossils. So if, for example, you need to understand a controversy over ID at the local school board in more depth, Wikipedia would not be a good choice, despite claims about the value of crowdsourcing.

The obvious problem with crowdsourcing is anonymity and the lack of accountability that goes with it. Individually sourced information is different: If a politician seeking re-election informs you that her opponent’s policies spell disaster, you will naturally consider the source. The car salesman’s new wheels pitch, the realtor’s market assessment, the vet’s advice, the bishop’s letter read from the pulpit, are all clearly sourced and evaluated as such. And all these sources are accountable. We implicitly factor in our experiences with each of these sources in determining our level of assent.

By contrast, crowdsourced information from Alexa on contentious issues could be based ultimately on the internet’s public landfill. Most of it may be right. But how would we know? Crowdsourcing is just not a substitute for personal searches for information, consulting a variety of sources.


Further reading: Alexa really does NOT understand us. In a recent test, only 35 percent of the responses to simple questions were judged adequate.


Denyse O'Leary

Denyse O'Leary is a freelance journalist based in Ottawa, Canada. Specializing in faith and science issues, she has published two books on the topic: Faith@Science and By Design or by Chance? She has written for publications such as The Toronto Star, The Globe & Mail, and Canadian Living. She is co-author, with neuroscientist Mario Beauregard, of The Spiritual Brain: A Neuroscientist'€™s Case for the Existence of the Soul. She received her degree in honors English language and literature.

Ask Alexa (and an anonymous crowd answers?)