eduzhai > Physical Sciences > Physics Sciences >

Extraction of Information from Crowdsourcing: Experimental Test Employing Bayesian, Maximum Likelihood, and Maximum Entropy Methods

  • Save

... pages left unread,continue reading

Document pages: 30 pages

Abstract: Acrowdsourcing experiment in which viewers (the “crowd”) of a BritishBroadcasting Corporation (BBC) television show submitted estimates of thenumber of coins in a tumbler was shown in an antecedent paper (Part 1) tofollow a log-normal distribution ∧(m,s2). The coin-estimation experiment is an archetype of a broad class ofimage analysis and object counting problems suitable for solution bycrowdsourcing. The objective of the current paper (Part 2) is todetermine the location and scale parameters (m,s) of ∧(m,s2) by both Bayesian and maximum likelihood (ML) methods and to compare theresults. One outcome of the analysis is the resolution, by means of Jeffreys’rule, of questions regarding the appropriate Bayesian prior. It is shown thatBayesian and ML analyses lead to the same expression for the locationparameter, but different expressions for the scale parameter, which becomeidentical in the limit of an infinite sample size. Asecond outcome of the analysis concerns use of the sample mean as the measure of information of the crowd in applicationswhere the distribution of responses is not sought or known. In thecoin-estimation experiment, the sample mean was found to differ widelyfrom the mean number of coins calculated from ∧(m,s2). This discordance raises critical questions concerning whether, andunder what conditions, the sample mean provides a reliable measure of theinformation of the crowd. This paper resolves that problem by use of theprinciple of maximum entropy (PME). The PME yields a set of equations forfinding the most probable distribution consistent with given prior informationand only that information. If thereis no solution to the PME equations for a specified sample mean and samplevariance, then the sample mean is an unreliable statistic, since no measure canbe assigned to its uncertainty. Parts 1 and 2 together demonstrate that theinformation content of crowdsourcing residesin the distribution of responses (veryoften log-normal in form), which can be obtained empirically or by appropriate modeling.

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...