Condorcet's jury theorem
Condorcet's jury theorem is a political science theorem about the relative probability of a given group of individuals arriving at a correct decision. The theorem was first expressed by the Marquis de Condorcet in his 1785 work Essay on the Application of Analysis to the Probability of Majority Decisions.[1]
The assumptions of the theorem are that a group wishes to reach a decision by majority vote. One of the two outcomes of the vote is correct, and each voter has an independent probability p of voting for the correct decision. The theorem asks how many voters we should include in the group. The result depends on whether p is greater than or less than 1/2:
- If p is greater than 1/2 (each voter is more likely to vote correctly), then adding more voters increases the probability that the majority decision is correct. In the limit, the probability that the majority votes correctly approaches 1 as the number of voters increases.
- On the other hand, if p is less than 1/2 (each voter is more likely to vote incorrectly), then adding more voters makes things worse: the optimal jury consists of a single voter.
Since Condorcet, many other researchers have proved various other jury theorems, relaxing some or all of Condorcet's assumptions.
Proofs[2]
    
    Proof 1: Calculating the probability that two additional voters change the outcome
    
To avoid the need for a tie-breaking rule, we assume n is odd. Essentially the same argument works for even n if ties are broken by adding a single voter.
Now suppose we start with n voters, and let m of these voters vote correctly.
Consider what happens when we add two more voters (to keep the total number odd). The majority vote changes in only two cases:
- m was one vote too small to get a majority of the n votes, but both new voters voted correctly.
- m was just equal to a majority of the n votes, but both new voters voted incorrectly.
The rest of the time, either the new votes cancel out, only increase the gap, or don't make enough of a difference. So we only care what happens when a single vote (among the first n) separates a correct from an incorrect majority.
Restricting our attention to this case, we can imagine that the first n-1 votes cancel out and that the deciding vote is cast by the n-th voter. In this case the probability of getting a correct majority is just p. Now suppose we send in the two extra voters. The probability that they change an incorrect majority to a correct majority is (1-p)p2, while the probability that they change a correct majority to an incorrect majority is p(1-p)(1-p). The first of these probabilities is greater than the second if and only if p > 1/2, proving the theorem.
Proof 2: Calculating the probability that the decision is correct
    
This proof is direct; it just sums up the probabilities of the majorities. Each term of the sum multiplies the number of combinations of a majority by the probability of that majority. Each majority is counted using a combination, n items taken k at a time, where n is the jury size, and k is the size of the majority. Probabilities range from 0 (= the vote is always wrong) to 1 (= always right). Each person decides independently, so the probabilities of their decisions multiply. The probability of each correct decision is p. The probability of an incorrect decision, q, is the opposite of p, i.e. 1 − p. The power notation, i.e. is a shorthand for x multiplications of p.
Committee or jury accuracies can be easily estimated by using this approach in computer spreadsheets or programs.
As an example, let us take the simplest case of n = 3, p = 0.8. We need to show that 3 people have higher than 0.8 chance of being right. Indeed:
- 0.8 × 0.8 × 0.8 + 0.8 × 0.8 × 0.2 + 0.8 × 0.2 × 0.8 + 0.2 × 0.8 × 0.8 = 0.896.
Asymptotics
    
The probability of a correct majority decision P(n, p), when the individual probability p is close to 1/2 grows linearly in terms of p − 1/2. For n voters each one having probability p of deciding correctly and for odd n (where there are no possible ties):
where
and the asymptotic approximation in terms of n is very accurate. The expansion is only in odd powers and . In simple terms, this says that when the decision is difficult (p close to 1/2), the gain by having n voters grows proportionally to .
The theorem in other disciplines
    
The Condorcet jury theorem has recently been used to conceptualize score integration when several physician readers (radiologists, endoscopists, etc.) independently evaluate images for disease activity. This task arises in central reading performed during clinical trials and has similarities to voting. According to the authors, the application of the theorem can translate individual reader scores into a final score in a fashion that is both mathematically sound (by avoiding averaging of ordinal data), mathematically tractable for further analysis, and in a manner that is consistent with the scoring task at hand (based on decisions about the presence or absence of features, a subjective classification task)[3]
The Condorcet jury theorem is also used in ensemble learning in the field of machine learning. An ensemble method combines the predictions of many individual classifiers by majority voting. Assuming that each of the individual classifiers predict with slightly greater than 50% accuracy and their predictions are independent, then the ensemble of their predictions will be far greater than their individual predictive scores.
Further reading
    
    
Notes
    
- Marquis de Condorcet (1785). Essai sur l'application de l'analyse à la probabilité des décisions rendues à la pluralité des voix (PNG) (in French). Retrieved 2008-03-10.
- Tangian, Andranik (2020). Analytical theory of democracy. Vols. 1 and 2. Cham, Switzerland: Springer. pp. 149–162. ISBN 978-3-030-39690-9.
- Gottlieb, Klaus; Hussain, Fez (2015-02-19). "Voting for Image Scoring and Assessment (VISA) - theory and application of a 2 + 1 reader algorithm to improve accuracy of imaging endpoints in clinical trials". BMC Medical Imaging. 15: 6. doi:10.1186/s12880-015-0049-0. ISSN 1471-2342. PMC 4349725. PMID 25880066.