To: Peter Tillers' General Home Page
Modeling Relevance (along Bayesian Lines)
The passage below is taken from R. Lempert, S. Gross & J. Liebman, A MODERN APPROACH TO EVIDENCE 228-232 (West Group, 2000) (edited & footnotes omitted):
... [S]ome of you may have chosen a career in law in part
because of your preference for verbal as opposed to mathematical
thinking, we have found that mathematical models can clarify thinking
about questions of relevance.
The excerpt that follows is taken from an article written by one of the authors [Richard Lempert]. Changes from the original are not noted in this excerpt.
______________________
Mathematics as a language can help clarify those legal rules that involve weighing evidence in an essentially probabilistic fashion. ... Bayes' Theorem [is] helpful in thinking about the meaning of relevance and in analyzing certain of the rules generally associated with this topic. The following discussion assumes that the fact-finder is a jury and, unless otherwise noted, that the issue to be resolved is the defendant's guilt. However, the analysis may readily be generalized to the situation where the fact-finder is a judge and/or a question other than guilt is at issue. The two models are here applied to a simplified situation where the fact-finder must evaluate only one item of indisputably accurate testimony.
A. Bayes' Theorem'
First we must attend to Bayes' Theorem. This theorem follows directly from two elementary formulas of probability theory: If A and B are any two propositions, then:
|
|
|
|
|
|
From these ... equations the following formula may be derived (expressed in terms of "odds" rather than "probability"):
|
O(G|E) = ------------------ x O(G) P(E|not-G) |
|
This formula describes the way knowledge of a new item of evidence (E) would influence a Bayesian rational decision maker's evaluation of the odds that a defendant is guilty (G). Since the law assumes that a fact-finder should be rational in this Bayesian logical sense, this is a normative model; that is, the Bayesian equation describes the way the law's ideal juror evaluates new items of evidence. What this equation says is that the odds (O) that a defendant is guilty, given the introduction of a new item of evidence, is equal to (1) the probability that the evidence would be presented to the jury if the defendant is in fact guilty, (2) divided by the probability that same evidence would be presented to the jury if the defendant is in fact not guilty, (3) times the prior odds of the defendant's guilt. The prior odds are the odds that would have been given of the defendant's guilt before receipt of the item of evidence in question.
For example, suppose at some point in a criminal trial the fact-finder believes that the odds are fifty-fifty, or 1:1, that the defendant is guilty. A more familiar way of stating this is that the fact-finder believes that the probability of the defendant's guilt is .50. The evidence next received proves the following: that the perpetrator's blood, shed at the scene of the crime, was type A; that the defendant has type A blood; and that fifty per cent of the suspect population has type A blood. Thus, if the defendant were the perpetrator, the probability that the blood found at the scene would be type A is 1.0. The probability that the blood would be type A if someone else committed the crime is .50, or 1/2, since half of the other possible suspects have type A blood. Plugging these figures into the formula indicates that after receiving the blood evidence a rational decision maker would evaluate the odds of guilt as:
|
|
1 ------------------- x .5 |
---- 1 |
|
The new evidence has raised the odds in favor of the defendant's guilt to 2, or 2:1. Another way of stating this result is that the fact-finder's best estimate of the probability that the defendant is guilty is now .67. Evidence that changes an estimated probability of guilt in this fashion is clearly relevant in a criminal trial.
Consider another case. Assume that the range of possible suspects has been limited to voters in a community so conservative that only one out of ten voters supports the liberal candidate. While a group of conservative jurors drawn from this community might be angered by evidence that the defendant supports the liberal candidate, such a showing would not influence the judgment of an ideal juror. Absent some reason to believe that liberals are more prone to commit the crime in question, the probability that the defendant could have been shown to be a liberal were he guilty is .1, the same as the probability that he could have been shown to be a liberal were he not guilty. Solving the Bayesian equation we find:
|
O(G|E) = --------------------------- O(G) .1
[thus here O(G|E) = O(G)]
|
The odds of the defendant's guilt remains O(G); the same as they were before the jury learned of the defendant's political affiliation. In these circumstances evidence of the defendant's political affiliation is not relevant.
1.Logical Relevance
In both examples the effect of the evidence on the decision maker's final judgment as to guilt turns entirely on the ratio,
|
conventionally called the likelihood ratio. In the first example P(E|G) was twice P(E|not-G), and the fact-finder doubled its prior odds of the defendant's guilt. In the second example P(E|G) and P(E|not-G) were the same, so the likelihood ratio was one, and the fact-finder's prior estimate of the defendant's guilt remained unchanged. In terms of the Bayesian model, it will always be the case that the impact of new evidence on the prior odds of guilt, or on any other disputed hypothesis, will be solely a function of the likelihood ratio for that evidence. [Note by Tillers: This is not true if the evidence is itself uncertain.] Where the likelihood ratio for an item of evidence differs from one, that evidence is logically relevant. This is the mathematical equivalent of the statement in FRE 401 that "relevant evidence" is "evidence having any tendency to make the existence of any fact that is of consequence to the determination of the action more probable or less probable than it would be without the evidence." (Emphasis added.) Hence, an item of evidence is logically relevant only when the probability of finding that item of evidence given the truth of some hypothesis at issue in the case differs from the probability of finding the same item of evidence given the falsity of the hypothesis in issue. In a criminal trial, if a particular item of evidence is as likely to be found if the defendant is guilty as it is if he is innocent, the evidence is logically irrelevant on the issue of the defendant's guilt.
As a practical matter courts may be justified in treating evidence as logically irrelevant when the likelihood ratio for that evidence is only slightly different from one, since such evidence will have little effect on the odds that the disputed hypothesis is true. A slight difference in this context must be very small indeed, since a likelihood ratio of 1.5 would lead a fact-finder to increase by fifty percent his prior estimate of the odds in question and a likelihood ratio of 2.0 would result in a doubling of the prior odds.
It is clear from the model that the likelihood ratio depends entirely on the relative magnitudes of P(E|G) and P(E|not-G) and not on the absolute size of either. [Note by Tillers: Again, Schum has argued that this is NOT true when there are "source uncertainties. In how many cases is it uncertain whether an alleged event that serves as evidence of a matter such as guilt is or is not true?] Thus, evidence that is unlikely to be associated with a guilty defendant will nevertheless be probative of guilt so long as the evidence is more (or less) likely to be associated with someone who is not guilty. Suppose, for example, that in an assault case the state can show both that the defendant is a heroin addict and that one in 500 criminal assailants are heroin addicts. Thus, it is quite unlikely that any given criminal assailant is a heroin addict, but this does not necessarily make the additional evidence exonerative or irrelevant. If the state can also show that only one in 1000 people who never engage in criminal assaults are heroin addicts, knowing the defendant is an addict should lead a fact-finder to double her prior odds that the defendant was the assailant. Conversely, if there was one heroin addict for every 250 non-assailants, evidence of the defendant's addiction should lead the factfinder to halve her prior odds on the defendant's guilt. In either of these supposed cases there may be good reason to keep evidence of the defendant's addiction from the jury, but the reason is not that the information is logically irrelevant.
....