Go to Peter Tillers' Home Page


 

Philosopher contemplating sunset -- or, possibly, dinner.




 

The law needs a satisfactory mechanism for ascertaining whether or not instances of the types of spatio-temporal states that are said (by the law) to be necessary are in fact present. Whether or not U.S. law as it now stands has a satisfactory mechanism of this sort is unclear: quite literally, no one really knowsif there is an adequate proof mechanism -- because no one really knows how to test how well (or how poorly) the existing proof mechanism works. (Nonetheless, most of us have our suspcions -- one way or another.)

 

 


 

Notes

for

Konstanz 2003 International Summer School Lectures on

Probability and Uncertainty in Law

© 2003 Peter Tillers

 

Introduction

 

Two Types of Uncertainty in Law

 

Uncertainty is a pervasive feature of law and legal systems. For example, uncertainty attends the interpretation of many or all legal norms. Legal systems, however, are also forced to grapple with a different kind of uncertainty. The law [1] frequently (and perhaps always) conditions the application of legal norms on the occurrence or non-occurrence of certain kinds of spatio-temporal events or conditions. The law conditions its own use in this way even though the occurrence or non-occurrence of instances of the kinds of spatio-temporal states that the law prescribes is always or almost always necessarily uncertain to some extent.  Consequently, the phenomenon of factual uncertainty confronts the law at every step.

 

Factual uncertainty -- uncertainty of propositions or hypotheses about states of the world  -- has been a prominent topic and problem for a very long time. Factual uncertainty has presumably always interested specialists in the law of evidence and proof. Factual uncertainty has been a major topic of philosophical discussion for millennia: the topic was of interest, for example, to Aristotle; [2] and, notwithstanding occasional suggestions to the contrary, [3] factual uncertainty interested medieval thinkers as well. [4] Finally, as is well known, factual uncertainty has been a central problem in epistemology for at least 200 years. [5] Despite this impressive lineage – despite the longstanding recognition of the phenomenon of uncertainty about facts – U.S. and U.K. legal theorists in the twentieth century devoted very little attention to factual uncertainty. [6] American and British legal theorists and philosophers were and apparently still are far more interested in matters such as the uncertainty of legal norms and the uncertainty or inconclusive nature of argument about legal norms. [7] I believe that this imbalance of attention is quite unfortunate. Today I will do just a little bit to correct this imbalance. In my time with you today I will focus on factual uncertainty and factual inference. I will talk about the law’s use of one particular method – the process of “proof” in litigation and adjudication – to make judgments and decisions about uncertain factual propositions. [8]


 

Factual Proof in Law as a Web of Evidence and Inference

 

A. Legal Rules as Pegs [9] for the Fabrication of Proof in Litigation [10]

 

Legal Rules as Conditional Imperatives

 

Although the focus of my discussion today is factual uncertainty in legal settings rather than legal uncertainty, I cannot afford to ignore legal rules altogether and, so, I shall not do so. But to keep my topic within manageable bounds I will make the provisional assumption that the extent of uncertainty about the identity and the meaning of legal rules is very limited. Toward the end of the lectures, I will relax this assumption.

 

Legal rules influence factual uncertainty and factual inference in legal settings in a variety of distinct ways. For example, legal rules may speak directly to the inferences that can be drawn from various kinds of evidence, they may serve to filter the evidence that a decision maker is allowed to see, they may instruct a decision maker and trier of fact about the burden of proof and persuasion that one party or another must discharge before a particular verdict can be returned, and legal rules typically regulate how evidence is gathered prior to litigation or trial. These are adjectival rules. But substantive legal rules also influence the course of uncertain factual inference and proof in litigation and also factual inference and discovery in the shadow of possible litigation.  Substantive legal rules have this sort of influence because exert they specify the conditions under which particular legal actions or effects should ensue.

 

It is true that legal rules can serve a wide variety of purposes; for example, legal rules can have hortatory, inspirational, educational, symbolic, and even cathartic purposes. But (as I have already suggested) substantive legal rules [11] also function as conditional imperatives.  Legal rules in the U.S. characteristically (or frequently) take the form of “if, then” statements. [12] These legal “if, then” statements provide that if m, n, and, o conditions or circumstances obtain, then legal action x shall, should, or may be taken, or jural state or relation y shall, should, or may ensue. In the U.S. such antecedent conditions – the conditions in the “if” clause in this sort of “if, then” statement -- are often called the “elements” of legal rules.

 

Such elements of a legal rule establishing a right or a duty are often called the “essential elements” [13] of the right or duty created or established by that legal rule. The law characterizes the ingredients of legal rules in this fashion – it calls them essential elements -- because of the seemingly-tautological premise that the elements of a right- or duty-creating legal rule set forth the necessary ingredients or conditions for the existence of the right or duty created or established by that legal rule.

·        If the doctrine of “essential elements is described and explained in the way that just described it, the doctrine is unobjectionable – providing, of course, that one grants that legal rules are the sources of rights and duties.  The doctrine or notion of essential conditions, however, has some capacity to mislead people who are not members of the legal profession. A person without legal training might quite reasonably assume that U.S. law takes the view each of the elements in a legal rule that purports to defines a right or a wrong such as murder is essential for the existence or commission of the right or of a wrong such as murder. This seemingly plausible assumption, however, is usually incorrect. The definitions of rights and wrongs that one finds in particular legal rules are ordinarily not exhaustive definitions. It is usually the case that are alternative ways of creating a right such as a copyright and there are ordinarily alternative ways of committing wrongs such as murder or negligence. What this means is that the purported “essential” elements of a rule that purports to define a matter such as “negligence” are therefore often not “necessary” at all! It is frequently more accurate to say that there are usually alternative paths to the creation of legal rights and legal wrongs and that each of these alternative path has its own set of necessary or essential conditions. For example, the law provides that “murder” can be committed in a number of different ways. In each of these various ways a death is necessary. But “intent to kill” is not a necessary ingredient of all types of murder; it is a necessary ingredient of only a certain kind of murder (“straight murder”). [14] Similarly, for certain types of murder charges – for felony-murder charges – the prosecution must show that any death that resulted due to the defendant’s actions resulted from the commission of a felony by the accused, a felony such as arson or burglary. For this kind of murder charge the commission of a felony other than homicide is an essential element of the crime charged. The commission of a separate felony, however, is not an essential element of different kinds of murder charges, such as depraved-heart or reckless-indifference murder. (The importance of the general point I am making may not be apparent now. But its importance will become more obvious toward the end of my lecture(s), when I have done some talking about the dynamics of proof in litigation and about the wrinkles and surprises that time routinely injects into the process of proof in litigation.


       Relationship between Conditional Legal Imperatives and “Proof of Facts” in Litigation

 

As I have already noted (and as you already surely knew), the reasons for the existence of legal rules and systems of legal rules are various. But one of the impulses supporting legal systems is the idea or ideal of the “rule of law,” the notion of a Rechtsstaat. Another factor supporting the existence of legal regimes is the common desire for efficient collective regulation of social activity. Because of these two factors and possibly for other reasons as well, [15] there is a widespread and deep-seated belief that it is very important that systems of law authorize and allow the production of serious consequences such as criminal sanctions, the imposition of taxes, property transfers, the transformation of a not-for-profit entity into a for-profit entity only when -- only if -- those conditions are present that the law itself (via legal rules) proclaims to be necessary if certain serious legal actions or consequences are to ensue.

 

The semi-paradoxical but deeply-felt principle that the administration of legal rules should be bound by legal rules – including the rules and the principles in the rules to be administered –  this belief that the law must respect the conditions that the law places on its own application  is one reason why the topic of factual uncertainty in law is very important.  It is widely assumed that most “elements” of legal rules refer to possible spatio-temporal events. [16] It is also widely (if tacitly) assumed that the bare fact that a particular element is an element of a particular rule means that the right, duty, claim, or defense created by the rule applies or should apply only if it can be said that an instance of the type of  possible spatio-temporal event identified by an element of a legal rule is not merely a possibility, but that it was, is, or will be an actuality. In short, there is a widespread and ingrained belief that the application of legal rules is, by the terms of those very rules, conditioned on the existence or non-existence of particular spatio-temporal events or states of a spatio-temporal framework, on “facts.” Given such assumptions and beliefs about the character of the requirements that legal rules appear to embed in themselves, it seems apparent that the law needs sound methods for resolving disagreements in litigation about factual questions. The ability of the law to reach tolerably satisfactory levels of fact finding accuracy may be essential if a legal regime is to be generally regarded as legitimate and some level of accuracy may even be necessary if a legal regime is to be able to survive over the long haul.  A legal system that regularly imposes legal consequences when the conditions said to be essential for those consequences are not present is generally thought to be a lawless or unprincipled system of law – and, in extremis, it may not  even be perceived as a system of law. [17] And any system of legal rules regularly permits legal consequences to ensue when the law proclaims that they should not ensue would surely be regarded as inefficient – since, practically by hypothesis, such a legal system is quite incapable of realizing its stated objectives except through dumb luck.

 

·        Proof of facts is not a legal sideshow. The process of factual proof in legal settings is a central feature, not only of adjudication and litigation, but also of legal systems as a whole.

 


 Some Peculiarities of Factual Inference in Legal Contexts


Assume now that substantive legal rules are conditional imperatives. Assume, furthermore, that the object of evidentiary processes in legal proceedings is to ascertain the existence or non-existence of the factual requirements for the application of legal rules. Given these assumptions, what can be said about the structure of inference and proof in legal proceedings? I shall start to explore this  question next. It is a question with a great many answers.

 

 

 

 Some Sources of the Multiplicity of Factual Issues in Litigation: (i) Multiplicity of the Conditions in Conditional Legal Imperatives: Multiplicity of the Elements, or Ingredients, of Rights, Duties, Claims, Charges, and Defenses.

 

You can expect that my brief examination here today will show that inconclusive argument or inference about factual questions is quite intricate, quite complex. But we should not rush into that thicket without first considering another source of both the complexity and the instability of inconclusive inferential argument about factual issues in litigation  Long before decision makers or actors in litigation engage in intricate (and unstable) reasoning about this or that factual question in litigation -- and, indeed, at the very same time that decision makers engage in such inferential deliberation --, decision makers and participants in litigation may be required – they ordinarily are required – to wend their way through a a different kind of thicket or spider-web, one that that interlocks and interacts with the ins-and-outs of complexes of arguments from and about evidence. This is because in legal settings [18] -- or, to speak more broadly and also more accurately, in settings or situations in which it is expected or thought that legal doctrine may come into play --, decision makers and actors are presented, not with a single inferential task, but with a large array of factual issues and problems. There are at least several distinct reasons for this. One important reason for this is the nature and workings oflawand legal doctrine. In litigation and in adjudication (as well as in many other legal and social contexts) legal rules are frequently a source of the tasks, including the inferential tasks, that decision makers and actors are required, willy-nilly, to perform if they wish to maximize their prospects for favorable outcomes (whether in litigation and adjudication or in other contexts).

 

But why does the influence of legal rules on outcomes tend to multiply factual issues? Consider my (tentative and possibly controversial) account of the structure and function of legal rules. In my Kelsenesque account of the nature and workings of legal rules, legal rules contain antecedent conditions. These antecedent conditions play a part in determining the factual issues that arise in litigation. My Kelsenesque reading of legal rules implies that evidence must be introduced to establish the existence of an instance of each type of spatio-temporal event that the legal rule upon which one wishes to rely provides is necessary for the application of that legal rule.

 

So be it.

 

But if we think about legal rules in this way -- if we think this is the nature of legal rules and that this kind of proof is what the existence of legal rules entails or implies --, we are faced with a mystery, possibly a small mystery, but a mystery nonetheless. This minor mystery is pertinent to the present inquiry: how and why do law and legal doctrine tend to multiply factual issues -- and thereby render inferential argument more intricate and make it less stable?

 

Most scholars who venture to discuss the structure of factual inference in litigation invariably assume that there is but a single issue in litigation. For example, the issue before the trier of fact, they suppose, might be, “Did Johnny Jones stab Valiant Victim or not?”

 

The typical picture that such scholars have of factual disputes in litigation looks something like this:


 

 

 

X  = death      E = evidence potentially suggesting or showing death

 

FIGURE 1

 

 


But the assumption that solitary factual issues are the rule rather than the exception in litigation is rather mysterious.

 

In reality, there is often controversy and disagreement about quite a large number of factual questions.

 

There are at least several reasons why this is so.

 

 Let me explore just one of those reasons with you now.

 

Consider the English common law definition of burglary:

 

Burglary is the breaking and entering into a dwelling at night with the intent to commit a felony therein.

 

As you can see, this definition of burglary provides that burglary has quite a few essential elements. The crime of burglary is committed (under English common law) if and only:

                    i. there was a breaking; e.g., a door latch or a window was broken

                 ii. there was entry; e.g., if the malefactor’s foot crossed the threshold

               iii. the malefactor’s trespass was against a dwelling, a home, and not, for example, an outhouse or a barn

               iv. the breaking and entering occurred at night; i.e., after sunset

                  v. the malefactor had the intent to commit a crime (in addition to the trespass)

               vi. the accused had a criminal intent at the time of the breaking and entering; a criminal intent that formed later would not suffice for a conviction of burglary

             vii. the miscreant intended to commit a felony (such as rape), and not, for example,  merely a misdemeanor

          viii. the miscreant intended to commit that felony “therein,” in the dwelling

 

My list of the elements of the common law crime of burglary suggests that a more descriptive picture of the problems of inference typically faced in litigation would look something like this (Figure 2):


 

 

 

 

FIGURE 2 [PT1]

 

 

 

 

 


Well, as you can see, the inferential task facing the trier of fact faces has become more complex. And, as you know, in the case of a burglary charge, a diagram having the sort of structure shown in Figure 2, would have to have at least eight boxes, or nodes, at the very top.

 

But such a picture does not yet have nearly enough detail to provide us with a passable picture of the structure of the inferential tasks that triers of fact typically confront in litigation.

 

Before I go on to develop a more nuanced picture of the structure of problems of inference and proof in litigation, let me pause to make one important general observation:

 

One object of litigation and of proof in litigation is to generate answers to questions. But litigation is not an attempt to answer any questions that may happen to interest the parties or the trial judge. Law is an authoritative system. This authoritative system, as we have seen, specifies – authoritatively – the conditions under which specified legal consequences are to ensue or are allowed to ensue. What you must note here is that the law constrains ordinary human inclinations and curiosity by providing that litigation must resolve certain questions that the law regards as important. (Those questions are framed by the elements of the rights, and rules etc. that are in play in the lawsuit in question.) So the bottom line is that legal rules pose some of the questions that must be addressed by evidence in litigation. Furthermore, the authoritative character of the law and legal rules mean that some questions that the parties or the trial judge might think should be addressed and answered in the dispute for the court cannot be addressed or resolved – because the law authoritatively views some factual propositions as immaterial to the resolution of the controversy before the court. So authoritative law both raises and forecloses some factual questions. [19]

 

 

Sources of the Multiplicity of Factual Issues in Litigation: (ii) The Generic or “Abstract” Character of the Elements of Legal Rules

 

Although the diagram in Figure 2 is useful – Figure 2 does capture an important feature of factual proof (and persuasion) in litigation and adjudication –, the diagram in Figure 2 misrepresents – by omission – a very important feature of the relationship between the elements of a legal rule – i.e., the law’s definition of a legal rule -- and the factual issues in a legal proceeding such as a lawsuit. The diagram in Figure 2 suggests that the elements of a legal rule are themselves possible facts in issue and that the evidence in a lawsuit goes to the question of the truth or falsity of those elements. But there is something very wrong with this picture; there is a fundamental error, the nature of which becomes apparent if we bring to mind that the elements of a legal rule are part of the definition of a legal rule or a legal right. Evidence of the sort that I have mentioned in my discussion thus far cannot show that the elements of a legal rule, the ingredients of a legally-prescribed definition of a rule, are true or false. There is, to be sure, a link between evidence and the elements of legal rules but the nature of that link does not resemble the sort of direct link that Figure 2 seems to suggest.

 

The matters represented by letters such as X in Figure 2 are the ingredients or elements of legal rules – e.g., “intent to kill”, “death” of a “person.” “Elements” or ingredients such as these are not themselves particular spatio-temporal events; they are not particular states of a spatio-temporal framework (except, of course, to the extent that any principle or rule – such as F = MA – is itself a spatio-temporal event). Elements of legal rules – the conditions found in conditional legal imperatives – are, instead, “abstract” or generic; i.e., a statement or legal principle or proposition such as the statement that a person commits murder if and only if he or she has an “intent to kill” does not amount to the proposition or hypothesis that a particular person in a particular place and time has, had, or will have an intent to kill. The factual matters at issue in litigation, however, are typically about such specific spatio-temporal events or states; the factual matters in dispute in a lawsuit are usually not about the (possibly factual!) question whether it is true or false to say that the law calls killing murder only if a killer intended to kill. [20]   There is, therefore, a distinction between the elements of a legal rule and the factual hypotheses that are ultimately in issue in a legal proceeding such as a trial.

 

The distinction I am making between disputes about the elements or ingredients of a legal rule and disputes about factual matters is not merely of scholastic significance.  The distinction is important because observers who wish to understand uncertainty in law need to understand that the “essential elements” of legal rules – the generic ingredients of legal rules -- do not, by themselves, establish which of a vast number – an uncountably large and possibly an infinite number – of possible “ultimate” [21] factual propositions are in issue in a legal proceeding such as a trial. [22]

 

Since the conditions in conditional legal imperatives are usually generic, such generic conditions can be satisfied by – or “instantiated” in – any one of a very large number of spatio-temporal events. For example, an “intent to kill” is a requirement for the commission of a certain kind of murder (“straight murder,” or, “intent-to-kill murder”).  An example of a typical factual issue (in litigation) is whether or not Sammy Smith had the intent to kill at 5:30 p.m. on June 1, 2003. An affirmative answer to this question about Sammy Smith is not the only way that the generic intent requirement for murder can be satisfied in a legal proceeding. It is not even the only way that an intent requirement can be satisfied in a trial of a murder charge against Sammy Smith. For example, the intent requirement in a trial involving Sammy Smith can also be satisfied by evidence showing that Sammy Smith had an intent to kill on July 15, 2003.

 

You can see where I am going: the number of possible spatio-temporal events or states that could  “instantiate” or “exemplify” or “satisfy” a generic legal requirement such as the generic requirement of an “intent to kill” is very large. Although I do not mean or want to suggest  that generic legal requirements such as “intent to kill” place no constraints on the factual hypotheses that can come to be in dispute in a legal proceeding – there are, after all, a very large number of possible spatio-temporal events or states that could not “instantiate” or “exemplify” or “satisfy” a generic legal requirement such as the generic requirement of an “intent to kill” , I do want to reiterate that the number of possible particular spatio-temporal events that, were they to occur, would instantiate, exemplify, or satisfy a generic requirement such as the generic condition “intent to kill”, the number of such possible spatio-temporal states having the capacity to serve as instances of a general class of events   is plainly very large, and it may well be infinite. (Even if the number of such possible events or states is not infinite, the number of such possible spatio-temporal events is very large, and probably uncountably large, I would think.)

 

So what?

 

The “so what” is in part revealed by the kind of diagram that we must construct if we are to capture the difference between, on the one hand, a generic legal requirement, and, on the other hand, a particular historical event that seems to constitute an example or an instance of a generic legal requirement. The picture in Figure 3 seems to do the trick. (Please note that this picture rests on the simplifying but potentially misleading assumption that the satisfaction of only a single generic legal requirement is ordinarily in question in a lawsuit or proceeding.)


 

 

 

FIGURE 3

 


By now you may be asking yourselves – by now you should be asking yourselves – what sort of a thing you are looking at when you look at the diagram in Figure 3. The diagram seems to have some of the characteristics of graphs. For example, the diagram seems to have nodes and arcs. But the nodes are peculiar, aren’t they?

 

The first oddity is that the diagram in Figure 3 seems to have two or perhaps even three kinds of nodes. The diagram has a rectangular node as well as two circular ones. The two circular nodes are in different colors. Is this kosher? Are the rectangular nodes meant to be qualitatively different from the circular nodes? And are the two different colors meant to express a fundamental difference between green circular nodes and yellow circular nodes. Does the use of such distinctive shapes and colors amount to an attempt to make qualitative distinctions among different kinds of nodes? If so, is this permissible within the framework of graph theory? (Isn’t a node just a node?)

 

The second feature of the diagram that may strike some of you as odd is that the directed arcs – they all appear to be directed arcs (but are they really?) – the directed arcs (if that is what they are) run upwards rather than downwards in the diagram in Figure 3. This feature alone may not lead you to say that the thing in Figure 3 is not a graph. But you might well wonder why the diagram does not follow the convention employed in many Bayes nets, which is to make directed arcs run downwards, from a hypothesis such as X – or should we say X1? – to matters E that serve as evidence of X or (alternatively) X1?

 

Let me grant that the graph in Figure 3 is a strange-looking graph and let me also stipulate that it may not be a graph at all, not a single or pure one, in any event. Let me also stipulate that the questions I myself have asked about Figure 3 are both interesting and important. But for the moment, let’s put aside the question of whether the picture or diagram in Figure 3 is a true graph and let’s just assume I have the right to draw my picture the way I have in fact drawn it. Is anything gained thereby?

 

The answer is, “Yes.”

 

We gain at least two things from the sort of diagram or picture found in Figure 3.  First, the fact that an arc passes from X1 to X (or the reverse) suggests that the connection between X and X1 can be insecure. Stated differently, the presence of that arc suggests that the step from X1 to X, or vice-versa, involves some kind of inference, the drawing of some kind of a conclusion. If that is what this kind of arc suggests or implies, it performs a valuable service. That’s because some sort of an inferential process, some sort of the drawing of conclusions, is in fact at play in the relationship between generic legal principles and specific events or states of the world: as all of you almost certainly know, there can be argument and there often is argument – vigorous and extensive argument! – about the strength and the nature of the connection between the two types of propositions. Even the fact that the arc runs upward, from X1 to X2, is a fortunate fortuity – because the upward direction of the arc suggests that there can even be debate about the direction in which inference runs and it perhaps suggests that particular spatio-temporal events may sometimes help to constitute or shape generic legal requirements or principles. (In the eyes of a person schooled in the common law, this last possibility does not seem at all strange or bizarre.)

 

The second valuable contribution made by the visual distinction between nodes of type X and nodes of the type X1 is that the spatial separation between the two makes it possible to represent the important proposition that multiple historical events or states of the space-time framework have the capacity to instantiate or satisfy a generic legal requirement. Thus, the distinction between generic legal requirements and factual circumstances makes it possible to draw the sort of picture that appears in Figure 4: [PT2]  

 

 

 

 

Here X is, again, a generic requirement and it can, again, have a meaning such as “intent to kill.” As before, X1 represents a particular historical event. We might use it  to refer to a possible event such as “Sammy’s intent, on June 1, 1983, at 3:00 p.m., to kill Valiant.” X2 is a separate event. X2  might represent an event such as “Sammy’s intent, on July 15, 1988, at 4:15 p.m., to kill Valiant.”

 

·        In U.S. legal proceedings it is common – particularly but not exclusively in civil cases – for a party to proceed to trial with multiple alternative distinct hypotheses about events or states of the world that allegedly instantiate(d) some generic requirement such as “intent” or “cause-in-fact.”

 

There are at least two reasons why it is helpful to have a visual representation that maintains and displays a distinction between propositions of the form X1 and X2. First, propositions such as these are different propositions because these two propositions are propositions about distinct spatio-temporal events. (Thus, one proposition might be true and the other, false.) Our representations of the process of inference and proof in litigation at a minimum ought to correctly identify the contested propositions or hypotheses that may become the object of evidentiary submissions and evidential argument in legal proceedings.

 

Second, since separate propositions such as X1 and X2 concern distinct events or states of the world, the evidence pertinent to one hypothesis – e.g., X1 – may not be pertinent (or not pertinent in the same way or to the same degree)  to another hypothesis such as X2. If we wish to use diagrams to display and dissect the force and direction of evidence in litigation, we must, of course, clearly identify the propositions to which evidence is directed. The separation between X1 and X2 is useful, in short, because the separation makes it possible to see clearly that evidence that is pertinent to one possible historical instantiation of a generic legal requirement may not be pertinent to a separate possible historical instantiation of the same generic legal condition or requirement. Thus, we are now able to recognize that the situation shown in Figure 5 is entirely possible and, quite probably, very common.


 

Consider an illustration of the meaning of the diagram in Figure 5. The diagram might be used to represent a situation in a criminal prosecution for a murder that perhaps was committed long ago – or perhaps not. In this hypothetical case Sammy Smith is on trial in the year 2003 for the murder of Valiant Victim. E1 might represent Sammy’s statement on May 15, 1983, “Gee, I despise Valiant.” E2 might represent Sammy’s alleged statement on  July 2, 1998, “Valiant’s behavior  at my birthday party on July 4, 1952, still infuriates me.” [23]   X1 might represent the hypothesis of “the killing of Valiant Victim by Sammy Smith on August 1, 1983.” X2 might represent the hypothesis or proposition, “Sammy Smith killed Valiant Victim on July 4, 1998.” [PT3]

 

·                    Once it is understood that a question about X normally devolves into questions about propositions or hypotheses such as X1 and X2 – and that similar things happen ordinarily happen to other (generic) elements of a legal rule (such as the legal rule defining “murder”), we are in a position to see that the ultimate factual question at a trial is very likely not

 

P([A|E1] + [B|E2] + … [Z|En])

 

but instead some variant of a question or proposition of the following sort:

 

P({[A1|EA1] or [A2|EA2] or … {[An|EAn]} + {[B1|EB1] or [B2|EB2] or … [Bn|E|EBn]}  + … {[Z1|EZ1] or … [Zn|EZn]})

 

 

 

Inference Networks in a Legal Style

 

Now let’s remove our gaze from the connection between the legal scaffolding for juridical proof – from the legal pegs on which factual issues and, below them, webs of evidence and inference hang -- and let’s look only at the webs of evidence and inference that hang from, or are attached to, hypotheses about specific spatio-temporal states or events. So, in terms of the notation above, let’s consider the connection between some An and some En. What is the nature of that connection?

 

The answer to this question is not straightforward. The thrust of your answer might well depend, for example,  (at least) in part on whether you subscribe to some variant of Bayesianism or, instead, to some variant of a so-called Baconian theory of inference or induction. But perhaps we can at least agree on this much (at least provisionally): the question before us is how we should think about a proposition of the form P(An | En) – or, more broadly, how we should think about the task or problem or phenomenon of inferring the value of hypothesis given some evidence.  Let me make several observations about how some scholars, including some influential legal scholars, have conceived of this phenomenon or task.

 

John Henry Wigmore's Picture of Inference

John Henry Wigmore, unquestionably the most influential legal scholar in the field of the law of evidence – Beweisrecht -- during the 20th century, developed, in the first quarter of the 20th century, a method of diagramming argument about and from evidence that is in many respects oddly similar to the kinds of directed acyclic graphs that are used in modern Bayes’ nets.

 

Wigmore's Chart Method, 1913, a pioneering example of Computer-Supported Argument Visualization

 

Figure 6

 

Source: John H. Wigmore, Principles of Judicial Proof (1st ed., 1917).  See also John H. Wigmore, The Science of Judicial Proof as Given by Logic, Psychology, and General Experience and Illustrated in Judicial Trials (3d ed. 1937) .

 

Wigmore’s diagrams have a lot of clutter and many peculiarities – his diagrams are very “busy.” But most of those peculiarities and clutter need not detain us. [24] Wigmore’s charts – his graphs – have several features of enduring interest and importance.

 

The Direction of the Mind's Eye in Inference 

 

One interesting feature of Wigmore’s charts is that the arcs in his diagrams run from evidence E to hypothesis H, rather than from hypothesis to evidence.  The same is true of many or most of the diagrams that are drawn by Wigmore’s follows and admirers, by people such as (i) the pair Terence Anderson & William Twining and (ii) the singleton David Schum.

 

That the arcs in Wigmore’s diagrams run from evidence to hypothesis may strike you as odd as well as interesting because if you are a Bayesian and if you use directed arcs in graphs to represent conditional probabilities, you are much more likely to draw graphs in which the direction of the arcs is reversed, in which the arcs run “downwards” from hypotheses H to the matters or events E that  serve as evidence.

 

Wigmore, like all U.S. legal scholars in the law of evidence who followed him, thought it was obvious that the central question in proof in legal proceedings concerns the probability of some factual hypothesis H given some evidence E. Furthermore, I very much doubt that he ever even considered the possibility that the arcs in his diagrams should run from hypotheses H to evidence E.

 

Bayesians are, of course, also interested, in assessing the value of H given E, P(H|E). That assessment is, in a sense, what Bayes’ theorem is all about. But Bayesians are far more apt to say that the interesting question – the key question, the question that can yield a useful answer  is the probability of evidence E given hypothesis H; they are much  they are more likely to focus on the expression P(E|H) than on P(H|E). By the same token, to the extent that Bayesians draw graphs and use directed arcs to represent problems of probability, they are far more likely to draw graphs in which the arcs run “downward,” from hypotheses H to evidence E.

 

As I understand it,  modern Bayesian theory views the direction of the arcs in Bayes nets as a matter of convenience: “arc reversal” is in principle always possible.  In the eyes of legal scholars, however, the direction of the arcs in an inference network is not viewed as a mere matter of convenience. Law teachers are likely to think that the upward direction of the arcs expresses something fundamental about the nature of factual inference, of the drawing of conclusions about factual hypotheses – and they are likely to get confused, their thinking is likely to become disordered, if you suggest to them that the arcs might be made to run “backwards.”

 

Who has the better of the argument? Bayesians in the mold of Judea Pearl or your typical law teacher?

 

This is a situation in which mutual misunderstanding is likely to prevail, in which there are likely to be the proverbial ships that pass each other unseen in the night.  But neither view is really a direct challenge to the other, and it is not useful to ask who is wrong and who is right about this business of the direction of the arcs in diagrams or graphs that purport to depict rational factual inference.

 

Some or all of you know better than I do at least one possible reason why Bayesians tend to be interested in questions that take the form P(E|H): the importance that Bayesians attach to likelihood ratios.  But I want to speculate that there may be another reason for the tendency of Bayesians to prefer downward-running (directed) arcs, arcs that run from hypotheses to matter that serve as evidence, rather than vice-versa; i.e., I want to suggest that there may be a broader explanation for the tendency of Bayesians to focus on the expression P(E|H) rather than on P(H|E). Let me put my suggestion as precisely as I can: perhaps there is a consideration  in addition to formal considerations that provides psychological support for the inclination of Bayesians to focus on P(E|H), for their tendency to make arcs in Bayes’ nets run from H to E  rather than vice-versa.

 

I speculate  that one reason why probability theorists are comfortable with expressions of the form P(E|H) – one reason why they find it “natural” to think this way about the relationship between hypotheses and evidence – is that many Bayesians have a “scientific model” in mind. My guess is that many Bayesians think about the role of evidence in much the way that (they believe) that (reputable) scientists such as physicists think about the relationship between hypotheses and evidence.

 

I trust you understand why I hesitate to speak about accounts of the scientific enterprise to an audience such as the one I have before me. But I beg you to allow me to use a very crude image of scientific investigation that I think will adequately serve my present purposes without getting me into a whole of trouble.

 

On one general kind of view of the scientific enterprise – on the kind of view advanced by philosophers of science such as Carl Hempel – scientific investigation involves at least two stages. One stage that involves the formation of hypotheses. Another stage involves the testing of scientific hypotheses.  As you undoubtedly know far better than I do, these two phases have been represented by an image called the “arch of knowledge.” In a simple version of this spatial metaphor (see Figure 7), the upward-sloping arc on the left-hand side of the arch represents the formation or development of hypotheses and the downward-sloping arc on right hand side represents the testing or verification of hypotheses.




FIGURE 7

 

 

Given the erudite audience I have before me, I need not dwell, nor do I want to dwell, on the details of this picture of scientific investigation and I certainly do not want to discuss controversies about how well or how poorly this picture represents how science works or should work. Suffice it to say that in the general model of science portrayed by the arch of knowledge in Figure 7 the second phase of investigation involves the extraction, whether by deduction or by some other kind of reasoning, of events or observations from the general laws or scientific principles that have been somehow developed or formulated in the first phase of the scientific investigation.  The general idea is that the investigator in the second phase of research is armed with a general hypothesis or a possible law of nature and, being so armed, uses that hypothesis to deduce or infer events, phenomena, or observations. In short, in this phase the investigator thinks: if H, then E.  And armed with the conclusions about E given H,  the investigator proceeds to see if E materializes or occurs in the fashion predicted, inferred, or deduced. This way of thinking about evidence, I am suggesting, comes naturally to your typical Bayesian: your typical Bayesian, being infused and suffused (perhaps excessively so suffused!) with a particular model or image of scientific investigation, a quasi-Hempelian, a quasi-hypothetico-deductive model – my imaginary Bayesian is comfortable in thinking in this kind of “scientific” fashion about the relationship between “hypothesis” and “evidence.” To be sure, perhaps there still exists, here and there, an occasional Laplacean-minded scientist. If so, my hypothetical Bayesian would not think exactly the way that such a scientist would: my sharp-minded Bayesian would firmly reject the view that the value of P(E|H) is ordinarily or ever 1.0 if H is (taken as) true. But in other respects my hypothetical Bayesian would think the same way that my hypothetical Laplacean and Hempelian scientist would during the sort of second phase of scientific investigation that is portrayed by the simple arch of knowledge in Figure 7. Furthermore, of course, she (my hypothetical Bayesian) would find it entirely natural and comfortable to think about the comparative probabilities of E’s given Hs and Es given not-Hs.

 

To many of my legal friends, however, my hypothetical Bayesian’s way of thinking about the relationship between evidence and hypothesis seems strange. Legal scholars who specialize in the law of evidence and proof have a very pronounced tendency – practically a dogmatic tendency! – to think that the key question facing a fact finder in a legal proceeding is always the probability of some hypothesis or hypotheses given the evidence – that the key question is not P(E|H), but, instead, P(H|E). In fact, if asked to ponder the probability of E given H and the probability of E given not-H, my legal friends are likely to get quite confused; they are prone to become almost mentally disordered when they are asked to ponder P(E|H) and P(E|~H). Characteristically, for example, they tend to think that such questions beg the (ultimate) question, the (ultimate) question being the probability of  H given E; they tend to think that the questions such as the question of the value P(E|H) and the question of the value of P(E|~H) beg the question of value of P(H|E). See, e.g., H. Richard Uviller, Unconvinced, Unreconstructed, and Unrepentant: A Reply to Professor Friedman’s Response, 43 Duke Law Journal 834 (1994) (attempted but flawed critique of the Bayesian approach toward a problem taken by the legal scholar Richard Friedman; Uviller’s mind could tolerate a bit of Bayesian reconstruction!). [25]

 

When legal scholars make the assumption that the interesting question is P(H|E) – rather than the conditional probability of E given H and, then, given not-H – they are not, I think, rejecting, either expressly or tacitly, the sort of model of scientific research and investigation that I have just limned. Legal scholars do not have scientific research in mind when talk about rational or sound methods of evaluating or approaching evidence in legal proceedings. So legal scholars are not implicitly or explicitly declaring war on conceptions of evidence that are embedded in images or models of science.

 

But even if it can safely be said that there is no overt or direct conflict between the law’s conception of the relationship between evidence and hypotheses and the conception of that relationship found in some models of scientific research, we are left with the puzzle of why legal professionals tend to think that is obvious that the key problem or question before them is P(H|E).

 

· I am going to disappoint you: I am not going to tackle this question directly at this point. For the time being I will instead refer you to Section 3 of my Handout.  My reason for avoiding the question now is not that I think the question is unimportant. I do not think that at all. I avoid this big question only because I do not have time to do it justice. Furthermore, I am not at all sure that I have an answer for you. My Handout offers several possible explanations, including David Schum’s.



Complex -- Not Simple -- Inference 

The second noteworthy feature of Wigmore’s diagrams or graphs and those used by his admirers and followers this again is a feature that is intrinsically interesting, and is not merely an effect of Wigmore’s awkward system of notation –, from a modern probability theorist’s perspective the second interesting feature of Wigmore’s diagrams is that they portray inference as “complex” rather than as “simple.” This means that Wigmore’s charts depict inference as a phenomenon having multiple steps, a series of linked inferences, and not as a phenomenon involving just a single inference. Wigmore’s charts, thus,  portray inference as “complex” rather than “simple.”

 

Today, to a generation of probability theorists familiar with Bayesian networks and similar graphic devices, Wigmore’s depiction of inference as complex may seem both unexceptionable and unremarkable. But Wigmore’s charts first appeared no later than 1917(!)Decades would pass before the vast majority of modern probability theorists began to devote sustained attention to “cascaded” or “hierarchical” inference.


Richard Jeffrey was one of the earlier theorists to recognize the importance of the phenomenon of source uncertainty. See his justly-renowned discussion of the “rule of conditioning,” which garnered the epithet “Jeffrey’s rule of conditioning.” See R.C. Jeffrey, The Logic of Decision (1965; 2d ed., 1983). And David Schum started publishing papers about inference networks, source uncertainty, and cascaded inference in the late 1960s and early 1970s. But apparently it was not until the dawn of Bayes nets, influence diagrams, knowledge maps, and similar inference networks in the late 1970s and early 1980s that a substantial number of probability theorists started thinking seriously about multistage or complex inference problems. See generally Judea Pearl, Belief Networks Revisited, 59 Artificial Intelligence 49-56 (1993). So, all in all, the mere fact that Wigmore, entirely on his own, was devoting careful attention to very deep inference networks as early as 1917 is rather remarkable! [26]



 Webs of Inferences Not Just Chains

 

A third interesting feature of Wigmorean graphs or diagrams – I now refer to these types of graphs collectively because now we will be discussing a modification of Wigmore’s diagramming technique that all admirers of Wigmore’s charting methods have adopted, a modification that Wigmore himself did not use –,  a third feature of Wigmore-style graphs – of Schum’s directed acyclic graphs, for example –  is that the inference networks, if viewed as Bayes’ nets, are replete with dependent conditional probabilities – or, more generally speaking, with dependent conditional inferences. Furthermore, the dependencies in Schum’s inference networks are of almost every imaginable kind.

 

David Schum sometimes explicitly treats inference networks as Bayes' nets.  This means that conditional probabilities reside in the nodes of Schum’s inference networks. These conditional probabilities are the conditional probabilities associated with the propositions that are represented or designated by the nodes. The arcs in the networks express dependencies among the uncertain propositions represented by the nodes. The presence of an arc between any two nodes means that, in the judgment of the person who allows that arc to be where it is, a change in the conditional probabilities associated with the proposition represented by one node can affect the conditional probabilities associated in any node that is connected to the first.

 

So: in Schum’s networks there are typically many dependencies and those dependencies are of almost every imaginable kind. For example, consider Figure 8, which shows a modified version of the graph found in Figure 8.15 in Section 4.5 at p. 423 of David Schum’s Evidential Foundations of Probabilistic Reasoning (1994):


 


 

Here – in Figure 8 -- we have an instance in which almost every node in a network is non-independent of, i.e., dependent on, i.e., capable of being influenced by, every other node in the network. Furthermore, the network you see in Figure 8 is a non-Markov network. This means there are some nodes in the graph whose conditional probabilities depend on, or can be influenced by, the conditional probabilities of remote ancestors, and not just by the conditional probabilities of immediate ancestors and the children of such immediate ancestors. For example, there is an arc between {A , Ac} and {H , Hc} as well as between {A , Ac}  and {E , Ec}. Hence, the conditional probabilities associated with {E, Ec} do not “screen off” the influence between, or the dependence between, {A , Ac}  and {H , Hc}, which it would do if the network were a Markov network and there were, therefore, no arc directly connecting {A , Ac} and {H , Hc}.

 

Now the graph shown in Figure 8 may not seem nightmarish to you. But if you had any sense, it would. The trouble is that if the number of nodes increases and the conditional probabilities in some nodes are dependent on those associated with other nodes, a combinatorial explosion can occur. For example, consider a “complete graph” – Kn – a graph in which every node is connected to every other node. In that case, where n represents the “number of nodes”, then the number of arcs – and, in our terms, the number of dependencies among conditional probabilities – equals n(n-1)/2.  (See Eric Weisstein’s World of Mathematics, http://mathworld.wolfram.com/CompleteGraph.html.) So if in Figure 8 – a graph with five (5) nodes – an arc connected every node to every other node, there would be a total of ten (10) arcs.  But if the graph expanded such that there were ten (10) nodes rather than five (5) and every node in the graph were connected to every other, the number of arcs would not double, but would increase to 45. This illustrates, or hints at, a point that I am not competent to show mathematically (but you are), which is that given certain kinds of graphs, the “size” of a problem depicted by a graph is g(n) = 2n where n is the number of the ingredients of such a graph. See David Schum, Evidential Foundations of Probabilistic Reasoning § 4.5, at 181-182 (1994). Whatever may be the precise equation that one must use for various kinds of graphs – for example, whether and when we should use Kn = n(n-1)/2 or g(n) = 2n (or both) – for present purposes it is sufficient just to note that given certain kinds of inference networks (including, my weak intuition tells me, the network shown in Figure 8), the size and difficulty of the inference problem depicted by the networks increase exponentially as the number of nodes or the number of ingredients in the network increase.

 

It often takes very little evidence and very little reflection to make it necessary to generate inference networks that have far more than ten arcs or ingredients. Consider, for example, my diagrams of the inference problem presented in the legal case United States v. Robinson, 544 F.2d 611 (2d Cir., 1976) & United States v. Robinson, 560 F.2d 507 (2d Cir., 1977) (en banc). See, first, Figures 4, 6 & 7 at http://tillers.net/ev-course/materials/robinson.html#opinions and http://tillers.net/ev-course/materials/robinson.html#diagrams. Havingexamined those diagrams, examine Figure 5 at id. The Robinson diagrams shown in Internet Figures 4, 6, and 7 represent some of the inferences involved in the offer of gun evidence to show, variously, identity, opportunity, and preparation. If you combine these three diagrams with tthe diagram in Internet Figure 5, which contains a series of links representing a series of judgments relevant to a more global judgment about the  credibility (in general) of just one of the witnesses (Simon) in the case, you can count, literally without putting too fine a point on your counter, at least 15 nodes or; and, if you make certain assumptions, you can easily identify a total of 20 nodes or so.
 
To be sure, the inference problem in Robinson as portrayed in Internet Figures 4-7 does not involve a network in which every node is connected to every other node. But the network, graph, or diagram that would emerge from the combination of the networks, graphs, or diagrams shown in Internet Figures 4, 5, 6 & 7 would have at least several non-singly connected nodes. Furthermore – and this is the point on which I prefer to dwell –, it is entirely possible that if you or I were to consider the inference problem presented in Robinson, it is entirely possible – yea, it is quite likely – that  one or both of us would conclude that some of the nodes in the network are, after all, properly speaking, connected to nodes in addition to those already shown as being directly connected. You might, for example, conclude that nodes representing the veracity and the observational sensitivity of one of the witnesses – a man called Simon – are directly connected to the node representing the final probandum, the probandum, or proposition, that the man called Robinson, in addition to Simon, was one the four bank robbers. You might reach this conclusion, because you might decide that it is entirely possible that the degree of Simon’s veracity and his observational sensitivity (and perhaps also his memory and objectivity!) do possibly bear a relationship to the proposition that Robinson, his former colleague in crime, is the culprit who committed the bank robbery with Simon and the other two malefactors.
 
The moral of this little story about Robinson is only that closer, or more granular, scrutiny is likely to reveal previously-unrecognized, previously-undiscerned, conditional dependencies. A case in which such dependencies multiplied precisely in this fashion – i.e., after closer examination and reflection – is the U.S. legal case called Bridges v. State, vol. 247 Wisconsin p. 350, 19 Northwestern (Reporter) 2d Series p. 529 (1945). See the diagrams in the internet figures 1-3 shown at  http://tillers.net/hearsay.html.

 



***


Simplicity and Fidelity in Representations of Inference in Legal Proceedings

A major feature of much contemporary work on Bayes nets (and inference networks generally) is that great care is taken to avoid unnecessary multiplication of dependent  conditional probabilities. Schum, however, seems almost oblivious to the supposed desideratum of keeping the number of dependencies to a manageable minimum. Schum seems to positively relish the discovery of dependencies. This is because, he says, dependencies are the key to many subtle and important inferential patterns – such as corroboration, convergence, and contradiction.


A major motivation for the strategies used by Judea Pearl to keep the number and character of dependencies under control, to speak, is the objective of computational tractability.[27] Schum is also interested in computational tractability. Or at least he says he is. But he seems much less interested in computational convenience than are theorists such as Pearl. Indeed, Schum has proclaimed more than once in my presence that one should not distort a problem of inference just for the sake of computational convenience. This an interesting difference of temperament, if not necessarily a direct conflict of opinion. It suggests to me that Pearl on the one hand and Schum on the other hand have different objectives. And my suspicion is that Schum’s objectives are more like those of the legal scholars he so admires.

 

Computational convenience is plainly not an insignificant consideration if automation of inference – the computability of inference – is what you are after. This, in an important sense, is what Pearl seems to want. Schum generally seems to see inference networks in a different light. His mathematical analyses of inference problems are not primarily meant to hasten the day when significant chunks of problems of inference can be automated. The purpose of Schum’s mathematical analyses and Bayesian computation is generally to promote the development of maps of the mind, of knowledge maps, representation of complex patterns of subjective beliefs and judgments. He wants inference networks to serve as ancillary devices, as support devices, as heuristic devices that decision makers can use to interrogate and elicit their own thoughts. This approach is in general more appealing to legal scholars than are approaches of theorists such as Judea Pearl. This is because Schum’s approach, for one reason or another, ends up with the development of representations of inference that seem very intuitive and commonsensical to many legal scholars.

 

One question – one very BIG QUESTION -- that all of this raises is the following: Does Schum’s approach to complex inference problems say something fundamental about inference, does it lead to a new basic epistemological point, one that concerns the appropriate respective roles of natural and artificial reason in some contexts such as law?

 

I don’t yet know the answer to this question. But I think the issue is very much worth discussing. I think this is a BIG EPISTEMOLOGICAL QUESTION.

 

Why do I think that careful study of Schum’s approach to inference – and, by implication, and the approach of some legal scholars to inference -- may pay important epistemological dividends?

 

Answer:

 

An approach such as Schum’s (and of some legal scholars as well) puts a premium on commonsense reasoning.  There is a loose sense in which Schum’s approach resembles the approach(es) of the biases and heuristics theorists (some of whom must be here today). Both camps, it seems to me, are in a way suspicious of rigorously logical but extraordinarily refined and remote analytical procedures; and both camps seem to advocate deployment of a logic or procedure that is much simpler than some of the approaches that other probability theorists advocate. So there is this loose similarity or affinity. But what may be interesting about Schum’s approach (and similar approaches by some legal scholars) is the way that Schum’s approach differs from the general tack taken by some of the followers of Herbert Simon and by some of the heuristics and biases people. Although Schum does advocate the use of simplified or intuitive “evidence marshaling strategies,” he does not, in general, advocate the use of quick and dirty heuristics, cognitive strategies that deliberately ignore much of the complexity in the human environment. He seems to think that it is both possible and desirable to use simplified or intuitive analytical machinery without deliberately reducing the complexity of the problems that human beings confront. His inclination in the past has run in the opposite direction: he has the advocated the use of certain types of analytical strategies – the use of certain evidence marshaling strategies – because he believes they have the ability to reveal and highlight complexities, complexities that in many instances previously were not even imagined.

 

Does this general approach – the use or development of a “simplified” “commonsensical” model, but a commonsense approach that usually or often heightens or accentuates complexity and nuance rather than suppressing it – does this general approach to inference – this “model” of inference -- make sense? This is the general question I would like you to keep in mind as I proceed. If we have time, I hope that we can return to this general question and discuss it. I hope that we can do so, not because I think I will have anything new to add, but because I would like to hear your thoughts about the general question I have posed.

 

Schum is fully aware that his methods of graphing complex inference – that the kinds of dependencies he routinely allows among conditional probabilities – violate not only the Markov condition but produce an amount and variety of dependencies among conditional probabilities that probably almost drive people such as Pearl crazy. Schum, however, is unfazed.

 

Who, in this context, is correct: Schum (on the one side) or – alternatively – the probability theorists who insist on adherence to strategies that try to keep the number and variety of dependencies to manageable proportions? More generally speaking, which way of representing evidence and inference in trials and other legal proceedings is permissible, preferable, or (even) necessary?

 

Consider this general question in a concrete form. Suppose that the issue is whether Albert Accused is the person who killed Valiant Victim or whether someone else did so. The evidence at the trial is that several possible culprits were presented to two eyewitness, W1 and W2, shortly after Valiant Victim was killed and that W1 said, “I’m sure that Albert – that guy – is the culprit; he’s the one who pulled the trigger,” and that, after W1 said this, W2, W1 ’s younger  and youthful brother, said, “I agree with W1. That guy – Albert Accused – shot the gun that killed Valiant.”

 

An American trial lawyer – and, perhaps, just about everyone else – would say that it is entirely possible W2’s report was influenced by W1 ’s report – because, for example, the probability that W2 would say what he did is increased by the report made by his brother, W1 – and, because of this, the  probative value of W2’s report should be discounted. (It is also possible that there is bad blood between W1 and W2 and that W1 ’s report would make W2’s report less probable. In this event, W1 ’s report might enhance the probative value of W2’s report – but in either case, there is possibility that W1 ’s report influences the probability of W2’s report.)

 

In the eyes of some U.S. and U.K. students of factual inference in legal settings a “natural” way of representing the possible influence of the report of W1 on W2 on the report of  W2  might be the one that appears in Figure  9 :



 

A      =         Albert Accused is the killer

SW1  =       W1’s statement,  “Albert Accused is the killer”

SW2  =       W2’s statement,  “Albert  Accused is the killer”

 

FIGURE 9

 



If an American trial lawyer were to be asked to cleanse his or her thinking – or the thinking of a trier of fact such as a jury or judge of the dependent conditional probability shown in Figure 9 P(SW2|SW1) –, the likely response of such a trial lawyer would be that real problems of evidence are full of such impurities and that the scrubbing of this “impurity” from the consideration of a decision maker such as a judge or a juror would falsify, distort, and misrepresent the inferential problem that is actually at hand.  If the goal of eliminating mathematically- or formally-inconvenient or –awkward dependencies in  reasoning about evidence were to be acceptable to my hypothetical U.S. trial lawyer, he or she would have to be convinced that any such alternative method of depicting the above situation would not cure the formal defect and achieve the formal desideratum at the cost of simply ignoring real and actual influences such as that of the report of W1 on the report made by W2.  I do not see any probabilistic argument that would have the remotest chance of convincing my hypothetical trial lawyer that the dependent conditional probability P(SW2|SW1) should be scrubbed or rubbed out of the picture:  no mode of representation that does not seem to directly assert and convey the possible influence of the one testimonial report on the other would seem to have a snowball’s chance in hell of surviving. {But cf. debate and discussion between David Schum and Paolo Garbolino on this point. The example used in their discussion is reasoning about attributes of witnesses that pertain to judgments or inferences about credibility.}

 

Ancillary Principles and Common Sense Reasoning

 

We have considered three interesting features of Wigmore- and Schum-style inference networks. Now let me turn to a fourth interesting (and, possibly, odd)  feature of Wigmore’s and Schum’s diagramming techniques. This fourth feature is not apparent from the Wigmore diagram you saw some time ago in Figure 6.

 

 

Wigmore's Chart Method, 1913, a pioneering example of Computer-Supported Argument Visualization

 

Figure 6

 

 

This fourth property does not appear in Wigmore’s own charts but it does appear in the diagrams of Wigmore’s  admirers and imitators. I am referring accounts of inference that emphasize the role of ancillary generalizations in factual inference and judicial proof. Ancillary generalizations and ancillary evidence appear explicitly and prominently in many of the diagrams of inference constructed by David Schum and by the collaborators William Twining and Terence Anderson, and by other such people.

 

As Figure 6 illustrates, Wigmore devised a “key number system.” The numbers attached to the nodes represent foundational, intermediate, and final or ultimate factual hypotheses, or factual probanda. The mechanical device of the key list was designed to make it possible for a user to keep track of all of the various parts and links of a complex inferential argument. Wigmore’s successors added to that system by adding the requirement that a user such as a judge or a law student also write down the ancillary evidence and the ancillary principles or propositions that support each move, each link, each inference, along a chain or web of inferences. This is essentially the procedure that has been advocated for years by figures such as Terence Anderson & William Twining and David A. Schum and more recently by scholars such as Robert Mislevy, Henry Prakken, and Douglas Walton.

 

The diagrams and charts that these Wigmore aficionados end up constructing are, in the eyes of some observers, sorry imitations of  “real graphs.” Wigmore-style diagrams with ancillary evidence look unnatural to many modern students of graph theory because these Wigmorean diagrams seem to have  some arcs that connect, not to nodes, but to arcs; or, alternatively, to nothing at all. For example, Wigmore’s and Schum’s diagrams of inference (and, I confess, some of my own) often take the following seemingly-forlorn pattern:

 

 

 

a

 

FIGURE 10A


 

 

 

 

 

FIGURE 10B

 

 

 

 

 

 

 

 

 

 

FIGURE 10-C

 


What kinds of graphs, diagrams, representations, images, or pictures are these?

 

Wigmore, Anderson, Twining, and Schum speak here with one voice: a  is ancillary evidence that supports a warrant or backing for the inference from e  about  {H , Hc}. This warrant or backing, they all say, is typically a thing that they call a generalization.

 

Whatever may the underlying merits of these general sort of representation of inference, some observers will object that such representations are not graphs – because there are those odd lines out to the right – lines that appear to be arcs – and those lines or arcs out to the right in Figures 10A, 10B & 10C do not seem to connect to the two nodes in the diagram. The trouble, these observers might say, is that  if we know anything at all about graphs,  we know that arcs must connect to nodes, and we know that arcs cannot connect to other arcs. (If arcs stand free, so to speak, if they do not connect to any of the nodes in a particular graph G, we know that those free-standing nodes are not part of graph G.) For this reason (and perhaps also for others) some knowledgeable observers have objected – students of graph theory, probability, and Bayesianism – some knowledgeable observers have objected to Schum-style representations of complex inference – and, by indirection, to the conceptions of factual inference that many legal scholars in Evidence entertain.

 

One or two of these caveators have suggested that Schum-style and Wigmore-style representations of inference can and should be reconstructed so that they become “true” graphs. This is effectively what Paolo Garbolino, a friendly critic, has advocated in print. Other observers have made similar suggestions in informal conversation and exchanges. (Judea Pearl did so in some e-mail exchanges with me over the UAI list several years ago.)

 

More is at stake here than just an academic interest in fidelity to the conventions of graph theory. What is implicitly at stake here is a question about the necessary or appropriate path to coherent argument about and from evidence.  Some of the critics of Schum-style graphs or diagrams may be effectively arguing that Schum-style graphs or diagrams – and the “logic” that such graphs or diagrams represent or incorporate -- render argument about and from evidence incoherent, illogical. (Some of Schum’s more restrained critics may want to suggest only that Schum’s diagramming strategies – and, by extension, the thought patterns of some legal scholars – inject unnecessary complexity – clutter – into representations of inference.)

 

I cannot, either in the time I have available – or, for that matter, with all the time in the world –  resolve this debate. But I would like to offer several observations that seem pertinent to an assessment of the strengths and weaknesses of the rival views about the proper way to visually represent argument about and from evidence. First, it is not necessary, or even wise, to view all of the arcs in the sorts of diagrams by Schum that I have shown as belonging to a single graph. David Schum himself has described his diagrams of inferences cum ancillary evidence and warrants as tantamount to networks nested within networks: graphs within graphs. In his view, formally speaking, the sorts of representations shown in Figures 10A - 10C are not all part of a single graph; the arcs or images in ancillary networks – like those to the right of the primary [vertical] inference in Figures 10A, 10B & 10C do not “connect” to either the arcs or the nodes in a primary or nested inference network such as the single-stage inference shown on the left-hand side of Figures 10A, 10B or 10C).

 

In Schum’s view these ancillary networks are nested within primary inference networks – as depicted in Figure 11: [28]

 

 

Figure_11_img.gif

 

There is, therefore, perhaps nothing strictly illogical or incoherent about Schum’s method of representation –, i.e., given this graph-nesting theory, it is possible that nothing in Schum-style Toulminesque inference networks transgresses any fundamental conventions or principles of graph theory.

 

Second, part of the disagreement about the value of Schum-style graphs – which is, possibly, by extension, a disagreement about the logical coherence or validity of conventional methods of argument in law about factual inference –, part of the disagreement between Schum and his critics about the value or validity of Schum-style graphs and networks may be a disagreement about the coherence or usefulness of non-monotonic argument about evidence.

 

As Schum has put it, some graph and probability theorists want to suck up everything – all argument about and from evidence – into a single inference network, a single graphic representation of complex inference. These sorts of theorists don’t want anything, except undiscovered evidence or unidentified hypotheses, to stand outside of or apart from an inference network – and when new evidence or new hypotheses or possibilities come along, these new matters are to be incorporated into a revised form of a single inference network, perhaps a Bayes’ net. To my layman’s eyes, this strategy of putting all evidence, all argument, and all hypotheses into a single network is, in a crude sense, if not necessarily in a technical sense, a monotonic representation of argument about and from evidence: a one-dimensional representation of inference.

 

Schum’s portrait of factual inference has clear affinities with Stephen Toulmin’s non-monotonic theory of argument. [29] Indeed, in Schum’s case – although not in Wigmore’s – more than affinity is at work: Schum’s defense of his methods of representing or portraying inference expressly draws on the work of Stephen Toulmin and similar logicians and theorists. There is now a large body of scholars at work on such forms of argument, reasoning, and rhetoric – and I bet that some of them are here today.

 

Third, while it may be true that in the eyes of a graph theorist Schum-style diagramming methods seem both uneconomical and inelegant, neo-Wigmorean and Schum-style diagrams generally seem far more “natural” to legal professionals; to the extent that legal scholars can stomach any diagrams of inference at all, the relatively more cluttered diagrams, charts, or graphs that Schum or Anderson construct generally seem more digestible to legal specialists in the law of evidence. This means: the kinds of pictures of inference that Schum constructs generally seem to conform better to the kinds of intuitions that legal professionals tend to have about evidence and about argument about evidence; i.e., Schum’s relatively cluttered representations – involving, as they do, representations of free-standing ancillary information – seem, ironically,  -- to a legal professional’s eye more intuitive, more transparent, than do the cleaner monotonic graphs that scholars such as Paolo Garbolino and Judea Pearl prefer to construct.

 

This third point – the contrast between a lawyer’s intuitions and preferences and the intuitions and preferences of people such as mathematicians and graph theorists – suggests again at least a partial answer to at least some the disagreements about the relative merits and demerits of two general contrasting approaches to the task of portraying inconclusive inference about events in the world.

 

If the overriding objective of developing representations of complex inference is to facilitate the computation of probabilities, traditional graphs may be superior to Schum’s representations. Schum does not (to my knowledge) directly deny this specific thesis.  Schum’s main defense is confession and avoidance (viz., e.g., ~ yes, I killed the guy, but I didn’t mean to do so). Schum maintains that the ancillary  graphs nested within which the primary networks are nested are suggestive – Schum approves Ron Howard’s characterization: “evocative”: [30] the function of the nested networks, Schum thinks, is to be suggestive and evocative, not determinative. He maintains that this way of thinking about the role of ancillary evidence is a perfectly logical way – a perfectly coherent way – to think about inference networks. But I think Schum does not really frontally deny that monotonic representations “compute better.”

 

You can see that I am edging toward the BIG EPISTEMOLOGICAL QUESTION that I mentioned earlier.

 

Even if conventional graphing methods “compute better,” Schum – and, by extension, most U.S. legal professionals – are unwilling to sacrifice their way of visualizing inference for computational convenience or, even, if push come to shove, even for the sake of at least some computational tractability.

 

While Schum is not indifferent to the importance of computational convenience and tractability (though many law teachers are!) – thus Schum argues that other strategies are available to make complex inference problems computationally tractable, including, for example, the use of relatively gross frames of discernment –,  Schum also maintains that his way of visualizing inference is, if not essential, it is, at a minimum, an appropriate, useful, fruitful, and productive way of deliberating about inference [31] And push comes to shove for Schum when the laudable objective of making problems of inference more tractable necessitates the distortion of problems of evidence and inference. Such distortion happens, Schum maintains, when known sources of uncertainty are ignored. [32] Schum did not say this—because he is not confrontational -- but I will say this on his behalf: Sometimes using a certain brand of formal analytical machinery is worse than not using none at all. Whether or not Schum would accept a proposition this bald, many legal scholars would do so.

 

Schum believes that if one wishes to use formal devices or argument to foster or support rational argument or deliberation about inference it is important to use formal devices or methods of argument – a grammar, I would say – that both allows and encourages a user to focus – to directly represent – the types of matters that (the user thinks) license or support inferences –, the propositions, principles, information, evidence, or knowledge that gives or suggests grounds for drawing this or that conclusion from some ultimate or intermediate evidential premise. Even though Schum leans strongly toward subjectivist Bayesianism, Schum also believes that the mind does not move from an E to an H  “without more.” If inference is to have any semblance of rationality, I can imagine Schum saying (though I don’t think he has ever put the matter this strongly), – I can imagine Schum saying that there must be some grounds or reasons for taking any non-compelled, non-deductive inferential step -- there just must be ancillary principles or knowledge or beliefs that induce the mind to “move” from E evidence, or intermediate probandum, to H hypothesis, or final probandum, and that it must be possible, at least in some instances, to spell out such ancillary grounds or principles to some degree. [33]

 

You can now perhaps see one compelling ground in Schum’s mind for the view that considerations of computational convenience do not warrant the abandonment of argument and representational strategies that focus attention on such ancillary networks: one should not abandon this essential or useful perspective – a perspective that emphasizes the importance of the grounds for inferential links or steps – for the sake of computational convenience or tractability.

 

Now let me remind you of an earlier conjecture of mine and then let me reiterate and amplify a second conjecture.

 

As I said earlier, I think that the disagreement between Schum and advocates of a rather different kind of diagramming or graphing strategy rests in part on divergent hopes and expectations for rational or logical analysis of inference. Schum emphasizes the heuristic and suggestive functions of inference networks far more than most of his AI colleagues do.

 

Now I want to make a related but distinct suggestion,  one that I have hinted at before.


I suspect that Schum’s relatively more limited expectations about the possible benefits of formal analysis are more in line with what formal analysis can be reasonably expected to achieve in the legal process. I want to suggest that a relatively less rigorous, relatively less artificial, relatively more commonsensical logic or procedure is likely to be more appropriate and useful in the legal process. In short, perhaps there is a kind of lesser logic – a kind of psycho-logic – that is often more appropriate for law than are seemingly more rigorous, but also more “artificial” methods of reasoning. [34]


 

Conclusion

 

There is much more that I would like to say. But I have gone on for too long already. Let me conclude with just a few words about two subtopics that I promised to discuss: (i) the diversity of evidence marshaling methods in legal proceedings; and (ii) the dynamics of judicial proof.

 

1.      Diversity in Inference. In my talks today I have focused on webs of evidence and conditional inferences. But I do not want to leave you with the impression that I think that such webs are the heart or “essence” of uncertain factual inference (or of uncertain rational factual inference). I think that such webs of evidence and inference describe just one way that human beings do and can think about factual questions in connection with litigation or possible litigation. I think human beings do use and also should use other methods of organizing evidence, including (i) time lines (a/k/a event chronologies), (ii) scenarios (a/k/a causal hypotheses), (iii) possibilities, (iv) retroductive eliminative reasoning, (v) loose thinking, free-flowing thought; and (vi) legal marshaling. My view is that these methods and others form a kind of loose meta-network; these various methods interact with each other and influence each other but the conclusions and suggestions unearthed by any one of these marshaling strategies are not propagated in any firm way into other ways of organizing or visualizing evidence. See generally P. Tillers & D. Schum, A Theory of Preliminary Fact Investigation. David Schum and I have constructed various representations of this swarm of evidence marshaling methods and strategies. Here is one early picture:

 

 

 

·      For more recent summary representations of quasi-networks of evidence marshaling strategies or methods, see http://tillers.net/marshal.html and http://tillers.net/marshall.html.

 

 

One effect of  such quasi-networks of sets of evidence marshaling strategies is to introduce complexities into inference far beyond those that I have already described in my brief discussion of the properties of Wigmore- and Schum-style inference networks.

 

2.      Dynamic Inference. This is a topic that is dear to my heart. Let me just say now that I think that when one adds time to the soup a lot of apple carts get upended. Time upsets a lot of inferential and theoretical apple carts not merely because time adds a layer of complexity to the task of inference and proof, though time certainly has that effect. Time has a more radically corrosive effect. Time has a nasty habit of producing surprises. In the realm of evidence and inference this means that (i) time has a nasty tendency to of generate new evidence and (ii) new evidence has a nasty tendency to suggest or raise new possibilities and issues and hypotheses.

 

This phenomenon – the generation of unanticipated possibilities – is in certain respects quite benign. But the unanticipated multiplication of possibilities can create difficulties for people who are engaged in planning for the future. The multiplication of possibilities can be even more destructive and depressing for people who are engaged in planning future investigative activity. See generally P. Tillers, Is Proof in Litigation Predictable?: Some Obstacles to Systematic Assessment of Decisions about Proof in Litigation. See also P. Tillers, Can AI Help Resolve Some Fundamental Puzzles of Judicial Proof?: Introductory Comments about the "Explosive Dynamic Complexity" of Evidentiary Processes associated with Litigation; P. Tillers, Tillers on Evidence  (archive, July 19, 2003).

 

The interaction of three general properties of inference – the intricacy of evidence marshaling methods, the multiplicity of evidence marshaling methods, and instability of (the results of) evidence marshaling strategies or methods over time –, the interaction among these three attributes or variables generates a specific kind of general picture of factual inference. This general picture may be cast in terms of William Twining’s distinction between rational optimists and fact skeptics. The portrait of inference that I have presented here pushes in the direction of people who call themselves holists, constructivists, subjectivists, and that sort of thing. But in one respect my picture differs from the sort of irrationalism that is often found in such camps. If I stress the subjectivity of inference and its occasional ineffability, I do so not because I think there is no there anywhere; i.e., I do so not because I think Nature is infinitely plastic, infinitely malleable, or vaporous. (I do not think that Nature is infinitely plastic. The existence of death and disease proves my point.) My portrait of the fragility and unpredictability of inference and proof in litigation instead suggests this proposition:


If Nature is largely impenetrable to the human eye, it is because real but wily Nature has a tendency to confound human aspirations and pretensions.


This is a distinction with a difference: the real existence of events in the human environment suggests that it is at least possible for human beings to use rigorous evidence marshaling strategies to good advantage – as long as these human reasoners do not resist their intuitions and subjective judgments about when it is useful and when it is not useful to explicitly decompose and analyze problems of evidence and inference.

 

I will have to explain all of this more fully on another occasion. [35]

 

Thank you.




[1]      I readily acknowledge the possibility that some systems of “law” may not connect norms and facts in the way that I describe or presuppose in these lectures.


[2] I will not go to the trouble of providing references to Aristotle’s works. Suffice it to say that the phenomenon and problem of the uncertainty of events in the world and of human knowledge of such events can fairly be said to be one of the primary fulcra for Aristotle’s entire philosophical (and scientific!) enterprise.


[3] Ian Hacking, The Emergence of Probability (1975).


[4]      See, e.g., {early English treatise writers}. Contrary to the claims of some observers, Western medieval legal professionals were also well aware of the problem of inconclusive factual inference and proof. See M. Damaška, Rational and Irrational Proof Revisited, in J.F. Nijboer & J.M. Reijntjes, eds., Proceedings of the First World Conference on New Trends in Criminal Investigation and Evidence 75, 79-81 (1997); James Franklin, The Science of Conjecture: Evidence and Probability before Pascal (2001).


[5]      The problem of empirical knowledge and factual inference, together with the notions of “happiness” and “utility,” is the central topic in the philosophical tradition known as British {or English} empiricism. See, e.g, David Hume, J.S. Mill, J. Bentham  ...  English empiricists, however, did not invent the problem of empirical knowledge and they were not the first philosophers to ponder the nature of empirical knowledge and its foundations.


[6] But pre-20th century English and Scottish thinkers considered the problem of uncertain factual inference at length.  See Barbara Shapiro, "Beyond Reasonable Doubt" and "Probable Cause": Historical Perspectives on the Anglo-American Law of Evidence (University of California Press, 1991).


[7]      See P. Tillers, Handout, Sec. 1, “Background Material on the Possible Distinction between Legal and Factual Uncertainty.”

 

[8] My discussion today emphasizes factual uncertainty and inference in legal settings. But I do not attempt to deal comprehensively even with just the phenomenon of  factual uncertainty in litigation; I do not attempt to analyze or identify  all discernible forms of factual uncertainty  either in legal settings, in litigation, or in trials.  I emphasize just one or two types of factual inference, and I pay relatively little attention to some other forms of argument about evidence. I restrict my gaze in this way only because time is short, and not because I think that the other types of uncertainty and argument are unimportant.

 

[9] “Pegs” are, typically,  stub-like cylindrical wooden shafts. Pegs are often found in closets or foyers, and clothes such as jackets and raincoats are hung on them. Some weavers of fabrics such as sweaters or rugs drape the threads they weave over knobs. The image of threads being stretched over knobs and woven in complex patterns is in fact an excellent metaphor for the process of factual proof in litigation.

 

[10] In this paper the phrases and words “forensic proof,”  “judicial proof,” and, sometimes, just “proof” are shorthand for “evidentially-supported demonstration in legal proceedings.”  I use the former phrases only for the sake of convenience, because they are short. There is no adequate shorthand label for this process. “Legal proof” will not do because the phrase suggests that proof can be about the existence and meaning of legal rules or norms, not about the truth and falsity of factual hypotheses, hypotheses about states of the world apart from legal rules. The phrase “factual proof” also does not quite do the trick – because it fails to evoke the legal function of the kind of proof under consideration here.


[11]      I do not claim that my assertions about the properties and workings of legal rules are true in all times or places. My claims, however, are very probably true of U.S. law – and my guess is that they are also true of legal rules in many other parts of the world.

 

[12] If legal rules in their raw form –  in the “legal source material” found in, e.g., a statute book or a court decision – do not take the form of conditional imperatives, such legal source material – the pronouncements found in statutes, regulations, and so on -- are usually translated by legal decision makers (e.g., judges) into “if, then” statements.


[13] As noted in the text,  to talk about the “essential elements” of rights, duties, claims, wrongs, and similar matters has some capacity to mislead is because in many accurate formulations of legal rights and duties, “if” clauses refer to conditions in conditional legal imperatives disjunctively: if (a or b) and c, then x. This seeming contradiction of  essential elements talk is in principle capable of being resolved because, in principle,  every legal rule containing a disjunction of elements could be boiled down into distinct sub-rules in which each element in each sub-rule is essential for the species of the right or wrong established by that particular sub-rule. Since treatise writers often (but not always) do this sort of paraphrasing and, thus, it is perhaps not surprising that it is often suggested, if not directly asserted, that every distinct legal rule consists of a single set of essential elements. The trouble with such a position, however, is apparent: there is nothing in the heavens or on earth that outlaws conditional imperatives that make use of disjunctive expressions. Furthermore, as a practical matter, it is often more economical to use disjunctive statements when describing a legal doctrine (or anything else).

 

 

[14] Vern Walker makes effective use of join trees to reveal such complexities in legal doctrine. See [VW's web site].


[15] For example, there may be some force to the hypothesis that symbolic action in law generally retains its social potency only if the symbolic actions taken by the law are generally warranted by circumstances; perhaps it is the case, for example, that symbolic action that rests on patently false factual judgments eventually generally appears hypocritical and unjust and thus loses its social efficacy, it social acceptability, it loses its grip on the psyche.

 

[16] Many complexities and issues are buried in the innocuous-sounding proposition in the text. For example, legal rules often make legal effects depend on the existence of other legal rights or conditions. For example, a statute or legal rule that authorizes a particular punishment for “willful trespass” may provide that the putative trespasser must somehow enter upon or invade a thing owned or possessed by another person. But “ownership” and “possession”  are, legally speaking, legal states that cannot be observed directly, but are conditions or conclusions must be inferred – or, in this instance, almost literally constructed -- on the basis of, among other matters, the character and meaning of yet other legal rules – rules, for example, that regulate how people acquire title to property. Furthermore, one might argue that “mental states” such as “intent kill” are not “actually” spatio-temporal “events” – or that, if they are spatio-temporal events, they are radically different from events such as “the movement of Pedestrian Smith across Fifth Avenue at 3:13 p.m. on June 1, 2003.” See M. Damaška, "Rational and Irrational Proof Revisited, " in J.F. Nijboer & J.M. Reijntjes, eds., Proceedings of the First World Conference on New Trends in Criminal Investigation and Evidence 75, 76 (“But the separation of the empirical from the evaluative and the juridical becomes more difficult when ‘subjective states of mind’ must be ascertained – such as intent, absence of consent, and the like. Whatever difficulties we confront in this regard are compounded when factfinding necessitates complex, inter-subjective evaluations … Establishing the ‘factual preconditions’ for the application of legal norm shades in these situations easily into the search for the norm.”)(footnotes omitted).


[17] See, e.g., Lon L. Fuller, The Morality of Law (Rev. ed., 1969).


[18] And perhaps also in other human, social, or institutional settings.


[19] One question that my general observation leaves unanswered as yet is whether or not it can be said that the range of inferential problems in litigation is relatively narrow – because of the preemptive effects of authoritative legal rules. I think and I hope that the rest of my lecture(s) will begin to provide an answer to this very question.

 

[20] This legal question is, conceivably,  also a “factual” one. Please see references in my Handout. But if the conventional distinction between questions of law and fact is invalid, the argument in this paper is unaffected: my argument in this paper {talk} remains valid as long as (i) generic legal rules and requirements exist; and (ii) the question of the existence or non-existence of instances of generic legal requirements remains important in the eyes of law and society.

 

[21] The meaning of “ultimate” will be clarified when I focus directly on inference networks.


[22] If all that you know about a legal controversy is the legal rules and doctrines that apply to the controversy, you have almost no way of knowing or guessing what particular factual possibilities are in issue in that controversy. (You can sometimes narrow the range of possible factual issues a bit if you know the names of the parties or the venue of the legal proceeding, but any restriction so achieved will ordinary leave a vast variety of possible factual matters in issue in the case.)


[23] Plainly we are dealing with a situation in which there is substantial uncertainty even about the year of Valiant’s death.


[24] Wigmore’s diagrams antedate Bayes nets (and directed acyclic graphs?)  by quite a long period of time – by roughly six decades! Wigmore was neither a mathematician, a probabilist, a logician, or nor a graph theorist, and it is not surprising that he used a system of representation that today appears (and, in his own day, appeared) rather cumbersome, inelegant, uneconomical, and cluttered.} We can dispense with most of that clutter, most of those peculiarities, without compromising the thrust of his diagramming technique. (But, as we shall see, Wigmore-style charts have several peculiarities that cannot be scrubbed without abandoning the basic insights that prompted Wigmore to develop them.)

 

[25] I go to some lengths to see to it that my students don’t fall prey to this sort of misunderstanding. See P. Tillers, Making Bayesian Thinking (More) Intuitive, http://tillers.net/ev-course/materials/tillersbayes.html.

 

[26] Douglas Walton traces the basic idea of multistage inference back to David Hume. See -- ?[lost citation; internet paper]. Edmund Morgan, a legal scholar, also used a diagram, one that was well-known to a generation of law students and legal scholars, to portray a chain of factual inferences. See  Edmund Morgan, Basic Problems of Evidence 185-186 (1961), recapitulated and discussed at 1A John H. Wigmore, Wigmore on Evidence § 37.4, 1034-1035 & n. 13 at 1035-1036  (P. Tillers rev., 1983). Cf. Chaim Perelman, The Idea of Justice and the Problem of Argument 121-122 (Petri trans., 1963). The notion of chains of inference was not merely a plaything of legal academics. See, e.g., United States v. Ravitch, 421 F.2d 1196, 1204 n. 10 (2d Cir.), cert. denied, 400 U.S. 834 (1970). See also United States v. Robinson, 544 F.2d 611 (2d Cir., 1976) & United States v. Robinson, 560 F.2d 507 (2d Cir., 1977) (en banc). Both opinions are discussed by P. Tillers at http://tillers.net/ev-course/materials/robinson.html#opinions and at http://tillers.net/ev-course/materials/robinson.html#diagrams.

 

[27] See, e.g., Laura Matignon, "Comparing Fast and Frugal Heuristics and Optimal Models," in Gerd Gigerenzer, Bounded Rationality: The Adaptive Toolbox 147, at 164 (2001) (“What Theorem 7 states is that the Markov blanket of a node determines the state of the node regardless of  the state of all other nodes not in the blanket (Pearl 1988). The theorem, based essentially on Bayes’s rule, represents an enormous computational reduction for the storage and computations of probability distributions. It is precisely due to this type of reduction of computational complexity that Bayesian networks have become a popular tool both in statistics and in artificial intelligence over the last decade.”)


[28] D. Schum, The Evidential Foundations of Probabilistic Reasoning 189 (1994).

I wonder if it might be more useful to take the reverse view: to view primary inference networks as being nested within ancillary inference networks. This is, perhaps, because I tend to think that primary inference networks – chains and webs of inferences about particular questions and hypotheses – “swim” in a mass of often unspoken assumptions, default assumptions. I tend to see ancillary networks, therefore, as part of the epistemic background for particular complexes of inference that we construct to consider when certain problems or question arise. But I am not at all sure that anything hinges on which of the two types of networks are viewed as the nesting networks or the nested networks, respectively. Here again I would want to defer to your opinions and advice.


[29] Stephen Toulmin, The Uses of Argument (1958).


[30] Ronald A. Howard, "Knowledge Maps," 35 Management Science 903 (1989).


[31]   [to be documented:] When pressed, Schum effectively acknowledges that he believes that his methods of visualizing inference are essential -- in the sense of being fundamental – because, for example, he believes that limitations on human knowledge (e.g., of causes) would render the inference networks generated by conventional (monotonic) graphing methods both inordinately convoluted and beyond the ability of human imagination.

 

[32] [to be documented:] Schum recognizes, however, that simplification does not necessarily produce significant distortion. And he clearly recognizes that some simplification (and distortion) is necessary, that some simplification – and attendant distortion – are unavoidable. So how is one to decide whether a simplifying strategy is appropriate or not? I do not know if Schum has attempted to give a general answer to this question. In years past he would have been inclined to say that subjective judgment must prevail in matters of this sort.

 

[to be documented:] Schum has always insisted that decision makers must be attentive to the tradeoffs between the benefits of granular analysis and its burdens. (There is a price for every epistemic strategy, he seems to want to say.) I have the impression, however, that in recent years Schum has become more appreciative of the possibility that formal analysis (including sensitivity analysis and the computerization of some chores) might in and of themselves suggest or even show, at least in some well-defined circumstances, which simplifying strategies work and which do not. But if that is the position he now takes, I do not know the details.


[33] I do not know how far Schum is willing to take this sort of argument. I say this because Schum has repeatedly recognized that there are limits to the degree of the granularity that human beings can achieve in their analyses of problems of evidence and inference. This concession on his part suggests that sometimes there are real links that are supported by reasons human beings sometimes just cannot manage to spell out.

 

[34] I think that in years past, this sort of suggestion would have made logicians shudder. I doubt that this is true to the extent that it once was.


Having suggested (in the text above) that rough and ready methods might be the better evidence marshaling methods for law, I feel compelled to add that I do not have any great confidence that the actual use of such rough evidence marshaling strategies by decision makers in and before trials would greatly improve – or improve at all – the quality or accuracy of fact finding in legal proceedings. But I also see no real danger that explicit use of the sorts of procedures that Schum has described would do much harm. So I think an experiment or two might be in order. Furthermore, I believe it may be possible – indeed, I believe it is possible – to devise user-friendly computer “tools” that will make this sort of marshaling of evidence seem fairly natural and comfortable in the eyes of judges and jurors. (I myself have done a little bit of work on the development of such computerized “inference-support” procedures.)


[35] My final resting place, I suspect, will not be as radical as the material in the text might suggest. My bottom line is that the level of analytical granularity one choose is a matter of choice and that even the choice of which evidence marshaling strategies to employ is a matter of choice – and that, thus, one should use all the various marshaling “operations” with a sense of humility and only to the extent that they serve to make matters and problems seem a bit more clear. In the mass of the evidence problems that arise in legal proceedings nothing will ever replace the need for repeated use of human intuition and subjective judgment on either the parcels of evidence or on whole collections of evidence that decision makers such as judges and jurors will confront.


 [PT1]Note 1: Should we look at X + Y + Z as a compound proposition? (But observe: E1 relates to X, but not Y or Z.) Are we to think here that the ultimate issue is P(A  + B + C)? Or, better said: P([X|E1] + [Y|E2] + [Z|E3])

 

  [PT2] One might question – very reasonably – whether there should be a directed arc running  from X1 or X2 to X. Indeed, one might question whether there should be an arc here at all. For are not X and X1 the equivalent of apples and oranges? Well,  perhaps, and perhaps not. But  regardless of how this last question should be answered,  I do want to suggest – insist, really – that  from a certain point of view it is perfectly legitimate to have such arcs and that the appropriate question to ask would be about  the meaning of arcs of this sort.  My partial justification for such arcs is that a "pragmatic dynamic perspective" offers some justification for them because pragmatic thinking suggests that it is quite simply a fact that X can suggest X1 or X2, and vice-versa ,and that it would therefore be useful to be able to draw some sort of a line or arc between X and X1 or X2. So perhaps it is appropriate to draw an arc when the occasion seems to demand it and then ponder what such a line or arc represents! (The answer to this last litlle puzzle sometimes may be: suggestions; perhaps instances of X and Xnsometimes function as hints. Cf. Glenn Shafer, ----.

  [PT3] Sammy’s two statements may have probative force for either hypothesis – X1 or X2  particularly if we assume that feelings or personality has a certain durability – but it is plainly not necessary or even likely that E1 and E2 will have exactly the same probative force for either possible event X1 or possible event X2.





 

Go to Peter Tillers' Home Page