Return to Peter Tillers' Home Page

 

Joint Report (Testimony) by David A. Schum and Peter Tillers, submitted December 21, 1994, in United States of America versus Charles O. Shonubi, 92 CR 0007 (BW), United States District Court, Eastern District of New York

 

Amended Appendix (Exhibit) B

 

I. Background

 

     Defendant Charles Shonubi was convicted in this court of importing 427.4 grams of heroin on December 10, 1991. He was also convicted of possessing this heroin with the intent of distributing it. The evidence at the trial and at the ensuing sentencing hearing established that there were 103 balloons in Shonubi’s digestive tract when he arrived by airplane at JFK International Airport on December 10, 1991. These balloons were filled with a paste or mix that included heroin. Shonubi acquired this heroin in Nigeria. He then flew from Nigeria to JFK, with a stopover in Amsterdam. Shonubi swallowed the balloons with the heroin sometime before his arrival in New York. Shonubi’s December airplane trip from Nigeria to New York was the second leg of a round trip between JFK and Nigeria; Shonubi, a Nigerian citizen but a New Jersey resident, had flown to Nigeria from New York shortly before his return to the U.S. on December 10, 1991.

 

     After the jury returned its verdict of guilty, Judge Jack B. Weinstein held a sentencing hearing. The evidence at the trial and at the sentencing proceeding established that Shonubi imported 427.4 grams of heroin on December 10, 1991. However, the evidence also showed that Shonubi had made seven other drug-smuggling trips between Nigeria and New York. Under the Federal Sentencing Guidelines the choice of the appropriate guideline for sentencing depends in part on the “base offense level.” Moreover, the base offense level in drug trafficking cases “depends on the amount of drugs involved.” United States v. Shonubi, 998 F.2d 84, 89 (2d Cir. 1993).

 

     The government seems to take the position that if the heroin involved in a drug trafficking case such as this one weighs at least 1,000 grams but less than 3,000 grams, under Federal Sentencing Guideline §2D1.1(c) the applicable base offense level for sentencing purposes is 32. Government’s Memorandum of Law on Resentencing Issues at p. 5 (“Under Sentencing Guideline §2D1.1(c), level 32 applies for heroin weighing at least 1000 grams but less than 3000 grams”). As noted above, the trial court found that Shonubi imported 427.4 grams of heroin on December 10, 1991. However, the court also found that the seven previous drug-smuggling trips were part of the same course of conduct and it concluded that the weight of the heroin imported by Shonubi in all eight of his drug-smuggling trips therefore could and should be aggregated for purposes of determining the appropriate base offense level. United States v. Shonubi, 998 F. Supp. 859, at 863-864 (E.D.N.Y. 1992) (Weinstein, J.)

 

     The trial court concluded that “[a]n overwhelming preponderance of the evidence ... establishe[d] that the defendant ... import[ed] [a total of] at least 3419.2 grams of heroin [during his eight drug-smuggling trips from Nigeria to the U.S.A.]” United States v. Shonubi, 802 F. Supp. 859, 864 (E.D.N.Y. 1992) (Weinstein, J.) The court arrived at this conclusion about the aggregate amount of heroin imported by Shonubi during 1990 and 1991 by multiplying (a) the amount of heroin that Shonubi imported on December 10, 1991 with (b) the number of drug-smuggling trips taken by Shonubi in 1990 and 1991. Thus, the court’s calculation was 427.4 grams x 8 = 3419.2 grams.

 

     Shonubi appealed his sentence. (The government cross-appealed, but the nature of that cross-appeal and its disposition are not relevant to the issues discussed in this report.) The Second Circuit Court of Appeals agreed with the trial court’s finding that Shonubi made at least eight trips from the U.S.A. to Nigeria and back during a 15-month period and that the purpose of each of these trips was to smuggle drugs. United States v. Shonubi, 998 F.2d 84, 89 (2d Cir. 1993). The Court of Appeals also agreed with the trial court that the amount of drugs imported on each of these trips could be aggregated for the purpose of determining the appropriate offense base level under the Sentencing Guidelines. Nonetheless, the Court of Appeals concluded that the trial court erred in finding that Shonubi imported a total of 3,419.2 grams of heroin. It said,  “[T]here is simply no proof [Shonubi] imported 427.4 grams of heroin on each of his seven other trips.” 998 F.2d at 89.  The Court then added, “Case law uniformly requires specific evidence . . . to calculate drug quantities for sentencing purposes.” Id. (emphasis added).

 

     The Court of Appeals remanded the Shonubi case to the trial court for resentencing. A statistical study by a consultant, Dr. David Boyum, has been submitted on behalf of the government for purposes of the resentencing proceeding in the trial court. The government urges the trial court to accept Boyum’s statistical study as sufficient proof that “Shonubi imported a total of at least 2090.2 grams of heroin during the seven trips that he took between September 1, 1990 and December 10, 1991.” Government’s Memorandum of Law on Resentencing Issues at p. 9. However, an expert in statistics, Michael O. Finkelstein, retained by the defense, is of the opinion that “Dr. Boyum’s analysis is not an adequate basis for estimating the total amount brought in by Shonubi in the seven trips made prior to his arrest.” Affidavit of Michael O. Finkelstein at p. 5 (November 4, 1994) , attached to Defense Memorandum of Law on Resentencing Issues (December 17, 1994).

 

II. Task & Issue Definition

 

     We have been appointed by the trial court to advise the court on the evidentiary, inferential, and statistical issues presented by Dr. David Boyum’s statistical analysis. Our primary task is to advise the court whether the Boyum report is sufficient proof that Shonubi imported enough heroin to place him within base offense level 32.

 

     In the last sentence of the preceding paragraph we have framed the issue about the amount of heroin imported by Shonubi in terms of an abstract legal concept -- the concept of “base offense level 32” -- rather than in more concrete terms -- such as in terms of a specific number of grams. We have expressed ourselves in this ambiguous and uncertain way to emphasize the possibility that although the parties to this action have framed the factual issue before the court with great precision, it is not entirely clear that they have chosen the correct issue. For example, as we have already noted, the government seems to assert that the issue before the court is whether “Shonubi imported a total of at least 2090.2 grams of heroin during the seven trips that he took between September 1, 1990 and December 10, 1991.” Government’s Memorandum of Law on Resentencing Issues at p. 9. On the government’s own premises, however, it is not clear that this is the correct description of the issue facing the court. A possible implication of the position taken by the government position in its resentencing memorandum is that the decisive question facing the trial court on resentencing is whether Shonubi imported a total of at least 1000 grams of heroin during his eight drug-smuggling trips. This implication is possible because the government seems to take the position that the Shonubi case falls (at least) within base offense level 32 for sentencing purposes if the aggregate weight of the heroin imported by Shonubi in his eight trips is at least 1,000 grams. Government’s Memorandum of Law on Resentencing Issues at p. 5. Hence, on the government’s premises, it may not matter whether the heroin imported by Shonubi weighs 1000 grams, 1500 grams, 2000 grams, or 2500 grams; the only issue that may matter is whether Shonubi imported at least 1000 grams. (Our understanding is that the government does not claim that the base offense level could be more than 32.)

 

     We recognize that we have been appointed by the court to offer our opinions about statistical and inferential issues and that it is not our job to offer our opinions about the accuracy of the government’s reading of the Federal Sentencing Guidelines. However, it is appropriate for us to note the evidentiary implications if the court agrees with the government’s interpretation of the Sentencing Guidelines. The proper formulation of the question about the amount of heroin involved in this case does depend in part on the court’s interpretation of the law. We only wish to emphasize that the precise way in which the issue about the amount of heroin involved in this case is formulated might turn out to be very important.

 

     Consider once again the government’s interpretation of Sentencing Guideline § 2D1.1(C). Stated most starkly, a possible implication of the government’s interpretation of that Guideline is that the precise issue before this court in this resentencing proceeding is not whether Shonubi imported 3,419.2 grams in his eight known drug-smuggling trips, or whether he imported 2,090.2 grams of heroin during the seven drug-smuggling trips that he took before December 10, 1991, or even whether Shonubi imported at least 1,000 grams of heroin during his first seven drug-smuggling trips, but whether Shonubi imported a total of only 572.6  grams of heroin (or more) in his seven drug-smuggling trips before December 10, 1991.[1]

 

     It is intuitively obvious that the resolution of a question about the sufficiency of evidence to show the importation of a particular quantity of heroin may depend on just how much heroin must be shown to have been imported. For example, if the government were obligated to show only that Shonubi imported an aggregate of one (1) gram of heroin in his seven drug-smuggling trips before December 10, 1991, it is doubtful in the extreme that anyone would question the sufficiency of the evidence in the record to show that Shonubi probably imported at least that much heroin. 

 

     As extreme and unrealistic as it is, our hypothetical case of a one-gram requirement suggests a possibility that may be relevant to the problem in the Shonubi case. If the issue facing the court is redefined so that the material question is not whether Shonubi imported 3,419.2 grams, 2,090.2 grams, or 1,000 grams of heroin, but whether Shonubi imported 572.6 grams of heroin during his first seven trips, the issue before the court may have to be redefined in yet another way. The difference between having to prove that Shonubi imported more than a total of 2,090.2 grams in seven trips and having to show only that he imported more than 572.6 grams in seven trips seems so great that we feel compelled to suggest the possibility that there is in fact sufficient non-statistical evidence in the record to support the inference that Shonubi imported at least 572.6 grams of heroin. However, since our mandate and our special knowledge are limited, we must leave it to the court and to the parties to investigate and discuss this possibility.[2] 

 

     We wish to note that the trial court encouraged us to discuss methods of inference apart from statistical methods; we were invited to discuss any inferential strategies that we think may be relevant to the factual and inferential problem facing the court. Since both of us have devoted a substantial amount of our careers to the study of various methods of marshalling evidence and organizing thoughts about evidence, we were grateful for the trial court’s invitation. We regret to report, however, that we have relatively little to say here about non-statistical inferential strategies. There are several reasons for this. First, we believe that there is no intrinsic or inherent incompatibility between statistical methods of inference or argument and other methods of inference or argument. If we were to discuss all possible methods of wresting conclusions from evidence, we would stray far from our mandate, which is, primarily, to evaluate Dr. Boyum’s statistical argument. Second, our examination of the record suggested to us that the record is relatively bereft of “Shonubi-specific” evidence about the amount of heroin that he may have imported during his first seven trips from Nigeria to New York. Third, although we have studied various methods of marshalling evidence with considerable care and although we believe that our research may have given us some special ability to identify such methods and their special properties, we rather doubt that we have any special facility in using the “natural” evidence marshalling strategies that we discuss. For one thing, most of the evidence marshalling strategies that we have studied are “natural” -- that is, they are methods that lawyers and other people already use, without being told to do so by the likes of people such as ourselves. Moreover, although we have developed tools that we believe may assist people in organizing their thoughts about evidence, we have no reason to believe that we can use such tools any better than anyone else can. In any event, the business of drawing factual inferences is, at bottom, a subjective business. We have no wish to claim that our subjective judgments about the amount of heroin that Shonubi imported are any better than anyone else’s judgments. Nonetheless, we are in a position to make some suggestions about what a well-ordered inferential argument of a particular type might look like. Thus, we make a brief comment about an possible type of incentive-and-risk approach that might be used in an analysis of some drug quantity problems that are similar to the one facing the court in the Shonubi case. More important for present purposes, we can and do illustrate how non-statistical methods of marshalling evidence can affect the relevance and persuasiveness of statistical analysis. We claim that statistical analysis always rests on a variety of assumptions. We illustrate how one non-statistical method of marshalling evidence has the capacity to undermine one of the key assumptions -- an independence assumption -- of Dr. Boyum’s particular statistical analysis.

 

 

III. Species of Evidence: Direct Evidence,

Indirect Evidence, and Statistical Evidence

 

     The Court of Appeals did not disturb the trial court’s finding that Charles Shonubi imported 427.4 grams of heroin into the United States on December 10, 1991. Moreover, the Court of Appeals agreed with the trial court’s inference that Shonubi made seven prior trips to Nigeria and that he brought heroin into the United States on each of those trips. See United States v. Shonubi, 998 F. 2d 84, 89 (2d Cir. 1993). Hence, the factual issue now before the court involves a question only about the amount of heroin imported by Shonubi. Moreover, if the amount of heroin imported by Shonubi on his eighth and final smuggling trip is taken as settled, the court has to deal only with the uncertainty about the amount of heroin that Shonubi imported in his first seven known drug-smuggling trips.

 

     There are a number of possible methods of drawing inferences about drug quantities in cases like Shonubi’s. However, in the absence of direct evidence about the amount of heroin imported by Shonubi in his first seven drug-smuggling trips -- in the absence, that is, of evidence such as testimonial reports of first-hand observations of the amounts imported by Shonubi on those trips --, the court’s only option is to rely on indirect methods of drawing inferences about the amount of heroin imported by Shonubi.[3]

 

     Statistical evidence is a special form of indirect evidence. Certain kinds of problems are associated with the use of any kind of statistical method. We will discuss these problems in passing. However, we will also discuss problems that are specific to Boyum’s statistical analysis. In particular, we will emphasize that Boyum’s analysis relies on data about the behavior of people other than Shonubi although the matter in issue, of course, concerns Shonubi’s behavior. As we have said, statistical evidence is indirect evidence. Boyum’s analysis is a particularly indirect form of indirect evidence.

 

IV. Comments on Statistical Analyses Submitted on

Behalf of the Government by Dr. David Boyum

 

     A. Some General Limitations of Statistical Evidence and Analysis. Dr. David A. Boyum has submitted a statistical analysis on behalf of the government. Boyum’s analysis rests upon data provided by the U.S. Customs Service about the amounts of heroin in balloons "passed" by 117 other persons who were apprehended for drug smuggling on their return from Nigeria. Hence, Boyum restricted his attention to a particular “reference class” -- the class 117 other balloon-swallowing drug smugglers who were apprehended on their return from Nigeria -- as the basis for drawing inferences about Shonubi.

 

     It is important to recognize that Boyum’s choice of a reference class is not ineluctable. The same is true of the conclusions that Boyum draws from his analysis of the data provided by the Customs Service. Statisticians are in the unenviable position of having choose among alternative ways of presenting statistical evidence that are all naturally misleading to some degree. This does not mean that statisticians are themselves perverse. There is no descriptive or inferential statistical method that excludes the possibility of reasonable controversy. Nature is complex and the meaning of the information obtained from nature is rarely obvious.

 

     It is useful to characterize the task of statisticians as being that of choosing the method or methods that are minimally misleading. Moreover, since it is usually not obvious which method or methods is least misleading, it is important to keep in mind  that two or more equally-qualified statisticians may justifiably disagree about the most appropriate method for obtaining data and about the meaning of any data they have obtained. We cannot assert as an absolute truth that any given method of obtaining data or any interpretation of such data is or is not correct. What we can and will do, however, is to spell out the assumptions on which statistical evidence and analysis such as Boyum’s rest and we can spell out the reasons why we do or do not find such the underlying assumptions of a study such as Boyum’s defensible or plausible. In actual fact, we have several concerns both about Boyum’s method of presenting data and his interpretation of his data. We will try to spell out the nature of our concerns about those matters.   

 

     B. Questions of Method in the Boyum Study. Boyum employed a method known as “Monte Carlo” for obtaining estimates of the total amount of heroin that Shonubi might have carried during his first seven trips to Nigeria and back in 1990 and 1991. It is important to recognize that a Monte Carlo method is a type of simulation. Such simulation methods are used with  great frequency and in many cases they yield useful results. They are often used in situations in which it is difficult or impossible to obtain more directly relevant data by other means. Although simulations can be useful, any simulation, including the Monte Carlo method, rests on various assumptions. If the assumptions make no sense, the simulation cannot represent how the events of interest are generated in nature, or “real life.” We will discuss the underlying assumptions of Boyum’s analysis after we describe Boyum’s Monte Carlo method and evaluate some of his characterizations of that method.  

 

     1. The Frequency Distribution in the Boyum Study. Using the data supplied by the U.S. Customs Service, Boyum plotted a frequency distribution showing the amounts of narcotics (net weight) taken from 117 balloon-swallowers at the time they were apprehended. Each of the 117 observations on which this frequency distribution was based was categorized in one of thirteen arbitrarily-determined class intervals or weight ranges. The categorization of frequency distributions on the basis of class intervals is regularly employed by statisticians in order to infer the form of the underlying theoretical probability distribution that may be "generating" or producing the data one observes.

 

     It should be noted that the choice of class interval size is completely arbitrary and determines what a frequency distribution will look like. For example, if Boyum had plotted all 117 values without grouping them into class interval or weight ranges the frequency distribution would have looked very flat. Boyum in fact used class intervals of 100 grams in constructing his frequency distribution. As a result of his choice of this class interval, the frequency distribution plotted by Boyum does not look flat. See Exhibit F to Boyum Affidavit or Government's Memorandum at p. 7.)

 

     It is also very important to note that the frequency distribution constructed by Boyum shows the net weights of heroin taken from 117 people on just one occasion from each person, the single occasion on which each of these 117 people was apprehended. Hence, this frequency distribution by itself says nothing about the amounts of narcotics that any of these 117 persons may have carried on any other trips they may have taken -- trips that did not eventuate in an arrest. 

 

     2. The Monte Carlo Method and the Normal Probability Distribution. On page 5 of Boyum Affidavit, Boyum gives a curious reason for his choice of a Monte Carlo simulation. He states, "I chose to use the 'Monte Carlo' approach instead of applying a statistical formula to the data because the charted data did not neatly match a standard probability distribution, such as a 'bell' curve." However, Boyum then proceeds to use exactly such a curve to present certain calculations (the results of which he incorrectly refers to as “probabilities”). This suggests that Boyum's choice of method rests only on statistical convenience and not on any reasoned argument. As it turns out, the "bell-shaped" curve that Boyum mentions emerges, naturally and necessarily, simply because of the kind of sampling operation that Boyum uses. This point is discussed further below, in the next part of this discussion of issues relating to method.

 

     It appears that the "bell" curve that Boyum has in mind is the Gaussian or Normal probability distribution. This probability distribution is unimodal but it is also symmetrical about its mean, median, and mode (all of which have the same numerical value in the Normal distribution). Boyum’s obtained relative frequency distribution (Boyum Affidavit, Exhibit F) for the weight ranges he used is unimodal but asymmetrical; that is, it is skewed to the right. This means that more frequencies are piled up to the left of the center of this distribution. In the sample of 117 cases, there are more small amounts than there are very large amounts. (In the obtained frequency distribution we have just one very risk-prone individual who had between 1200 and 1300 grams of heroin in his stomach when he was apprehended).

 

     The mean of a frequency distribution indicates its center of gravity, the point at which it would balance if it were taken as a system of weights. Boyum calculated the mean in his distribution to be 432.1 grams. The median is the point below which 50% of all observations fall and above which 50% of all observations fall. Boyum's obtained median is 414.5 grams. The mode is simply the most frequently-observed class interval or weight range. In Boyum's distribution the modal weight range is 300-400 grams.

 

     3. The Meaning of “Random Sampling” in the Boyum Analysis. Boyum tells us that he calculated the mean and standard deviation in the frequency distribution mentioned above, a frequency distribution that was based on 117 observations. The calculated values were: sample mean = 432.1 grams, and standard deviation = 172.6 grams. He then used these two values as estimates of the two parameters µ (population mean) and s [sigma] (population standard deviation) of a Gaussian or Normal probability distribution.

 

     Boyum tells us only that he wrote a program to generate his simulation data “at random.” Boyum Affidavit at p. 5. People sometimes construe the words "at random" to mean that every outcome is equally likely. This, however, is not the meaning of “random sampling” in the Boyum study. The numbers that Boyum generated in his simulation were generated from a bell-shaped distribution, not a flat or uniform distribution. Hence, he chose a method of random sampling that did not make each number equally likely. The only apparent reason for Boyum’s choice is statistical convenience.

 

     4. Numbers as Representations of the Outcomes of “Trips.” Boyum tells us that he used a computer suitably programmed to generate observations at random from a distribution having parameters µ = 432.1 and sigma = 172.6. He also tells us that he generated 100,000 sets of seven numbers from this distribution. In Boyum’s view, each such set of seven numbers represents the amount of heroin a person might bring back to the U.S. in seven trips from Nigeria to the United States. As we explain below, however, there is considerable difficulty with such an interpretation.

 

     For each set of seven numbers Boyum’s computer found the total weight across the seven hypothetical "trips". He thus obtained 100,000 randomly-generated values of total heroin weight across seven simulated trips. The distribution of these total weights (across seven hypothetical "trips") is shown in Boyum Affidavit as Exhibit H. This distribution looks very Gaussian or Normal in form. However, this is no surprise. The sampling operation that Boyum used made it inevitable that the distribution of the total weight of heroin across seven “trips” would look Gaussian.

 

     Statistics are random variables having certain probability distributions called sampling distributions. Boyum generated sets of seven numbers at random and calculated the sum of these numbers. This sum is a statistic. It is well-known that the probability of sums of numbers drawn independently from virtually any assumed distribution will be Normal or Gaussian in form regardless of what form this assumed distribution really is. (We discuss the matter of independence a bit later.). The Normal form of sampling distributions in this case is a consequence of what is known as the Central Limit Theorem. The larger the sample size on which a statistic like a sum is based, the more closely the sampling distribution for this sum will come to a Normal probability distribution. The convergence process that takes place here can come about quite rapidly in many situations and it does so in this case. Even with sample sizes of just seven, the convergence to the Normal distribution occurs rapidly. Hence, the distribution shown in Exhibit H is in fact an estimate of a sampling distribution that will appear normal in form by the Central Limit Theorem. (Notice that the sampling  distribution in Exhibit H is a discrete distribution and it is another frequency distribution, not a probability distribution. This is because Exhibit H is based on a sample of generated observations, though admittedly a very large one.)

 

     From the distribution shown in Exhibit H, Boyum formed a cumulative frequency distribution and faired in the smooth curve he presents in Exhibit I. It is from this smoothed curve that he obtains an estimate of the probability that the total weight in seven simulated trips is at least some number. For example, he determines that in any such set of seven trips the estimated probability of a total weight  of at least 2090.2 grams is 0.99. (Boyum Affidavit p. 5). This much is unproblematic, but Boyum gets into a bit of trouble when he says, "According to the generated distributions, there is a 99% probability that Shonubi carried at least 2090.2 grams of heroin on the seven trips combined..." Boyum Affidavit p. 9.

 

     One difficulty is that Boyum does not say that the figure 0.99 is in fact an estimated probability. Moreover, whether estimated or not, the probability calculated by Boyum -- 0.99 -- has very little if anything to do with Mr. Shonubi. Indeed, the figure 0.99 may have little to do even with the total amount of heroin brought back on all trips taken by any of the 117 persons whose data formed the basis for this simulation. Recall that the net weight data represents amounts of heroin found on (in?) each of 117 people during just a single trip -- the one on which they were apprehended. We have no data about the amounts of heroin that any these people brought into the United States during trips that did not culminate in an arrest. The conclusions that Boyum draws about the behavior of Shonubi and, tacitly, about the behavior of the 117 members of the reference class rest on certain critical assumptions. As we explain next, we are not comfortable with all of those assumptions.

 

     C. Evaluation of the Assumptions of Boyum’s Analysis. Boyum explicitly assumed that Mr. Shonubi was a "typical heroin swallower". Government's Memorandum, p. 8. Stated differently, Boyum assumed that Shonubi belonged in a reference class consisting of all other persons apprehended with heroin that was acquired in Nigeria, put in balloons, and then swallowed. Boyum’s assumption that this is an appropriate reference class may be reasonable; if Shonubi is typical or substantially typical of persons in this reference class, it may be possible to draw at least some tentative conclusions about Shonubi based on our knowledge of the behavior of other members of the same reference class. Even if Boyum’s choice of a reference class is defensible, however, the question of how much we know about the other 117 members of this reference class remains unanswered. As we explain in the course of our discussion below, it turns out that we may not know as much as we think we do. (The central difficulty is that we may not know much about the behavior of the 117 members of the reference class in trips they took before the trip that culminated in their arrest.)

 

     We believe that Boyum’s simulation rests on one important assumption not made explicit in Boyum’s report. Boyum states that the 100,000 sets of seven numbers that were generated for his study were randomly chosen. Boyum Affidavit, p. 5. We strongly suspect that these seven numbers, which are taken to be (hypothetical) “observations,” were chosen by the computer independently. If that is in fact the case, it follows that for any single (hypothetical) “trip” the number that the computer generated to represent the weight of heroin carried on that trip did not depend in any way depended on the number that the computer generated for any prior (hypothetical) trip. Similarly, the total weight for one set of seven trips depended in no way on the total weight generated for any other set of seven trips.

 

     There are two major difficulties with Boyum’s unstated independence assumption. First, it is hard to believe that for Shonubi or for any other balloon-swallower there is no dependence at all among the amounts ingested on successive trips. Second, it may not make any sense to assume that there is complete independence from swallower to swallower. Each set of seven trips in Boyum's computer generation presumably refers to different persons who made seven trips. If so, it is reasonable to assume that some of the swallowers know each other and compare notes about their experiences with their risky behavior. Perhaps there are groups of couriers who work for the same wholesale dealers.

 

     Independence assumptions are commonly made in statistical analyses. The issue of independence assumptions is often related to the question of convenience. It is very convenient to assume independent trials in situations in which one does not want to take the trouble to imagine possible forms of nonindependence. However, independence assumptions may not make any sense. If they do not make sense, they may invalidate an entire analysis. There are strong reasons to believe that Boyum’s assumption of independence was unwarranted. It is hard to imagine that a person’s drug-smuggling behavior on one occasion has no significant influence on his drug-smuggling behavior on a later occasion. In general, repeated episodes of human behavior exhibit many kinds of nonindependencies. Moreover, as we will explain below in Section VII, some particular types of dependencies are especially likely to be found in problems involving a sequence of similar human actions.

 

     D. Some Problems with the Government’s and Boyum’s Interpretations of the Results of the Monte Carlo Simulation. There are several misleading statements about Boyum’s simulation in both the Boyum Affidavit and in the Government's Memorandum. Exhibits H and I are described in a potentially misleading way. They are described as 100,000 simulations of seven trips. Boyum Affidavit, p.6. See also Government's Memorandum, p. 16. It would be more accurate to say that these distributions concern 100,000 sets of hypothetical trips taken by seven randomly-selected individuals from the 117 people who were apprehended on their last trip.

 

     There are two misleading statements about probabilities in both the Boyum Affidavit and in the Government's Memorandum. The Boyum Affidavit (page 6) contains the assertion, "I have charted the probability distribution of the total net weights generated by the computer simulations and the cumulative distribution" (our emphasis). The Government's Memorandum (page 8) contains the assertion, "Mr. Boyum then charted the probability distribution of total net weights generated by the simulations and the cumulative distribution" (emphasis is ours). These two statements are misleading because there is no such thing as "the" probability distribution for Shonubi or for anyone else. A reader may conclude from either of the above two statements that the Monte Carlo process is capable of generating "true" or exact probabilities. This is most definitely not the case.

 

     As it happens, both Boyum and Carter hedge their statements (quoted directly above) by adding the phrase "generated by the simulations." Although this hedge is entirely appropriate, it robs the two statements quoted above of much or all of their significance. There is an infinity of possible probability distributions that might be applied in the present case. Each of these probability distributions rests on different sets of assumptions. No particular probability distribution has significance without a demonstration that its underlying assumptions are reasonable.

  

     Any interpretation of the probability estimates that Boyum's Monte Carlo methods may yield must take into account another arbitrary feature of Boyum’s analysis. Boyum generated 100,000 values of net weights, across seven hypothetical trips, in his simulation. The question is why 100,000. Why not 1,000, 10,000, or a million? The government explains Boyum’s choice of 100,00 simulation trials in the following way:

 

A much smaller number of simulations might produce less accurate estimates due to the influence of chance. With the simulation of 100,000 trips, however, the 'law of averages' insures that the influence of chance is negligible.6 (Government's Memorandum, p. 17.)

 

In footnote 6 the government adds,

 

The law of averages is an axiom in statistics that states that the more times one repeats an event that has a random outcome, the closer one gets statistically to the true probability distribution of the outcome.

 

     There is a simple error in the first quotation. As we read the Boyum Affidavit (page 5), the simulation consisted of the generation of 100,000 sets of seven numbers (700,000 numbers in all). Hence, it is incorrect to say that Boyum simulated 1000,000 trips; it is correct to say that the simulation consisted of 100,000 generations, or sets, of seven trips.

 

     The first quotation also contains an erroneous interpretation concerning “chance.” An increase in sample size actually reduces, not the “influence of chance,” but the variance and standard deviation of any sampling distribution such as the one Boyum determined in his 100,000 trials. (Boyum generated 100,000 samples of seven numbers and determined the frequency distribution of the sum of the numbers across each set of seven numbers, which indicates "total weight"). Chance is omnipresent. The quotation makes it sound as if taking a large sample somehow removes chance. What is reduced is only sampling variation. This means that there is less variability in any distribution of the sum of seven numbers chosen randomly than there is in the variability in a distribution of single numbers chosen randomly.

 

     The real trouble, however, comes in the statement made in footnote 6 of the Government’s Memorandum. There are three difficulties with that statement. First, what the government calls "the law of averages" is actually a collection of laws called the "laws of large numbers." Second, the so-called laws of large numbers are not axioms, or things taken for granted. They are provable theorems based on other axioms.

 

     The third difficulty is that the laws of large numbers do not concern probability distributions themselves, but only the parameters of probability distributions. These laws concern convergence processes that take place between a statistic (like the sample mean) and a population parameter (like population mean µ) when the sample size upon which the statistic is based is increased without limit. It is entirely inaccurate to say, as is said in footnote 6, that the law of averages allows one to get statistically closer to "the true probability distribution." We cannot put any finer point on it than this: there is no such thing as "the probability distribution" shown by Boyum’s study for Shonubi (or, for that matter, for any other individual in the chosen reference class). Boyum made many assumptions and arbitrary choices en route to the ultimate determination of the cumulative curve shown in Exhibit H of Boyum’s Affidavit. Boyum could have generated 100,000 trillion numbers and yet have come no closer to determining "the" probability distribution regarding Shonubi. Such large collections of numbers say nothing about “true” probabilities if, for example, Boyum's assumptions were wrong.

 

     These statistical issues aside, there is a matter that disturbs us even more about the use of Monte Carlo simulations as a basis for making judgments about individuals. Boyum is in the world of facts when he makes his choice of a reference class for Shonubi and when he constructs his initial frequency distribution on the basis of the data provided by the U.S. Customs Service. However, this distribution concerns only the amounts of heroin taken from persons on trips that eventuated in an arrest. When Boyum proceeds to use these 117 items of data as a basis for generating 100,000 sets of seven "trips," he leaves the world of facts and enters a world of fiction.

 

     A simulation produces a fictional account of what might be happening in some interesting part of the world. Our dictionary defines a “simulation” as a sham object, a counterfeit, an imitative representation of the functioning of one system or process by means of the functioning of another. Webster’s New Collegiate Dictionary, p. 1074 (1980). What we see in Boyum’s study is an effort to compare the behavior of an individual (Shonubi) with the functioning of a computer in generating fictitious episodes of drug smuggling. Fictions can be useful if they are sensible and if they are used with appropriate caution when they are applied to real events or sequences of events, particularly those involving individuals. However, the 100,000 sets of “trips” generated by Boyum are entirely fictional because, to our knowledge, there are no records of the amounts of heroin any person actually brought into the U.S. during seven trips from Nigeria.

 

     When courts in proceedings such as this one are required to consider the use of simulation evidence they have to decide the extent to which any given simulation succeeds in capturing matters that are relevant to a determination of matters such as Shonubi's probable past behavior. It is important to understand that fictional episodes of human behavior may be unpersuasive even if the simulation generates a very large number of them.

 

     We have already noted that although Boyum made an independence assumption, this assumption was left unstated both in the Boyum Affidavit and the Government's Memorandum. The reason why we are confident that independent sampling was assumed in this simulation is that, if it had not been assumed, Boyum would have been obliged to specify the particular form of nonindependence that his computer implemented when it generated 100,000 instances of seven "trips."

 

     The independence assumption made by Boyum and implemented by his computer allows for some preposterous instances of seven trips. For example, it would permit  (though with low probability) the generation of seven trips in which the fictitious traveler person carried over 1000 grams in his or her stomach on each trip. It would also allow, with high probability, the generation of sets of seven trips in which more heroin was carried during the early  trips than during the later trips. For reasons we discuss in Section VII below, we cannot believe that the behavior of persons who engage in balloon-swallowing is entirely unsystematic. In the independent sampling employed in this simulation, the generated sets of seven "trips" were entirely unsystematic, made without rhyme or reason. This we cannot believe characterized the actual behavior of Shonubi or of any other person in the reference class to which he is being compared.

   

V. Comments on Michael Finkelstein’s Report

 

     David Secular, Shonubi’s attorney, has submitted a legal memorandum on resentencing issues. See Defense Memorandum of Law on Resentencing Issues (November 17, 1994). The defense has also submitted an affidavit by Michael O. Finkelstein, Esq. (This affidavit, dated November 8, 1994, is attached as an appendix to the legal memorandum by Secular just mentioned.) Both the defense’s legal memorandum and Finkelstein’s affidavit comment on Boyum’s statistical analysis.     Finkelstein has two main objections to Boyum's study. We discuss these two objections, as well as one objection that he dismisses.

 

     1. Interperson Variability and Trip Variability. Finkelstein finds fault with Boyum for using interperson variability in heroin amounts to estimate trip variability in heroin amounts for individual balloon-swallowers. Affidavit of Michael O. Finkelstein [hereafter “Finkelstein Affidavit”] at p. 4, ¶ 8. Finkelstein’s point can be illustrated by means
of the frequency distribution given in Boyum's Exhibit F. This distribution there shows the net weights of heroin known to have been brought into the U.S. by 117 other persons. (These net weights are the amounts of narcotics that balloon-swallowing drug smugglers "passed" after their apprehension at JFK Airport.) This frequency distribution gives one picture of the variability from person to person based on the net weight of heroin they brought in during one trip -- the trip which culminated in their apprehension. Boyum provides an appropriate statistical measure of this interperson variability. He reports a standard deviation in this frequency distribution of 172.6 grams. If we are to provide an example of trip variation, however, we must resort to a hypothetical fact situation. The reason is that the court has been given no evidence that measures this kind of variation. This is one of Finkelstein's major points.

 

     Imagine that Person X has taken seven trips to Nigeria and has brought back heroin on each occasion. Person X confesses to seven prior episodes of heroin smuggling and reports that he smuggled the the following amounts on each trip:


         

   Trip

 

Amount(grams)

     1

 

     75

     2

 

     93

     3

 

    135

     4

 

    176

     5

 

    230

     6

 

    220

     7

 

    300

 

There is variation here -- from 75 to 300 grams --, but it concerns the variation in the amount of heroin that Person X brought back in his or her seven trips. As a measure of the variation in this set of seven trips made by Person X, we determine its standard deviation, which is 80.62 grams.

 

     Suppose that there were data like those in the table above for all 117 persons in the sample used by Boyum; suppose, for example, that we have trustworthy confessions from all 117 of these people and that each one of them confesses to taking other trips and informs us of the amount he or she brought in on each trip. If we had this information we might find the trip standard deviation for each of these 117 persons and then determine the average of these standard deviations. Finkelstein's point is that the interperson variation in the frequency distribution across 117 persons in Boyum's frequency distribution (whose standard deviation = 172.6 grams) cannot be taken as an estimate of the variation across person X's seven trips or the average of the trip standard deviations taken across other trips made by each one of the 117 persons in Boyum's sample.

 

     We concur with Finkelstein on this point; we have said as much in our own analysis in Section IV-B above, but in other terms. One of the reasons for the fictitious nature of Boyum’s probability calculations about Shonubi and the amounts of heroin he carried on seven prior trips involves Boyum’s equating of these two sources of variation. These two sources of variation are, in fact, not the same. The 117 cases forming the basis for Boyum's analysis provide no information about trip-to-trip variation for any one of these persons or for any measure of such variation when it is also taken across all 117 persons.

 

     2. Sampling with Replacement. Finkelstein observes that it must be assumed that Boyum's computer-based generation of hypothetical trips was done with replacement. Finkelstein Affidavit at p. 3, ¶ 6. This comment essentially refers to the independence issue that we discussed in Section IV-C above. Sampling with replacement from a finite population  means essentially that the probabilities in effect remain the same from trial to trial in a sampling operation. In Sections IV-C and IV-D we noted just some of the preposterous things that can be the result of assuming such trial independence with respect to human behavior. We offer a few more observations about this matter in Section VII below.

 

     3. Boyum’s Probability Calculations. Finkelstein says he does not question Boyum's calculations but only their application to Shonubi. Finkelstein Affidavit, p. 3, ¶ 6. We have a somewhat different view of Boyum's calculations. The probability calculations that Boyum made on the basis of his simulation make sense only if his simulation methods trap critical features of the part of the world being simulated. The part of the world of interest in this case concerns Shonubi and his past behavior. For reasons discussed above in Section IV-D, we have considerable doubts about the extent to which Boyum's simulation succeeded trapping the critical features of this part of the world.

 

 

VI. Illustration of an Alternative Statistical Analysis

 

     A difficult question for Judge Weinstein and other judges who will be called upon to make sentencing decisions in the future in situations such as this one is whether any statistical analysis of past cases is helpful in assessing novel individual cases. The same problem arises in medical diagnosis. Physicians now often face the question of the extent to which a statistic, compiled over other patients in the past, is relevant to diagnoses regarding a particular new patient.

 

     We have already noted that any statistical method rests on certain assumptions that may or may not hold in particular situations in which this method is used to support an inference. In addition, every statistic requires the making of judgments that are quite arbitrary. In view of these lamentable considerations, it is not surprising that there are many other statistical strategies that might have been used as a basis for inferring the total amount of heroin Shonubi might have been carried on his first seven drug-smuggling trips from Nigeria.

 

     We wish to describe one possible alternative statistical method. One of the virtues of this alternative method is that it rests only upon the 117 observations provided by the U. S. Customs Service; in other words, our alternative method does not involve the generation of hypothetical cases that were never observed. We describe this alternative method of statistical analysis partly because it might be of value in this case in its own right, but mainly because it illustrates certain important features of any statistical analysis of data, including the one done by Boyum.

 

     The U.S. Customs Service data contain information about 117 persons who are believed to have brought heroin into the U.S. in balloons (or prophylactics) that they swallowed. The use of this data as a basis for inferring the behavior of other people on other trips involves formidable difficulties. As we have already noted, the frequency distribution Boyum shows us (Boyum Affidavit, Exhibit F) considers just one way in which these 117 persons varied among themselves: in terms of the amount of heroin they were carrying when they were apprehended. These 117 persons, however, almost certainly varied among themselves in at least one other way: in terms of the number of prior trips they had taken. It is important to consider this variable. For example, it is unlikely that the man who swallowed the balloons with heroin that weighed a total of 2,093 grams was on his first trip.  Moreover, it is likely balloon-swallowing is a gastronomic "art" that may improve over time. (See our discussion of this possibility in Section VII below.)

 

     If the amount of heroin imported may be affected by the number of trips taken, any statistics calculated from this base of 117 cases -- such as Boyum's mean, median, and standard deviation -- must be taken over different numbers of prior trips made by the 117 persons in the Customs Service as well as over different heroin weights on the trip on which a person was apprehended. It seems prodigiously unlikely that all of the 117 persons whose weight data were supplied to Boyum had taken exactly the same number of prior trips. The difficulty, of course, is that the data available to the court do not show how many prior trips any of them had actually taken and how much heroin they may have imported on each such prior trip. All that we can reasonably infer is that these 117 persons had taken different numbers of prior heroin-smuggling trips and, if they such trips, that they imported different amounts of heroin on each such trip. The alternative statistical method described below may avoid some of these difficulties because it does not involve a simulation or the generation of fictitious cases. (Our method also has the advantage of being conservative in favor of Shonubi.)

 

     The alternative strategy we describe rests entirely on the modal class interval or weight range. (The mode in this case indicates the most frequently-observed weight range.) Our first illustration of how this alternative strategy works will rely on the frequency distribution that Boyum constructed. The class interval size of this distribution is 100 grams.[4] As shown in Boyum Affidavit Exhibit F and in Government's Memorandum (page 6), the modal class interval in this frequency distribution is the weight range 300-400 grams. This means that this weight range was the most frequently-observed range of heroin weights in the sample of 117 persons; 32 of the 117 persons fall in this weight range.

 

     For reasons mentioned above, we suspect that the 32 persons in the weight range of 300-400 grams had taken different numbers of prior trips on which they brought heroin into the U.S; we believe that some of them had taken few or perhaps no prior trips, while others had taken several or many. Hence, the modal weight range of 300-400 grams of heroin might be taken as a range of heroin weight typical across people who had taken different numbers of prior trips.

 

     The hypothesis that 300-400 grams represents the typical range of heroin imported by people who have taken a varying number of prior trips is, of course, an assumption. The frequency data available to us concern only the weights of heroin 117 people were carrying on the single trip on which they were apprehended. The question, however, is whether our assumption is a reasonable one. We can begin by making the point that it is possible that some of the 32 persons who are known to have carried 300-400 grams were arrested on their first trip, some on their second, some on their third, and so on. We might also argue that these possibilities are reasonable and likely possibilities. If this assumption or argument seems reasonable, we can also argue that it is reasonable to assume that the "typical" amount of heroin these 32 persons carried on any of their prior trips was 300-400 grams.

 

     We could take the additional statistical step of calculating the mean, median, or mode within this modal weight range (300-400 grams) and use it as a basis for inferring the typical amount of heroin that Shonubi or any member of his reference class might have been carrying on each prior trip. However, for reasons involving conservatism with respect to Mr. Shonubi, we will not do so. We could instead Suppose take the lowest value in this modal weight range as an estimate of the typical amount of heroin that Shonubi might have carried on each of his seven prior trips. Suppose we do so.

 

     The lowest value in the modal weight range of Shonubi’s reference class is 300 grams. Across the eight trips that we believe Shonubi made, he carried more than 300 grams. We know this for a fact. That’s because we know that Shonubi was carrying 427.4 grams when he was apprehended. However, we are now assuming that sometimes Shonubi carried less. Perhaps, for example, on his early trips perhaps he was letting his stomach and bowels become accustomed to the abuse he was inflicting on them.

 

     The answer that arises from the strategy just described is the figure 7(300) + 427.4 grams = 2527.4 grams. This figure represents the total grams across all eight trips that it is believed that Shonubi made. There are two reasons why this a conservative figure. First, we have taken the lowest possible value in the weight range that occurred most frequently in the 117 persons whose data form the reference class to which Shonubi can reasonably be compared. Second, this approach gives a smaller typical weight than taking either the mean, median, or mode in the class interval 300-400. Our approach certainly yields a more conservative figure than the mean or median in the overall sample of 117 cases provided by the U.S. Customs Service. Recall that Boyum calculated the overall mean to be 432.1 grams and the overall median to be 414.5 grams.

 

     Having made our argument about the apparent merits of a modal statistical strategy, we now wish to use our own suggested strategy to make the point that the choice of 100 grams as a class interval size in the frequency distribution for the U.S. Customs data is arbitrary. We could either increase or decrease class interval size. The table shown below illustrates how our choice of class interval size alone influences what we would regard as even a conservative value for the "typical" amount of heroin Shonubi might have been carrying on each trip. Changing the class interval size may seem trivial but it is not. As illustrated in the table below, changing the class interval size can affect the location of the modal weight range. 

         

Class

Interval Size

 

Modal Weight Range

Number of Cases

"Conservative" Estimate Per Trip

"Conservative" Estimate for Seven Trips

100

300-400 gr

32

      300 gr

  2100  gr

200

400-600 gr

52

      400 gr

  3200  gr

 50

350-400 gr

14

      350 gr

  2450  gr

 25

350-375 gr

12

      350 gr

  2450  gr

 

     This simple, almost trivial, analysis of the 117 data provided by the U.S. Customs Service produces some interesting results.

 

     1) The first row of the above table shows the determination we just made based on Boyum's choice of a class interval size of 100. Notice that this choice produces a "conservative" estimate that Shonubi (or anyone else in this reference class who made seven prior trips) carried a total of 2100 grams on these seven trips. This total falls within just 9.8 grams from a figure Boyum himself reported based on the generation of 100,000 sets of seven fictitious trips. He claimed that the probability was 99% that Shonubi carried at least 2,090.2 grams on his seven prior trips. The large value of this probability was, however, a result of various questionable assumptions Boyum made, and was of course an artifact of his having generated 100,000 cases. If we have enough cases to work with we can report very large probabilities. There were only 117 data points provided by the U.S. Customs Service. The trouble is that Boyum's 100,000 cases are all fictitious. Notice that the method we are now illustrating involves no probability calculations. We could perform none ourselves without making further assumptions that would be difficult to defend as far as their application to Shonubi is concerned.

    

     2) In Row 2 we show how, just by arbitrarily increasing the class interval size for the U.S. Customs Service data, we can produce a "conservative" result that differs by only a small amount from the total across seven trips inferred by Judge Weinstein in his original sentencing decision. In this case we have a modal class interval of between 400-600 grams per trip. The most conservative point in this interval is, of course, 400. Across seven trips, the total amount is 2,800 grams, an amount that differs by only 191.8 grams from the amount Judge Weinstein inferred. Notice that the modal class interval here contains the greatest number of cases (52) of any class interval sizes we are examining. 

 

     3) Rows 3 and 4 simply show how decreasing the size of the class intervals for the U.S. Customs data produces a per-trip estimate that falls between the ones mentioned in 1) and 2) above. As these two examples illustrate, the smaller the class interval size the fewer the number of cases are in the modal class interval.

 

     Clearly, when we say we are being "conservative" with respect to Shonubi in making these statistical estimates, this apparent conservatism is always relative. In the above illustrations it is relative to the apparently innocuous act of deciding how we are to report our data in the form of a frequency distribution. The limiting case of such conservatism would, of course, be the one in which we took just one class interval from zero to 1,300 grams. (All 117 cases fall within this class interval.) In this case we would report that Shonubi brought in zero grams on each of his seven previous trips. Apparently, this would be objectionable even to Finkelstein, who assumed that each of Shonubi's seven trips were for the purpose of smuggling heroin. 

 

     The bottom line is that the uncritical acceptance of any statistical method invites us to be misled. By the same token, the rejection of all statistical evidence may leave us ill-informed when we do not have to be. The specter of trial by statistics frightens many persons, particularly those among us who are aware of at least some of the difficulties inherent in statistical analyses of any form when they are used as a basis for inferences about individual persons such as Shonubi. The point of our examples is only that all statistical analyses rest upon assumptions and upon arbitrary choices. One moral of our story is that statistical analyses must be treated with great caution, with an awareness of the assumptions and the arbitrary choices on which they rest. However, the other moral of our story -- as odd as it may sound -- is that sometimes it makes sense to make use of statistical analyses that rest on assumptions and some arbitrary choices. Hence, it is true, on the one hand, that the 117 cases provided by the U.S. Customs Service cannot yield “specific evidence” about Shonubi. On the other hand, the Customs Service data do provide evidence about a reference c