Negative Evidence versus Scientific Critique

Negative Evidence versus Scientific Critique.

-- In defense of the scientific method.

By J. T. Enright

<<This brief essay has been composed in an attempt to clarify the difference between "negative evidence" and a "negative" critique, a subtle distinction that is perhaps not adequately appreciated by the general public.>>

In private correspondence from a non-scientist, I have been criticized for "negativism" and for relying on "negative evidence" in my publications about the Munich dowsing experiments (Skeptical Inquirer, Jan./Feb.1999, and other references there). My correspondent went on to say, "Negative anything is by definition frustrating, energy draining, etc. There is a certain stagnant quality about such cul-de-sacs associated with the term `negative'". Those are allegations of serious concern in a society where everyone is urged to "think positively", and in which negativism was given a strongly pejorative flavor in the public mind by Spiro Agnew many years ago, when he referred disparagingly to his critics as the ". . . nattering nabobs of negativism . . .".

What, exactly, is "negative evidence"? In the context of scientific research, it does not simply mean an observed result indicating that nothing at all of interest was observed. Sometimes a researcher is so convinced of his preferred interpretation that he may ignore some of his own results (which go the other way: negative evidence) in describing the research. This is a cardinal sin for a scientist, reflecting lack of proper objectivity. Another common situation is that a different scientist may attempt to replicate a phenomenon reported in the literature, and finds that he gets a quite different and uninteresting result. This also represents negative evidence; it is, indeed, a negative result. Replication of a real phenomenon should consistently yield the same kind of evidence, and when that's not so, the original report is open to suspicion. This sort of situation, however, is particularly problematic in science because the broader community has inadequate evidence to decide which of the two conflicting results possibly represents incompetence or a serious mistake on the part of one or the other of the experimenters. Sometimes, indeed, science does proceed in this way, and if attempts at replication in many laboratories all fail, the original report certainly loses credibility.

Often, however, scientific disagreements are not based on genuine failure of replication. Instead, the differences in interpretation arise from other aspects of the scientific process and method. Examples include the direct or implied critique of certain claimed results because of a demonstration that the methods used to obtain the original data are flawed, or a demonstration that the data available do not provide convincing evidence in support of the new interpretation that has been offered.

A hypothetical example can perhaps help to clarify the distinctions being made here. In order to appreciate these issues, suppose that an established scientist publishes a report like the following:

"The ability of the Indian guru, Mr. Sham, who claims to be able to levitate, was tested in the following way: Mr. Sham sat in a sling that was attached to the weighing arm of a Smithson Model 37B Super-electronic balance, which registered a weight of 139 pounds. He was then told to try to levitate, and because of difficulty in reading the scale during `levitation', the range of the balance was readjusted (from 0-to-350-pounds range setting to the 0-to-10-pounds setting). Following that adjustment, the registered weight then dropped to 1.30 pounds, but when the guru relaxed, the weight (with scale readjusted upward again) returned to 139 pounds. That remarkable outcome is a quantitative demonstration of the power of mind over matter: apparently genuine levitation."

Such a remarkable claim would probably induce many other scientists to attempt to investigate the claimed phenomenon. Let us consider some of the possible outcomes.

A) Negative Evidence. Suppose, then, that in an attempt at replication by scientist N.E.G., using exactly the same equipment and procedures with the collaboration of Mr. Sham, the apparent initial weight of 139 pounds was observed to remain unchanged after the guru tried to levitate. A single "failure" of this sort is not overwhelmingly convincing, but if several similar attempts all resulted in no apparent change in weight, the original report would be seriously discredited.

B) Methodological Critique. Suppose instead, that another scientist, M.C., places an encyclopedia in the sling attached to the same balance, which initially registers a weight of 259 pounds. Then the range of the instrument was readjusted downward to the 0-to-10 pound setting (as in the levitation report), and the registered weight became 2.1 pounds. Since no one would suspect that books can levitate, that result would indicate rather convincingly that this particular balance (and perhaps others of this model) can give fatally flawed readings following range readjustment. By implication, that result would suggest that the original report on Mr. Sham may well have involved an instrumental artifact rather than genuine levitation. The result of course reflects negatively on the original claim, but it is not negative evidence in the usual sense. The outcome, when published, would represent nothing more than a scientific critique resulting from application of proper skepticism to the primary measuring instrument.

C) Data Insufficiencies. Suppose, now, that still another scientist, D.I., obtained from the original claimant the printout of the original data tape from the balance recorded while the guru had tried to levitate, and discovered that the reported 1.30-pound value involved a small incidental ink smear between the 1 and the 3, and that the actual number printed by the machine was 130 pounds, not 1.30. Would a report of that discovery constitute "negative evidence"? Certainly not in the usual sense involving replication- just the result of applying scientific skepticism to data, even though the result would clearly reflect negatively on the original claim.

My re-analysis of the Munich dowsing data (Naturwissenschaften 82: 360-369, 1995) is an example resembling the latter outcome. I demonstrated that the data offered in support of the claim by Betz and colleagues, who contended that they had proven the reality of dowsing (Wagner, Betz and Konig, Schlussbericht 01-KB8602, Bundesministerium fr Forschung und Technologie, 1990), was not based on a consistent or convincing data set; that the obvious interpretation of those very data was that dowsers could NOT do what they claimed.

When differences of opinion arise between scientists, with one claiming that such-and-such is the case and the other the contrary, this does not mean that the latter report constitutes "negativism", implying a failure, an undesirable or disappointing sort of outcome. The genuine scientific method requires the deliberate application of honest doubt, of skepticism, to every surprising claim that is offered for incorporation into our view of how the world works. It has been the cure for vast bodies of superstition, as well as the corrective for many instances of well-intended misinterpretation. A critique which concludes that a given body of scientific data deserves a different interpretation than originally proposed should not be treated disrespectfully just because it doesn't add something really novel to the accumulated body of interesting facts. To learn that dowsing DOESN'T seem to work as claimed indeed involves something worth knowing and, in that sense, the experiments were certainly worthwhile, though not in the sense that the researchers themselves concluded.

-----------

James T. Enright is a Professor at Scripps Institution of Oceanography, UCSD, and a member of SDARI.