“Get your facts first; then you can distort them as you please.”
“Statistics don’t lie; statisticians do.”
“Statistics don’t lie; statisticians do.”
Distortion of reality—whether intended or not—amounts to pure and simple propaganda. The 2012 election provides a perfect example, if one is needed. Both sides of the aisle practice the art, but one side does seem less restrained than the other. It is hard work indeed to discern the truth, the real truth.
Our goal on this page is to ease the hard work it takes to maintain a democracy that is responsive to each of its citizens. This is doubly serious when corporation executives can spend as much as they like in shaping attitudes and an economy in ways that allow them to milk us all for their personal benefit as well as expand the wage gap. Boards of directors are ultimately in charge, but CEOs solve that inconvenience by sitting on each others boards. The income gap is the most important economic result when chicanery clothes its deceit in propaganda.
Other sectors of our society are also not immune, especially where the profit motive is embedded. As if that were not bad enough, the financial community has joined the parade of skimming the profits created by the labor of citizens, not to mention entrepreneurs, civic organizations, and corporations alike. The financial community not only sets the interest rates to their liking, but also creates instruments with inflated values for others to buy. The less regulated an economy, the more common such crimes against democracy become. The 2007 housing bubble is just the most recent example. It came with a twist: Profits were privatized while the risk was socialized in one of the most serious instances of hypocrisy changing history in favor of the bad guys.
For another example, if one is needed, America has the most expensive medical system in the world, but our life expectancy lags that of many other nations that spend far less. This speaks to inefficiency of the highest order. The only reason we can see for that situation is that our health system is the child and captive of politicians who in turn are too dependent financially on the corporate beneficiaries of the system. The Supreme Court specifically legalized this system where money trumps human issues. It need not be this way, but a new era has dawned.
Biased reporting is not yet universally legal, but human nature, being what it is, some individuals may have taken their cue. Science is all about facts and laws of nature, but even there human avarice is fully evident. For a glaring example, research papers are increasingly being withdrawn. This is especially true when a profit motive is involved. One example would be a faulty experimental plan that yields seemingly favorable results. Human nature being what it is, the medical research community seems to have more than its share of this particular problem. More on that below.
"Peer reviewed" science" is not immune either. For example, Science magazine and other journals in its class have been victimized by careless or unscrupulous authors who either made regrettable mistakes, falsified their results, or improperly omitted data—frauds—to fit their preconceived notions that other researchers cannot verify. The many scientific scandals in recent years speak to this problem. Journal retractions help limit the damage but do not totally stop later researchers from going astray if their literature surveys are not thorough—the case for too many. Sadly, some 2/3 of all retractions resulted from misconduct—that included outright fraud, plagiarism, and duplicate publications (Science, Vol 338, 5 Oct 2012, p23). Many journals, especially those with a limited subscriber base, find the reviewing process too expensive and cumbersome to use effectively. Fortunately, there are effective ways to detect errors. Whether specialists or generalists, there are questions we can ask.
There are two kinds of errors. Type One—accepting a result as true when it is actually false: a false positive (null hypothesis is true in spite of the indication otherwise.) In plain English, the effect observed cannot be reproduced. Type Two—rejecting a result as false when it is actually true: a false negative (accepting a null hypothesis when it is false). In plain English again, a real effect is missed. Both errors plague data mining. The reason is simple: random numbers—being random—can produce both false positives and false negatives, especially when the level for significance is set too low or too high as the case may be. Data miners need to guard against both types of error; that is not as easy as one might think. And guarding against both at once can be expensive indeed.
How certain can we be that a positive result is actually true or that a negative result is really negative?
This heading is not just how the math of statistics works. What is needed is rigorous thinking: How likely it is that a positive result observed is not true. This little issue is not little when one first encounters it. This requirement implies statistics cannot prove anything; that is true enough. Back to data mining in the ractical world. The arcane vocabulary of statistics aside, most of us tend to think in terms of positive results and real effects even if we must state their significance from the opposite side of the coin. The issues are: Is this signal real? Is a non signal? And most importantly: How can we know, within reason at least?
With all this in mind, two concepts enter the question: The level of assurance we want to have, call it alpha, (denoted by the Greek alpha), and p, the probability that a positive result exceeds the selected alpha. Setting the level of alpha is at the investigator’s discretion. Convention for the typical case is 0.05. But there is nothing magical about that level. Charlatans realize it gives them opportunities to lie to us by simply saying: “This result is significant.” Unless the investigator or his/her reporter mentions a number like 0.70 0.05 or 0.0001, a conclusion will be suspect on its face. S/he may, to suit his/her own purposes, in fact use 0.5. This is equivalent to flipping a coin! The smaller the p value, the higher the real significance, the probability of a "correct conclusion". A significance of 0.05 means that upon repeated trials the odds for confirming the initial result are 19:1. The greater the odds, the more likely it is that the result is in fact true. Data mining is especially vulnerable to Type One errors. So demand the actual “p value”, not just the term “significance” according to some unmentioned alpha.
Type Two errors can arise simply by chance alone, but more-commonly arise from other sources. The Type Two error plagues data bases that are not complete, a fact in most cases of data mining in our day. Important effects are overlooked or unrecognized as such. This one challenges the researcher and interpreter alike, especially when they are the same person. Repeating failed experiments, as well as collecting more data on positive results can be expensive, and is usually less rewarding than plowing a field for the first time. In the end, however, Type Two errors retard science and technology—the child of science—while Type One errors merely waste money, serve a temporary embarrassment or deflate one’s ego or career as the case may be. Deliberately biasing results from drug research to achieve FDA approval can lead to death, sometimes many. There really is no substitute for repeating experiments to high levels of significance of positive results when lives are at stake. False negatives can only be found the same way, by repeating failed experiments. Done properly, most research is expensive and time-consuming. This feature alone does not sit well in our age of instant gratification.
The media is prone to making both types of error. Addressing the media onslaught during election campaigns on the veracity of available information is an important service now provided by pundits and commentators, who indeed may have axes to grind of their own. Nevertheless, there are specific questions we might have of the data miners, even of ourselves, if we wish to do the mining:
How was this work sponsored, and by whom?
If it was done and published by a private organization, be especially aware. So also if the source did not report, or is vague about, critical details. S/he may be as unaware as you or I. More often the source is the problem—the case about 2/3rds of the time in two leading scientific publications reporting research that later had to be retracted. Ignorance, simple mistakes, or inadequate experimental designs only accounted for about 1/3rd of the retractions.
Were the samples or data relied upon representative?
Valid statistical results will be accompanied by error bars. Were they? There is a myriad of ways to test data and interpret results. Each technique has its own assumptions. Were they met? Were the samples representative and taken at random? Was a control group or comparative base line used? Were their parameters defined? Polls have become popular in recent years. Polls are treacherous enough—even when taken by independent experts interested only in being correct. Do not believe any poll that is not accompanied by an error bar. Even then it is difficult to arrive at a valid poll because one can never be sure of statements given by those polled by whatever means are accurate. So even an error bar does not guarantee accuracy, but many well-known polls do seem to be reasonably accurate for their purposes. We reference several on “Peace Related News Links”
How strong is the effect or difference?
Do not rely upon a medical result if the error bar includes zero, as many will if the alpha or value chosen for useful significance is high enough and/or the effect is small. (sigma is the number of standard deviations on one or both side(s) of the mean used to draw an error bar. Beware if both the effect and its attending error bar are both near zero. The standard deviation is also known as the root-mean-square, or RMS.)
How certain is it that this result is true?
Was the alpha value reported, or just the error bar defined by an alpha value not mentioned. Drug companies often abuse the system, especially for over-the-counter drugs, or when they have a huge investment in a very marginal medication. The over-the-counter folks typically protect themselves from legal problems by using a disclaimer in the label—in fine print. It is usually good practice to avoid all such products. Even drugs approved by the FDA will not necessarily do what is claimed—recent court cases attest to that fact. See Pfizer for a violation by one of the largest drug companies. That “violation” was just one instance. Another instance by the same company led to a $2.3 billion settlement. These are not the only such examples. The entire tobacco industry is paying for misrepresenting the health hazards.
How many variables were analyzed at the same time?
This one is a most prevalent error when data mining, and is commonly a result of statistical naivety and human exuberance. For instance, if one is mining data for effects at the two-sigma level, then on average one result in twenty will be an accidental positive. One might be looking at 400 variables (yielding 200 independent results) affecting one feature, say the weather. By statistical variations alone one would expect to find 10 or so “significant” examples at two-sigma, and one instance at three sigma would be observed naturally. It is entirely possible that none of these are real! But typically, well-directed data mining will usually produce more positives than expected from the math. In this case, in order to complete an effective data mining operation, it is necessary to repeat the mining on new and independent data bases wherever possible. This is not always or even usually possible. Finally to an eager researcher, a three sigma result might seem like proof positive, but it is not. So ask about it. There must be an operative scientific mechanism that explains it all before it can be considered new science. Even then, the research must be repeated until the five-sigma p value is reached, roughly one chance in a million that is is an artifact of the Type One error.
In all data-mining cases, there will typically be Type Two errors as well. The situation is worse—or better in a scientific sense—if a number of variables “co-vary” showing effects in unison. For example, average rainfall might co-vary with latitude, average temperature, and elevation. In those events, the real science of weather becomes more deducible and predictable when the co-variates are sorted out.
Was the placebo effect included when reporting a drug efficacy?
If the study involved human subjects one should ask: What fraction of the total variance was due to the placebo effect? If a placebo was not employed, or reported, be especially beware; it may be that the response claimed was entirely the placebo effect at work. Or the placebo effect may be some fraction of the total response. In real life, the placebo effect reflects the power of mind over matter. If a person in a medical trial even thinks s/he might be getting the real medication, that person will typically show some positive response, possibly at some “significant” level. Since it affects both the experimental and control groups equally, the placebo response must be subtracted from the apparent medical effect to arrive at the true medical response.
How many variables were analyzed at the same time?
This is where data mining can easily run astray in both ways. Data mining employs metrics of various complexities where dozens of variables are assumed to operate. Such situations arise in public health. Consider the possible incidence of leukemia in the health of the populations living near twenty different electric power plants over three intervals of time. The national incidence of leukemia for each time interval is used for the expected rate. Data of this sort is ripe for data mining for good purposes. All too often, however, a data miner, running upon a correlation of two-sigma, grabs that and runs to the media only to discover later their exuberance brought on a Type One error. But Type Two errors were likely there as well. Significant results would simply not be recognized, a Type Two error.
The variables are power-transmission lines. If their designs differ or their rated power does, then these features provide additional ways to block the data for comparative analyses. Even prevailing wind might be a factor if noxious or radioactive gases or particles are released. Local air quality affected by local industry would be yet another variable to consider. The possibilities may seem endless. If one station stood out with a high (or low) incidence with a significance of 0.05, one could not be sure the finding is real, as it is likely to be a statistical artifact of the odds implied by the significance level. To conclude the finding is real is to commit an experiment-wise error. That is not to say, the result should be ignored either. In fact controversy surrounding this very issue still exists. But the website EM Watch does not supply any data allowing us to decide for ourselves. Their claims are subject to the errors cited here. Never mind their emotional appeal. So what is the real picture? To research it for yourself,visit each of the following sites:
Electric power used in America peaked in 2008.
In contrast, US cancer rate has been declining since the early 1990’s.
In particular leukemia is not tracking power generation in the US as the emwatch.com site claims. See National Cancer Institute. The incidence of leukemia since 1991 has been trending down for both males and females. For this reason alone we cannot accept at face value the power-line thesis of emwatch.com. Have they cherry-picked their data? Did they make it up? Did they themselves study a nonrepresentative group? Is it merely propaganda? We cannot say from the evidence presented, but their position is consistent with any of these theses. Compared with obvious affects from gender and racial differences appearing in the referenced tables. EM Watch has it wrong, dead wrong. But decide for yourself. But we see little merit in the arguments put forth on emwatch.com.
Similar controversy surrounds the use of cell phones. Neither power lines nor cell phones emit ionizing radiation that could indeed damage DNA. If there is a relationship, it must be small and indirect.
How scientific is this result?
To reach scientific acceptability, two conditions must be satisfied: 1) The significance must be at least five sigma (0.0000003, or less than one in a million. 2) There has to be a mechanism, reason, and logic behind the observations must be consistent with scientific laws. Even then, an independent confirmation is required to eliminate Mark Twain’s concern about human nature. It took years of high-energy-collider research to establish the existence of what looks like the Higgs boson. But the fit is not exact in some important ways. There may be some new physics in the offing. But the critical point is that five sigma was reached by two independent means. Since Higgs and others predicted it existence decades ago, the scientific proof is there as well. We can safely bet our lives on it being real.
In summary statistics can never prove anything, science is ultimately required for that. Nevertheless, strong statistical results can provide roadmaps for good health or further research as the case may be.
These simple questions, answered honestly and forthrightly, are usually sufficient for us to assess the veracity of the finding. It may be that you do not want to be bothered. But if you do, you have about an even chance of being taken in if your source is a politician of any type, simply because we rarely have the time, ability or resources to get answers for ourselves. Our risk can be higher than 50% if the media source is owned by ideologues. Moreover, we cannot even rely on an honest media reporter to be in command of all these details. If it is important that we know for sure, and it often is, then we should search out the source and evaluate its significance for ourselves before making a decision. Even our family doctors may not be sufficiently aware—if s/he is not skilled in statistics, if the drug or treatment is new, if it was not rigorously certified by the FDA, or almost any number of ways the devious people can corrupt the system.
To be sure, human nature can corrupt science.
The good and bad news is: corruption in science in effect is miniscule compared with the damage created by politicians.
The good and bad news is: corruption in science in effect is miniscule compared with the damage created by politicians.
For those who care little for the math or details, they can still ask the foregoing questions in bold. Then be critical of the reply. Ask further questions, and trust your intuition: does the reporter's demeanor spell confidence? If the statement is in print, can it be verified from other more-reliable sources? Finally, one can ask one's self: Is this statement consistent with the solidly-known facts? Politicians make their bread and butter by using sleight of hand and camouflage. True, some are better at it than others. Watch also for how much weight they put on loudness and forcefulness. Of course these traits may be appropriate depending on the occasion. But in the recent decade, the one who shouts loudest, or tells the biggest lie in trying to demonize an opponent, does indeed get extra votes. As citizens living in a new era where propaganda is driven by money buys anything, we must tread carefully.
Posted by RoadToPeace on Sunday, October 14, 2012.
Comments
To be able to post comments, please register on the site.