Physical performance can be measured quite accurately; mental performance is less accurately quantified, and not just because it has many different expressions. Behavior is even less predictable. Nevertheless, useful information on behavior can be assessed with confidence as Varshney has shown so dramatically. Our purpose here is to provide some background justifying that confidence.
We review some of the history of how social science came to be "quantified" using probabilistic methods. Unlike the chemist or physicist, the social scientist must deal with soft information where measurement itself may be in question; it may not be a hard number at all, just an impression.
Like Newton and Einstein, Adolphe Quetelet and Karl Pearson were the giants upon whose shoulders a new science was born. Social science may not be as quantitative as the "hard" sciences, but its logic is just as sound. Social science has the added handicap of having numerous detractors and critics who do not rely upon data at all, but who project their individual biases and hang-ups for all the world to see. Such Authoritarian Personalities can make the lives of social scientists quite difficult.
Adolphe Quetelet:
Quetelet was among the first to view human behavior as a societal issue and that there is such a thing as an "Average Man" (or Average Woman.) By that he meant how an average person will react to a given set of circumstances. Of course, such knowledge is of great help in interpreting history. Its greatest expression however is in predicting the future. In that mode, it gives one better judgment, whether leading or following others.
The concept of the Average Man was born 5 March 1831, about the time Ameer Ali and his band of Thugs were run down in India. The newborn "christening" came on 9 July of the same year as Quetelet read papers at the Brussels Academie Royale. Quetelet was first to quantify societal behavior in terms we would now call statistical. His second paper dealt with the penchant for crime at different ages. We will not go into his methodologies here, but rather discuss some of the techniques he developed that bear on what we need to do today if terror is ever to be relegated to the history books.
In his paper on the the dependence of crime on age, Quetelet pointed out a critical feature, one the current Administration ignored in their haste to eliminate Saddam Hussein. To paraphrase our source, Quetelet cautioned that no interpretation of data can be sound unless it considers all possibilities as to what may be causing the feature being measured or counted. He pointed out that politicians in particular are prone to using gut-feel instead of hard data, with attendant errors of considerable import. This feature is still with us and explains why politicians get it wrong so often. It affects all political parties, but not necessarily equally. In Quetelet's case, previous authors attributed crime to education erroneously. Quetelet also showed theories of medicine suffered from the same lack of thoroughness. Quetelet was among the first to stand tall against wishful thinking by those in influential positions.
A contemporary of Quetelet studied longevity as a function of profession. To pick a few of his results, natural philosophers lived 74.7 years, Authors on revealed religion lived 67.5 years, Poets lived 57.2 years while students lived a mere 20.7 years. Laughable? Yes, but we must remember the times. That author was limited by information available on deaths that occurred. Still he neglected that students went on to become something else by the time they died. A repeat study, better done, confirmed an effect of profession on age at death, but the range was much smaller, and more reasonable.
Quetelet's career was some 60 years long and throughout it all he was a staunch advocate of looking for and finding the true cause or causes underlying any effect.
Karl Pearson:
Pearson was an apostle of Quetelet, and like Quetelet, adamant about interpreting data. Unlike Quetelet, Pearson had an acid tongue. Pearson extended many of Quetelet's methods and refined their use in addressing social questions of his day. He never retreated from a fight that pit his solid data against bias or conjecture.
Pearson's most public renown came from a very public and sharp disagreement among "statisticians" of his day on how to generate and interpret data. These people came from different professions, that of a statistician did not yet exist.
In observatonal studies, several issues must be recognized and dealt with.
- Category Definition.
- Data selectivity.
- Adequacy of sample.
- What can be learned about society from limited information.
The controversy had to do with the temperance movement which was all the rage in 1910. Of course everyone knows drinking is bad for your health -- when done to extreme. And everyone could conclude that prohibiting the sale of alcohol would improve public health. OK so far?
Well, no. It seems that when a few hardy souls tried to prove the effect, they came up empty-handed. For example, try as they might to show otherwise, children of drinking parents thrived just as well as children of teetotalers. Pearson was among the data collectors. His antagonists were important, even thoughtful, people of his day. Many were at Cambridge. Unlike Pearson, they disdained data as unnecessary. In public opinion the latter carried the day. But Pearson's basic data has by and large stood the test of time. He certainly won the statistical battle. It took another two decades for the temperance movement to run out of steam for lack of sufficent social improvement to warrant permanent adoption. Thus, the 18th Amendment to the American Constitution came and went.
Far more important is the fact that Pearson's methodologies formalized the mathematical approach to social science and for that society can be forever greatful. Pearson's correlation coefficient has become a standard tool in medicine, psychology, sociology, and anthropology.
In his time, eugenics was a very sensitive topic, and Pearson did not back away. Many in the temperance movement thought alcoholism was genetic and that negative behavior would result. Pearson showed that that was simply not so. Alcoholics and their families thrived as well as, or even better than, non-alcoholic families did. Pearson did see one trend that he put little stock in at the time, for his data lacked the necessary power to see a small effect clearly. His observation that children of alcoholics seemed to live shorter lives foretold the fetal alcoholic syndrome and similar effects only rediscovered in larger, later studies.
See Ashutosh Varshney for a particularly good application of Pearson's scientific methods in our time. His approach can tell us why Honolulu and El Paso are so much more peaceful than Detroit, Baltimore or Washington.
Quetelet says not to skimp when looking for possible reasons; Pearson tells us how to be rigorous in interpretations.
It is hard to appreciate Pearson's technical contributions without some background in math. However, his most important one, the essence of the product moment sample correlation coefficient, can be grasped easily. If a factor is believed to cause a certain result, the Pearson's coefficient can be used in two important ways:
- provide an estimate of degree of confidence in the effect one can have from the data sample, and
- determine the fraction of variance that the supposed relationship removes from the variance in the result.
Variation observed gives rise to its mathematical descriptor, variance. The two are not to be confused. Pearson's correlation coefficient relies on variance. It is not all things to all people, nor does it provide scientific proof; it just quantifies the argument over possible cause and effect by limiting the possible degree of relatedness. Which variable is cause and which is effect is not what the correlation coefficient determines. The actual root cause must be determined by other means.
It is also difficult for the non-practitioner to distinguish between the strength of a correlation and its degree of certainty. Either can be high while the other is low, they are independent.
In practical science, proper experimental design can usually determine which variable or variables are cause and which effect. The experimental design transcends all other features in assembling data. Once the information is in hand, it is good practice to use all reasonable tools to interpret the data. We say that because it is not always clear which tool is most accurate. But if they both arrive at the same interpretation, then we can be pretty sure of being basically correct.
Aside from design of experiment, there are two ways to go astray,
- Accepting a "true" result when it is actually false, and its opposite,
- Accepting a "false" result when it is actually true.
What to do. Well for one thing, the experiment can be repeated one or even more times. This adds to the "power" of the result and is always good practice whenever it is possible. Partisan politics is the bane of social science research. Experimental results on people subjects are "soft" compared with the physical sciences. And since social issues are also subject to emotions, partisanship tends to push logic aside.
Posted by RoadToPeace on Monday, March 06, 2006.
Comments
To be able to post comments, please register on the site.