Epidemiology is the indispensable basic science of public health. It provides the logical framework for the facts that enable public health officials to identify important public health problems and to delineate their dimensions. Epidemiologic methods are used to define these health problems; to classify, identify, and elucidate their causes; and to plan and evaluate rational control measures.
HISTORICAL DEVELOPMENT OF EPIDEMIOLOGY
In ancient times, epidemics and plagues were terrifying natural phenomena that cried out for a more rational explanation than that they were due to the wrath of god or the machinations of evil spirits. Hippocrates (c. 460–377 B.C.E.) described many kinds of epidemics and in On Airs, Waters, Places and other writings. He offered empirical insights into environmental and behavioral factors that might be associated with certain kinds of disease. Although doctors and others engaged in the healing arts did not clearly understand the concept of contagion until several hundred years later, Fracastorius (c. 1478–1553) identified several ways that infections can be transmitted—by direct contact, by what we now call droplet spread, and by contaminated clothing.
The science of epidemiology took root with empirical observations of epidemics and other causes of death. John Graunt (1620–1674), in London, complied the first mortality tables on England's bills of mortality. Statistical analyses of deaths due to childbed fever by Ignaz Semmelweiss (1818–1865) in Vienna in the early nineteenth century and of tuberculosis by Pierre Charles Alexandre Louis (1787–1872) in Paris demonstrated the power of numbers. In London, in 1848 and 1854, meticulous, logical examination of the facts and figures about cholera epidemics by John Snow (1813–1858) revealed the mode of communication of this deadly epidemic disease. Snow is regarded as the founder of modern epidemiology because of his use of such careful methods.
Until early in the twentieth century almost all epidemiology focused on communicable diseases, although Percivall Pott's (1714–1788) observations on cancer of the scrotum in chimney sweeps and James Lind's dietary experiment with fresh fruit to prevent scurvy (1975) were precursors of modern noncommunicable disease epidemiology and clinical trials, respectively. The use of epidemiology in studies of coronary heart disease and cancer in large-scale trials of many new preventive and therapeutic regimens, in nationwide surveys of health status, and in evaluation of health services came to the fore in the second half of the twentieth century. In the final quarter of the twentieth century, powerful computers, information technology, and more rigorous methodological approaches transformed epidemiology and made it a mandatory feature of clinical science as well as the most fundamental basic science of public health.
DEFINITION AND SCOPE
The word "epidemiology" was coined in the mid– nineteenth century to describe the scientific study of epidemics. Its meaning has expanded over the years, and present-day epidemiology encompasses the study of all varieties of illness and injury as they affect defined groups of people. In 1983 a committee representing the International Epidemiological Association defined epidemiology as "the study of the distribution and determinants of health-related states or events in specified populations, and the application of this study to control of health problems." Study includes observation, surveillance, hypothesis-testing research projects, analysis of epidemiologic and other kinds of data, and certain other kinds of experiments. Distribution includes analysis of data according to the time scale over which events occur, the places where the events occur, and the categories of persons to whom they occur. Determinants are all the physical, biological, behavioral, social, and cultural factors that influence health. Health-related states or events include diseases, causes of death, behaviors such as the use of tobacco, reactions to preventive regimens, and provision and use of health services. Specified populations are those with identifiable characteristics such as known numbers and age groups. The ultimate aim and purpose of epidemiology—to promote, protect, and restore good health—is manifested in the "application of this study to control health problems."
Epidemiologists attempt to identify, measure, count, and control diseases, injuries, and causes of untimely death; and to relate these events to the associated inherited, environmental, and behavioral factors that cause or contribute to them. One of the great intellectual challenges of epidemiology is to dissect these factors and unravel their connections in order to identify exactly what is ultimately responsible for a particular disease or health problem.
RELATIONSHIP TO OTHER SCIENCES AND TECHNOLOGIES
The information used by epidemiologists comes from a diverse array of sources; draws on a wide range of sciences and technologies; and calls on the expertise of technologists and other people engaged in many kinds of crafts. Some connections are obvious—those with vital statistics, biostatistics, microbiology, immunology, and chemistry; with every clinical specialty from pediatrics to geriatrics and palliative care, and from family practice to hematology and neurosurgery. Other obvious connections are to the social and behavioral sciences, and, less obviously, to animal husbandry, wildlife biology, agricultural science, physics, atmospheric sciences, oceanography, engineering, town planning, education, law enforcement, communications technology, and the media. Epidemiology may be the most ecumenical of all the sciences. Probably no other branch of biomedical science has so many connections to such a wide range of other human activities.
The basis of all epidemiology is the comparison of groups of people. For these comparisons to be valid, it is necessary to convert raw numbers into rates. A rate is a fraction—the upper part (the numerator) is the number of people affected by the problem, event, or condition of interest; the lower part (the denominator) is the number of persons in the population who are at risk of experiencing the problem, event, or condition. Because the events normally continue over a long period, often indefinitely, rates are expressed in relation to a specified time. Since fractions are awkward to deal with, there is commonly a multiplier, and the rate, as shown in the following formula, is expressed in terms of so many per thousand, per hundred thousand, etc., in a specified time, usually a year, though shorter periods are used when circumstances warrant it:
In practice there are many variations in the ways rates are expressed, but the basic elements of events, population at risk, and time are common to all.
Rates have many uses. By comparing rates, epidemiologists can examine the experience of particular groups of people at specified times, in different cities, countries, or occupational groups.
The terms "incidence" and "prevalence" are often confused. Incidence refers to the number of new cases, events, or deaths, that occur in a specified time, usually one year. Prevalence refers to the total number of events or cases, both new and long-term, that are present at a particular point in time. Prevalence is therefore expressed as a number, not a rate, as there is no time dimension involved.
An epidemic is the occurrence of a number of cases of a disease clearly in excess of normal expectation. This is usually a large number when the disease is one of the common infectious fevers, but even a single case of a dangerous contagious disease, such as typhoid, that has long been absent from a community should suffice to activate the highest level of epidemic surveillance and control measures. The occurrence of a small number of cases of a rare variety of cancer, closely clustered in time and space, may also signal an epidemic. Observational and analytic epidemiology blend in the investigation of epidemics. The investigation demands meticulous attention to detail in collecting information about all the cases of the condition, including mild and inconspicuous cases as well as those with florid manifestations, and must include details about all possible associated factors, such as dietary intake (this is especially important in outbreaks of food poisoning), occupation, living conditions, and unusual recent experiences. Particular attention is paid to the index case—the first identified case of a condition. In most infectious disease epidemics, this could be the case that introduced the infection into the affected community. Information is also gathered about healthy people in the same community, aimed at discovering why they have not been affected. Laboratory tests are used to confirm the diagnosis, identify the pathogenic organism, toxic chemical, or other agent that caused the disease; and to measure immunological responses among both sick and healthy people. Analyzing all this information often clarifies the nature and cause of an epidemic and points the way to appropriate control measures.
Investigating epidemics can be tedious because it needs to be so painstaking, even, seemingly, a boring routine task. But often it is as exciting as detective fiction. For example, an epidemic of typhoid in Aberdeen, Scotland, was traced eventually to a contaminated can of processed beef from Argentina. The can had been cooled in a river adjacent to the canning works. As the pressure inside the can fell when it cooled, a partial vacuum was created and typhoid bacilli in raw sewage in the water were sucked into the can through a minute hole.
Identifying the existence of an epidemic sometimes requires unusual vigilance and an ability to make connections among seemingly isolated events. An epidemic of lethal pneumonia among members of the American Legion who attended a convention in Philadelphia in 1976 and then returned to their hometowns before becoming ill, would not have come to light without rigorous scrutiny on the part of epidemic intelligence service officers of the Centers for Disease Control. Subsequent investigations led to the identification of Legionnaire's disease.
Techniques of molecular biology, notably DNA typing and the identification of biomarkers, have immensely enhanced the precision of epidemic investigation. It is now possible to trace the exact passage of an infectious agent such as the gonococcus or HIV (human immunodeficiency virus) as it is transmitted by direct contact from one individual to another among a group of people; or to show that coughing by a passenger with
The application of several analytic methods of epidemiologic study has contributed substantially to scientists' understanding of disease causation, and therefore to control and prevention of many conditions of great public health importance. The available methods are observational epidemiology (the empirical study of naturally occurring events), analytic study, and, under carefully defined conditions and with all due ethical safeguards, human experimentation.
Observational Epidemiology. This method begins with surveillance of populations, using vital and health statistics—including analysis of death rates arranged by age, sex, locality, and cause of death. Other information is derived from notified cases of infectious diseases of public health importance, from registries of cancer or other diseases, and from hospital discharge statistics. Since 1957, the National Center for Health Statistics has conducted continuously a National Health Survey that has carried observational epidemiology to new levels of comprehensiveness.
It is often possible to make imaginative use of many other kinds of available information about defined population groups. Schools and many employers keep records of absences due to sickness, sometimes with reasons for these absences. Police and other law enforcement agencies keep records of calls to settle domestic disputes and of damage due to vandalism, which are useful indicators of social pathologies associated with local variations in the frequency of domestic violence, alcohol abuse, and broken families. All such sources of information combine to make it possible for epidemiologists and public health specialists to produce a multidimensional "community diagnosis." Serial measurements can indicate whether things are improving or getting worse, and in which ways these trends are moving for each of different indicators ranging from adolescent smoking behavior to reasons for long-term disability among the elderly.
Analytic Observational Studies. The possibilities of observational epidemiology are considerable, but not limitless. They are powerfully reinforced by analytic studies. The two main analytic methods are the case-control study and the cohort study.
Careful questioning of patients has enabled many doctors to make inferences about the influence of past experience on present disease. Percivall Pott, an eighteenth-century British physician, observed that cancer of the scrotum occurred among former chimney sweeps, and correctly inferred that it was associated with the accumulation of tar in the skin creases. Two hundred years later, in 1940, Norman Gregg, an ophthalmologist in Sydney, Australia, similarly inferred correctly that the cases he was seeing of congenital cataract must be associated with rubella (German measles), which their mothers had had during early pregnancy.
The case-control study is a systematic extension of routine medical history taking, in which the past histories of patients (the cases) suffering from the condition of interest are compared to the past histories of persons (the controls) who do not have the condition of interest, but who otherwise resemble the cases in such particulars as age and sex. Analysis of data about a series of cases and controls may show differences that are statistically significant. Sometimes only small numbers of cases are required to demonstrate significant differences between cases and controls. This makes the case-control study a suitable way to search for causes of rare conditions. For example, the discovery that a very rare form of liver cancer was strongly associated with occupational exposure to vinyl chloride required only four cases, and the fact that expectant mothers' use of artificial estrogens during early pregnancy can cause cancer of the vagina many years later in their daughters was based on a case-control study of eight cases. Although case-control studies can be flawed by the presence of biases that are often difficult or even impossible to eliminate, they are a valuable method of investigation because they can be done rapidly and at
A cohort study is conducted by identifying individuals in a defined population who are exposed to varying levels of known or suspected risk for the condition of interest, such as cancer of the lung or coronary heart disease. The population is observed over a certain period, and the death and disease incidence rates among those exposed to varying and known levels of risk are compared. Cohort studies require large numbers, commonly many thousands, and prolonged observation, commonly years or even decades. They are therefore expensive, requiring a large and dedicated staff and maintenance of detailed records of very large numbers of people, only a small proportion of whom will ultimately fall ill and die of the condition of interest. Some cohort studies have become famous. The people of Framingham, Massachusetts, have been the subjects of cohort studies of coronary heart disease since 1948. In 1951, Richard Doll and Austin Bradford Hill began a cohort study of lung cancer in relation to tobacco smoking in a cohort of about 40,000 male British doctors. Later phases of this study have expanded to include risk factors for coronary heart disease and other chronic conditions; and by the late 1990s this study had yielded dramatic evidence of the relationship of tobacco smoking to cancers of many kinds—and to coronary heart disease, chronic obstructive lung disease, and various other life-shortening chronic diseases.
It is possible to get results from a cohort study without waiting many years, if detailed information about exposure to risk factors at some time in the past is available in sufficient detail for a population of sufficient size. A method that permits reliable linking of past and present medical and other relevant records, such as a record linkage system, facilitates this approach. Record linkage is the process of relating information from two or more sets of records—compiled years apart and sometimes by different agencies—about the same individuals. A prerequisite is a way to identify individuals with a high degree of precision, such as a unique numbering system, or a system combining a sequence of digits for birthdate, birthplace, and sex; with alphabet letters or a phonetic code used for other details, such as the individual's mother's maiden name. Obviously the logistics of all this make it a costly method, but the yield can justify the expense. This method, known as an historical cohort study, has demonstrated the relationship of childhood cancer and developmental anomalies to prenatal maternal exposure to small diagnostic doses of X-rays. Record linkage and historical cohort studies have also demonstrated a relationship between birthweight and the occurrence of cardiovascular disease in middle age.
Experimental Epidemiology. In the 1920s, experimental epidemiology meant observing the passage of infectious pathogens in colonies of rodents, but such experiments are rarely necessary, and the meaning of the term has changed. Experiments in which the investigator studies the effects of intentional alteration or intervention in the course of a disease are now done on humans rather than experimental animals, usually using a randomized controlled-trial design.
The randomized controlled trial (RCT) is a form of human experimentation in which the subjects, usually patients, are randomly allocated to receive either a standard accepted therapeutic or preventive regimen, or an experimental regimen. The purpose of random allocation is to eliminate or minimize bias in the selection of subjects. This greatly enhances the validity of the results. Preferably, the subjects and those observing the trial's results should be unaware of which subjects are receiving the experimental and control regimens, thus eliminating the power of suggestion as a factor influencing the response of individuals to the regimen. There are very important ethical constraints on the conduct of randomized controlled trials. The only ethically acceptable justification for conducting a randomized controlled trial is uncertainty about which of the available regimens is the best, a state of affairs known as "equipoise." It is absolutely essential to obtain the genuinely informed consent of all human subjects on whom a trial is conducted.
CLINICAL EPIDEMIOLOGY AND EVIDENCE-BASED MEDICINE
In the final quarter of the twentieth century, physicians in clinical practice discovered the value of epidemiologic methods in enhancing the efficacy of treatment regimens, mainly through rigorous attention to the nature and quality of the evidence
OTHER RECENT ADVANCES
Epidemiology made spectacular progress in several other directions in the 1990s. One was in the application of molecular biology, resulting in what is sometimes called molecular epidemiology. Other advances have been made in genetic epidemiology, where the meeting of molecular genetics with public health, occupational and environmental health, and infant and child health has produced both exciting stories of great progress and difficult ethical and moral problems. What are scientists and physicians to do, for instance, with the newfound knowledge and technical capability to identify defective genes, especially genes that, in interaction with some environmental circumstances, can disqualify certain individuals from particular occupations and can render others ineligible for life insurance? Such dilemmas presage a testing time for society's values.
Another set of new challenges face epidemiologists who specialize in studies of risk management. The global environment is changing as the burden of greenhouse gases increases and leads to a rise in average global ambient temperatures, and remote sensing and climate models enable us to predict the likely future distribution of vector-borne diseases such as malaria, dengue, and schistosomiasis. A new realm of risk factor analysis is thus emerging, based on future health scenarios that incorporate climate models and— in the most sophisticated applications—include sets of models for future patterns of biodiversity, human settlements, and economic and industrial dynamics. In these ways epidemiologists are helping to plan the public health services that will be needed in the future.
JOHN M. LAST
(SEE ALSO: Case-Control Study, Cohort Study, Cross-Sectional Study; Epidemiologic Transition; Graunt, John; Hippocrates of Cos; Mortality Rates; Notifiable Diseases; Pott, Percivall; Rates; Rates: Age-Adjusted; Record Linkage; Semmelweiss, Ignaz; Snow, John; Vital Statistics; and other articles on specific diseases mentioned herein)
Ashton, J., ed. (1994). The Epidemiological Imagination. Buckingham, UK: Open University Press.
Beaglehole, R.; Bonita, R.; and Kjellström, T. (1993). Basic Epidemiology. Geneva: World Health Organization.
Buck, C.; Llopis, A.; Nájera, E.; and Terris, M., eds. (1988). The Challenge of Epidemiology. Washington, DC: Pan American Health Organization.
Last, J. M., ed. (2000). A Dictionary of Epidemiology, 4th edition. New York: Oxford University Press.
Rothman, K. J., and Greenland, S., eds. (1998). Modern Epidemiology, 2nd edition. Philadelphia: Lippincott-Raven.
Roueché, B. (1954). Eleven Blue Men, and Other Narratives of Medical Detection. Boston: Little, Brown & Co.