Introduction
All psychological research begins with observations of behavior, either informal, everyday observations or formal observations based on prior psychological research. Such observations frequently lead to questions. For example, many people drink beverages containing caffeine throughout the day. When asked why, these people often say that it helps them stay alert. Does caffeine really help people stay alert?
For every question one can ask, there are many possible answers. Caffeine might affect alertness at all dosages. It might only affect alertness at certain dosages or only in some situations. On the other hand, caffeine might decrease, not increase, alertness. Finally, one should not discount the possibility that caffeine has no effect on alertness and that the reported effects are placebo effects—that is, they might be caused by expectations about caffeine and not by caffeine itself.
In the language of science, the possible answers to questions are called hypotheses, and the procedure scientists employ to choose among these hypotheses is called hypothesis testing. Hypothesis testing, when used correctly, is a powerful tool for advancing knowledge because it provides a procedure for retaining hypotheses that are probably true and rejecting those that are probably false.
Scientists test hypotheses by making predictions and collecting information. A prediction is a statement of the evidence that would lead the scientist to accept a particular hypothesis. The hypothesis that caffeine maintains alertness leads to the prediction that people who ingest a measured dose of caffeine a given time before engaging in a task that requires alertness will perform better than people who do not ingest caffeine. An experiment could be conducted to test this hypothesis. An experiment is a set of controlled conditions used to test hypotheses. It contains at least one experimental group and one or more comparison, or control, groups.
Role of Predictions
Hypothesis tests are only as good as the predictions generated. Good predictions tell the researcher what evidence to collect. To be a good test of the hypothesis, predictions must have three characteristics. The predictions must follow as a logical consequence from the hypothesis and the assumptions the researcher makes about the test situation. The hypothesis and its corresponding prediction must be testable, in the sense that the researcher could decide from the data whether a given prediction has been confirmed. Finally, it should be unlikely that a given prediction is confirmed unless the hypothesis on which it is based is correct.
If the prediction was not logically related to the hypothesis, a confirmation by the data would reveal nothing about the truth of the hypothesis. This logical relationship is complicated by the fact that rarely is it the case that a prediction can be generated solely from the hypothesis. To generate a prediction, a number of additional assumptions are usually required. It takes a certain amount of time for caffeine to enter the system and affect behavior; any behavioral effects will not be evident until the dosage reaches a certain level. The time it takes to detect the occurrence of a new object on a computer screen is a valid measure of alertness.
Although the prediction may logically follow from the hypothesis, it will not be confirmed if any of the assumptions are incorrect. Unfortunately, there is no foolproof way to avoid this problem. Researchers must carefully consider all assumptions they make.
The condition of testability means that it must be possible to decide on the basis of evidence whether the prediction has been confirmed. Testable predictions are falsifiable; that is, certain experimental outcomes would lead the researcher to conclude that the prediction and the hypothesis are incorrect. A prediction that is not potentially falsifiable is worthless. If researchers cannot conceive of data that would lead them to disconfirm a prediction, the prediction cannot be tested; all data would lead to confirmation.
The third condition, that it should be unlikely for a given prediction to be confirmed unless the hypothesis on which it is based is correct, is met when the hypothesis leads to a very specific prediction. Specificity is a characteristic of the hypothesis; if the hypothesis is specific, it should lead to a specific prediction. Hypotheses that meet the criterion of specificity tend to require fewer additional assumptions. Unfortunately, such hypotheses are rare in psychological research. Because of this lack of specificity, a single experiment rarely provides a definitive test of a hypothesis. As knowledge about behavior increases, the ability of psychologists to generate and test such hypotheses will no doubt improve.
Testing Strategies
Hypotheses are tested by comparing predictions to data. If the data confirm the prediction, then the researcher can continue to consider the hypothesis as a reasonable explanation for the phenomenon under investigation and a possible answer to the research question. If subjects react faster to new objects after ingesting caffeine, the investigator would conclude that caffeine affects alertness in this situation. If the data do not confirm the prediction, however, then the investigator would conclude that the hypothesis does not provide a correct explanation of the phenomenon and is not a correct answer to the research question. If subjects given caffeine perform similarly to those not given caffeine, then the researcher would conclude that caffeine does not have much, if any, effect on alertness in this experimental situation.
A useful strategy for testing hypotheses is to generate two hypotheses and attempt to show that if one is false, the other must be true. This can be accomplished if the hypotheses selected are mutually exclusive (they cannot both be true) and exhaustive (they are the only logical possibilities). When two hypotheses satisfy these conditions, demonstrating the truth or falseness of one of these hypotheses determines the status of the other: If one is probably true, the other must be probably false, and vice versa.
The conclusion drawn from a hypothesis test is that the hypothesis is either probably true or probably false—“probably” because it is possible that the wrong conclusion has been reached. The success of any hypothesis test depends on the logical connections among the hypothesis, assumptions, and prediction. Even when these connections are sound, however, different results might occur if the experiment were repeated again. Repeating the experiment and obtaining similar results increases the researcher’s confidence in the data.
It is typically the case that a given experiment will not answer the research question unambiguously. By performing additional experiments with a variety of dosages, times, and tasks, researchers should discover the range of conditions under which caffeine affects alertness. Testing one hypothesis typically leads to more questions, and the cycle of asking questions, generating hypotheses, making predictions, collecting data, and drawing conclusions is repeated. Each round of hypothesis tests increases knowledge about the phenomenon in question.
Group Behavior Research
An excellent example of hypothesis testing was described by Bibb Latané and John M. Darley in their 1970 book The Unresponsive Bystander: Why Doesn’t He Help? Latané and Darley became interested in this topic from reading newspaper accounts of people assaulted in the presence of bystanders who did little to assist the victims. The most famous of these was the murder of Kitty Genovese in the presence of thirty-eight neighbors who witnessed this event from their apartment windows; none of them intervened or even called the police. Latané and Darley asked the question, What determines in a particular situation whether one person will help another?
One of their hypotheses was that the number of people present is an important factor affecting how likely it is that someone will react to a dangerous situation. They also hypothesized that what a person does also depends on the behavior of other people at the scene. Latané and Darley predicted that people would be more likely to respond to an emergency when alone than when in the presence of others. They also predicted that what others who are present do affects a person’s behavior.
To test this hypothesis, Latané and Darley solicited the participation of male college students to complete a questionnaire. Students went to a room in a university building where they worked on the task alone, along with two confederates of the experimenters, or with two other people who were also naïve subjects. After several minutes, smoke was introduced into the room through a small vent in the wall. The response measures were whether subjects would seek assistance, and, if they did, how long it took them to do so. If subjects did not respond after six minutes of sitting in a room that was filling with smoke, someone came in to get them.
The predictions and the experiments to test them were based on the above hypotheses and on assumptions about how subjects would view the test situation. Darley and Latané assumed that subjects would believe the emergency in the experiment was real and not contrived, and that subjects would perceive the situation as potentially dangerous.
Their predictions were confirmed by the data. Subjects were most likely to report the smoke when alone. When there were two passive confederates who acted in a nonchalant manner in the presence of the smoke, subjects were least likely to respond. Being in a group of three naïve subjects also inhibited responding, but not as much as when the other two people ignored the apparent danger.
The results of this experiment appear to provide rather convincing evidence for the correctness of Latané and Darley’s hypotheses; however, the assumption that subjects would view the smoke as potentially dangerous is suspect. Postexperimental interviews revealed that some subjects did not perceive the smoke as potentially dangerous. Furthermore, the results of this experiment suggested additional questions to Latané and Darley: “Does the inhibitory effect of other people depend on the fact that in a three-person group, the subject is in a minority? What would happen if only one other person were present? Does the effect depend on the fact that the other people were strangers to the subject? What would happen if the subject were tested with a close friend?”
These questions were addressed in subsequent experiments. In each case, hypotheses were generated and predictions were derived and tested. Care was taken to make the dangerous situation as unambiguous as possible and not to give away the fact that it was contrived. The general results confirmed Latané and Darley’s predictions and validated their hypotheses about bystander apathy.
Eyewitness Testimony Research
For more than eighty years, psychologists have studied people’s accuracy at describing events they have witnessed (eyewitness testimony). The history of this work is recounted by Gary Wells and Elizabeth F. Loftus in their edited volume Eyewitness Testimony: Psychological Perspectives (1984). Considerable evidence, collected in a wide variety of settings, demonstrates that people’s recollections of an event can be influenced by postevent experiences such as interviews by the police and attorneys or the viewing of mug shots. This raises an interesting question: When people’s recollection of an event is changed by postevent information, is the underlying memory changed, or is the original memory still intact but rendered temporarily inaccessible? In other words, does reporting of the event change following postevent experiences because the memory has changed, or because the new information blocks the ability to recall the event as originally experienced?
This is an example of an interesting question to which no satisfactory answer has yet been obtained—but not for lack of trying. The question suggests two mutually exclusive and exhaustive hypotheses: The underlying memory is changed by the postevent experience, so that what the person reports is not what he or she originally saw or heard, or the postevent experience has created a new memory that is in competition with the existing, original memory, and the new memory simply overwhelms the old memory. Both hypotheses lead to the prediction that postevent experiences will affect what a subject reports, but the second hypothesis leads to the additional prediction that the original memory can be teased out into the open under the right circumstances. The problem is how to do this in a convincing manner.
David Hall, Loftus, and James Tousignant reviewed the research on this question. They note that it is difficult, if not impossible, to disprove the hypothesis that both memories coexist. In some studies, what appear to be original memories seem to have been recovered but not always. The question as asked assumes an either-or situation (either the original unaltered memory still exists or it does not). Unfortunately, this question cannot be answered unambiguously with present knowledge and technology. On the other hand, the question of under what conditions recollections (as opposed to memories) are changed can be answered, because it leads to a number of testable predictions based on hypotheses about these conditions.
Clearly, not all questions lead to testable hypotheses and falsifiable predictions. When data support more than one hypothesis, or if the data are likely to occur even if the hypotheses are false, hypotheses have not been adequately tested. The hypotheses about the status of the original memory are an excellent illustration of this. Knowledge of the effects of postevent experiences was advanced by asking the more limited question that led to testable hypotheses. Scientific understanding advances when people ask the right question.
Laboratory Versus Field Experimentation
Laboratory experimentation, with its tight control over the variables that affect behavior, is the best way to test hypotheses. By randomly assigning subjects to conditions and isolating the variables of interest through various control procedures, researchers are afforded the opportunity to arrange situations in which one hypothesis will be confirmed to the exclusion of all other hypotheses. Unfortunately, laboratory experimentation is not always an appropriate way to test hypotheses.
When laboratory experimentation is neither possible nor appropriate, researchers can use field experiments, in which experimental methods are used in a natural setting. The inability to control the field setting, however, makes it more difficult to exclude some explanations for the data. Thus, the advantages and disadvantages of field experimentation must be weighed against those of the laboratory. The earliest research of Latané and Darley on bystander apathy involved field experimentation; however, they found it inadequate for rigorous hypothesis testing and moved their research to the laboratory.
Alternate Forms of Inquiry
Surveys and questionnaires can provide answers to some kinds of questions that experiments cannot: Do certain characteristics distinguish people with different attitudes or opinions on an issue? How many people have a given attitude or opinion? Do certain attitudes or opinions tend to occur together? Predictions can be made and tested with carefully designed surveys and questionnaires.
Archival data and case studies can be extremely useful sources for generating questions and hypotheses, but they are poor techniques for testing hypotheses. Both archival research and case studies can indicate relationships among various factors or events. Whether these relationships are causal or accidental cannot be determined by these methodologies; therefore, the only question that can be answered from archival data and case studies is whether certain relationships have been observed. An experiment is necessary to ascertain whether there is a causal connection.
Statistical significance tests are often part of hypothesis testing. Statistical hypotheses parallel research hypotheses; statistical hypotheses are about aspects of populations, while research hypotheses are about the subject of inquiry. Deciding which statistical hypothesis to accept tells the researcher which research hypothesis to accept. The logic of hypothesis testing in general applies to statistical significance tests.
Knowledge is advanced when research questions can be answered. Hypothesis testing, when used correctly, is a powerful method for sorting among possible answers to find the best one. For this reason, it is frequently employed by psychologists in their research.
Bibliography
Barnard, C. J., et al. Asking Questions in Biology: A Guide to Hypothesis Testing, Experimental Design, and Presentation in Practical Work and Research Projects. New York: Pearson, 2011. Print.
Giere, Ronald N. Understanding Scientific Reasoning. 5th ed. Belmont, Calif.: Thomson/Wadsworth, 2006. Print.
Hall, David F., Elizabeth F. Loftus, and James P. Tousignant. “Postevent Information and Changes in Recollection for a Natural Event.” In Eyewitness Testimony: Psychological Perspectives, edited by Gary L. Wells and Elizabeth F. Loftus. New York: Cambridge University Press, 1984. Print.
Hardin, James W., and Joseph Hilbe. Generalized Estimating Equations. 2nd ed. Boca Raton: CRC, 2013. Print.
Latané, Bibb, and John M. Darley. The Unresponsive Bystander: Why Doesn’t He Help? New York: Appleton-Century-Crofts, 1970. Print.
Moore, K. D. A Field Guide to Inductive Arguments. 2d ed. Dubuque, Iowa: Kendall-Hunt, 1990. Print.
Stanovich, Keith E. How to Think Straight About Psychology. 8th ed. Boston: Allyn & Bacon, 2007. Print.
Wilcox, Rand. Introduction to Robust Estimation and Hypothesis Testing. 3rd ed. Waltham: Academic Press, 2012. Print.
No comments:
Post a Comment