Research Terms & Ideas
What to Be Cautious About
Conducting research in the social sciences that clearly shows connections between one thing (e.g., texting as a primary form of socializing) and another (e.g., quality of workplace relationships) is very difficult. And, even when done well, showing that the research results have implications for the “real world” is even more difficult. If an experiment on university students shows that texting changes relationship quality, should employers hire only people who don’t text, or who do, or for whom texting is a minor part of their socializing activities?
For traditional research, the problems within a study are called “threats to internal validity” and the problems regarding taking the study’s findings to the real world (i.e., generalizing) are called “threats to external validity.” These threats are categorized, named, and described below. Please note that only a cursory overview is provided here. Human research is a complex undertaking usually involving years of study.
Few studies are free of any of the threats. Use your judgment and the analysis of related research to determine how much a study’s findings will influence your thinking or practice. Also, remember that the threats described in this section pertain only to the design of the study, not the nitty-gritty of its statistical methods and execution. Most of us simply must temporarily trust that the researchers chose the right methods and applied them properly. Then, we wait in hope that another researcher will discover any problems.
Threats to Internal Validity – Possible Problems Within a Study
History: Other Events Affect Study Participants
Studies often measure one or more variables (e.g., confidence in finding employment) before and after an intervention (e.g., a weekly job-search webinar series). Finding a change, such as increased confidence in finding employment, cannot always be attributed to the intervention. Other events in participants’ lives may have caused the observed changes. These events, which are part of the person’s history, can be specific to the individual (e.g., a person is told about a work opportunity by a neighbour) but more often have affected the whole group.
For example, a job search workshop series delivered in November could produce the result of a group displaying much greater confidence in finding employment whereas the same series run in December might find a downward change. These changes may have nothing to do with the workshop but rather are caused by events in the labour market: Christmas hiring creates opportunities in late November and December whereas January sees the aftermath of many employees being laid off as post-Christmas work settles down. A comparison group of similar participants receiving no intervention (control group) or a different intervention would help clarify the cause of the change.
Maturation: Sometimes People Change on Their Own
Adolescent males, for example, naturally become more risk-averse as they mature into adulthood. A year-long study that finds a drop in criminal activity (activity that typically involves risk) after a career development intervention with adolescent males may not be able to claim that the intervention was related to the drop. A comparison group of similar participants receiving no intervention (control group) or a different intervention would help clarify the cause of the change.
Selection Bias: Comparing Apples and Oranges (Usually Accidentally)
For example, consider a study investigating the effectiveness of an employment intervention. A group of volunteers are selected to receive the intervention. Changes in this group are compared to changes among a group of individuals receiving typical services they are required to participate in if they wish to retain their social assistance benefits. Finding that the group receiving the intervention experiences greater positive change than the other group would be no surprise. By virtue of volunteering, a bias has been created. Volunteers are typically more eager and motivated than those who are compliantly participating to receive other benefits. Selection bias can be far more subtle, however, creating findings that may not hold up with many minorities.
Experimental Mortality or Attrition: Participants Who Drop Out of Studies May Be Different Than Those Who Stay
Participants drop out of research for a variety of reasons such as becoming ill, moving, changing schedules, or changing priorities. Sometimes, these reasons may be tied to the study’s area of concern. If they had stayed, study-leavers may have responded in a different way than those remaining in a study. Consider a test of a 3-month entrepreneurship program teaching five groups of 20 participants. On average, each group has five dropouts. The results of the program are based on measures of the remaining 80 participants. Regardless of whether the results show the program’s participants successfully started entrepreneurial ventures or did not, the study’s conclusions will need to be tentative. Perhaps the 20 participants who left did so because their entrepreneurial idea became very successful and time-consuming while in the program. Or, maybe those who left did so because the program inadvertently sent subtle messages that they would not succeed as entrepreneurs because of systemic issues related to their culture, gender identity, or racial identity. Researchers can mitigate the impact of this uncertainty by gathering relevant information about participants at the start of a study and analyzing the differences between those who stayed and left.
Testing: Participants May Change by Being Tested
People sometimes change their behaviour when ideas are brought to the forefront of their minds. For example, a study examining the impact of labour market exploration on participant optimism has two pre-test questionnaires, one about participants’ labour market exploration activities and one about their optimism. They go through an online intervention showing them ways to explore the labour market and, sometime later, complete post-test questionnaires on labour market exploration and optimism. The researchers are pleased to find out that both exploration and optimism increased. However, the increase could have been due to pre-tests: Perhaps the test questions prompted participants to pay more attention to their beliefs about the future and to think about going out to see what the world has to offer. This prompting may have been the cause of the behaviour change rather than the intervention. A comparison group of similar participants receiving no intervention (control group) or a different intervention would help clarify the cause of the change.
Instrumentation: The Way Things Are Measured May Change Over Time
Imagine a study of worker initiative and feelings of agency in which video recordings of workplace behaviour are reviewed by observers before and after an intervention. Observers follow a detailed behaviour coding system, tracking behaviour minute-by-minute for up to 3 hours per study participant. New video recordings are similarly coded immediately and 3 months after the intervention. Findings of improvements (or declines) in initiative and agency may be due to the intervention but could instead be due to:
- observers getting better at coding behaviour over the course of the study because their skill develops,
- observers getting tired of or bored with coding behaviour and getting worse over the course of the study, and/or
- different observers (e.g., research assistants in universities) at the end of a study than at the beginning of the study.
Researchers can minimize these problems by having thorough observer training, multiple observers, and random checks of observer accuracy.
Statistical Regression: Extreme Scores on Anything Usually Become Less Extreme
Studies are published occasionally that show remarkable results with very difficult participants or contexts. The findings are quite seductive when a treatment or intervention is shown to work with a group that is typically hard to help (“This works with the most difficult clientele”!). From a statistical perspective, though, it is known that extreme scores will naturally become less extreme due to the nature of measurement and variance changes. The fact the participants score very low (or high) on a variable increases the chance that changes will be seen, regardless of whether participants actually changed. A comparison group of similar participants receiving no intervention (control group) or a different intervention would help clarify the cause of the change.
Threats to Internal Validity – Possible Problems Within a Study
The typical problems preventing generalization have to do with studies’ samples (participants in the study), measurement, and people’s responses to experimentation.
Sample Representation: There Are Too Few Participants in the Study and/or Participants Do Not Represent the Population
Sample selection (sampling / selection bias). Consider testing a new work search intervention designed to motivate clients in 1-to-1 employment services that you believe will work with all unemployed clients. Now imagine the time and effort it would take to make sure your study’s sample represents all possible clients, meaning you would need to find and gain the cooperation of a diverse group of individuals representing different cultures, sexes, ages, socioeconomic status levels, and other variables. Most researchers do not have the resources needed to include representatives of all populations they would like their results to generalize to. They therefore often pick a research sample that includes most of the target population’s sub-groups (e.g., unemployed working-age men and women who are immigrants, refugees, or Canadian-born). In early stages of research, when researchers are not sure of what they will find, researchers want to invest very little. Many researchers work for universities and therefore have research participants readily available: university students. Samples composed of students are fine for preliminary testing of ideas but generalizing from this sample to others should be done with caution. “Selection bias” is the formal terminology for choosing a sample that will respond differently than the population the researcher wants to represent (https://www.iwh.on.ca/what-researchers-mean-by/bias).
[1] Researchers use “sample” to refer to the participants (e.g., people, animals, plants, rocks) in their study, recognizing that they cannot study the entire population that is of interest to them (e.g., all unemployed immigrant women, all corporate executives who move to other corporations).
Experimental Setting or Artificial Context: Lab Findings May Not Work in Real Life
Research in the real world (in situ, if Latin is one of your languages) is expensive and time-consuming. Creating artificial contexts can help researchers reduce costs and control variables that could interfere with their studies. However, it can sometimes be a big leap to generalize findings found in a lab to day-to-day behaviour. For example, a study of adolescent career-related decision-making using online scenarios in a computer lab may produce results difficult to transfer to adolescents’ actual lives.
Testing: Being Observed Influences Behaviour, and the Way Things are Measured Can Influence Results
There are a number of generalizability concerns related to testing or measurement. First, people behave differently when studied (known as the Hawthorne effect). Also, people often behave differently when they are in novel situations or when asked to do things outside of their normal routines.
Second, testing can interact with or influence an intervention. Giving participants a pre-test can prime their thinking so that are mentally ready for an intervention (e.g., career strategy workshop) that they would not have been ready for without a pre-test. Similarly, giving participants a post-test can solidify concepts learned in an intervention such that everything comes together and makes sense for participants. Without the post-test, the change may not have been solidified and the intervention would have not been shown to make a difference. Also, the timing of measurement can make a difference. A test given right after a life skills program may show few changes, but the same test given a month or two later could show dramatic results as participants had time to get their ducks in a row.
Finally, one measure (e.g., a multiple-choice test) may find no changes due to an intervention but another measure (e.g., short-answer quiz) may show significant change.
Experimenter Effect: Some People Produce Change that Others Cannot
Interventions involving humans working with humans, which are common in the career development field, are particularly prone to the experimenter effect. A common scenario is that a researcher/practitioner finds that an intervention seems to work really well. The researcher puts this to the experimental test, finds great results and, even though the study is done well, goes on to find that no one else gets the same results. Something about the researcher (or group of researchers) beyond the intervention itself is enabling them to get results that others’ cannot with the intervention alone.
Where to Find Credible Research
Given the difficult of doing research well, it is really important to pay attention to the source of the research you read. Traditional academic journals typically have two features that make them more trustworthy than other sources:
- The use of a peer-review process in which at least one other expert reviews the study and approves it or suggests improvements before publication.
- They do not have a vested interest in the results of the research. It should not matter whether the research results go one way or another.