Internal validity refers to the extent to which a study establishes a trustworthy cause-and-effect relationship between a treatment and an outcome. In other words, it’s about ensuring that the changes in the dependent variable are directly caused by the independent variable, not by other confounding factors.

Why internal validity matters

Internal validity is crucial because it allows researchers to make strong, confident and trustworthy conclusions about the causal relationships they are investigating. Without high internal validity, the results of a study may be influenced by factors other than the treatment, making it difficult to determine the true effect of the independent variable on the dependent variable.

Research example

You want to test the hypothesis that listening to classical music while studying improves test performance. You recruit a group of high school students and assign them to either a treatment group or a control group based on their availability. The treatment group is asked to listen to classical music while studying for a history test, while the control group is asked to study without music.

After the study period, both groups take the same history test. Upon analyzing the results, you find that the treatment group (classical music) performed better on the test compared to the control group (no music).

Can you conclude that listening to classical music while studying improves test performance?

To ensure the validity of your conclusion, you must eliminate alternative explanations, such as control, extraneous, and confounding variables, that could account for the observed results.

How to check whether your study has internal validity

To determine if your study has internal validity, consider the following three questions:

  1. Did the treatment and response variables change together?
  2. Did the treatment occur before changes in the response variables?
  3. Can any confounding or extraneous factors explain the results of your study?

If you can answer “yes” to the first two questions and “no” to the third question, then your study likely has high internal validity, and you can establish a causal relationship between your independent and dependent variables.

In the research example about listening to classical music while studying, let’s examine these three questions:

  1. Did listening to classical music and test performance change together? Yes, the treatment group (classical music) performed better on the test than the control group (no music).
  2. Did listening to classical music occur before the history test? Yes, the treatment group listened to classical music while studying, which happened before taking the test.
  3. Can any confounding or extraneous factors explain the results? Yes, the availability of students is an extraneous factor that could explain the differences in test performance between the groups.

Because the participants were assigned to groups based on their availability, there may have been pre-existing differences between the students in each group that influenced their test performance, regardless of the classical music intervention. As a result, this study has low internal validity, and you cannot conclude that listening to classical music while studying improves test performance

Trade-off between internal and external validity

There is often a trade-off between internal and external validity in research studies. Studies with high internal validity are often conducted in tightly controlled laboratory settings, which may limit their external validity (generalizability to real-world settings). Conversely, studies with high external validity are often conducted in more natural settings, which may reduce their internal validity due to the presence of confounding variables.

Research example

A study on the effects of a new medication for depression may have high internal validity if it is conducted in a controlled clinical setting with strict inclusion criteria and a placebo control group. However, the same study may have limited external validity because the participants may not be representative of the general population of people with depression, and the controlled setting may not reflect real-world conditions.

Threats to internal validity and how to counter them

There are several common threats to internal validity that researchers must be aware of and take steps to mitigate.

Single-group studies

In single-group studies, participants are exposed to a treatment, and their outcomes are measured before and after the treatment.

Research example (single-group)

A school counselor wants to investigate the effectiveness of a new mindfulness-based intervention on reducing anxiety in high school students. The counselor recruits a group of students who have reported high levels of anxiety and implements the mindfulness program for 6 weeks. All participants complete an anxiety assessment questionnaire before (pre-test) and after the intervention (post-test).

ThreatMeaningExample
HistoryAn unrelated event influences the outcomes.During the 6-week intervention, the school announces changes to the grading system, which may impact students’ anxiety levels independently of the mindfulness program.
MaturationThe outcomes of the study vary as a natural result of time.As the school year progresses, students may naturally develop better coping strategies and time management skills, leading to reduced anxiety levels regardless of the intervention.
InstrumentationDifferent measures are used in pre-test and post-test phases.The pre-test anxiety assessment is administered on paper, while the post-test is completed online, potentially influencing the way students respond to the questions.
TestingThe pre-test influences the outcomes of the post-test.Participants become more familiar with the anxiety assessment questions, leading to more self-aware responses in the post-test, which may not accurately reflect the impact of the mindfulness intervention.

In this example, the single-group design makes it difficult to attribute any changes in anxiety levels solely to the mindfulness intervention. The threats to internal validity could provide alternative explanations for the observed results. To improve the internal validity of this study, the school counselor could consider adding a control group that does not receive the mindfulness intervention, using consistent measurement tools, and controlling for potential confounding variables.

How to counter threats in single-group studies

To minimize threats to internal validity in single-group studies, researchers can:

  • Use a control group that does not receive the treatment
  • Use multiple pre-tests and post-tests to establish a stable baseline and trend
  • Use reliable and valid measures consistently throughout the study
  • Be aware of and account for potential confounding events or maturation effects

Multi-group studies

In multi-group studies, participants are divided into two or more groups, typically an experimental group that receives a treatment and a control group that does not.

Research example (multi-group)

A psychologist wants to investigate the effectiveness of two different therapy approaches, cognitive-behavioral therapy (CBT) and mindfulness-based stress reduction (MBSR), in reducing symptoms of depression. They recruit participants diagnosed with depression and divide them into three groups based on their age. Group A receives CBT, Group B receives MBSR, and Group C is put on a waiting list as a control group. The participants’ depression levels are measured using a standardized questionnaire before (pre-test) and after the 8-week intervention period (post-test).

ThreatMeaningExample
Selection biasGroups are not comparable at the beginning of the study.Participants in Group A (CBT) are mostly in their 20s and 30s, while those in Group B (MBSR) are mostly in their 40s and 50s. The age difference between the groups may influence the effectiveness of the therapy approaches, making it difficult to attribute any changes in depression levels solely to the interventions.
Regression to the meanThere is a statistical tendency for people who score extremely low or high on a test to score closer to the middle the next time.Participants with extremely high depression scores are more likely to show improvement in the post-test, regardless of the intervention they receive, due to the natural tendency for scores to regress towards the mean.
Social interaction and social desirabilityParticipants from different groups may compare notes and either figure out the aim of the study or feel resentful of others or pressured to act/react a certain way.Participants in the control group (Group C) may feel disappointed about not receiving any treatment and discuss their experiences with those in the intervention groups, potentially influencing their responses in the post-test questionnaire.
Attrition biasDropout from participants25% of participants in Group B (MBSR) drop out of the study before completing the intervention, citing scheduling conflicts. The remaining participants in this group may not be representative of the original sample, making it challenging to compare the effectiveness of MBSR to CBT and the control condition.

In this example, the multi-group design introduces several threats to internal validity. The non-random assignment of participants to groups based on age leads to selection bias, while regression to the mean and social interaction effects can influence the observed results. Additionally, the high dropout rate in one of the intervention groups creates attrition bias.

To improve the internal validity of this study, the psychologist should consider randomly assigning participants to the intervention and control groups, using larger sample sizes to minimize the impact of extreme scores and dropouts, and implementing measures to control for potential confounding variables and social interaction effects.

How to counter threats in multi-group studies

m of the study counters the effects of social interaction.

In multi-group studies, modifying the experimental design can help mitigate various threats to internal validity.

  • To address selection bias and regression to the mean, researchers should randomly assign participants to different groups. This ensures that the groups are comparable at the beginning of the study, minimizing the impact of pre-existing differences on the observed results.
  • To reduce the influence of social interaction and participants’ expectations, researchers can employ blinding techniques. By keeping participants unaware of the study’s specific aims and hypotheses, researchers can prevent participants from adjusting their behavior or responses based on their understanding of the study’s purpose.