The Merits of Externally Invalid Survey Experiments


Gustavo Diaz
McMaster University




“Future research should confirm if our findings generalize…”

  • …with a representative sample
  • …in other countries
  • …beyond the survey setting
  • …when using behavioral outcomes

Usual workflow

  1. Research idea

  2. Realize resource/ethical/practical limitations

  3. Conduct experiment with limitations

  4. Wave hands about external validity


  • Should we ever implement an externally invalid survey experiment on purpose?

  • Identify what makes external invalidity desirable

  • Challenge: Different kinds of external (in)validity

External validity concerns

Type Concern
Samples Does this apply to a different population?
Contexts Does this apply in a different setting?
Treatments Do they resemble real-world phenomena?
Outcomes Do they reflect actual behaviors?

External validity concerns

Type Concern
Samples Does this apply to a different population?
Contexts Does this apply in a different setting?
Treatments Do they resemble real-world phenomena?
Outcomes Do they reflect actual behaviors?

External validity concerns

Type Concern
Samples Does this apply to a different population?
Contexts Does this apply in a different setting?
Treatments Do they resemble real-world phenomena?
Outcomes Do they reflect actual behaviors?

External validity concerns

Type Concern
Samples Does this apply to a different population?
Contexts Does this apply in a different setting?
Treatments Do they resemble real-world phenomena?
Outcomes Do they reflect actual behaviors?

External validity concerns

Type Concern
Samples Does this apply to a different population?
Contexts Does this apply in a different setting?
Treatments Do they resemble real-world phenomena?
Outcomes Do they reflect actual behaviors?






Saudi Arabia and Kuwait were selected for their theoretical case value;


Saudi Arabia and Kuwait were selected for their theoretical case value; both are high in gender inegalitarianism, and they offer tough tests.


Saudi Arabia and Kuwait were selected for their theoretical case value; both are high in gender inegalitarianism, and they offer tough tests. In addition, while these neighboring countries have much in common, both resource-rich and highly conservative, they also differ in important ways.


Saudi Arabia and Kuwait were selected for their theoretical case value; both are high in gender inegalitarianism, and they offer tough tests. In addition, while these neighboring countries have much in common, both resource-rich and highly conservative, they also differ in important ways. Thus, if similar results are found, the case for generalizability across different interaction types and varying national circumstances will be strengthened.






Invalid Benefit
Samples Contour generalizations
Contexts Contour generalizations
Treatments Statistical properties
Outcomes Hypothetical/rare scenarios
  • Endline: Consider merits before implementation

  • What would persuade you to embrace external invalidity?