A common assumption across all inferential statistical tests is that you collected data from a random sample from your population of interest. To be a truly random sample, every subject in your target population must have an equal chance of being selected in your sample. An example of violating this assumption might be conducting a study to estimate the amount of time college students workout at your university each week. If you were to collect data only at the gym on campus, this would be a biased sample. In this example you excluded all students who do not work out at the gym on campus, thus students at the university did not all have an equal chance of being included in the study.
If you suspect your data was not randomly selected, you can try one of the following:
1. Narrow your target population in your interpretation of the results.
Example: Let’s say you want to estimate the average amount undergraduate students in the US spend on textbooks each semester. However, it is only reasonable for you to collect a sample of students who attend your university. In this case, you might want to discuss the results of your analysis in terms of the average amount students from your university spend on textbooks each semester, instead of generalizing to the nationwide college student population.
2. Consider which subsets of your population are less likely to be in your sample, and how those individuals might differ from those you have in your dataset. Try to redesign your sampling plan to specifically target those individuals, or narrow your population of interest as described above. Techniques such as stratified random sampling can help overcome these issues.