If you have two numeric variables that are not linearly related, or if one or both of your variables are ordinal variables, you can still measure the strength and direction of their relationship using a non-parametric correlation statistic. The most common of these is the Spearman rank correlation coefficient, ρ, which considers the ranks of the values for the two variables. For example, consider the lengths and weights of a sample of five kittens:
Kitten | Length (cm) | Weight (g) |
1 | 7.8 | 245 |
2 | 8.2 | 321 |
3 | 7.5 | 260 |
4 | 9.0 | 405 |
5 | 8.1 | 272 |
The ranks of these values are in the following table:
Kitten | Length Rank | Weight Rank |
1 | 2 | 1 |
2 | 4 | 4 |
3 | 1 | 2 |
4 | 5 | 5 |
5 | 3 | 3 |
Spearman’s correlation is equivalent to calculating the Pearson correlation coefficient on the ranked data. So ρ will always be a value between -1 and 1. The further away ρ is from zero, the stronger the relationship between the two variables. The sign of ρ corresponds to the direction of the relationship. If it is positive, then as one variable increases, the other tends to increase. If it is negative, then as one variable increases, the other tends to decrease.
You might want to use Spearman’s correlation if your data have a non-linear relationship (like an exponential relationship) or you have one or more outliers. However, Spearman’s correlation is only appropriate if the relationship between your variables is monotonic, meaning that as one variable increases, the other tends to either increase or decrease (not both):
Inference
You can determine if ρ is significantly different from zero by running a Pearson correlation t-test on the ranks of the two variables.
Assumptions:
- Random samples
- Independent observations
- The relationship between the two variables is monotone (assessed by visually with a scatterplot).
Hypotheses:
Ho: The ranks of the two variables are not linearly related (ρ = 0).
HA: The ranks of the two variables are linearly related (ρ ≠ 0).
Example: Performing analysis in R
The following video investigates the relationship between age of shelter animals and the number of days they wait until they are adopted.
Dataset used in video
R script file used in video
Sample conclusion: We have no evidence to suggest that the age of shelter animals is related to how long they spend in the shelter (ρ = -0.18, p>.05).