Statistical measure relationship between two variables in statistics

Dr. Anthony Picciano - Education Research Methods

statistical measure relationship between two variables in statistics

Correlation between two variables indicates that a relationship exists between those Graphs and the relevant statistical measures often work better in tandem. There are two variables A and B. I want to test if A has an effect on B or B has an effect There is no statistical analysis, by itself, that will demonstrate a cause and continuous scale, i.e. at least not falling short of the interval measuring scale. People of the same height vary in weight, and you can easily think of two people The Survey System's optional Statistics Module includes the most common type, to look at the relationship between two variables while removing the effect of.

This is especially true if you labeled the mid-points of your scale you cannot assume "good" is exactly half way between "excellent" and "fair".

statistical measure relationship between two variables in statistics

Most statisticians say you cannot use correlations with rating scales, because the mathematics of the technique assume the differences between numbers are exactly equal. Nevertheless, many survey researchers do use correlations with rating scales, because the results usually reflect the real world. Our own position is that you can use correlations with rating scales, but you should do so with care.

When working with quantities, correlations provide precise measurements.


When working with rating scales, correlations provide general indications. Correlation Coefficient The main result of a correlation is called the correlation coefficient or "r". It ranges from If r is close to 0, it means there is no relationship between the variables. If r is positive, it means that as one variable gets larger the other gets larger. If r is negative it means that as one gets larger, the other gets smaller often called an "inverse" correlation.

The square of the coefficient or r square is equal to the percent of the variation in one variable that is related to the variation in the other. After squaring r, ignore the decimal point.

statistical measure relationship between two variables in statistics

An r value of. A correlation report can also show a second result of each test - statistical significance.

Australian Bureau of Statistics

In this case, the significance level will tell you how likely it is that the correlations reported may be due to chance in the form of random sampling error. If you are working with small sample sizes, choose a report format that includes the significance level. This format also reports the sample size. A key thing to remember when working with correlations is never to assume a correlation means that a change in one variable causes a change in another.

Sales of personal computers and athletic shoes have both risen strongly in the last several years and there is a high correlation between them, but you cannot assume that buying computers causes people to buy athletic shoes or vice versa. The second caveat is that the Pearson correlation technique works best with linear relationships: It does not work well with curvilinear relationships in which the relationship does not follow a straight line. An example of a curvilinear relationship is age and health care.

They are related, but the relationship doesn't follow a straight line. Each of these two characteristic variables is measured on a continuous scale.

Negative values simply indicate the direction of the association, whereby as one variable increases, the other decreases. The significance of an association is a separate analysis of the sample correlation coefficient, r, using a t-test to measure the difference between the observed r and the expected r under the null hypothesis. Spearman rank-order correlation coefficient The Spearman rank-order correlation coefficient Spearman rho is designed to measure the strength of a monotonic in a constant direction association between two variables measured on an ordinal or ranked scale.

Statistical Language - Correlation and Causation

Data that result from ranking and data collected on a scale that is not truly interval in nature e. In addition, any interval data may be transformed to ranks and analyzed with the Spearman rho, although this results in a loss of information.

Nonetheless, this approach may be used, for example, if one variable of interest is measured on an interval scale and the other is measured on an ordinal scale. A similar measure of strength of association is the Kendall tau, which also may be applied to measure the strength of a monotonic association between two variables measured on an ordinal or rank scale.

As an example of when Spearman rho would be appropriate, consider the case where there are seven substantial health threats to a community. Health officials wish to determine a hierarchy of threats in order to most efficiently deploy their resources. They ask two credible epidemiologists to rank the seven threats from 1 to 7, where 1 is the most significant threat.

Correlation and Regression

If there is a significant association between the two sets of ranks, health officials may feel more confident in their strategy than if a significant association is not evident. Chi-square test The chi-square test for association contingency is a standard measure for association between two categorical variables. A simple and generic example follows.

If scientists were studying the relationship between gender and political partythen they could count people from a random sample belonging to the various combinations: The scientists could then perform a chi-square test to determine whether there was a significant disproportionate membership among those groups, indicating an association between gender and political party. Relative risk and odds ratio Specifically in epidemiology, several other measures of association between categorical variables are used, including relative risk and odds ratio.

statistical measure relationship between two variables in statistics

Relative risk is appropriately applied to categorical data derived from an epidemiologic cohort study. It measures the strength of an association by considering the incidence of an event in an identifiable group numerator and comparing that with the incidence in a baseline group denominator. A relative risk of 1 indicates no association, whereas a relative risk other than 1 indicates an association. As an example, suppose that 10 out of 1, people exposed to a factor X developed liver cancerwhile only 2 out of 1, people who were never exposed to X developed liver cancer.

statistical measure relationship between two variables in statistics

Thus, the strength of the association is 5, or, interpreted another way, people exposed to X are five times more likely to develop liver cancer than people not exposed to X. If the relative risk was less than 1 perhaps 0. The categorical variables are exposure to X yes or no and the outcome of liver cancer yes or no.