If you prefer, you can write the R² as a percentage instead of a proportion. The proportion that remains (1 − R²) is the variance that is not predicted by the model. You can also say that the R² is the proportion of variance “explained” or “accounted for” by the model. You can interpret the coefficient of determination ( R²) as the proportion of variance in the dependent variable that is predicted by the statistical model.Īnother way of thinking of it is that the R² is the proportion of variance that is shared between the independent and dependent variables. These values can be used to calculate the coefficient of determination ( R²) using Formula 2:ĭiscover proofreading & editing Interpreting the coefficient of determination This value can be used to calculate the coefficient of determination ( R²) using Formula 1:įormula 2: Using the regression outputs Formula 2:Įxample: Calculating R² using regression outputsAs part of performing a simple linear regression that predicts students’ exam scores (dependent variable) from their study time (independent variable), you calculate that: Where r = Pearson correlation coefficient Example: Calculating R² using the correlation coefficientYou are studying the relationship between heart rate and age in children, and you find that the two variables have a negative Pearson correlation: Formula 1: Using the correlation coefficient Formula 1: The first formula is specific to simple linear regressions, and the second formula can be used to calculate the R² of many types of statistical models. You can choose between two formulas to calculate the coefficient of determination ( R²) of a simple linear regression. In other words, when the R 2 is low, many points are far from the line of best fit:Ĭalculating the coefficient of determination In contrast, you can see in the second dataset that when the R 2 is low, the observations are far from the model’s predictions. Note: The coefficient of determination is always positive, even when the correlation is negative. In other words, most points are close to the line of best fit: You can see in the first dataset that when the R 2 is high, the observations are close to the model’s predictions.
It is the proportion of variance in the dependent variable that is explained by the model. More technically, R 2 is a measure of goodness of fit.
The lowest possible value of R² is 0 and the highest possible value is 1.
The outcome is represented by the model’s dependent variable. The coefficient of determination ( R²) measures how well a statistical model predicts an outcome. What is the coefficient of determination?