What is the general form of the least-squares regression line equation?
$\hat{y} = a + bx$, where $\hat{y}$ is the predicted value, a is the y-intercept, and b is the slope.
How do you calculate the correlation coefficient (r) from R-squared?
$r = \pm \sqrt{R^2}$. The sign of r matches the sign of the slope.
What is the formula for the test statistic (t) in a t-test for the slope?
$t = \frac{b - 0}{SE_b}$, where b is the sample slope and $SE_b$ is the standard error of the slope.
What is the formula for the confidence interval for the slope (b)?
$b \pm t^*SE_b$, where b is the sample slope, $t^*$ is the critical t-value, and $SE_b$ is the standard error of the slope.
How to calculate degrees of freedom (df) for t-tests and t-intervals for slopes?
$df = n - 2$, where n is the number of data points.
Explain the concept of standard error of the slope.
The standard error of the slope ($SE_b$) measures the variability of sample slopes around the true population slope. A smaller $SE_b$ indicates a more precise estimate of the slope.
Explain the importance of checking conditions (LINE) before performing inference for linear regression.
Checking conditions (Linearity, Independence, Normality, Equal Variance) ensures the validity of the inference procedures. If conditions are not met, the results of the t-test or t-interval may be unreliable.
Explain why a statistically significant slope does not necessarily imply causation.
Correlation does not equal causation. A significant slope indicates a linear association, but other factors (lurking variables, confounding variables) may be influencing the response variable.
Explain the meaning of R-squared ($R^2$) in linear regression.
$R^2$ represents the proportion of the variance in the response variable that is explained by the explanatory variable. A higher $R^2$ indicates a better fit of the regression model.
Explain the concept of residuals in linear regression.
Residuals are the differences between the observed values and the predicted values from the regression line. They represent the error in the model's predictions.
What is the definition of explanatory variable?
The explanatory variable (independent variable) is plotted on the x-axis and explains the patterns seen in a scatterplot; consider it the 'cause'.
What is the definition of response variable?
The response variable (dependent variable) is plotted on the y-axis and responds to the explanatory variable; consider it the 'effect'.
What is the definition of inference?
Inference uses sample data to make predictions or test claims about a population parameter, moving from describing data to making informed decisions.
What is a t-interval for slopes?
A confidence interval used to estimate the true slope of the population regression line, providing a range of plausible values.
What is a t-test for a slope?
A hypothesis test used to determine if there is a significant linear relationship between two variables by testing if the slope is significantly different from zero.