A What is dot hat in a regression output

https://stats.stackexchange.com/a/256364/154908

Q. The augment() function in the broom package for R creates a dataframe of predicted values from a regression model. Columns created include the fitted values, the standard error of the fit and Cook’s distance. They also include something with which I’m not familar and that is the column .hat.

library(broom)
data(mtcars)

m1 <- lm(mpg ~ wt, data = mtcars)

head(augment(m1))
#> # A tibble: 6 x 10
#>   .rownames    mpg    wt .fitted .se.fit .resid   .hat .sigma .cooksd .std.resid
#>   <chr>      <dbl> <dbl>   <dbl>   <dbl>  <dbl>  <dbl>  <dbl>   <dbl>      <dbl>
#> 1 Mazda RX4   21    2.62    23.3   0.634 -2.28  0.0433   3.07 1.33e-2    -0.766 
#> 2 Mazda RX4…  21    2.88    21.9   0.571 -0.920 0.0352   3.09 1.72e-3    -0.307 
#> 3 Datsun 710  22.8  2.32    24.9   0.736 -2.09  0.0584   3.07 1.54e-2    -0.706 
#> 4 Hornet 4 …  21.4  3.22    20.1   0.538  1.30  0.0313   3.09 3.02e-3     0.433 
#> 5 Hornet Sp…  18.7  3.44    18.9   0.553 -0.200 0.0329   3.10 7.60e-5    -0.0668
#> 6 Valiant     18.1  3.46    18.8   0.555 -0.693 0.0332   3.10 9.21e-4    -0.231

# .hat vector
augment(m1)$.hat
#>  [1] 0.0433 0.0352 0.0584 0.0313 0.0329 0.0332 0.0354 0.0313 0.0314 0.0329
#> [11] 0.0329 0.0558 0.0401 0.0419 0.1705 0.1953 0.1838 0.0661 0.1177 0.0956
#> [21] 0.0503 0.0343 0.0328 0.0443 0.0445 0.0866 0.0704 0.1291 0.0313 0.0380
#> [31] 0.0354 0.0377

Can anyone explain what this value is, and is it different between linear regression and logistic regression?

A. Those would be the diagonal elements of the hat-matrix which describe the leverage each point has on its fitted values.

If one fits:

$\vec{Y} = \mathbf{X} \vec {\beta} + \vec {\epsilon}$

then:

$\mathbf{H} = \mathbf{X}(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T$ In this example:

$\begin{pmatrix}Y_1\\ \vdots\\ Y_{32}\end{pmatrix} = \begin{pmatrix} 1 & 2.620\\ \vdots\\ 1 & 2.780 \end{pmatrix} \cdot \begin{pmatrix} \beta_0\\ \beta_1 \end{pmatrix} + \begin{pmatrix}\epsilon_1\\ \vdots\\ \epsilon_{32}\end{pmatrix}$

Then calculating this $\mathbf{H}$ matrix results in:

library(MASS)

wt <- mtcars[, 6]

X <- matrix(cbind(rep(1, length(wt)), wt), ncol=2)

H <- X %*% ginv(t(X) %*% X) %*% t(X)

Where this last matrix is a $32 \times 32$ matrix and contains these hat values on the diagonal.

X                           32x2
t(X)                        2x32
X %*% t(X)                  32x32
t(X) %*% X                  2x2
ginv(t(X) %*% X)            2x2
ginv(t(X) %*% X) %*% t(X)   2x32
X %*% ginv(t(X) %*% X)      32x2

dim(ginv(t(X) %*% X) %*% t(X))
#> [1]  2 32

x1 <- X %*% ginv(t(X) %*% X)
dim(x1)
#> [1] 32  2
dim(x1 %*% t(X))
#> [1] 32 32

x2 <- ginv(t(X) %*% X) %*% t(X)
dim(x2)
#> [1]  2 32
dim(X %*% x2)
#> [1] 32 32

# this last matrix is a 32×32 matrix and contains these hat values on the diagonal.
diag(H)
#>  [1] 0.0433 0.0352 0.0584 0.0313 0.0329 0.0332 0.0354 0.0313 0.0314 0.0329
#> [11] 0.0329 0.0558 0.0401 0.0419 0.1705 0.1953 0.1838 0.0661 0.1177 0.0956
#> [21] 0.0503 0.0343 0.0328 0.0443 0.0445 0.0866 0.0704 0.1291 0.0313 0.0380
#> [31] 0.0354 0.0377

46 Deep Learning tips for Classification and Regression

B Q-Q normal to compare data to distributions