# 线性回归(一)-多元线性回归原理介绍

## 多元线性回归理论

### 从一元线性回归讲起

0 1 2 3 4 5 6 7 8 9
Y 6.7 7.2 10.3 12.4 15.1 17.6 19.4 21.3 24.0
X 343.4 477.6 739.1 1373.9 1510.2 1700.6 2026.6 2577.4 3496.2

#### 拟合的依据

$$\begin{array}{l} \frac{{\partial L}}{{\partial {\beta _0}}} = 0\\ \frac{{\partial L}}{{\partial {\beta _1}}} = 0 \end{array}$$
$$\begin{array}{l} ①\frac{{\partial L}}{{\partial {\beta _0}}} = \sum { - 2*[{y_i} - ({\beta _0} + {\beta _1}{x_i})]} = - 2n[\bar y - ({\beta _0} + {\beta _1}\bar x)] = 0\\ ②\frac{{\partial L}}{{\partial {\beta _1}}} = \sum { - 2{x_i}*[{y_i} - ({\beta _0} + {\beta _1}{x_i})]} = - 2\sum {{x_i}{y_i} - {x_i}{\beta _0} - {\beta _1}x_i^2} = 0 \end{array}$$

$$~r = \frac{{Cov(X,Y)}}{{\sqrt {{\mathop{\rm var}} (X){\mathop{\rm var}} (Y)} }}$$

</div>

$$\begin{array}{l} E((X - EX)(Y - EY))\\ = E(XY - YEX - XEY + EXEY)\\ \because E(YEX) = E(XEY) = EXEY\\ \therefore 、text{原式}= EXY - EXEY\\ = \frac{1}{n}\sum {{x_i}{y_i}} - \bar x\bar y \end{array}$$

$$\begin{array}{l} E({(X - EX)^2})\\ = E({X^2} - 2XEX + {(EX)^2})\\ \because E(XEX) = {(EX)^2}\\ \therefore \text{原式} = E({X^2}) - {(EX)^2}\\ = \frac{1}{n}\sum {x_i^2} - {{\bar x}^2} \end{array}$$

F检验

##### 回归方程的检验

$$\begin{array}{l} \sum {{{({y_i} - \bar y)}^2}} = \sum {{{({y_i} - \hat y + \hat y - \bar y)}^2}} \\ = \sum {{{({y_i} - \hat y)}^2}} + \sum {{{(\hat y - \bar y)}^2}} + 2\sum {({y_i} - \hat y)(\hat y - \bar y)} \\ = \sum {{{({y_i} - \hat y)}^2}} + \sum {{{(\hat y - \bar y)}^2}} \end{array}$$

### 多元线性回归

$$\left\{ {\begin{array}{*{20}{c}} {{y_1} = {\beta _0} + {\beta _1}{x_{11}} + {\beta _2}{x_{12}} + ...{\beta _p}{x_{1p}} + {\varepsilon _1}}\\ {{y_2} = {\beta _0} + {\beta _1}{x_{21}} + {\beta _2}{x_{22}} + ...{\beta _p}{x_{2p}} + {\varepsilon _2}}\\ {{y_3} = {\beta _0} + {\beta _1}{x_{31}} + {\beta _2}{x_{32}} + ...{\beta _p}{x_{3p}} + {\varepsilon _3}}\\ \vdots \\ {{y_n} = {\beta _0} + {\beta _1}{x_{n1}} + {\beta _2}{x_{n2}} + ...{\beta _p}{x_{np}} + {\varepsilon _1}} \end{array}} \right.~$$

$$\begin{array}{l} Y = \left[ {\begin{array}{*{20}{c}} {{y_1}}\\ {{y_2}}\\ {{y_2}}\\ \vdots \\ {{y_n}} \end{array}} \right],\beta = \left[ {\begin{array}{*{20}{c}} {{\beta _0}}\\ {{\beta _1}}\\ {{\beta _2}}\\ \vdots \\ {{\beta _p}} \end{array}} \right],\varepsilon = \left[ {\begin{array}{*{20}{c}} {{\varepsilon _0}}\\ {{\varepsilon _1}}\\ {{\varepsilon _2}}\\ \vdots \\ {{\varepsilon _n}} \end{array}} \right]\\ X = \left[ {\begin{array}{*{20}{c}} 1&{{x_{11}}}&{{x_{12}}}& \cdots &{{x_{1p}}}\\ 1&{{x_{21}}}&{{x_{22}}}& \cdots &{{x_{2p}}}\\ 1&{{x_{31}}}&{{x_{32}}}& \cdots &{{x_{3p}}}\\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1&{{x_{n1}}}&{{x_{n2}}}& \cdots &{{x_{np}}} \end{array}} \right] \end{array}$$

1. 检查影响因素的取值中是否有线性相关，即剔除$X$系数矩阵中的线性相关的列，即剔除多余影响因素。
2. 先根据一元线性回归拟合的依据计算多元拟合的依据，
3. 对样本进行显著性检验
4. 对回归方程进行显著性检验

## 参考资料

https://www.eatrice.cn/post/多元线性回归推导和误差处理/

2020年3月30日