See Least Squares for the Linear Algebra view
Regression In Statistics
Bayesian
Classical Statistics
Practical Considerations
- Heteroskedasticity: variance for all data point noise terms may not be the same.
- Nonlinearity: Effect of x is non-linear on y.
- Multicollinearity: strong correlation between explanatory variables. Hard to distinguish relative effects of each variable.
- Overfitting: Too many variables may fit the data well but not the population. 10x more data points than parameters preferred.
- Causality: unknown direction of causal effect, OR third variable is causing existing two.