Risk Model Handbook  
- What is risk
  
  
- Why a Multi-Factor Model?  
 Computing sample statistics directly from historical data, however, is fraught with danger. Historical returns are typically noisy; even in the absence of actual data errors, false signals and spurious relationships abound. Two assets may appear closely related when their seemingly-correlated behavior is in fact an artifact of data-mining.
 
 Weak signals and noise aside, when a new asset enters the existing universe, there is no reliable way of calculating its relationships with the other assets, because it does not yet possess a returns history. One could construct various proxies, but such an approach is dubious at best.
 Finally, data points totalling no less than the number of assets are required to accurately estimate all the variances and covariances directly. For any realistic number of assets, it is extremely unlikely that sufficient observations exist. Even with a universe of 100 assets, over 1 2 · 100 · (100+ 1) > 5; 000 relationships need to be estimated. For stock markets like the U.S. (over 12,000 assets), this becomes completely infeasible.
 A better approach is to first impose some structure on the asset returns by identifying common factors within the market | that is, factors which drive asset returns.
 
 
- The Returns Model  
 r = Bf + u  , B are factor exposures, f are coefficients, u are residuals.
 3.1 Least Square Regression solution
  
 
 4. Residuals are normally-distributed: strengthening the previous assumption to u ∼ N (0; Ω)where Ω = σ2In is not strictly required. Nevertheless, it is a convenient assumption for testing the estimators, to simplify constructing confidence intervals, evaluating hypothesis tests, and so forth.


3.1.3 Outlier

3.1.4 Robust Regression


3.2 Statistical Approaches
3.2.1 Principal Components

3.2.2 Asymptotic Principal Components  
3.2.3 Maximum-Likelihood Estimation  
5. Model Factors  

5.2 Industry Factors
5.3 Style Factors
5.3.1 Standardization of Style Factors
Because style factor definitions are expressed in a mixture of units, it is best to standardize them to ensure a level of consistency across the regression estimates. Failure to do so may result in scaling problems in the regression or possibly an ill-conditioned covariance matrix .



6. The Risk Model  



6.2 Autocorrelation in the Factor Returns  
Over short time frames, market microstructure tends to induce lead-lag relationships that induce autocorrelation in the factor returns
over time.  
6.5 Specific Risk Calculation  
