Share page with AddThis

Magazine Issues » November 2014

EDHEC: Guarding against data mining

Man in-tunnelAs factor-based investing gains ground, the Edhec-Risk Institute says investors need to be wary of influences, such as data mining, that can thwart efficacy of a factor.

Institutional investors have started to review factor-based equity investment strategies. The parliament of Norway, which acts as a trustee for the Norwegian Oil Fund, has commissioned a report on the investment returns of the fund. 

This report was requested after the fund’s performance fell short of the performance of popular equity market benchmarks. 

The report shows that the returns relative to a cap-weighted benchmark of the fund’s actively-managed portfolio can be explained by exposure to a set
of well-documented alternative risk factors. 

After taking into account such exposures, active management did not have any meaningful effect on the risk and return of the portfolio. 

The authors argue that such exposures can be obtained through purely systematic strategies without a need to rely on active management. Therefore, rather than simply observing the factor tilts brought by active managers ex-post, investors may consider which factors they wish to tilt towards and make explicit decisions on these tilts.

This discussion of active managers’ sources of outperformance has naturally led to factor indices being considered as a more cost-efficient and transparent way of implementing such factor tilts. 

As institutional investors’ discussions have gone on, providers of exchange-traded products have rolled out a
series of factor-based equity investment products. 

The notion that such factor-based equity strategies may deliver outperformance over standard cap-weighted indices receives support from the asset pricing theory, which postulates that multiple sources of systematic risk are priced in securities markets. In particular, both equilibrium models, such as the intertemporal capital asset pricing model, and no arbitrage models, such as the arbitrage pricing theory, allow the existence of multiple priced risk factors. 

The economic intuition for the existence of a reward for a given risk factor is that exposure to such a factor is undesirable for the average investor because it leads to losses in bad times (for example when marginal utility is high).

This can be illustrated by liquidity risk. While investors may gain a payoff from exposure to illiquid securities as opposed to their more liquid counterparts, such illiquidity may lead to losses in times when liquidity dries up and a flight to quality occurs, such as during the 1998 Russian default crisis and the 2008 financial crisis. In such conditions, hard-to-sell (illiquid) securities may post heavy losses. 

While asset pricing theory provides a sound rationale for the existence of multiple factors, theory provides little guidance on which factors should be expected to be rewarded. Empirical research, however, has come up with a range of factors that have led to significant risk premia in typical samples of data from US and international equity markets. A key requirement of investors to accept factors as relevant in their investment process is, however, that there is a clear economic intuition as to why the exposure to this factor constitutes a systematic risk that requires a reward and is likely to continue producing a positive risk premium. 

A review of the empirical literature has identified factors that affect the cross section of expected stock returns and count 314 factors, for which results have been published.

It is interesting these factors  explain expected returns across stocks not only in US markets, but also in international equity markets, and – in many cases – even in other asset classes including fixed income, currencies and commodities. 

It is worth noting that the debate about the existence of positive premia for these factors is far from closed. For example, the debate is ongoing on the low risk premium. Early empirical evidence suggests that the relation between systematic risk (stock beta) and return is flatter than predicted by the capital asset pricing model. 

Other research has found stocks with high idiosyncratic volatility have had low returns. Other papers have documented a flat or negative relation between total volatility and expected return. 

However, recent papers have questioned the robustness of the results and show the findings are not robust to changes to portfolio formation or to adjusting for short-term return reversals.

What is important in addition to an empirical assessment of factor premia is to check whether there is any compelling economic rationale as to why the premium would persist. Such persistence can be expected notably if the premium is related to risk taking. 

In an efficient market with rational investors, systematic differences in expected returns should be due to differences in risk. Researchers have argued that to determine meaningful factors, less weight should be placed on the data that models are able to match, and instead scrutinise the theoretical plausibility and empirical evidence to support their main economic mechanisms. 

This point is best illustrated with the example of the equity risk premium. Given the wide fluctuation in equity returns, the equity risk premium can be statistically indistinguishable from zero even for relatively long sample periods. 

However, one may reasonably expect that stocks have higher reward than bonds because investors are reluctant to hold too much equity due to its risks. For other equity risk factors, such as value, momentum, low risk and size, similar explanations that interpret the factor premia as compensation for risk have been put forth in the literature. 

It is worth noting that the existence of the factor premia could also be explained by investors making systematic errors due to behavioural biases, such as over-reaction or under-reaction to news on a stock. 

However, whether such behavioural biases can affect asset prices in the presence of some smart investors who do not suffer from these biases is a point of contention.

In fact, even if the average investor makes systematic errors due to behavioural “biases”, it could still be possible that some rational investors who are not subject to such biases exploit any small opportunity resulting from the irrationality of the average investor. 

The trading activity of such smart investors may then make the return opportunities disappear. Therefore, behavioural explanations of persistent factor premia often introduce so called “limits to arbitrage,” which prevent smart investors from fully exploiting the opportunities arising from the irrational behaviour of other investors. The most common limits to arbitrage are constraints because of short selling or funding liquidity.

Investors who wish to exploit factor premia need to address robustness when selecting a set of factors. 

Indeed, an important issue is that the premium may decrease if investors are increasingly investing to capture it. 

Another issue is that the discovery of the premium in the first place may have been a result of data mining. To avoid the pitfalls of non-persistent factor premia and achieve robust performance, investors should keep these following checks in their mind. 

First, investors should require a sound economic rationale for the existence of a premium. 

Second, due to the risks of data-mining, investors may be well advised to stick to simple factor definitions that are widely used in the literature rather than rely on complex and proprietary factor definitions.

Different papers in the empirical literature use different proxies to capture a given factor exposure, and practical implementations of factor exposures may deviate considerably from factor definitions in the literature. 

For example, when capturing the value premium one may use extensive fundamental data including not only valuation ratios but also information on, for example, sales growth of the firm. 

Moreover, many value-tilted indices include a large set of ad-hoc methodological choices, opening the door to data mining. 

As an illustration, it is easily to be in a position to consider the affect of strategy specification for fundamental equity indexation strategies, which are commonly employed as a way to harvest the value premium.

The outperformance of a fundamental equity indexation strategy is highly sensitive to strategy specification choices. 

Different leverage adjustment methods can also lead to large differences when measuring yearly performance. 

The differences of as much as 10% in annual return between strategies that make different ad-hoc choices clearly shows that back histories depend heavily on strategy specifications. 

In contrast to the wide variety of factor definitions used by index providers and asset managers, most empirical asset pricing studies resort to simple and consensual factor definitions. 

For example, the most widely used definition of “value” is based on a single variable – the book-to-market ratio of a particular stock. 

This simplicity of the factor definition may be an important guard against data mining risks.

This article is based on a paper by Noël Amenc, professor of finance, Edhec Business School, and director, Edhec-Risk Institute; Felix Goltz, head of applied research, Edhec-Risk Institute; and Ashish Lodh, senior quantitative analyst, ERI Scientific Beta

©2014 funds europe