Bruce Feibel, author of

Investment Performance Measurement, spoke at the Performance Measurement Forum's North America chapter's Fall meeting on the topic of risk. I was quite pleased when he expressed his support for the risk-adjusted measure, M-squared, also known as Modigliani-Modigliani, named for its designers, Franco and Leah. This grandfather/granddaughter duo developed what Bruce and I feel is a superior metric to evaluate risk.

Unlike the other risk-adjusted measures, its results are highly intuitive. To me there's a parallel to the way we show gross and net performance, as well as pre- and after-tax results. For example, from a gross/net perspective we might see:

- Gross return = 17.45%
- Net return = 16.68%.

The net result reflects the impact of the advisor's fee. From a tax perspective:

- Pre-tax return = 19.35%
- After-tax return = 16.45%.

Well, with M-squared we could see:

- Before adjusting for risk = 18.27%
- After adjusting for risk = 16.45%.

Granted, we might not label our results this way, but I'm simply trying to convey how the results align with what we often see today. Our return

*has* been adjusted for risk, unlike the other measures. And unlike the other measures, its reported in percentage terms, which are much clearer to comprehend.

If you're not including M-squared in your reporting, you should be!

And, if you'd like more information on this measure,

send me a note.

M-squared is becoming part of the performance "orthodoxy" but (like the Interaction Effect) it is a deeply flawed approach - in this case looking at the challenging topic of Risk Adjusted Performance. The problem with M-squared is definitional and logical, it is not mathematical. Unfortunately, performance professionals have a tendency to rush into calculations before considering a) the questions being asked and b) the connection between the information they supply and the investment decision making process. With this in mind, let's consider M-Squared.

ReplyDeleteM-squared pretends that leveraging or deleveraging an investment is appropriate, desirable, meaningful... or even possible. The problem is that none of these expectations are really true. For example, one can pretend that it makes sense to our clients to take an actual investment, sell enough of it to match the benchmark's risk by investing the proceeds in cash, and then evaluate this "theoretical" new investment relative to the benchmark. OR... the client can borrow at the cash rate and leverage the existing investment if it happens to exhibit less volatility than the benchmark.

The problems with this approach are obvious:

#1: We need to evaluate the actual investment, not some theoretical, manipulated version of it.

#2: Neither the leveraged nor the deleveraged versions of the investment may be possible, since you can't necessarily sell a portion of an investment quickly and without cost, and some investments are not available for sale (such as a closed fund.)

#3: This is the most important problem: NONE OF THIS MAKES SENSE FROM THE CLIENT'S PERSPECTIVE. Performance evaluation of active managers means examining the extra return earned over a market opportunity with the same risk as the investment, not the other way around. We fit the risk of the benchmark to the investment; we don't change the risk of the investment to match the benchmark.

From the client's perspective, risk adjusted performance is essentially an opportunity cost analysis, and here's the relevant question that it answers: "What was the extra return I earned relative to the market opportunity for the amount of risk my manager took?"

The simple solution to this is to run a Sharpe Ratio line for the benchmark, and then to evaluate the excess return of the portfolio at the same risk level as the portfolio - NOT at the same risk as the benchmark.

Another significant problem with M-squared that no one seems to have addressed is the inherent bias in its results: it overcompensates managers who take less risk than the stated benchmark, and it penalizes those managers with skill who take more risk. Shouldn't we be treating managers fairly? From a client's perspective, it is highly desirable for a skillful manager to take more risk when he/she gets paid better for it than the benchmark. M-squared takes the opposite view. Why? Because the focus of M-squared is on having a single risk measure for evaluating managers. This might make things easier and more standardized for the performance analysts, but it provides incorrect information to the client. The focus here should be on the client and in presenting something that is relevant and understandable to them. M-squared is neither of these things.

Still not persuaded? Then take this challenge: Show your client a "risk vs return" chart with a market line that connects Cash and the Benchmark, along with a "dot" that represents the manager's performance. Then see if the client looks at the manager's performance relative to his/her own level of risk, or at the risk of the benchmark. See for yourself. This may be all the proof needed to show that M-squared is a convenient abstraction, but a flawed measure of risk adjusted performance.

Steve, very interesting perspective and insights; worth reflecting on. Thanks!

ReplyDeleteHere's an illustration of the M-squared alpha vs the "Differential" return alpha (to coin a term used by Carl Bacon in his excellent book on performance measurement.)

ReplyDeleteImagine a benchmark that posts 12% return with 20% risk, relative to a risk-free asset with a 4% return. We also have 2 funds that each outperform their stated benchmark, each having the same Sharpe Ratio, but one having less risk than the benchmark while the other has more risk. Fund #1 posts a 16% return with 22% risk while Fund #2 posts a 13.82% return with 16% risk.

Adjusting for risk, we find that Fund #1 has a required return of 12.80 while Fund #2 has a required return of 11.20%. The difference between each fund's actual and required returns is the "differential" alpha. For Fund #1 this is 3.20% and for Fund #2 it is 2.62%. When we compare these results to the "one size fits all" M-squared alpha of 2.91% we see the inherent bias that M-squared has against higher risk managers: Fund #1 alpha is understated by 29 bps while Fund #2 alpha is overstated by 29 bps.

Here we see the difference in perspective between M-squared alpha and differential return alpha. M-squared conveniently brings each fund to the corresponding risk of the benchmark, answering the theoretical question: "What return would I have earned IF the fund had the same risk as the benchmark?" This may seem interesting, but it is of no practical value, because the fund DID NOT have the same risk as the benchmark. Instead, the manager deliberately chose to have a different level of risk as part of his active process. So, the practical and appropriate question to ask and answer is this: "Did the manager produce enough return to compensate me for the risk he chose to maintain?" We can clearly see that the differential return alpha is the right answer to this question, because it measures alpha relative to the actual risk of the fund.

It is true that the M-squared alpha is a convenient theoretical abstraction that is easy to calculate, and it requires little thought because it puts all managers into the same box. However, it fails to answer the client's compelling questions about risk-adjusted performance. The differential alpha is just as easy to calculate, and it makes sense.

Steve, it appears that you are making a compelling argument that identifies a shortcoming with the measure. I will need more time to review what you've presented, but look forward to better understanding your argument. Thanks for sharing!

ReplyDeleteStephen, the required return of fund #2 in your example is 10.40%, not 11.20%. The M2 for fund #1 in your example is 2.91, the M2 for fund #2 is 4.28. The "differential return" à la Bacon for #1 and #2 is 3.20 and 3.42 respectively. The ranking order (and this is the only relevant thing about risk-adjusted performance measurement) is therefore not affected here: #2 > #1 in the case of all three measures.

ReplyDeleteThe "differential return" has major weaknesses:

It can be shown that the M2 measure is equivalent to vb*(SRp-SRb), i.e. the M2 of a portfolio is equal to the difference in the Sharpe Ratios of portfolio and benchmark multiplied by the benchmark's volatility; it can also be shown that the "differential return" is equal to vp*(SRp-SRb). The flaws of this measure are due to the positive relationship with its portfolio volatility: 1) otherwise idential but less diversified portfolios have higher differential returns. 2) otherwise identical but leveraged portfolios also have a higher differential return.

Therefore, "differential returns" can be manipulated easily and create perverse insentives. If they are used at all, they should be complemented with either better risk-adjusted performance measures (M2 or Sharpe Ratio or many others) or then supplemented with additional portfolio information.

David, risk-adjusted return is measured on a difference scale than return. Switching from the accounting word of returns to risk-adjusted returns is more than just converting Celsius to Fahrendheit to Celsius or vice vera. Risk-adjusted returns are economic value indicators measured in units of "utility". Calculating utility figures is notoreously difficult: first of all, figures will be different for each investor (unlike return, which is the same for all investors) and secondly simply horribly difficult to derive (Kahneman/Tversky won a Nobel price for it some years ago). Fortunatly, we do not need to express utility in number like returns. For pracical purposes, it is enough if we can express utility as an "ordinal" magnitude, i.e. a number that can be used as a score to produce a ranking between a set of portfolios. This is exactly what the Sharpe Ratio does (and a large number of other good RAPM): If the Sharpe Ratio of portfolio A is 0.6 and the one of portfolio B is 0.8, we know that for any risk level (be it the actual portfolio volatility, a benchmark, a peer group - whatever), portfolio B will give us a higher return than portfolio A and is therefore more attractive for investors that like return and dislike volatility.

ReplyDeleteThe M2 measure is an interesting answer to a question that nobody really asked. This is why it didn't become popular, and this is perfectly OK. The more important issue is that realistic investor risk preferences are about more than just excess return and volatility risk and asset risk characteristics are about more than just the first two moments of their P&L distribution. These questions are adressed by a number of very intersting alternative risk-adjusted performance measures that have been introduced more recently.

I must first welcome Andreas’ comments. We are privileged to have contributions from someone with his knowledge and stature within the industry. His website is a treasure trove of useful information about investment performance and risk analysis.(For an example, look at his wonderful spreadsheet on evaluating contributions to risk in a diversified portfolio – terrific stuff!)

ReplyDeleteNow it seems that he and I may disagree on a few numbers in the previous example that I related, but the more important discussion relates to the general statements that he and I have made. He favors M2 as a risk adjusted performance measure. I suggested that clients need realistic appraisals of performance, and that the M2 idea of leveraging/deleveraging the portfolio (rather than the benchmark) is unrealistic, impractical and meaningless to clients. Clients do understand and easily relate to an “opportunity cost” analysis of a manager’s actual performance relative to the passive market opportunity at that same level of risk. Therefore we would reject M2 as an unrepresentative and unrealistic abstraction in favor of a simple analysis of comparing the manager’s actual return to the required rate of return for his given level of volatility risk. Clients understand and relate to the simple risk vs return chart showing portfolios and benchmarks lying near or along the Sharpe Ratio line of each.

I was fascinated by the comment that this differential return approach “can be manipulated easily and creates perverse incentives.” This view is understandable because the amount of alpha will be directly related to the manager’s volatility. As Andreas states: “leveraged portfolios have a higher differential return.” This can be observed by simply graphing the example that I proposed: since each manager has a higher Sharpe ratio than the benchmark, the gap between these two lines increases as risk increases. While each of the different portfolios has the same Sharpe ratio, the higher risk portfolios have greater alphas compared to the corresponding risk portfolios along the benchmark's Sharpe ratio line. However, this is not a “manipulated and perverse incentive.” Rather, this is the case of a fair representation of the manager’s active process, which includes a decision regarding how much risk to take. When the market rewards risk, then the right decision may be to increase risk. When a manager has good selection skill, then it's reasonable to hold fewer (outperforming) issues and be less “benchmark driven” since his job is to create alpha. So a manager will control both the level of risk and the selection of issues to implement that risk level, recognizing that when skill is present then the client is rewarded for increased risk. Clients expect this from their active managers, and these managers make a good case for taking more risk than the benchmark when the risk they take is rewarded more highly than that same level of risk in the benchmark. Of course, clients don’t need active managers to move the risk level; they can do that for themselves by leveraging or deleveraging the investment in a passive benchmark. Therefore, the true benchmark for the active manager is a leveraged or deleveraged benchmark with the same level of risk. M2 is the opposite of this since it changes the risk of the portfolio to match the benchmark, resulting in an abstraction that does not fairly represent the investment process or the results of the actual portfolio. The criticism of M2 and the preference for the differential return approach are based on a fair representation of the manager’s active results in a context that clients find both relevant and understandable.

Regarding the "M2 vs Differential" debate:

ReplyDeleteA second area of discussion is around the statement that “M2 is an interesting answer to a question that nobody really asked.” Quite frankly, the business world typically dismisses useless ideas by calling them “answers looking for a question.” I have to wonder WHY no one has asked the question that M2 answers – especially after they have had this statistic’s answer for well over a decade. I suggest that the market has not embraced this statistic because it is unrealistic, unrepresentative and therefore less than useful. It’s certainly an easy statistic to visualize because it is stated in terms of return, so it doesn’t suffer from the complexity of some of the “downside risk” statistics that have gained acceptance. I suggest that the market’s lack of affirmation of M2 is some objective evidence of the credibility of the criticisms that I have put forward.

Lastly, I suggest that a differential return approach can be useful in the context of helping to show performance in the context of the client’s true financial goals. In this context, the actual volatility of an investment is a critical factor, as is the amount of excess return that can be reasonably expected. The amount of risk a manager takes relative to a benchmark is an indication of whether the investment is appropriate for the investor, and whether the manager is exercising proper discretion by increasing or decreasing the risk of the portfolio. The amount of excess return actually earned is relevant to understanding whether the investor is earning an adequate return relative to the target return goal. It is not enough to simply “rank order” the managers; we have to evaluate their actual results in the context of the investor’s financial goals. An M2 analysis cannot do this; a differential approach can.

I suggest that we have enough "abstractions" in the world of investment performance. We can always use more representative and understandable risk adjusted performance measures, but first we need to use the ones we already have.

Stephen, this is an interesting debate. M2 is not really a "risk-adjusted performance measure" (RAPM), but just a transformation of any RAPM of the type "reward function divided by risk function". The Sharpe Ratio is one representative of that class. Note that you can adjust any RAPM of this class with the M2 transformation. For example, if the Sortino Ratio of your portfolio is 0.5 and benchmark 0.7, the downside risk of your benchmark is 10%, then the “downside risk” adjustment % is 0.1*(0.5-0.7)=-2%. The adjustment is negative because the portfolio’s Sortino ratio is smaller than the benchmark’s.

ReplyDeleteThe fact that the “differential ratio” can be manipulated by leverage is severe. Jensen’s Alpha suffers from the same issue, in fact, this is what motivated Treynor to develop his ratio (which is nothing else than a beta-adjusted Alpha, and therefore independent of leverage).

People have become aware of the issues of “relative returns”: one cannot eat benchmark- (or peer-) relative performance. The same applies in this context here: one cannot eat “risk-adjusted returns”, neither benchmark-risk nor portfolio-risk adjusted returns. So my prediction is that expressing RAPMs as percentage returns would create more confusion than anything else among investors. I actually interpret the low popularity of M2 as an indication that investors have a fine nose and sense these issues. “Rank orders” provided by RAPM are sufficient to evaluate investment portfolios in the context of investor financial goals and risk preferences. Evaluating actual results relative to “investor’s pure financial goals” is an ex post accounting exercise in $ or % return space. I am afraid that higher “M2 returns”, “differential returns” or whatever will not help paying pensions.