Portfolio Evaluation for Passive Investors: Eleven Simple Rules

From Gerd Kommer and Tobias Jerschensky

In this blog post, we formulate eleven simple rules of thumb to help a private investor evaluate the performance (return and risk) of his or another passively managed portfolio. Such an evaluation is often referred to in financial jargon as: Benchmarking referred to when a comparison is made with a reference size, e.g. B. with an index (market return, asset class return) or with another objectively comparable investment.

Some of the following basic evaluation and benchmarking rules do not apply to actively managed portfolios/portfolios or only apply with additional assumptions.

(1) Short periods of time are usually useless and often even misleading for evaluating the performance of a portfolio

Periods of less than three to four years are not meaningful if you want to draw reliable, robust conclusions from the observed portfolio performance (return, risk). When looking at shorter periods of time, there is a risk of drawing conclusions that are harmful for the future.

Judgments derived from history tend to become more reliable the longer the data series being analyzed. The returns of listed and unlisted investments over short periods of less than three to four years are heavily influenced by “statistical noise”. These are influencing factors that are often random in nature or are in any case beyond the control of the portfolio holder or his advisor and their specific characteristics could not be predicted (expected) ex ante. Because this is the case, little or nothing can be derived from “noise-influenced results” for the future. Drawing decision-making conclusions from short series of data can actually be downright harmful.

An example: A depot has existed for six years. Over this period, the performance is satisfactory from the portfolio holder's perspective. But isolated only in the last twelve months, it is significantly worse than an underlying benchmark and also less satisfactory than over the entire period. What can be inferred from the poor performance over the immediate past twelve months? Most likely nothing, even if that seems unsatisfactory to the portfolio holder.

In the following table we show how long it would take, from a purely statistical point of view, until under normal circumstances one no longer has to consider a given outperformance (excess return) of an actively managed fund or other portfolio relative to its correctly chosen benchmark to be “possibly coincidental”.

Table: How long does it take until you can reliably distinguish random outperformance from non-random outperformance (excess return) in an actively managed investment fund?

► “Alpha” or tracking difference is the average return difference of an investment A compared to a benchmark B over a sufficiently long measurement period. This return difference will fluctuate around its mean from year to year with some volatility. Three typical example values for this fluctuation intensity are given in the second column from the left. An alpha of an average of one percentage point p.a. after all costs over a period of ten years is generally considered to be a significant performance for active portfolios (e.g. investment funds). ► "Tracking error is the volatility of the periodic tracking difference in an overall period. This is highly dependent on the active strategy and is usually higher, the higher the desired "outperformance" compared to the benchmark is. ► The calculated annual figures are based on a significance level of 95%. After this number of years, the probability that the “alpha” was just a coincidence is only 5%.

Also, the return of a portfolio or investment Anyone who, without a hard, objective reason, gives greater weight to performance in the recent past in the analysis than to performance in the more distant past is subject to the common, dangerous error of reasoning: “recency bias”. [1]

(2) Return alone is not meaningful

The fact that two portfolios A and B should not be compared solely on the basis of their (hopefully correctly measured) return, but that the comparison must also take risk and liquidity level, taxes and costs into account, is a trivial statement that no one will dispute. Nevertheless, according to our observation, “one-dimensional” portfolio or financial product comparisons are constantly made solely on the basis of the return and often enough on the basis of the return in far too short periods of time, e.g. B. 12 or 24 months. We will address the common mistake of not taking into account the different degrees of liquidity between two investments being compared in the next rule of thumb.

(3) Illiquidity or liquidity should be taken into account in the ex post evaluation of the individual components of a portfolio

The parallel/comparative evaluation of the performance (return, risk) of liquid and illiquid investments (e.g. stocks, bonds, gold, cryptos as liquid investments on the one hand and real estate or private equity as illiquid investments on the other) during a period of time There are two insights to consider:

A rational assessment of the risk, especially the risk of fluctuations in the value of illiquid investments over a given historical period, is complex because these investments are not listed on the stock exchange, i.e. there are no daily updated market prices for them. Concluding from the absence of daily updated market prices and the resulting apparent stability of value that the illiquid investment in question has relatively stable prices over time will often be an incorrect and potentially very damaging conclusion. When evaluating a portfolio, an investor should ask himself at what price he could sell the illiquid investment at that moment in a maximum two-month window. In the majority of cases you will find that immediate sale is not possible or only via a “secondary market” with a presumably considerable discount compared to the “reported price” from the provider.

The illiquidity disadvantages of individual investments should be underestimated or even ignored in their performance assessment because they ex post - as will usually be the case in a specific case and even statistically has to be the case - did not play a role is a common mistake among private investors that can have a very damaging effect at some point.

(4) The individual components in a consciously and systematically diversified portfolio should not be evaluated purely individually in isolation

If a depot consists of several individual components, e.g. B. several ETFs and other financial investments, which were originally consciously chosen as part of the definition of the overall structure of the portfolio, then the portfolio performance should primarily be evaluated “as a whole”, so to speak as a “team performance”.

So it usually happens not on the individual return of an individual investment in isolation, but on the portfolio structural role of this individual investment “in the team”, in the overall portfolio. For example, it is a mathematical necessity that even among ten individually excellent, but also different, individual investments within a portfolio, there must necessarily be one that produces the worst individual return among these ten in a given period of time (whether 12 months or 12 years).

The analogy of a Bundesliga soccer team: what counts in terms of its success in an individual game or in a season (34 games) is primarily the team's game result, the collective performance of all eleven players, i.e. how these eleven players played together in different roles and tasks. For example, a defender

(5) When evaluating a portfolio, one should avoid hindsight bias

A not uncommon mistake when evaluating the performance of a portfolio during a period of time This is called “hindsight bias” in technical jargon. [2]

In most situations, one should judge the past performance of a portfolio or the performance of the person managing the portfolio based primarily on only the information that one had before the start of the period. This approach ensures that there is no “confusion between strategy and result”, which means: the quality of a strategy - including an investment strategy - can ultimately only be meaningfully assessed on the basis of the information and goals that were known/available at the time the strategy was chosen and defined. The result of the strategy alone - regardless of whether it is pleasant or unpleasant - is an inadequate and often even very poor quality criterion for a strategy.

The following simple example should make this clear. The goal is to get from Berlin to Munich by train in the shortest possible travel time. Therefore the Sprinter is chosen, which is supposed to be around half an hour faster than the normal ICE. In statistical terms you should be faster. In fact, on the specific journey, this Sprinter train is delayed by an hour, while the normal ICE arrives on time. Nevertheless, hardly anyone can doubt that it is ex ante The right strategy was to book the sprinter, even if the specific result was disadvantageous.

(6) The valuation level of an investment at the end of the evaluation period should be taken into account

Here is an example to illustrate what is meant: Two stock portfolios A and B generated the same, satisfactory return over a period of eight years (i.e. an evaluation period that tends to be sufficiently long). They also had similar fluctuations in value (volatility) during this time and there were no serious differences in other risk types (e.g. maximum drawdown, diversification contribution in the portfolio, etc.). However, at the end of the observation period, stock portfolio A is now valued significantly higher (more expensive) than portfolio B - measured by a common and reliable valuation indicator, here the P/E ratio.

What are the conclusions? Portfolio B was or is now the better investment because it has a higher expected return in the future (all other things being equal). The superiority of Portfolio B exists even if the assessment of the valuation of the two investments is based on an (identical) uncertainty factor.

(7) Non-materialized cluster risks should be taken into account when evaluating the portfolio return

A good portfolio is structured in such a way that it does not contain any concentration or default risks, except for those that the investor has recognized, understands and is aware of - e.g. B. temporarily – were accepted. Typically, the materialization of cluster risks is a rare black swan event. Black swan events may only occur once every 20 to 50 years. [3]

It can be assumed that such a cluster or default risk did not occur in a given evaluation period. This is exactly what was to be expected ex ante. However, this “normal non-occurrence” does not mean that these unmaterialized cluster risks should be ignored in the ex post evaluation of a given portfolio.

From a conventional, so to speak “vulgar” risk perspective, which consists in basing risk evaluation solely on volatility, a portfolio A in which cluster risks have been well diversified away cannot be directly compared with a portfolio B in which a cluster risk exists.

Here is an example: Portfolio A consists of 100 different high-yield bonds from 100 different issuers from different industries and countries, so it is well diversified in terms of default risk. Portfolio B consists only of one high-yield bond. The two portfolios are identical in all other important characteristics (current yield, currency, duration). Portfolio B is very likely to outperform over a normal long period of time (e.g. one year or five years) because the probability of this one bond defaulting is low for these periods. In Portfolio A, on the other hand, one of the 100 bonds is almost guaranteed to default, which lowers the A portfolio return. Only if the small risk of default for the individual bond in B occurs will A perform better (and then dramatically). Therefore, if B outperforms as expected, one cannot conclude that it was the better portfolio (investment). The B investor was simply lucky, but luck that will occur in the majority of cases. However, if luck does not occur, the consequences for Portfolio B are extremely negative.

Examples of further cluster risks include the decades-long under-return of assets in a country (stocks, bonds, real estate) due to political factors. Examples of such default risks include the bankruptcy of an account-holding bank or the provider of capital-forming life insurance.

(8) The benefit of a mediocre investment may have been ex post to prevent an even worse investment

The financial benefit from an investment A - in addition to its return - often also consists in the fact that the investment A prevented the investor from making a worse investment B. This statement is not sophistry.

When evaluating the past performance of a given investment, an investor should always ask himself, "If I'm being honest, did Investment A stop me from making an even worse Investment B?" Investment success is not only the result of smart, positive decisions and actions, but also the result of avoiding harmful, negative decisions/actions.

We have this unusual evaluation perspective in a separate blog post entitled “Via Negativa – an unknown concept for more success when investing” shown. The via negativa concept is based on the obvious, but often overlooked, insight that the economic success of most wealthy households relative to less wealthy households is due in large part to the fact that they have made fewer investment mistakes than other households over a long period of time. One such avoided mistake could be avoiding a “disaster investment” by making another investment that may be individually suboptimal.

(9) When evaluating a portfolio, you should also think about “negative parallel universes”.

A portfolio should be structured to provide a minimum level of financial resilience even in “negative future worlds”. Here are some examples of negative “future worlds”, negative “future scenarios”:

The general level of interest rates in the Eurozone is rising noticeably above the current level. As a result, real estate prices fall sharply and the real estate market “freezes”: the transaction volume (purchases/sales) shrinks by more than half. Sellers no longer want to sell at the sharply reduced price. Buyers do not want to buy as they wait for further price declines. I won't be able to implement my short-term wish to sell for two years and after that only at a lower price than expected today.
The German state is tightening the existing rent cap. As a result, the value of rented and owner-occupied residential properties only increases below inflation over 13 years, i.e. falls in real terms.
The USA is experiencing a national debt crisis due to its high national debt. US dollar interest rates (as well as bond interest rates in other countries) are therefore rising sharply. As a result, bond prices fall by 40% in a short period of time. There are reports in the media about a possible haircut on US government bonds. The US dollar is depreciating sharply. My daily money in US dollars experiences a drawdown of 35%.
There is a systemic banking crisis in the Eurozone. At the same time, many banks are running into serious liquidity problems and are restricting their customers' account withdrawals. [4] According to media reports, my bank, where I hold 700,000 euros in a current account, is potentially insolvent and will no longer allow withdrawals until further notice. It has been unclear for months whether there will be a bailout by the state for my bank above the statutory deposit protection limit of 100,000 euros.
Due to their high valuation today, tech stocks will significantly underperform the general stock market over the next ten years, as was the case for around ten years from the beginning of the noughties. Tech stocks have a weight of around 60% in my portfolio. This pulls the portfolio return below the general market return for years.
My own company, which I own, is in crisis. Its estimated value is halved and no distributions are possible for several years. My financial peace of mind as an entrepreneur and as a person decreases significantly because of this.

The structure and distribution of a household's assets, if they have already accumulated significant assets, should ideally be designed in such a way that no such black swan scenario has an individually catastrophic financial impact on the household, i.e. that the household assets will suffer, but "the very worst" is still prevented, even if that costs return points "in good times". This is called “financial resilience.” The question of the correct asset allocation and portfolio structure should be addressed by the household at greater intervals.

(10) When evaluating the performance of a passively managed portfolio, one should consider which goals can be achieved rationally and realistically and which cannot

From our point of view, the primary financial goal of a passively managed portfolio is to be in the top quintile (the top 20%) of all comparable investor households in terms of final assets achieved or the average return measured correctly in financial mathematics after ten, 20, 30 or 40 years. A secondary but also very important goal is to avoid disaster performance, i.e. not to be among the worst - let's say - fifth of all meaningfully comparable investors (see previous point).

“Factually comparable” requires taking into account the risk taken during the period in question (e.g. volatility risks, counterparty risks, cluster risks, liquidity risks, default risks).

For an investor with a passively managed portfolio, a rational, realistic goal with regard to the portfolio cannot be “to have the most profitable investment” or to “be among the most successful 2% of all comparable investors”. By definition, this cannot be realistically achieved with a passively managed portfolio in periods of less than, say, 20 to 30 years (it is possible for longer periods). If it is not achievable, then later introducing the claim “why did I underperform the best investment Y in the ten years this portfolio has existed?” makes no sense in performance evaluation.

(11) For a portfolio managed by a service provider: Be careful when comparing your own portfolio with the portfolios of other asset managers or banks

When asset managers and banks want to acquire a new customer B. based on a securities account statement. Then they benchmark this portfolio Naturally, strategy Y has paid better in the past than portfolio on average were no better or even worse than X.

In this case, does the historical outperformance of Y compared to X prove that the new asset manager/bank has more investment skills than the one who managed the previous portfolio X?

No, because to do this you would have to compare all existing strategies of the new asset manager/bank with portfolio But this will not happen on the part of the new asset manager/bank.

Conclusion

Successful investing is a long-term process, a marathon. In order to be able to judge for every kilometer covered during this marathon whether or not one is sufficiently promising in terms of the desired performance, evaluation criteria are needed. We have formulated eleven simple criteria or rules of thumb here. Anyone who uses them during the annual portfolio review (much more often will not be necessary) increases the likelihood of actually achieving the marathon goals they have set for themselves.

Endnotes

[1] See article Recency effect in the German Wikipedia or article Recency Bias in the English Wikipedia.

[2] See article Hindsight bias in the German Wikipedia or article Hindsight bias in the English Wikipedia.

[3] What is characteristic of black swan events is that they have a very low probability of occurrence, which is often not even quantifiable, but which cause particularly great damage if they occur.

[4] Depots (as opposed to accounts) cannot and must not be blocked for legal reasons.