In stock investing, mean reversion refers to the tendency of the stock price moving towards to its mean or average over time. In particular, when the current price is less than the mean, the stock is considered under valued and expected to rise. On the other hand, when the current price is above its average, the stock price is over valued and is expected to fall. Mean reversion trading is about trading strategies to buy or sell the stock when its performance has greatly deviated from its average.

Mean reversion trading is based on more scientific methods determining stock buy and sell points than traditional charting approaches. It is mainly because precise numerical values can be derived from historical data to identify these buy/sell threshold levels. Mean reversion often arise in sideways markets, e.g., DJIA (1960-1980) as shown above. Other household name examples of mean reverting stocks include MSFT (2000-2010), WMT (2000-2010), Additional studies supporting the mean reversion stock returns can be found in Ref. [1].

Mean reversion can be generated by mathematical simulations. For example, taking *X(t)* to be a process governing by

Here the corresponding mean equals 2, rate of mean reversion 0.8, volatility 50%, and the noise *W(t)* a standard Brownian motion. A sample path is given in Figure 1 below.

Consider the case that the stock price is given by *S(t)=exp(X(t)) *and that the net position can be either flat (no stock holding) or long (with one share of stock holding) at any time. In addition, a slippage cost is added to each transactions. In this case, a set of differential equations (Hamilton-Jacobi-Bellman equations) can be solved to obtain *X1* and *X2* with *X1<X2*. Here *X1* is the **low** and *X2* the **high**. One should buy if the price is less than or equal to *exp(X1)* and sell if the price is greater than or equal to *exp(X2).*

To illustrate, let us consider a numerical example. Take the discount to be *0.5*, slippage cost 1%, then we can obtain *X1=1.331, X2=1.631.* For mathematical details, we refer to [1]. These two levels are shown in green lines in Figure 1. Clearly, two main factors affect the overall return: The probability for the price to go from *X1* to *X2* and the frequency for the price to travel from *X1* to *X2.*

In practice, mean reversion trading rules are often used in conjunction with trend following strategies. In addition to more mathematical issues such as model calibration and prediction, one has to decide *a prior* a stop loss level to implement the trading rules to prevent substantial losses.

**References:**

[1] H. Zhang and Q. Zhang, Trading a mean-reverting asset: Buy low and sell high, Automatica, Vol. 44, pp. 1511-1518, (2008).

]]>*Leverage space is simply a manifold, structured such that we can examine and discuss matters pertaining to growth functions of stochastic outcomes, such as processes of money management and portfolio allocations in the capital markets and gambling scenarios. It is a framework for observing and analyzing these functions, as well as drawing conclusions about how we manage these functions for our benefit.*

*Since growth functions are a function of time, much pertaining to what occurs in the manifold of leverage space is a function of time also. Additionally, we examine what occurs therein in the asymptotic or long-run sense as well.*

We begin with the notion of a *player*, who is confronted with a set of *discrete*, possible outcomes for a random event that he will allocate resources to, and the outcome of this event will determine how much resources he gets back or loses.

These discrete outcomes have a probability associated with each possible outcome. Let’s us take the hypothetical case of a player confronted with a coin toss proposition that will return 2 units to him if heads is tossed, but will cost him 1 unit if tails is tossed. Thus, the set of discrete outcomes presented is:

Outcome | Quantity | Probability |

Heads | 2 | 0.5 |

Tails | -1 | 0.5 |

The player wishes to maximize what he will get back, considering the sum of all possible discrete outcomes times their respective probabilities (the classical *‘expectation’*), and considering the amount he has available to risk on the *proposition*.

Since the expectation is the same regardless of the amount risked upon it, the expected growth is maximized by wagering 100%.

Our player is permitted to risk a fraction, *f,* of his available capital. Being a fraction it is bound between the values of 0 and 1, and thus, given the independence of the expectation to the amount wagered, expected growth is always maximized for a positive expectation at *f *= 1 whereas expected growth is always maximized for a negative expectation at *f *= 0 (*i.e.* nothing risked).

However, we are looking at quitting after only one play. That is to say, for a positive expectation, such as this 2 to 1 coin toss, expected growth is maximized by risking 100% of the player’s available resources if the player’s horizon is 1 play. Thus, the fraction to risk is a function the player’s *horizon, *the point where he intends to cease risking resources on the proposition.

Let us examine what now happens at two plays of this same proposition, this same coin toss that pays 2 to 1. Utilizing the model presented in [2] and [3] (which yield an expected growth optimal fraction to wager of 1 for a horizon of 1) we find that the fraction of available resources to wager for a horizon of 2 plays is now *f* = .5 per play.

And if we continue expanding the horizon, that is, the number of plays where the player will cease risking resources on the proposition, we find the fraction to risk (on each play) continues to diminish up to a point:

This point where the expected growth-optimal fraction settles to an asymptote as the horizon (which we will call by the variable *Q*) approaches infinity we will refer to as the *Kelly Criterion* growth-optimal fraction ( where *f* = .25 in our hypothetical 2 to 1 coin toss case).

In the real world, oftentimes, what is risked in terms of resources is often not that which is “put up.” Consider the case of short sales, for example, or futures positions, where a certain good faith margin is put up, but what is actually risked is not the same amount.

In the 2 to 1 coin toss example already cited, what is put up (1 unit) is also that which is risked, and this is almost always the case in gambling situations. In capital markets, trading situations, the mechanics are often a little more complicated.

Building on this example, let us suppose the player must put up 1 unit to assume a wager, but what he can actually lose is a different amount, ranging from 1 unit down to .1 unit. In such situations, the Kelly Criterion must be amended to weight for the largest losing outcome of the discrete outcomes which are possible. This amendment I refer to as Optimal *f*, but it is necessary so as to keep the value as a fraction (0<= *f *<=1).

Heads | Tails | Kelly Criterion | Optimal f |

2 | -1 | 0.25 | 0.25 |

2 | -0.8 | 0.375 | 0.3 |

2 | -0.5 | 0.75 | 0.375 |

2 | -0.25 | 1.75 | 0.4375 |

2 | -0.1 | 4.75 | 0.475 |

The critical difference as exemplified here is that the answer returned by the Kelly Criterion, sans weighting to largest potential loss, is *not* the optimal fraction but rather the optimal *leverage factor*. That is, if we look at the case of a 2 to .25 coin toss, we find the Kelly Criterion solution to be 1.75. This does not mean to risk 175% of our available capital[1], but rather to operate as though we have 1.75 times more resources from which 1 unit (the amount which must be put up) is divided into.

In the largest losing outcome-weighted solution, the Optimal *f *solution, we have an actual fraction to wager (and it should be pointed out that the Kelly Criterion solution, which is a leverage factor, never equals the expected asymptotic growth optimal fraction to wager except in the special case where the amount put up to assume the wager equals the worst case possible outcome) and the player thus multiplies the amount of available resources by this fraction, dividing the largest potential losing outcome into that product.

Ultimately, the number of wagers to assume is the same regardless of whether one uses the original Kelly Criterion calculation or the worst-possible outcome weighted solution (Optimal *f*) but only the latter is an actual fraction. By using the actual fraction, we bound the risk axes between 0 and 1.

Thus far we have discussed the expected growth-optimal peak of, which we see for a set of discrete outcomes with a positive expectation begins at *f *= 1 for a horizon of 1 (*Q*=1) and migrates to its asymptotic peak. But there is much more to the character of the curve and its implications than simply the peak.

Let us examine some of these properties now. Shown in the next figure is our 2 to 1 coin toss game after 40 plays. The vertical axis represents what we would expect to make, as a multiple of our initial resources, the horizontal axis representing the fraction of our resources we risk, and we can declare that the more we risk, the greater will be our drawdown.

Notice that if we risk a fraction of .4 we make exactly the same as risking a fraction of .1 (and both making less than the peak at .25) yet risking .4 will have approximately 4 times the drawdown that risking .1 has. Clearly, it does not pay to exceed the peak, a point we will refer to as *kappa*.

Additionally, notice that beyond risking a fraction of .5 the expected growth is less than 1. In other words, if we risk greater than a fraction of .1 (or risk more aggressively than 1 unit for every 2 units we have) we are multiplying our resources by an amount less than 1. Thus, even though we are engaged in a very favorable proposition and *not* borrowing, we can still lose all of our resources by being aggressive beyond a certain point. We call this point *psi*.

Starting at *f* = 0 and increasing the value for *f*, we find that the expected return increases at an ever faster rate, up to a point, then decreases. That is to say, there is a point left of the peak, *kappa*, where marginal increases in gain slows down for increasing risk. This is the point where the curve goes from concave up to concave down and represents the peak of the first derivative of the curve. This *inflection point* we call *nu *and view it as the most conservative of the risk-adjusted return optimal points.

The (left-of-the-peak) inflection point (*nu*) does not appear at first. Clearly there can be no inflection point at *Q* = 1 which is a straight line (as shown in Figure 1) nor for a purely convex function when *Q* = 2 (as shown in Figure 2) and, in the 2 to 1 coin toss example, even at *Q* = 8 (as shown in Figure 3) there is still no inflection point. Figure 6, where Q = 40, clearly shows an inflection point. This point, *nu*, like *kappa*, the peak itself, migrates towards the asymptotic peak value, *kappa*, as *Q* gets ever greater approaching infinity.

Another important point, the more aggressive of the two risk-adjusted optimal points we refer to as *zeta*, and it represents that point where gain with respect to risk is generally maximized. If we look at the vertical axis as gain and the horizontal axis as risk, the clearly that point on the curve which exhibits the steepest slope with the point on the curve where *f* = 0 (and is therefore tangent to the curve, and therefore can only appear when *Q* is sufficiently large that a *nu* point appears) is the point where the ratio of return to risk is maximized and we call this point *zeta* (*nu* < *zeta* < *kappa*).

Notice in Figure 7 the slope of the point *kappa* (in green) as being shallower than that of the return / risk optimal point, *zeta*, the point where the tangent line (in blue) touches the return function.

Finally, it should be born out that if we are using the fractional representation that the expected growth-optimal peak, *kappa*, will never be greater than the sum of the probabilities of the winning discrete. Thus, in a coin toss situation, *kappa* can never be greater than .5 for an unbiased coin, regardless of the payout. One can always be certain the peak of the curve, *kappa*, for all possible sets of discrete outcomes, will reside somewhere between 0 and the sum of the probabilities of the winning outcomes.

One final point that must be made here: whenever one assumes a position, whenever anyone places a wager, assumes a position in the capital markets or has a vested stake in a growth function based on a stochastic outcome, one resides somewhere on this very same curve. The peak and the other chronomorphically-important loci of growth regulation (*nu, zeta, psi*) are at different points for *Q*, but one is always somewhere on this curve, the benefits of consequences of the specific location at work regardless of whether acknowledged or not.

Thus far we have looked at only one proposition. We now turn our attention to multiple, simultaneous propositions.

If we have two coins, two propositions two components, we are now looking for the two values for *f* that represent a coordinate in three-dimensional space. For *N* components, we are thus looking for a peak in an *N+1* dimensional space. For two, 2 to 1 coin tosses, played simultaneously, we therefore have a peak at *f** _{1}*= .23,

The propositions do not have to be identical, they can be entirely different propositions. What is necessary, however, is that the propositions transpire over the same windows of time. That is, if we are looking at multiple simultaneous propositions, they should al transpire over what we call a holding period, and a holding period can be any uniform window of time, a day, a week, a month, a year, any uniform period of time across components. Thus, for example, if our holding periods are months, our table of potential outcomes must be outcomes over the course of a month.

Notice now the necessity for scaling all axes to values between 0 and 1. Thus, we obtain an *N + 1* dimensional manifold scaled in all dimensions between 0 and 1 save for the return (altitude) dimension.

Also notice the grey area in Figure 8, representing those *f *value coordinate where the growth function is less than 1. Notice how one can be at an asymptotic peak value for one axis (*f *= .23) but the other axis being such that the aggregate growth function for both simultaneous propositions is less than 1 in the aggregate. This is counter to the usually-accepted notions of diversification.

Just as with the single proposition, the surface changes shape as the number of plays or holding periods increases, the important points of growth regulation migrating at varying rates towards the asymptotic peak. Further, these important loci of growth regulation (*nu, zeta, psi*) although single points in the simple-proposition case (*N*-dimensional manifolds in an *N+1* dimensional leverage space manifold) these points (with the exception of *kappa*, always a solitary point) themselves manifolds (of *N* dimensions) in *N+1* leverage space. The calculation of these manifolds covered in-depth in [1] and just as with the single-proposition, one is always at some loci, some point on the surface in leverage space, whether acknowledged or not.

Therefore, often one does wish to allocate based upon the coordinates of a point in leverage space. For example, being within the *N* dimensional *zeta* manifold at a given horizon, *Q*, in an *N+1* dimensional manifold of leverage space for *N* components, one is therefore seeking to satisfy the criterion of being reward / risk optimal (with risk defined as drawdown in the case of the *zeta* manifold), as opposed to simply solving, as with most portfolio-allocation strategies, the more conventional measure of being mean-variance return optimal over the next, solitary period. The manifold of leverage space allows us to satisfy other criteria, wither by residing at certain loci to achieve those criteria, or by paths along the surface within the leverage space manifold, based on other events, to achieve other criteria.

Often, varying risk metrics are employed where, if violated, would result in ruin to the leverage space implementer. For example, if the probability of a given drawdown by a given horizon, Q, is exceeded, then that part of the surface of leverage space should be removed (so that the surface at such loci is at an altitude of 0). This is exemplified in Figure 9 for the two component case:

Other risk-violating criteria might be, for example, allocating more than *X*% to any one component (in which case, the altitude, the return function, for all loci where any component has an *f* value greater than .1 would be reduced to 0. The superimposition of any risk metrics on the surface of leverage space and the notion that various criteria can be satisfied by residing at different loci or traversing various paths along this surface, opens up a wide range of possibilities in portfolio construction and allocation, and money management techniques.

There is much more to explore here.

**References**

[1] M. L. de Prado, R. Vince, Q. J. Zhu, Optimal Risk Budgeting under a Finite Investment Horizon, SSRN, 2364092, 2013.

[2] R. Vince and Q. J. Zhu. Inflection point significance for the investment size. SSRN, 2230874, 2013.

[3] R. Vince and Q. J. Zhu. Optimal Betting Sizes for the Game of Blackjack. SSRN, 2324852, 2013.

[4] R. Vince. *Risk-Opportunity Analysis*. Createspace Division of Amazon, 2012.

]]>

So what should we do in practice? Here we shed some lights on this issue by giving a graphic illustration of the recent analysis of Vince and Zhu [1].

Consider a simple game betting on a flipping of a biased coin. Kelly’s formula is well illustrated by the graph of the average log return function per play below where the unique peak corresponds to the Kelly optimal betting size.

Nevertheless, two key components pertinent to the real applications are absent in Kelly’s formula: risk aversion and a finite investment horizon. Adding these practical considerations the picture changes dramatically. It turns out that, when playing the game for only a finite number of times, the total return as a function of the betting size becomes, in general, a bell shaped curve. Moreover, the risk measured by drawdown is approximately proportional to the bet size. Thus, the goal of a player is then to maximize the ratio of the total return and the bet size. Graphically, for any given point on the graph of the total return curve this ratio is exactly the slope of the line connecting this point and the origin. Three typical lines are illustrated in the featured graph in the beginning of the blog. It is clear that the top line that tangents to the return curve indicates the bet size maximizing the return / bet size ratio. Comparing to the middle line that passes the peak (which can be shown to be very close to the Kelly optimal bet size) of the return curve we see a theoretical justification for needing to be more conservative than the Kelly optimal bet size in practice. The lower line is also significant. This line passes through the inflection point, the boundary of increasing or decreasing marginal return when the bet size increases. This is the most conservative of the three points. When bet size increases beyond this inflection point while the return / bet size ratio may still increase the marginal increase diminishes. This makes the inflection point also a reasonable conservative choice.

Empirical analysis of a realistic Blackjack game in [1] shows that in practice the reasonable bet size should only be one quarter to one third of that the Kelly best bet size suggests. The idea and qualitative conclusion discussed here also apply to investment capital allocation. Of course, when dealing with problems with multiple investment assets or strategies, the analysis is much more involved technically. Detailed discussion of related theory and implementations can be found in Lopez de Prado, Vince and Zhu [2]. The clear message regarding to the problem of asset allocation is when in doubt be conservative.

References:

[1] Vince, R. and Zhu, Q. J., Optimal Betting Sizes for the Game of Blackjack, (2013) SSRN 2324852.

[2] Lopez de Prado, M., Vince, R. and Zhu, Q. J., Optimal Risk Budgeting under a Finite Investment Horizon, (2013) SSRN 2364092.

]]>Putting the correctness of such expert’s opinion aside, the sheer single minded focus on stock picking is already very misleading. In fact, most successful professional traders and investors will tell you money management is an important component of their trading systems.

To understand why let us try this question: Can one lose money in a game in which one has a favorable probability of winning? The answer is, absolutely yes. For example suppose that a skillful gambler is, on average, right 9 out of 10 of his bets. For each right bet he will double his money and wrong bet, loss all the bet. The following table created with a spread sheet is thought provoking. We assume that the gambler plays 10 games and the sole losing one is game 7.

Game/Bet Size | 20% | 40% | 60% | 80% | 90% | 100% |

1 Win | 120 | 140 | 160 | 180 | 190 | 200 |

2 Win | 144 | 196 | 256 | 324 | 361 | 400 |

3 Win | 173 | 274 | 410 | 583 | 686 | 800 |

4 Win | 207 | 384 | 655 | 1050 | 1303 | 1600 |

5 Win | 249 | 538 | 1049 | 1890 | 2476 | 3200 |

6 Win | 299 | 753 | 1678 | 3401 | 4705 | 6400 |

7 Lose | 239 | 452 | 671 | 680 | 470 | 0 |

8 Win | 287 | 632 | 1073 | 1224 | 894 | 0 |

9 Win | 344 | 885 | 1718 | 2204 | 1698 | 0 |

10 Win | 413 | 1240 | 2749 | 3967 | 3227 | 0 |

Columns of table corresponding to the size of the gambler’s bets as a percentage of his account balance and the rows are his balance after corresponding number of games.

Clearly betting all one has the worst outcome since this will certainly lead to loss of all capital in the event of one single losing bet despite the gambler’s high success rate. By examining the last row of the table we will find that the best betting size in this situation is 80%. A bit of thought will convince us that, in fact, when this one losing bet happens does not influence the outcome. The question of how to find the best betting size in general was first carefully analyzed by John Kelly, a scientist in AT&T Bell Lab. Here is Kelly’s reasoning. Suppose that the gambler has a winning probability of *p*, so that the losing probability is *q=1-p*. Then the long term asymptotical average gain per play is

*p*log(1+f)+q*log(1-f).*

A typical graph of the log return as a function of *f * using the data from our gambling example is featured at the beginning of this blog.

This simple graph contains rich information that has important ramifications for investment decisions. First, we see that the curve has a unique peak at 0.8 which coincides with the maximum in the table above validating 80% is the best betting size. Moreover, it is not hard to calculate the critical point of the log return function and derive that the general best betting size is at *p-q*. This is the famous Kelly’s formula. Furthermore, when *f* approaches 1 the curve dips below the horizontal axis indicating that even with a huge advantage in the game one could still lose money if he bets inappropriately. Similarly, inappropriate investment allocation could significantly hurt the performance even for a skilled stock picker. Finally, the curve on the two sides of the peak is asymmetric. The right hand side of the peak looks like a cliff which drops into losing territory in no time. Thus, in practice one should never go near the peak. In fact, Professor Ed Thorpe, the pioneer who extended the Kelly formula and used it in stock and option trading, said in a recent interview that he is more comfortable using only a fraction of the allocation that the Kelly formula suggests.

We restrict ourselves on analyzing a simple gambling problem mainly to avoid technical complications. The methods and the qualitative principle indicated by the results equally apply to investment problems. The take home from this discussion is that how to appropriately allocation your investment capital is at least as important as how to pick your investment. The technical details involved in applying such capital allocation in practice also explains why this is one of the best kept secret among the professionals – it is hard to convey in 30 second sound bites. This and similar other in-depth knowledge is exactly how serious investors distinguish themselves from the rest of the pack.

]]>Trend following is an investment strategy based on technical analysis. The basic premise is that the market can be regarded either as a bull market or a bear market at a given time.

Trend followers take advantage of these trends and make their buying and selling decisions. Rather than focusing on predicting specific price levels as in other areas of technical analysis, trend followers simply jump on the trend and ride through it.

There are a several different ways and various time frames to determine the general market directions. Traditionally, moving averages and channel breakouts are used to determine current market trends. Here we focus on a brand new trend following approach developed by Dai, Zhang, and Zhu [1].

This approach starts with two assumptions: (1) The stock prices follow a bull-bear switching geometric Brownian motion; (2) The switching process is a hidden Markov chain.

Under these assumptions, we are able to compute the conditional probability {p(n): n=0,1,2,…} of a bull market given the past stock daily closing prices S(0),S(1),…,S(n) with S(n) being the most recent price.

Our trading rules are based purely on the readings of p(n). To facilitate our exposition, we introduce the following notation:

M1: annual bull market return rate,

M2: annual bear market return rate,

1/L1: average duration of bull markets,

1/L2: average duration of bear markets,

V: stock volatility,

dt: time unit 1/252.

For example, using DJIA (1962-2008), we can estimate these parameters and have M1=0.18, M2=-0.77, L1=0.36, L2=2.53, and V= 0.184.

The conditional probability p(n), n=1,2,…, can be obtained as follows:

**p(n)=min{max{p(n-1)+F(p(n-1))*dt+[[(M1-M2)*p(n-1)*(1-p(n-1))]/(V*V)]*ln(S(n)/S(n-1)),0},1},**

where **p(0)** is an initial guess taking value in [0,1] and

**F(p)=-(L1+L2)*p+L2-{(M1-M2)*p*(1-p)*[(M1-M2)*p+M2-0.5*V*V]}/(V*V).**

Note that **p(n)** stays in [0,1] for all n. It behaves pretty much like a traditional indicator (e.g., RSI) in technical analysis. Nonetheless, the way it is used in determining market trends is quite different from that of RSI.

In the above figure, the red curve represents DJIA from 1960 to 1981 and the blue curve the corresponding conditional probability **p(n)**.

We only consider the long-side of trading, i.e., go long in a bull market and sit on the sideline in a bear market.

Assuming a fixed percentage transaction cost, it is shown in [1] that the optimal trading rules can be given in terms of two threshold levels BL>SL. For example, using the DJIA (1962-2008) estimates, we can obtain BL=0.934 and SL=0.768. These two threshold levels can be seen in Figure (DJIA: 1960-1980). The TF trading rule is to buy when **p(n)** crosses BL from below and close the long position when p(n) crosses SL from above.

We test our trend following strategy on the historical data of SP500 (1962-2008), DJIA (1962-2008) and NASDAQ (1991-2008) indices. In these tests we assume the transaction cost K=0.1% and use 10 year treasury bond as the alternative risk-free investment instrument. We use the actual yield when holding the bonds. The test results for the total (annual) returns of this trend following strategy on the three indices are contrasted with buy and hold and invest in bond in the following table.

Indices | Period | Trend Following | Buy and Hold | 10yr Bond | #Trades |

NASDAQ | 1991-2008 | 8.82(12.86%) | 4.24 | 2.63 | 66 |

SP500 | 1962-2008 | 64.98(10.00%) | 56.2 | 23.44 | 80 |

DJIA | 1962-2008 | 26.03(7.18%) | 12.11 | 23.44 | 80 |

The graphic illustration of these results are presented in Figures log(NASDAQ: 1991-2008), (log(SP500): 1962-2008) and (log(DJIA): 1962-2008)), respectively, where the blue part is the return of the trend following strategy in log scale.

We should be clear that the results described here is not intended for using directly as an investment strategy. To develop this model into a useful investment strategy one needs to address several important issues: (1) go beyond the tests of the three indices and use techniques such as withhold samples to avoid backtest overfitting; (2) diversify to make the strategy more stable; and (3) control the risks using various mechanisms. Nevertheless, these results demonstrate how mathematics can be useful in making trading decisions. Further studies of the trend following can be found in [2] and [3].

**References:**

[1] M. Dai, Q. Zhang, and Q. Zhu, Trend following trading under a regime switching model, SIAM Journal on Financial Mathematics, Vol. 1, pp. 780-810, (2010).

[2] M. Dai, Q. Zhang, and Q. Zhu, Optimal trend-following trading rules, Working paper, (available upon request).

[3] D. Nguyen, G. Yin, and Q. Zhang, A stochastic approximation approach for trend-following trading, Hidden Markov Models in Finance: Volume II (Further Developments and Applications), Springer, R.S. Mamon and R.J. Elliott, Eds.

]]>