Factors to predict Out-of-Sample Performance

sedelstein

#1

6/2/2016 12:50 PM

Here is a new paper dated 3/9/16 from the folks at Quantopian that people here may be interested in reading.

This paper http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2745220 All that Glitters Is Not Gold: Comparing Backtest and Out-of-Sample Performance on a Large Cohort of Trading Algorithms offers some interesting factors that might be useful to have in the performance+ visualizer.

Some of the factors are already available to us. The attached graph is an excerpt which ranks the factors the authors found to be important.

Some interesting ones include

tail_ratio - Ratio between the 95th and (absolute) 5th percentile of
the daily returns distribution. This was found to be the most important factor

For example, a tail ratio of 0.25 means that losses are
four times as bad as profits.

skewness - skewness of daily returns distribution
kurtosis - kurtosis of daily returns distribution
sharpe_std - Standard deviation of rolling 6 month Sharpe ratio

Food for thought for those of us interested in avoiding overfitting.

Regards

Capture1.PNG

Carova

#2

6/2/2016 8:13 PM

It would be great if these factors could be added to the MS123 Scorecard.

Would that be possible Eugene?

Vince

Eugene

#3

6/3/2016 12:49 PM

Steve, thanks for the find. Most of the metrics aren't suitable for different reasons (skewness and kurtosis are too specific and rolling Sharpe ratio is too big an effort), let's see if Tail Ratio is useful or not. At the moment I'm pretty skeptical re: tail ratio because isn't bullet proof.

Consider a simple example. The 95th percentile of a system's EOD returns is 43419 and the absolute 5th percentile is 66144. The tail ratio is therefore 0.65 meaning that the loss is only ~1.5 times greater than the win. This must be feasible enough to stomach, right? Well, here's the actual backtested equity behind that 0.65 tail ratio with a close to total loss of capital - its Net Profit is -74%:

Vince, why do you think it'd be great? If you had tested them before then what were your findings? Don't you find that skewness and kurtosis are too specific to fit the paradigm of the MS123 Scorecard? At least I do. If I were to consider adding something I'd give it a go first in the visualizer to see if it proves useful and only then go along.

tail_ratio.png

sedelstein

#4

6/3/2016 6:07 PM

Hi Eugene

The dollar P&L figures above. Were they for a constant dollars per trade? I would have thought the returns should be in percents and the strategy executed with a constant percentage of portfolio allocated to the trade

At any rate. I thought the paper might be helpful to the folks here.

Eugene

#5

6/3/2016 8:25 PM

Steve, the test was done using a Percent Equity and the daily returns were absolute. In my experiments, values below 0.50 were also reported in some cherrypicked tests (thereby confirming the percentile formula is correct) but most of the time the TR stays above 0.7 even for desperately losing systems.

Taking the returns in percents of a losing system doesn't make difference: the tail ratio numbers still make no sense to me. Thanks for the suggestion anyway.

Eugene

#6

6/6/2016 9:22 AM

Although it will not make its way to the Scorecard, look for Tail Ratio in Performance+ in the upcoming release of MS123 Visualizers.

Carova

#7

6/6/2016 4:22 PM

Hi Eugene!

QUOTE:
Vince, why do you think it'd be great? If you had tested them before then what were your findings? Don't you find that skewness and kurtosis are too specific to fit the paradigm of the MS123 Scorecard? At least I do. If I were to consider adding something I'd give it a go first in the visualizer to see if it proves useful and only then go along.

The "tails ratio" has a strong basis in both theory and practice. It is based on the observation that financial returns do not follow a Gaussian Distribution (with exponential tails) but are rather much closer to a Pareto or Cauchy Distribution (with power law tails). This is often referred to as "fat tails" in returns. Several academic and financial studies have shown that the vast majority of returns (both positive and negative) results from these tails. As a result, you would expect that any trading strategy that reduced either the positive or negative tail would significantly impact future results. I have used something similar to sort my strategies prior to extensive testing, and have found it of value.

QUOTE:
Although it will not make its way to the Scorecard, look for Tail Ratio in Performance+ in the upcoming release of MS123 Visualizers.

The value of this would be limited. This type of metric is MUCH MORE valuable during an optimization rather that ex post facto.

Vince

Eugene

#8

6/6/2016 4:46 PM

Thank you for the feedback Vince. Let's first evaluate the tail ratio numbers in backtest mode, and if they are reasonable we may consider going along.

Eugene

#9

6/30/2016 11:41 AM

For what it's worth, check out the tail ratio on the Performance+ tab in new version 2016.07.

sedelstein

#10

7/14/2016 12:20 PM

FWIW and for those who are interested here is another paper dated last month with some interesting metrics that I had not seen before.
It is not a request for support unless and perhaps the user community think it worthwhile.

Don't Stand So Close to Sharpe

http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2792592

The reference section has interesting papers as well.

Their FT ratio seems to have better OOS performance and has low correlations 0.6ish to the Sharpe Ratio

Carova

#11

7/14/2016 2:13 PM

Interesting statistic (FT ratio) that I have never heard of prior to your post. It does make some sense, and it appears to be somewhat similar to the Tails Ratio in that it is examining the contributions of the positive and negative portions of the distribution (where most of the gains and losses are located).

I have been using the sum of the top 2.5% trades (gains) over the bottom 2.5% trades (losses) as a metric for the past few years as a ranking metric for non-trend-following systems and have found it to have significant value. Unfortunately it seems to have little value for trend-following systems.

Vince