Robustness-Test - WealthLab

mjj38

2023-12-06T18:23:46Z - 2023-12-06T18:23:46Z ago

This is one of my main robustness tests, from which, I derive a lot of information from. Please let me know your thoughts.

6476-SSRN-id2423187-pdf

291

4 Replies

Bookmark

Notify

Sort

Glitch8

( 11.81% )

2023-12-06T19:21:25Z - 2023-12-06T19:21:25Z ago

Thanks for taking advantage of the new PDF posting feature!

I'll need some time to really read this, but a quick glance right now reminds me of what Volker always says in the Webinars about the importance of Parameter Stability Testing.

Maybe I can convince him to submit a paper so we can coin that acronym PST :)

mjj38

2023-12-07T18:46:39Z - 2023-12-07T18:46:39Z ago

Looking forward to you thoughts on it.

DrKoch8

2023-12-08T10:25:42Z - 2023-12-08T10:25:42Z ago

I found this paper very inspiring.
First, what I like about this paper:
It has a very good introduction and explanation of the topic "overoptimization":

QUOTE:
In system optimization, regression toward the mean indicates that the specific combination of optimized parameter values which led to extreme performance in historical simulation will probabilistically not retain a level of extreme performance in the future.

It talks about the practical implications:

QUOTE:
Unfortunately, many traders are subsequently
frustrated by poor realized trading system performance that does not live up to overly optimistic expectations. One large and prevalent source of overly optimistic expectations that remains largely misunderstood and underestimated is the data mining bias (DMB).

And discusses the reasons why these things happen:

QUOTE:
Many traders are familiar with the idea that future trading system performance is likely to be worse than was seen in historical simulation. However, the origins of this performance degradation are often not well understood. One significantly large cause is the DMB, also commonly known by other names such as curve-fitting, over-fitting, data snooping, or over-optimization. DMB is built-into the typical system development process and yet largely remains unknown, misunderstood, and/or ignored.
...
Unfortunately ignoring the problem doesn’t eliminate the consequence which is that the trading system fails to live-up to performance expectations
in cross-validation or worse in live trading.

The paper has a good explanation where the DMB comes from:

QUOTE:
To understand DMB, one must first recognize its two preconditions which are
inherent to the system development process:

1) randomness and,
2) a multiple comparison procedure in the search for the best system rules. The interaction of randomness and the search process is unique to the system rules evaluated and the historical market data and results in inflated performance metrics.

It also enumerates various methods of DMB mitigation along with their pros and cons.

Then the paper introduces the SPP (System Parameter Permutation) method which combines:
* time frame variations
* parameter variations
* randomly skipped trades (as consequence of parameter variations)

Good points:

* It uses an optimizer to perform parameter variations. This is a very clever idea.
* It uses the cumulative distribution function (CDF) to judge the robustness of performance metrics under parameter variations. This allows for very detailed analysis.
* It uses several overlapping time intervals for the "SPP for short-run time period"

Bad Points:
* It uses predefined parameter ranges:

QUOTE:
the set of parameter ranges under which the trading system is expected to function is determined ex ante in preparation for optimization

this is highly subjective and makes the results un-repeatable.

* It uses the "Exhaustive" Optimizer which uses "unlikely" or "bad" parameter combinations with the same weight as "useful" parameter combinations. I think this leads to overly pessimistic results.

DrKoch8

2023-12-08T10:33:59Z - 2023-12-08T10:33:59Z ago

To overcome (what I think are) the weak points of the SPP method I suggest these improvements:

Use a "targeted" optimizer like "Shrinking Window" or SMAC.
This will make the optimizer concentrate on "useful" parameter combinations.

Use only the top x% of optimizer results
This will provide for "parameter variations" but in a limited part of parameter space.
This should result in much more realistic results. (i.e. future strategy results should be covered much better)

Judge Optimizer Results by a Second Level Cross Validation
If the targeted optimizer is simply run for a target metric like "APR" it will run into "overoptimized territory" immediately. Therefore it is useful to calculate a OS-metric for every optimizer run and use these OS-metrics for further analysis. (see the Top x% above)

This is a bit too complicated to explain in a few sentences. It is best to demonstrate the principle with a working new extension... ;)

Bookmark

Notify

Sort