Concept to improve Generalization during Optimization
Author: Carova
Creation Date: 6/19/2016 1:15 PM
profile picture

Carova

#1
Hi All!

I am suggesting that we should consider taking a concept that is common in many branches of "automated learning", i.e. using a combination of a Training Set and a Test Set to provide a preliminary assessment the generalization of an optimizing system. This might include either a different time period of a current WL, a different WL during the same time frame, of a different WL during a different time frame. This would in no way guarantee that generalization would occur, but it would definitely remove instances when generalization did not occur.

Any thoughts?

Vince
profile picture

Eugene

#2
Hi Vince,

I don't pretend to have understood what you saidbut maybe you'd have better success through an example? If you could show us what you mean:

Creating Optimizers in Wealth-Lab Pro
profile picture

Carova

#3
Hi Eugene!

The concept of using a Test Set is given here, here, and here. It is an "out of sample" test set that is evaluated simultaneously with the "in sample" training set, but is not used for any ranking during the optimization (training) phase.

Is this clearer?

Vince
profile picture

Eugene

#4
Thanks Vince. While I don't think there's appetite for such major changes, where this enhancement would be destined for - perhaps WFO?
profile picture

Carova

#5
Hi Eugene!

This would be for the standard Optimization primarily, since WFO does provide some info already about generalization. Understand your thoughts, but would adding another WL evaluation be that difficult?

Vince
profile picture

Eugene

#6
Your best bet might be to check out the API user guide above.
profile picture

Carova

#7
Thanks Eugene! I would code it myself if I had the appropriate expertise, but that is not the case. Additionally there are already PSO and GA Optimizers available that I might tackle if it were in C, but could not even dream of doing in C#. As I said, I still do not have that level of skill.

Vince
profile picture

superticker

#8
QUOTE:
Carova: This would be for the standard Optimization primarily, since WFO does provide some info already about generalization. Understand your thoughts, but would adding another WL evaluation ...

I've considered trying to provide another method of optimization as well, but have run into numerous problems.

Truth being said, trading optimization is a non-linear, fuzzy-system programming problem. I say fuzzy system because you're trying to fit an underdetermined system. For example, it takes 3 unique points (and therefore 3 linearly independent equations) to define a plane for an explicit solution (or unique fit). If you have more points than three points, then you have an overdetermined system; if you have less points than three, then you have an underdetermined system, which is a fuzzy system by definition.

So if we try to fit our fuzzy plane model with two points (which defines a line), we will get a solution vector space (infinite possible planes revolving around this line) rather than a unique solution.

During Wealth-Lab optimization, we are also performing a fuzzy-model fit. We employ a strategy with say 5 parameters to fit. That means each individual stock needs at least 5 winning round-trip trades to fit the model precisely. But when I optimize over a two year span, each stock in my WL dataset gets less than 5 winning trades. So there's no way to arrive at a precise fit theoretically, and that's what makes this system fuzzy.

So how can you make your fuzzy WL strategy less fuzzy? Well, you can start by minimizing the number of parameters it has. That can be done by employing self-adjusting adaptive indicators to some degree. You can also include more external information (i.e. market sentiment) in your strategy.

I've also looked at trying to optimize a metric other than Profit-per-Bar that would provide more opportunities for feedback other than trade simulation. But the price behavior is so stochastic even after decorrelating market behavior, that I've run into problems here as well. But I do think modeling some form of filtered, decorrelated price behavior has more feedback opportunity than relying on trade simulation alone.
profile picture

LenMoz

#9
Regarding training and test sets- You can do that now albeit with some effort. Divide your target symbols into two portfolios, say evens and odds. Optimize one, then test the other with the best optimization parameters. You're obviously looking for similar results, or you may be benefitting from having overfit or outliers.
Len
profile picture

Carova

#10
QUOTE:
superticker

As you point out, we rarely have enough trades to justify the number of adjustable parameters that we need to get reasonable generalization. In other Machine Learning contexts I have use Principal Component Analysis (a purely linear process) to extract the EigenVectors having the greatest contribution to precondition inputs into a Neural Net. While that type of approach helps control the vector space somewhat, it comes at the expense of eliminating all the non-linear information.

However, we do not even have this luxury in a rule-based system. As you point out we need to use external information to help improve the situation, but I would also argue that your concept of adaptive "indicators" is even more important. It is only in these can we capture some of the dynamic non-linearity of the Market.

QUOTE:
LenMoz

Yes, Len, that is a possible work-around (one that I have been using for the last couple of years), but it is highly labor-intensive (which is the reason we have computers! ;) ), kludgy, and SLOW! This was the reason that I initially brought up the suggestion.

Vince
profile picture

superticker

#11
QUOTE:
we rarely have enough trades to justify the number of adjustable parameters that we need to get reasonable generalization.
This is what makes this a fuzzy system and is the most challenging aspect of this problem.

QUOTE:
In other Machine Learning contexts I have use Principal Component Analysis (a purely linear process) to extract the eigenvectors having the greatest contribution to precondition inputs into a Neural Net.
Yes, Principal Component Analysis, or even Factor Analysis, are excellent ways to extract relevant eigenvectors for solving a multidimensional problem. But this problem will still remain fuzzy (or underdetermined) because you can't find enough eigenvectors to uniquely "solve for" (or characterize) all your model parameters.

QUOTE:
While that type of approach helps control the vector space somewhat, it comes at the expense of eliminating all the non-linear information.
Well slow down a moment. The power of a neural network solution is to solve a nonlinear problem, not a linear one. You're right; if you use eigenvectors (or linear methods) to setup your neural network, then it will be crippled by the linear (eigenvector) front-end solution. But in practice, you wouldn't do it that way.

Since the neural net can solve nonlinear problems directly, one would skip the eigenvector solution altogether and feed the raw nonlinear data into the neural network directly. This way you'll realize a nonlinear solution rather than a linear one. However, the eigenvectors may reveal which parts of the raw data are orthogonal and which are redundant. That's essential in setting up your neural network problem because you must avoid providing any redundant raw data into the neural network; otherwise, you'll have a grossly unstable neural net solution.

Disclaimer: I know nothing about WL's neural network solving facility. I'm just talking theory here, not WL implementation. Also, I have never attempted solving for eigenvectors for steering a stock trading strategy--no experience here either. And solving for eigenvectors on most fuzzy systems may be extremely difficult if not impossible. If you've actually done this, I would like to talk to you more about your experience. For example, what was the final condition number of your system matrix in your solution? What external inputs were most relevant for solving this problem?

QUOTE:
... we need to use external information to help improve the situation, but I would also argue that your concept of adaptive "indicators" is even more important.
I would say we need all the help we can get--use all methods--to make these systems less fuzzy.