Neuro-Lab Performance Evaluation Script
Author: LenMoz
Creation Date: 3/3/2014 11:59 AM
profile picture

LenMoz

#1
Background
Here's a script you can use (with modifications) to analyse the performance of your neural network. The output is unabashedly copied from Neuro-Lab, with some improvements. Output looks like this.
CODE:
Please log in to see this code.

Why use this script?
1. Evaluate at any time, during or after training the NN. Neuro-Lab's equivalent "Evaluate Performance" tab is unavailable after training. I find this script useful to check an NN's efficacy, even months after it was trained.
2. The script provides additional statistics: correlation and Least Squares Alpha and Beta. Since it's a script, you are able to add other statistics to meet your needs. This script provides a framework.
3. Greater control of input date ranges. The script uses the same "Data Range" as a strategy script. How is this useful? You might want to select a year at a time and compare correlation, year to year, to evaluate robustness.
4. Save the evaluation results by copying the Debug Log into a document.
5. Compare using different portfolios. I typically run Neuro-Lab against the portfolio I propose to write a strategy for. Using this tool, I've found that a given NN sometimes correlates to other portfolios, too.
6. Compare NNs. Multiple neural networks can be tested with this script, based on a parameter. So, I can set up a date range and portfolio, then quickly compare the NNs. The code included below is customized for three neural networks, "Broad Markets" which comes with Neuro-Lab, plus two of my own.
7. You can create a .CSV from the Debug Log to do additional analysis. I do an XY graph of the predicted and actual columns, typically.
What does "(with modifications)" mean?
Nothing worthwhile comes for free. Obviously, this script needs to know about your neural network. You must hand tailor the script for this. Update variable "availNN" with your NN name(s). You'll need to add your network's output script and the MinOutput and MaxOutput values from your network's XML. I've left the customizations I use for my networks as prototypes.
Caveats
1. The script only runs as a multi-symbol backtest(MSB), because I chose not to implement the convoluted reflection code for detecting MSB/SSB. Workaround - a portfolio having only one symbol.
2. Even if you want to try this routine with "Broad Markets", you'll need to do some prep. First you'll need to train "Broad Markets" and finds its XML. Then you'll need to update routine "SetupBroadMarkets" with your trained values of minOutput and maxOutput. The values in the script are based on the portfolio and date range I used when I trained "Broad Markets".

Any feedback would be appreciated.
CODE:
Please log in to see this code.
profile picture

ronc

#2
Len - I hope that late feedback is better than none/never.

Great script, but something I have never understood from the WL documentation is how to intepret the output data. I understand that the first 3 columns area histogram showing the frequency counts for NN outputs in certain bins. After that I am lost.

Predicted and Actual Outputs are outputs from the NN and actual data, but are these the median or mean values each histogram bin? In either case why are these important?

What is the meaning of the graphic? The WL User Guide says it is "how much higher or lower the average output of this row is compared to the Average Output of all Observations." That seems to be a description of the variance of the observation data rather than how closely the model fits the data. I would think that we need something like MSE or similar metric to see how well the model fits the data.

Any help would be appreciated.
profile picture

LenMoz

#3
Ron,
QUOTE:
Predicted and Actual Outputs are outputs from the NN and actual data, but are these the median or mean values each histogram bin? In either case why are these important?
The mean. Why...? For a predictive NN, predicted and actual should move together, showing that the NN successfully predicts the actual.

QUOTE:
What is the meaning of the graphic?
The graphic provides a quick visual of the above. In the output, bars are sorted from smallest to largest NN score. The graphic is the average actual for that bucket. If the actuals don't follow the NN, i.e. the bars don't show a clear smaller to larger pattern, the NN is not predictive for that portfolio/timeframe.

QUOTE:
I would think that we need something like MSE or similar metric to see how well the model fits the data.
I use the statistics at the top (Mean Error, Correlation, etc.) to compare NN performance, especially Correlation

profile picture

ronc

#4
Thanks Len but unfortunately I am still not understanding. Let's use the first output row in your original post as an example. If I understand it correcty, it says that there were 5 occurrences of NN output values between 4 and 8. Then, I am guessing that the Predicted column means that all 5 of those cases produced a NN prediction value of -14.81. But I do not understand how all 5 occurrences would produce exactly the same value. Hence my guess that -14.81 might be the mean of all NN predictions in that 4-8 bin. If the Predicted column is not the mean of the values in a bin then what is it? Perhaps it is the NN prediction at one edge of the bin? Similarly, exactly what does the "Actual" column mean?
Thanks,
Ron
profile picture

LenMoz

#5
QUOTE:
my guess that -14.81 might be the mean of all NN predictions in that 4-8 bin
That's correct. The "Predicted" column is indeed the mean of the 5 predicted values that fell in that bucket (they were not all -14.91) and the "Actual" is the mean of the associated actual values. I copied NeuroLab's format.

The value I see in the script is that I can run it after NeuroLab has locked down the scripts, against other date ranges and portfolios. I use it to periodically check if the NN remains predictive months or even years after it was trained, a good test of robustness.

profile picture

ronc

#6
OK, got it. My next question is, how do the NN outputs map to "real world" outputs? Your code looks like it is doing a linear transform from NN to "real" based on the min and max values observed in NN and "real"?

Also, does your graphic column represent something like (Actual - Predicted) / Actual for each bin?

In the WL output there is a "% of Average" column, described in the User Guide as "This column displays how much higher or lower the average output of this row is compared to the Average Output of all Observations." If I understand that correctly, WL showing how far each actual bin differs from the average of all actual data, while you are showing how far each actual bin differs from each predicted bin. I can see the value in the latter, as a measure of model fit, but I am not clear why the former is useful.
profile picture

LenMoz

#7
QUOTE:
... how do the NN outputs map to "real world" outputs?
Above (post #1) I said, "You'll need to add your network's output script and the MinOutput and MaxOutput values from your network's XML." The mapping is the reason. A neural network score of zero linearly maps to MinOutput and 100 maps to MaxOutput.

Here's the code that does the transformation, copied from the script in post #1.
CODE:
Please log in to see this code.