NeuroLab MessageBox from "Evaluate Performance" - "The given key was not present in the dictionary"
Author: LenMoz
Creation Date: 4/23/2013 10:28 AM
profile picture

LenMoz

#1
This error, "The given key was not present in the dictionary", appears in a MessageBox when I try to "Evaluate Performance" with the "Use Out-Sample Range" radiio button on. "OK" is the only option. "In-Sample" on the same portfolio works fine.

Can you give me any ideas on what causes this error?

I am using a custom "Large Tech" portfolio. I have used this portfolio previously to train other networks with no errors. Symbols are A AAPL ABB ADBE ADI ALTR AMAT AME APH ARMH ASML ATVI BIDU BRCM CA CAJ CERN CHKP CRM CSCO CTRX CTSH CTXS DELL DHR EMC EMR ERIC ETN FISV GLW GOOG HON HPQ HTHIY IBM INTC INTU JNPR KYO LNKD MSFT MSI NOK NTAP ORCL QCOM RHT ROK ROP SAP SI SNDK STX SYMC TDC TEL TSM TXN WDC WIT.
profile picture

Eugene

#2
Unfortunately, it's not possible to tell right away what went wrong just by a MessageBox. As the logic behind is quite complex, the developer first should be able to reproduce the error.

Is it happening with all your systems or some particular one? (Obviously, it's optimal if we could reproduce using a canned/downloadable strategy or a simple script.) With all portfolios or just this? With all data ranges or some? With Fidelity data only or with other provider? How about playing with other NL settings? etc.

The more observations, the better the chances to track the problem down and help you.
profile picture

LenMoz

#3
I had hoped a quick scan of the code might find the conditions leading to that MessageBox. If many scenarios cause it then not much help. It does occur instantly when the Evaluate button is clicked.

I think I found it. On the "Select Training Data" tab, the "Lead Bars" value was set too low. I had set it based on the Input Script, forgetting that the Output Script needed additional leading bars.

This question is about NeuroLab. Is there a way to choose a data date range for training(besides inside the scripts)?

It would be useful if that MessageBox included the name of the "given key." That may have simplified my diagnosis.
profile picture

Eugene

#4
QUOTE:
I think I found it. On the "Select Training Data" tab, the "Lead Bars" value was set too low. I had set it based on the Input Script, forgetting that the Output Script needed additional leading bars.


Good to know that.

QUOTE:
It would be useful if that MessageBox included the name of the "given key." That may have simplified my diagnosis.


The thing is, not only a KeyNotFoundException may arise in this routine. For whatever reason, the developer decided not to scare away users here by throwing a standard exception dialog.
profile picture

LenMoz

#5
I'm a big guy. I wouldn't be scared. ;-)

A message should at least be useful. This one may as well have said the ubiquitous, "An error has occurred", from my perspective.

It is still strange to me that the message does not occur on the In-Sample data, which should have the same issue. Why would that be?

Again,
QUOTE:
This question is about NeuroLab. Is there a way to choose a data date range for training(besides inside the scripts)?

"

profile picture

LenMoz

#6
In the User Guide Help file, (NeuroLab/Select Training Data), I find:

QUOTE:
Important!

The Training Period is [currently] bar-based, not date-based. This is an important distinction if you select a DataSet that contains symbols with varying date ranges. For simplicity, assume that your DataSet has two symbols, YHOO and IBM. Selecting 50% of data for training means that you are selecting 50% of the available data in both sets. Depending on the data provider, YHOO data could start in 1996, but in 1962 (or earlier) for IBM. Assuming the year is 2010, the in-sample data for IBM would include data from 1962 to 1986, but from 1996 to 2003 for YHOO. In other words, you would be "accidentally" training the network with out-of-sample data with respect to IBM

To avoid this, we recommend creating DataSets of symbols that are synchronized by Starting Date for the purpose of training, if necessary.


How do I create a DataSet that is synchronized by Starting Date?
profile picture

Eugene

#7
QUOTE:
To avoid this, we recommend creating DataSets of symbols that are synchronized by Starting Date for the purpose of training, if necessary.


I guess the User Guide is pretty straight here, talking about not putting stocks like LNKD with almost anything else in your list. The Yahoo/MSN providers allow to specify the starting date for a DataSet. While it's proved to be suboptimal in practice, here (only) it sounds like a useful feature.