Clean up DataSets from invalid symbols?
Author: Sabawi
Creation Date: 2/15/2011 7:40 PM
profile picture

Sabawi

#1
Does anyone know how can I clean up my data sets from bad data like invalid symbols and their empty records?

Thanks!!!
profile picture

Eugene

#2
What is "invalid symbol"? What do you mean by its "empty record"?

Request doesn't seem clear enough, please clarify.
profile picture

Sabawi

#3
Hi, These are stock symbols in my datasets that are no longer listed and are not valid anymore. The datasets I built have few 1000's stocks in them. So instead of letting the code go through these bad stocks, I want to purge them from the datasets. I don't want to do this manually. I was wondering if there is a quick way for WL to clean up the datasets.
profile picture

Eugene

#4
Delisted symbols are quarantined automatically by Fidelity symbol management. Check this out:

Data Manager > Symbol Management

As these quarantined symbols will return 0 bars as their Bars.Count (Robert please correct me if I'm wrong), you can scan through your DataSets to find and then weed them away.
profile picture

gbullr

#5
I have been having a problem related to this. While I understand that there are symbols that don't trade anymore etc, is there a way to include the data in a back test? Alternatively there is a massive "survivorship" to any presently valid dataset and consequently every long strategy. As a way to explain this lets suppose that I am trying to create a shorting strategy that has the characteristic that it made money in the .com era. The problem I have encountered is that I don't have any of the dot.bombs in my data-set because none of them survived, but that is a huge problem for back-tests that try to make money short or long actually because they are not included. Solutions?

Thanks.
profile picture

Cone

#6
Get a data subscription at http://premiumdata.net/. You can get all data for all stocks traded back to about 1950 I think, and it's not expensive.

Also, don't think for a minute that "survivorship bias" works only one way. Consider all the M&A activity that's been occurring over the last 6 months. Stocks get "taken out" at premiums too.
profile picture

gbullr

#7
Thank you very much. I completely agree w/ you. Survivorship bias works both ways. I just thought it was easier to explain the short side.