Fidelity Bad Data
Author: torpedo333
Creation Date: 5/11/2012 1:42 PM
profile picture

torpedo333

#1
I'm just getting started with Wealth Lab and I started running some simple backtests on S&P 500 Daily data and S&P 500 30-minute data. After getting some strange results, I checked the historical trade log which indicated that the system made $19 million on CSCO. So I opened the CSCO chart to look at all of the signals generated by the system. If you look at 2/28/2007 on 30 minute data, you'll find the High for the day is $100,000. NOT A TYPO!

I called Fidelity and they had the exact same error in their system. 5 years later and still no correction. There are many, many other bad data ticks not just on CSCO but on many stocks in the Fidelity database. Please don't tell me to use the "Edit Bar Data" feature. There are just way too many stocks for that. The Fidelity S&P 500 Daily data has 20 yrs of data. That's 2.5 million bars for 500 symbols over 20 yrs. For the 30 minute data it's approximately 9.7 million bars for 500 symbols over the 6 yrs of intra day data they give you. No one has time to correct all of these errors by hand.

So the point of this post is to find out from other Wealth-Lab users what they do about this issue? Which data providers do people use for their backtesting? Yahoo, CSI, or other? I mean, if the database is bad, the entire process is useless. So how do I get good quality Daily and 30-minute data?

Thanks in advance.

profile picture

Eugene

#2
In addition, Cone has found an issue with the Fidelity static intraday data: right now, history retrieved is limited to 11 Jan 2010 for 3- and 5-minute intervals for all symbols starting with M-Z.
profile picture

Cone

#3
I'm not really sure it's all symbols since I can't test them all, but the symptoms point to that.

Anyway, the good news is that Fidelity data has come a long way since integrating new providers with the release of WLP 6.0 a couple years ago. The bad news is that data spikes in historical data, especially before 2010, is going to remain a problem for some time. Structural changes on the back end are required to apply corrections that don't automatically come in from their providers. The fact is that all those "spikes" are real prints reported by the exchanges... the problem for the data vendor boils down to weeding through literally thousands of codes associated with those prints and create filters for most of them. Long story short, fixes are in the works, but will take some time.
profile picture

karla2010

#4
CODE:
Please log in to see this code.


torpedo, this is not anything fancy but works ok for me.

I put it up front in my code before any processing. All it does is take the High-Low, finds the median value over a moving window of 200. It then divides the High-Low by the median and if it is greater than "ef" which I made 20 just by inspection, it substitutes a synthetic value for the spike bar.

Fool around with it, maybe you can improve it. Good luck.
profile picture

torpedo333

#5
Thanks everyone for your responses.

Has anyone tried using ASCII or Yahoo data? Or any other providers that have more accurate data?

profile picture

Eugene

#6
"Using ASCII data" is the same as saying "have anyone tried using a trading software". There are countless vendors of ASCII data, any platform allows to export to CSV/ASCII, it's a file format - not a data vendor. Finally, it comes with a limitation like the inability to use the Edit Bar Data feature.

Yahoo data is limited to EOD only, but one can try IQFeed's intraday data out (free trial available).
profile picture

Cone

#7
For clean, filtered, EOD data for Large Cap stocks (S&P 500 and Nasdaq 100).. use Wealth-Data!
See what it's all about at https://www.wealth-data.com

The Wealth-Data Provider is now included with Wealth-Lab Developer installations, but Fidelity Wealth-Lab Pro users can install it from Extensions.
This website uses cookies to improve your experience. We'll assume you're ok with that, but you can opt-out if you wish (Read more).