Data Manager - Update Pricing and the number of B
Author: ivraju
Creation Date: 7/2/2009 1:31 PM
profile picture

ivraju

#1
I am trying to update 5 minute data for couple of hundred securities. The 5 minutes data for each security has more than 100,000 bars in .WL file. i.e. more than 4 years worth of data. Since the files are big, its taking long time to update each file. Is there any way that I can reduce the number of bars in each file to about 30,000 bars.

I update 15 minutes data for the same securities and each security has about 33,000 bars and they get updated very quickly.
profile picture

Cone

#2
No. It's only a one-time download though, so you won't have to do it again. Updates will be much quicker.
profile picture

ivraju

#3
Thanks Cone,

To be more specific, I update securities of S&P 500 index.

The Data set with 5 minutes interval takes about 4 minutes where as 15 minutes Data set takes little more than a minute to update all 500 securities, every time I update.

I am trying to have the 5 minute Data set to update in little more than a minute, instead of 4 minutes.

The only difference between the 2 Data sets besides time interval is the Physical file size for each security. The 5 minutes files are 3 times bigger compared to 15 minutes data files as they suppose to but because of its size its taking longer to do updates, I think. Am I missing any thing here?

the only other thing to consider is that the Fidelity server takes more time to return 5 minutes requests than 15 minutes requests (I don't think that is the case here)

I don't look back more than 1 year with 5 minutes data. So I don't really need 4 years data in those files.
profile picture

Eugene

#4
QUOTE:
I am trying to have the 5 minute Data set to update in little more than a minute, instead of 4 minutes.

Your original request has nothing to do with the file size. Once the "body" of the file (i.e. those 100,000 bars) is downloaded and saved to disk, as Robert says above, the data is not refreshed any more (unless user deletes the file by any means possible) and your update should take much less time.

Updates can not be made quicker than they are, assuming it's all right with your network connection. You could try a faster connection, or a compressing proxy (if available, but increases ping times).
profile picture

ivraju

#5
Thanks.

The file size is directly proprtional to the number of Bars in the file. 15 minute data files are about 1 MB (33,000 bars) and 5 minutes data files are about 4 MB (1000,000 bars).

I am not talking about first time updates, which takes much longer, I am talking about subsequent updates (daily or every 15 minutes or when ever I click on "Update Dataset(Pricing)" button in Data Manager.

5 minutes dataset with 500 securities taking about 4 minutes to update. (these securities have about 100,000 bars for each security)
15 minutes dataset with 500 securities taking about 1 minute 30 seconds to update. (these securities have about 33,000 bars for each security)
Daily data set with 500 securities taking about 25 to 35 seconds to update. (these securities have about 3,700 bars for each security)

All the above 3 data sets using the same computer, same connection speed, same number of securities. So I don't think its anything to do with connection speed.

I beleive that the WL is scanning all bars in that file every time it does update a security, so scanning 100,000 bars(5 minutes dataset) takes more time than scanning 33,000 bars(15 minutes dataset), that is why it is taking longer to update 5 minutes dataset than 15 minutes dataset. So If I can bring down the number of bars in 5 minutes data files to around 30,000; it will update as fast as 15 minutes dataset.



profile picture

Eugene

#6
QUOTE:
So If I can bring down the number of bars in 5 minutes data files to around 30,000; it will update as fast as 15 minutes dataset.

Highly unlikely. The time it takes to process a binary file, be it a 1.5 Megabyte 15-minute .WL file or a 4.0 Megabyte 5-minute file, is equally low. For a modern computer with SATA II drive that is capable of speeds higher by an order (or two) of magnitude (tens of megabytes per second), it's the same thing.
QUOTE:
5 minutes dataset ... 4 minutes to update.
15 minutes dataset ... 1 minute 30 seconds to update.

Your connection speed, as you said, stays the same. For me, it looks like the dependence is pretty linear.

You're looking in the wrong direction.
profile picture

ivraju

#7
Thanks Eugene.

"Highly unlikely. The time it takes to process a binary file, be it a 1.5 Megabyte 15-minute .WL file or a 4.0 Megabyte 5-minute file, is equally low"

when you do few files, you may not notice the difference like you said, but when you do 500 files, it adds up. Now we talking about accessing 750 MB vs 2 GB every time we make an update.

Could you do me favor?

Could you create 3 data sets with S&P 500 securities 1 for 5 minutes, 1 for 15 minutes and 1 for daily.

Ignore the first time update times.

on a trading day, do an on demand update for those 3 datasets one after another and see if the update takes same time (doesn't have to be exactly same!) for all 3 datasets.

The results I see are consistant several times during trading hours. If the connection is contributing to the slowness, then I expect to see some times 15 minute dataset to take 4 minutes. That never happens. When ever I update Daily dataset, they just fly...

Please post your results here.

Wish you Happy July 4th.




profile picture

StreetSmart

#8
Hope you all don't mind my piggybacking my question in this thread (it's relevant to the topic).

Currently all my data sets/downloads are scaled for daily pricing (close). Is there a way to change the scale (to 5 min or 15 min) without rebuilding all the data sets as new sets?

Thanks,
profile picture

StreetSmart

#9
An additional question on data sets:

When I click the data set to get a list of the symbols, I get them all in the right side window and can sort them in detail. However, right click on those symbol(s) does not allow deleteing them from the data set.

Is there some way of deleting individual symbols in the set through the data manager?

Thanks,
profile picture

ivraju

#10
You can only compress a lower scale to higher scale.

i.e. if you have 5 minutes dataset, then you can change the scale to 10, 15, 30, 60 minutes without downloading any data from server.
profile picture

StreetSmart

#11
Yeah, I understand that (the data would be non-existant in the current log).

What I'm trying to do is change the download scale from daily to 15 minutes without having to completely rebuild the datasets and New data sets.
profile picture

Eugene

#12
QUOTE:
What I'm trying to do is change the download scale from daily to 15 minutes without having to completely rebuild the datasets and New data sets.

It's possible.

1. Find your Data directory (see WLP User Guide -> Data -> Where data is stored).
2. Open the DataSets folder.
3. Copy the daily DataSet XML file you want to duplicate as a 15-minute one, under a different name (e.g. "Dow 30" becomes "Dow 30 (15)").
4. Open the new XML file with a text/XML editor.
5. Find the following block:
CODE:
Please log in to see this code.

6. Change it this way:
CODE:
Please log in to see this code.

7. Save it.
8. (And restart WL5 to see the change - thanks, Robert)
profile picture

Eugene

#13
QUOTE:
However, right click on those symbol(s) does not allow deleteing them from the data set.

Is there some way of deleting individual symbols in the set through the data manager?


Wealth-Lab User Guide > Data > Data Manager > How to: Add Symbols or Modify a DataSet.
profile picture

Eugene

#14
ivraju

These are binary files. When updating, such file is neither being completely read from scratch nor it has to be parsed like an ASCII file (when it's not cached by the ASCII provider).

A .WL file is not a flat binary: Wealth-Lab knows the position where to insert new data and the total bar count just by reading the header. When updating, it appends the new bars to the end of the file and updates the header. The actual file size does not matter.
profile picture

Cone

#15
Why not just copy the symbols from one DataSet, Create a new DataSet, paste the symbols and select a new time scale. That's got to be easier than finding the DataSet xml files, manually editing them, and restarting WL!
profile picture

StreetSmart

#16
QUOTE:
However, right click on those symbol(s) does not allow deleteing them from the data set.

Is there some way of deleting individual symbols in the set through the data manager?


Wealth-Lab User Guide > Data > Data Manager > How to: Add Symbols or Modify a DataSet.





Yes, thats definately correct. However, I was trying to delete symbols from the list in the Symbols Detail mode (so that I could delete all symbols that are returning 0 bars).
profile picture

StreetSmart

#17
QUOTE:
1. Find your Data directory (see WLP User Guide -> Data -> Where data is stored).
2. Open the DataSets folder.
3. Copy the daily DataSet XML file you want to duplicate as a 15-minute one, under a different name (e.g. "Dow 30" becomes "Dow 30 (15)").
4. Open the new XML file with a text/XML editor.
5. Find the following block:

<Scale>Daily</Scale> <BarInterval>0</BarInterval>
6. Change it this way:

<Scale>Minute</Scale> <BarInterval>15</BarInterval>



Thats exactly what I was looking for! Can I assume that if I change the existing datasets in this way that the data would now be downloaded in 15 minute bars and I caould dajust the chart scale to anything 15 minutes or more?

Thanks?
profile picture

StreetSmart

#18
QUOTE:
Why not just copy the symbols from one DataSet, Create a new DataSet, paste the symbols and select a new time scale. That's got to be easier than finding the DataSet xml files, manually editing them, and restarting WL!


Thats a great idea, Cone. But I'd still end up with twice as many datasets. My objective is to alter the download scale on the existing datasets After all, I can always chart them in longer duration scales (daily) as long as I have the shorter duration scale (15 Minutes) data.

Thanks
profile picture

Eugene

#19
QUOTE:
Yes, thats definately correct. However, I was trying to delete symbols from the list in the Symbols Detail mode (so that I could delete all symbols that are returning 0 bars).

That's not an option.

Script a quick Strategy that:
* goes through the DataSetSymbols list (see it in the QuickRef),
* collecting only the symbol names with > 0 bars (Bars.Count > 0),
* PrintDebug the collected symbols
* Run the strategy in single symbol mode,
* copy the tickers
* in your DataSet, Select All and hit Delete.
* Finally, paste the list of valid symbols

Repeat for each dataset involved.
profile picture

Eugene

#20
QUOTE:
Can I assume that if I change the existing datasets in this way that the data would now be downloaded in 15 minute bars and I caould dajust the chart scale to anything 15 minutes or more?

Yes, I think that it should. Basically, I suggested copying just for backup purposes.
profile picture

StreetSmart

#21
Thanks Eugene. Thats exactly what I did (changed Daily to 15 Minutes) and the outcome was exactly what I was looking for. Of course, the first data download took 7 hrs (often adding 300K bars to an equity). But subsequent downloads will be much quicker.

Thank you.