Random data provider
Author: Eugene
Creation Date: 6/1/2010 4:20 AM
profile picture

Eugene

#1
Random static data provider, which is a part of Community.Providers, is available for installation from our web site.

This provider can quickly create large DataSets with randomized historical data for blind testing purposes. Hopefully, it will help you fight over-optimization and design better trading systems without curve-fitting.

Random data provider - installation
Random data provider - home page (online documentation)
profile picture

Sammy_G

#2
Tried downloading it. Got error message: This extension is for WealthLab Developer only. Why are WLPro customers being excluded?
profile picture

Eugene

#3
Thank you for the heads-up. A new version fixing the bug is uploaded. Hope this wasn't too much aggravating.
profile picture

Sammy_G

#4
Downloaded and installed, thanks.

Created 2 datasets of 10 symbols each - one daily, the other 1-min bars - with start date approx 1 month ago (all default choices, by the way). When I opened their charts, they are all blank w/message 'No Data Available'. I must be doing something wrong.
profile picture

Eugene

#5
Opening a blank chart of a "new" symbol happens because on demand data updates were turned off in the Data Manager (and for the better! As a side note: I always suggest turning it off right after installing WL5 - it's better having your DataSets updated physically since collecting data on demand slows down strategy execution, giving false impression of slow application performance.)

In this case, you need to update the two DataSets first. (The easiest way with other data providers is to "Update all data..." by provider, but the Random provider currently doesn't support this feature.)

What you need to do is to open the Data Manager, highlight one of the DataSets, and click "Update DataSet", then repeat for any other Random dataset you might have.
profile picture

Sammy_G

#6
OK, did that and now I have data!
Not to be critical or anything, but its even created data for dates that were weekends or holidays; that's unrealistic for trading purposes (at the very least it will affect total # of bars between dates).

Also, will the data have to be manually updated thus every time?
profile picture

Eugene

#7
QUOTE:
Not to be critical or anything, but its even created data for dates that were weekends or holidays;

What a stupid bug with trading on weekends. Thank you for spotting it. I've just fixed it, another version was uploaded on the server.

A holiday in one country is not a holiday in another, so for simplicity's sake, we'll leave this as is for now.
QUOTE:
Also, will the data have to be manually updated thus every time?

Right, but on demand updates and/or refreshing the chart will work as well - anything but the "Update all data" button.
profile picture

Sammy_G

#8
Updated to build 2, created a new daily dataset of 10 symbols (choosing all default values). The weekend bars are gone, but the holiday bars are there; you think Cone might develop a Data Truncation Utility for these random datasets ;)

Now there's a new bug: Of the 10 symbols, the first 2 have identical prices, the next 5 have identical prices and then the last 3 have identical prices. In other words, there are really only 3 unique price series in the 10 symbol dataset.
profile picture

Eugene

#9
Re: holidays, please see my reply above. Holiday schedules are very different around the globe, so it's not practical to include them in this provider. (Later on, when the Market Manager tool arrives, it will be a no-brainer to "remove" holidays w/o affecting the data itself.)

As to the bug, it took me quite some time to figure out the problem. Long story short, there's nothing wrong with the logic. The problem is caused by the provider... being too fast. It's only evident when creating short periods of daily data, and never happens when creating hundreds of years and millions of intraday bars.

For whatever reason, the random data generator isn't being reset properly. In the next build I'll insert a delay in the code, and for now, you can simply specify creating more bars (e.g. 10 years of daily data, a month of intraday bars is just fine).
profile picture

Eugene

#10
To anyone interested.

By design, the provider is creating parameterless Random objects. In this case, .NET uses a time based seed number. Since we determined that the provider is fast, returning a new symbol with 30 bars of data in no time, it happened that .NET used the same seed for the next new instances. That's why the tiny 30-bar symbols (only) really weren't random but repeating, and this didn't happen for normal usage scenarios (e.g. millions of bars) - it takes some CPU time! So, inserting just a token delay between function calls fixed the issue. A new version was uploaded.

Two helpful links:

Random Number in C#, Be careful of some of the samples you find.
C# Random Number Generator
profile picture

Sammy_G

#11
Good detective work, Eugene. Its amazing what bugs can teach you about the inner workings of software. One thing I worry about, just thinking ahead, is that as CPUs get faster and faster, will the delay factor hold true in the future.

Potential trap: Once the Random data provider creates a dataset and the symbols get populated with data, I don't know if when you refresh the data is it creating all symbols afresh or just adding data at the end like with any other provider. If its the latter, a user must delete previously created defective data files by going to the respective folders (this is what I did). Which brings me to a suggestion: Perhaps the Random data provider can delete previously created data files and/or data sets, or some such option should be available in the Data Manager.
profile picture

Eugene

#12
Actually, refreshing it's simply a matter of deleting all DataSets by a provider and updating in "all data" mode with option "Delete data for non-DataSet symbols" enabled. This effectively wipes out the existing data. For some technical reasons not be discussed here at this time, this option is disabled in the provider.
profile picture

Eugene

#13
Updated Random provider to v2010.06.4:

* Added: support for Strategy Monitor
In response to:
QUOTE:
Also, will the data have to be manually updated thus every time?

* Added: possible to update all data by provider

Note: due to a live bug, do not use this mode ("Update all data...") when your Random DataSets contain new symbols w/o data. Instead, create new DataSets and update each of them manually to pre-fill them with data. Then you can safely use "Update all data..."
profile picture

Eugene

#14
Random data provider updated to 2010.09:

Summary of changes:

Added: streaming provider
Changed: possible to install in Wealth-Lab 6.x
Fixed: bar data editing fix
Fixed: previous close is honored on updates to maintain data consistency
Fixed: Open/Close could potentially get above/below High/Low
Fixed: increased delay between symbols (sometimes identical random numbers still slipped in)
Fixed: various cosmetic/internal changes and fixes

Wealth-Lab 6 customers can install it from Extensions. For more information, please check out the Wealth-Lab Wiki page.
profile picture

mkalayog

#15
The daily option of the Random data provider generates data only back to 1/1/10, whereas the minute option generates data as far back as I need. If I enter 1/1/2000 as the daily data starting point and then update the dataset, the log indicates that I add the right # of bars (something like 2600+). However when I chart any of the RAND symbols, the chart shows that the data only go back to 1/1/10. What's the problem?

Thanks-
profile picture

Eugene

#16
Most likely, the problem is that you updated its data in Update All Data mode. The Random provider does not support this feature properly due to some complications (though support will be added in the upcoming release early next year).

To reset the data for the existing symbols, follow the tips in the Wealth-Lab Wiki FAQ on Data and Data Providers:

How to delete (e.g. broken) data files?

Also you have the option to create a new DataSet with different symbols (i.e. not RAND0... but something else e.g. RND0...) and update this particular DataSet.
profile picture

mkalayog

#17
Neither of those approaches work:
Refreshing dataset doesn't show data before 2010; I also tried different symbols and the data still get loaded to beginning of 2010 and not earlier. Keep in mind that loading minute data works just fine...

Thanks.
profile picture

Eugene

#18
Sorry I forgot to mention that you need to delete the existing Daily DataSet as well if you wish to start from an earlier date. (The starting date is fixed for a symbol.)

With regard to approach #2, please detail the steps that you are taking.
profile picture

mkalayog

#19
Data Manager -> Create New DataSet -> Random Data Provider -> symbol name prefix = abczzz (new name), symbols = 10, data scale = daily, starting date 1/1/2000, market hours unchanged --> next --> dataset name = random abczzz --> finished --> select 'data sets tab' in data manager, then select 'random abczzz' --> click 'update dataset' --> here is the log:
Up-to-date symbols: 0, Update required for: 0, New symbols: 10
Creating new random symbols...
Symbol: ABCZZZ0, Created random bars: 2861
Symbol: ABCZZZ1, Created random bars: 2861
Symbol: ABCZZZ2, Created random bars: 2861
Symbol: ABCZZZ3, Created random bars: 2861
Symbol: ABCZZZ4, Created random bars: 2861
Symbol: ABCZZZ5, Created random bars: 2861
Symbol: ABCZZZ6, Created random bars: 2861
Symbol: ABCZZZ7, Created random bars: 2861
Symbol: ABCZZZ8, Created random bars: 2861
Symbol: ABCZZZ9, Created random bars: 2861
Update completed (0.312 sec)

So then I select a symbol from that dataset and double-click it to chart it (no strategy loaded), with scale=daily + data range = all data, and I see the bars going back only to 12/21/2009.... Also when I click on symbol details of the dataset each symbol has only 261 bars showing...

profile picture

Eugene

#20
Thank you for reporting. I confirm this bug with 6.x but I'm pretty sure it was working back in the 5.6 days when the provider was released. Marking it to fix in the upcoming release.
profile picture

mkalayog

#21
yes it used to work w/ 5.6...
thx
profile picture

mkalayog

#22
Eugene also it would be great to have the option to select the method of randomization (Power Law method (by tmh13) vs. simple method (by johnjan). Thanks,
profile picture

Eugene

#23
Sorry, currently this is not planned because I'm reluctant to add a Data Manager tab for just one option. I prefer not to overload the interface with options, switches, choices etc. (Maybe later.) What's bad about having the randomization method being selected on a random basis?
profile picture

Eugene

#24
The Random provider has been updated to version 2011.01 on 01/12/2011.

This is a maintenance release. A lot of bugs have been fixed, just to name a few: "Update All Data" now is fully working again, failing Bar Data Editor for intraday charts, stability issues, incorrect starting date of generated data.

Note: the provider can be installed in Wealth-Lab 6.1+ only yet it is incompatible with the new (as of v6.1) GetSessionOpen method.
profile picture

mkalayog

#25
Thanks for the update Eugene. Do you know when you'll release the ability to select the method of randomization (per 12/26/2010 above). This is critical since not all random #s are the same, and results generated by assuming a gaussian distribution ("simple method" by johnjan in the provider) vs. a power law/scalar distribution (by tmh13) can be enormous. Since there is currently no way to determine which symbols are generated by which method, I have no way of parsing out the effect of the random provider on my strategy.

Thanks,
profile picture

Eugene

#26
I'll consider it. A lot of updates to our extensions were made this month, including two new Streaming providers, so this option didn't make it into the current release cycle.
profile picture

mkalayog

#27
Thanks Eugene - I know you guys have been busy and I saw the 10 or so updates to the extensions! Thanks for considering,
profile picture

Eugene

#28
The change turned out to be slightly more involved than it seemed initially. I'll implement the choice of randomization method when creating new DataSets (only) with a few exceptions:

1. Not possible to select a method for Streaming
2. Not possible to select a method in Strategy Monitor
3. Does not apply to intraday data which always uses its own sort of simple randomization

Any old Random DataSets will be lost. Stay tuned for the update early next month.
profile picture

mkalayog

#29
Thank you - looking forward to it!
profile picture

Eugene

#30
Random provider updated to version 2011.02.

As requested, the New DataSet Wizard dialog will contain a new option. The provider now allows to specify which randomization method to use when creating new DataSets.

Breaking change: all DataSets created before will be lost and have to be created from scratch (but all previously generated data is preserved).

For more information refer to the online manual page in the Wiki.
profile picture

Eugene

#31
Random data provider updated to 2012.09.

Summary of changes:

* Fix: (62778) Shifted DataSeries break execution in Strategy Monitor on non-native scale
* Change: possible to install in Wealth-Lab 6.3
profile picture

Eugene

#32
Random data provider 2017.04 is possible to install/update in Wealth-Lab 6.9+, .NET 4.5+ required.