Dataset Optimization Slowing
Author: sdbens20
Creation Date: 2/11/2011 10:57 AM
profile picture

sdbens20

#1
I am optimizing a dataset of 280 symbols. The first symbol optimizes in 40 seconds. At the 30th symbol, optimization is taking 4+ minutes. Is this decline in speed to be expected as the process steps through the dataset? Is there a way to reduce the optimization speed slowdown?

My computer processor is a quad-core hyperthreading i7. It's using only 12% of its capacity. 33% of system memory is being used. Is there a way to get WLP to use more of my computer's capacity?

Are there plans to modify WLP to utilize multi-core processors? Plans to utilize graphics processors?

Thank you.
profile picture

sdbens20

#2
Just found a post of a similar problem from a couple years ago. In that case, the user solved problem by turning off "On Demand Data Updates - Automatically update data for symbols on-demand when they are charted or accessed." I'm testing this change on my optimization now and will let you know of speed is improved.

Thanks.
profile picture

Eugene

#3
QUOTE:
Is this decline in speed to be expected as the process steps through the dataset?

It's not possible to tell for sure because we haven't seen your code. Normally - no, it's not expected, but the information provided is not enough. No optimization method, no data loading range, no code (for starters).
QUOTE:
Is there a way to get WLP to use more of my computer's capacity?

Unfortunately, at this stage there's no easy solution. That requires changing .NET Framework version to 4.0 to utilize the new parallel processing goodness.
QUOTE:
Are there plans to modify WLP to utilize multi-core processors?

Fidelity is aware that it's an item of high demand. Please call Fidelity and leave a request, thank you.
QUOTE:
Plans to utilize graphics processors?

Developers may do this on their own, and I know that some of our partners are already having success boosting WL6 with GPGPU technologies.
profile picture

sdbens20

#4
Turning off "On Demand Data Updates - Automatically update data for symbols on-demand when they are charted or accessed" did not fix the slow down problem. After changing the setting and restarting the optimization, speed has gone from 40 seconds for the first symbol to an average of 113 seconds at symbol 26.

My code follows: It's a Williams %R crossover system I created by selecting rules and then having Wealth Lab convert it to script.
CODE:
Please log in to see this code.


Optimization Method: Generic
Data Loading Range: 1 year, daily values
Raw Profit Mode: Fixed Dollar


profile picture

sdbens20

#5
I have called Active Trader Support and spoken to a WLP specialist to request that Fidelity improve WLP to include support for multi-core processors and GPU processing. Please let me know what else I can do to help promote these improvements being made.

Thank you.

Sherm
profile picture

Cone

#6
There's no reason that your script would run longer than several msec for each symbol. If you're using a Fidelity Daily DataSet with On-demand off, symbols will be loaded and executed very fast.

What type of optimization are you running, and how do you know which symbol is currently being optimized?
profile picture

Eugene

#7
He's probably running Genetic (not 'Generic') optimization. In this case, 2 things should generally be avoided: 1) run concurrent GA optimizations and 2) watch the progress graph in real-time.
profile picture

sdbens20

#8
Cone,

I open the results tab to see where the optimization is in the list of symbols.

Eugene,

Yep, I meant genetic. Also, what button do I push to avoid running concurrent GA optimizations? I'm running only one instance of optimization (only one green progress bar is displayed.) Finally, does "watch the progress graph in real-time" mean that I should not have the Optimization/Optimation Control tab open so that the green progress bar is visible?

Sherm
profile picture

Eugene

#9
Sherm,

Please disregard. Single running GA instance is OK (but concurrent are currently questionable - although for some they work just fine). The progress bar is OK, it's the Fitness graph that will slow optimization down.
profile picture

sdbens20

#10
Eugene,

Where do we stand on defining/correcting the cause of the slowdown? From Cone's comment, looks like it shouldn't be happening.
profile picture

Eugene

#11
Have you tried copying/pasting the code in a new strategy window and trying to optimize?
What about optimizing other, canned strategies?
Other optimization methods, do they work as expected? With your code / with canned strategies?
profile picture

sdbens20

#12
The optimization continuously gets slower and slower as it works through my dataset, whether using my strategy or a canned strategy or using Genetic or Monte Carlo methods.

Do you experience a slowdown when optimizing a dataset?

Here's some data from different trial runs optimizing my 290 symbol dataset using Active Trader 2008-02 Bow Tie Variation or my strategy and using either Genetic or Monte Carlo methods. Data Scale is Daily, Data Range is 1 Year, and Position Size is $1,000,000. I running Windows 7 Pro, 64 bit, no other applications are open, the processor is 12% utilized, and 25% of memory is in use.

Several trials below are included because I noticed that closing and reopening WLP between optimizations resulted in reduced per symbols processing times and a for the Genetic method, more steady increases in processing time from one symbol to the next.

Trial #1 Active Trader 2008-02 Bow Tie Variation, WLP not restarted before optimizing

Genetic: Symbol Time to Optiimize Per Symbol
Each Symbol Time Increase
#1 35 sec ---
#2 41 6 sec
#3 47 7
#4 44 (??) -3
#5 69 25
#6 80 11
#7 92 12
#8 109 17
#9 122 13
#10 123 1
#11 139 16
#12 152 13

Trial #2 Active Trader 2008-02 Bow Tie Variation, WLP restarted before optimizing

Genetic: Symbol Time to Optiimize Per Symbol
Each Symbol Time Increase
#1 28 sec ---
#2 33 5 sec
#3 37 4
#4 37 (??) 0
#5 46 9
#6 50 4
#7 54 4
#8 61 7
#9 63 2
#10 60 -3
#11 73 13
#12 72 -1

Trial #3 my strategy, WLP restarted before optimizing

Genetic: Symbol Time to Optiimize Per Symbol
Each Symbol Time Increase
#1 33 sec ---
#2 39 6 sec
#3 44 5
#4 50 5
#5 56 6
#6 61 5 Note: At this rate of increase
#7 67 6 the optimization of 290
#8 72 5 symbols will take 72 hours.
#9 78 6
#10 84 6
#11 90 6
#12 95 5

Trial #4 my strategy, WLP not restarted before optimizing

Monte Carlo (20 runs, 10 tests per run)(data presented differently because the green progress bar displays differently)

Elapsed Time Symbols Processed Remaining Time
1 min 15 (avg=15/min) 19m33s
2 23 (avg=11.5/min) 15m54s
3 34 (avg=11.3/min) 16m41s
4 41 (avg=10/min) 24m24s
6 54 (avg=9/min) 26m6s

Trial #5 Active Trader 2008-02 Bow Tie Variation, had to restart WLP before optimizing due to crash

Monte Carlo (20 runs, 10 tests per run)(data presented differently because the green progress bar displays differently)

Elapsed Time Symbols Processed Remaining Time
1 min 22 (avg=22/min) 12m26s
2 37 (avg=18.5/min) 13m43s
3 50 (avg=16.7/min) 14m27s
4 61 (avg=15.3/min) 14m59s
6 81 (avg=13.7/min) 15m31s

Finally, I let my strategy optimize (Monte Carlo, 40 runs, 20 tests per run) overnight (8+ hours). This morning, with 2+ hours remaining runtime, I clicked on the Results tab in the Optimization Control window to see how many symbols had been processed. The program crashed without presenting results.

Please let me know if you need more information.

Thanks,

Sherm
profile picture

sdbens20

#13
Noticed that the formatting of my data was lost when posting my last message. To clariy, the numeric data in Trials 1-3 such as #2 33 5 sec means that the second symbol in the dataset took 33 secs to process which was 5 secs longer that it took to process symbol #1. For Trails 4 and 5, a line such as 2 37 (avg=18.5/min) 13m43s means that after 2 minutes of optimization, 37 symbols had been processed at an average of 18.5 symbols per minute and Optimization Control was reporting that it would take another 13 minutes 43 seconds to process the entire dataset.
profile picture

Eugene

#14
Which WLP version are you running, 32-bit or 64-bit?

Please create a new support ticket and leave a link to this thread. tia.

p.s. By the way, your script boils down to:
CODE:
Please log in to see this code.
profile picture

sdbens20

#15
64-bit

Support ticket just created.

Thanks for the improved script.

Sherm
profile picture

Eugene

#16
Try not opening the Results tab to see where the optimization is in the list of symbols.

Populating ListView with thousands of entries can slow processing down, as does updating the Fitness graph in real-time.

p.s. Given that Wealth-Lab does not report per symbol run times, you must be calculating them with a timer?
profile picture

sdbens20

#17
Eugene,

1. The slowdown occurs without opening the Results View. I started briefly opening/closing the Results view after noticing that dataset simulation was taking much longer than expected. Also, I have never opened the Fitness graph.

2. The Results window is only opened briefly from time to time to figure out when the slowing optimization might be ending.

3. Yep, I'm manually clocking the per symbol speed.

4. I noted in the support ticket that slowdown occurs when using either GA or Monte Carlo methods.

Sherm

PS: Have any of you experienced this slowdown yourselves? The recommendations I've received from different responders makes it seem that no one has duplicated the problem. Please give it a first hand look using the code I provided (or Eugenes improved version posted above) to see what happening. Thanks.
profile picture

Eugene

#18
While trying to reproduce the issue on Win7 x64 Enterprise using 64-bit WLD 6.1 (I don't have WLP around) with the strategy code and settings above, what I noticed was some difference between estimated optimization time and its final elapsed time when profiling a Monte Carlo-based optimization. First few symbols of the Nasdaq-100 DataSet predicted 19-20 minutes and total optimization time finished in 22 minutes. No drastic slowdown was registered, the figures are in the same ballpark.

It does not make sense timing Genetic optimizer due to the way it works: actually, you never know how many runs are required. Nonetheless, as Sherm points out, slowdown occurs for him using either of these methods so it's possibly not related to particular optimizer implementation (all Wealth-Lab optimizers are in fact plugins, even the built-in methods) but is something more general.

Do you have a different PC to try?
profile picture

sdbens20

#19
Eugene,

1. If you were asking me if I have another PC to try, the answer is yes but both are much older than the one I use when when experiencing the slowdown. As mentioned in my first post, I'm experiencing this slowdown on my i7 quad-core hyperthreading machine while using only 12% of processor capacity and 24% available memory. Seems to me that my computer ought not be the cause of the slowdown.

2. Please take another look at my data posted 2/12 @ 11:04 AM. Trials 4 and 5 both used the Monte Carlo method. The slowdown was not drastic at first but it becomes more drastic as each symbol is processed. In trial #4 using my script, symbol processing speed dropped from 15 symbols/minute to 9 symbols per minute 6 minutes into the optimization. In trial #5 using a canned script as you suggested, symbols processing speed dropped from 22/min to 13.7/min. 6 minutes into the optimization. With the optimization speed progessively declining with each symbol processed, it takes days for my 280 symbol dataset to finish optimization.

3. There is another difference in trials 4 and 5. WLP had not been restarted for trial 4 but was for trial 5. Doing a restart of the program before running an optimization consistently reduces initial per symbol processing time but does not fix the slowdown problem.

4. You provided me an improved version of my original script. I don't use it because it gives different results from the original. I'm not requesting any attention be given this difference unless you believe its important to resolving the slowdown issue.

5. You mentioned your trial finished in 22 minutes compared to an early 19-20 prediction. As my optimization runs on the 280 symbol dataset set, it also starts at a reasonable time to completion and then gets into predicting "days" for completion. At this moment, I'm GA optimizing a 78 symbol data set that started out by taking only 34 sec on the first symbol. At that rate, the optimization should complete in 44 minutes. It has taken 2 hours 54 minutes to get to the 56th symbol. It now predicts taking 12 days 23 hours to complete.

6. I understand your comment about timing the Genetic Optimizer's per symbol processing symbol. However, I do believe there is a rough range that the time per symbol will fall into and that that range is helpful is seeing that optimization time per symbol is, in very general terms, increasing. Trial #3 (2/12 @ 11:04 AM posting) shows that the time per symbol can, in fact, increase at a very consistent rate when using GA.

Thanks,

Sherm
profile picture

sdbens20

#20
Eugene,

I issued a support ticket for this problem but see that it's being addressed as a forum issue, not a support item. Does the fact that my problem is not being handled as a support issue reduce the priority given to finding a solution? Why is it not being treated as a support problem and does it make a difference on the priority my problem receives?

Thanks,

Sherm
profile picture

Eugene

#21
1. I'm asking not about their configurations but if they show the same behavior, or not. You are the one with this symptom, currently. Merely trying to scout for hints. Currently, there are not much.

2. On a less powerful computer, it took less than 1/2 hour to complete 100 symbols. I couldn't reproduce it. The available information doesn't make it possible for me to suggest anything else, given that you disabled on demand data update. Suggesting to reinstall .NET framework, which we do in 1% most oddly looking cases including DRASTIC slowdown (opening of certain WL windows takes minutes and but backtests work very fast) and it helps, wasn't given to you for the reason that it usually affected older XP machines when .NET wasn't a part of Windows. You may still give it a try but in your case, it's not that evident.

3. Restarting always helps because it frees memory. On top of that, Garbage Collector in .NET 2.0 is a slowpoke. I'd anticipate many positive effects if WL was moved to .NET 4.0 - even GC was overhauled.

4. Actually, the difference is caused by this line in my version:
CODE:
Please log in to see this code.

You may replace it with the similar line from your script w/o breaking anything:
CODE:
Please log in to see this code.


5. GA timings shouldn't be that blindly trusted. Due to alogorithm complexities, they do depend on various factors and do vary wildly. As GA's making progress, notice how rapidly the number starts changing towards the middle/end.


Re: support vs. forum.

Talking about giving you a lower priority is incorrect.

a. I asked to create a ticket to exchange some data files and possibly have the GA optimizer developer look into and, if possible, share his experience. That's it.
b. Currently, it is essential to keep this conversation in one place.
profile picture

sdbens20

#22
In my computer's Control Panel listing of installed programs, there is a line called Microsoft .NET Framework 4 Client Profile. Should that program (or whatever it is) be reinstalled? Should Net Framework 2 also be installed or installed instead? Can multiple versions of the NET frameworks be installed on the same computer? Your #3 in your last post almost makes it seem that WLP is designed to run with NET 2, not NET 4.

I know nothing about this topic.
profile picture

Eugene

#23
Yes, WL5/6 was designed to run with .NET2 back in the days of 2006/2007 b/c it was the most actual target platform version. Nevertheless, .NET4 includes .NET 3.5, 2.0, 1.1 with service packs. You may try to reinstall the NET4.0 framework, and it will get 2.0 reinstalled as well, but there's no guarantee that it will work: your case is only somewhat similar to the cases we fixed in the past with .NET 2.0 reinstallation on older XP machines before the framework got integrated starting from Vista.
profile picture

Eugene

#24
1. As GA estimated run times can't be relied upon in this context, I wonder if the Exhaustive optimization has the same issue?

2. Also I'm interested about a different scenario: Portfolio Simulation multi-symbol backtest optimization. Could you run this kind of Exhaustive/MC optimization on your DataSet of Daily data (hopefully it's not minute data scaled into Daily?), noticing the estimated time after processing the first 5-10 symbols and then again after 50 and 100 symbols? Is there a drastic slowdown, or something reasonable and expected e.g. 10-20%?

3. Finally, I'd still like to know if the issue can be reproduced or not on your other, older PCs. Since the issue occurs only for you, and so far we don't have a good hint as to why this might happen, it may be something with your computer's setup, installed 3rd party programs, etc. - something not evident.

P.S. To further stress the point that talking about reducing the priority for your issue is incorrect:

c. Creating a ticket already means rising the priority as it becomes registered, we can work on it later as more evidence/hints become available, it's possible to share ideas with the GA developer etc.

d. The two technicians, Cone and I, are used to have loads of requests/bugs/ideas/projects/articles/extensions/whatever to process and to serve the Wealth-Lab user community. Our manpower is limited, and we already dedicate 6-7 days a week to various W-L activities. Thank you for your understanding.
profile picture

sdbens20

#25
***Note: This post was written (but not sent) before your 2:07 AM and 4:07 AM messages.***

Eugene,

Just noticed that your version of my script displays (in the Results window) one column of period data and one column of smoothing data. My version displays four columns each of period and smoothing data. Might that explain your faster processing time and the differences in profitability I see?

I am currently optimizing a 67 symbol dataset using your code with the change you mentioned making in one line. It does process symbols much faster than my code. Slowing still occurs. The first symbol processed in 11 seconds. At symbol 25, average speed was 26 seconds. At symbol 42, average speed dropped to 31 seconds. At symbol 53, average speed again dropped to 40 seconds. All symbols were processed in 51 minutes, a new reord for me. Net profits seemed greater, also.

Sherm
profile picture

Eugene

#26
QUOTE:
Might that explain your faster processing time and the differences in profitability I see?

Your original code originated from the Rule Wizard. As a result, there were four times more %R and SMA series created than actually needed. They all act the same, that's why there's no difference in Net Profit between the original and the handcrafted versions after replacing that line of code from my message dated 2/18/2011 3:02 PM. This is not necessarily a deficiency as the output code is correct, just not optimal. But some processing overhead can be added by the duplicate DataSeries.
profile picture

Eugene

#27
This may sound cumbersome but try running several copies of Wealth-Lab 6 under different Windows user names. In contrast to simply opening several Workspaces, I think this way you could load all of the CPU cores because each copy runs as an independent process with its own CPU affinity. (Edit: you might even want to set manual CPU affinity by highlighting each WealthLabPro.exe in Task Manager, right-clicking and seelcting "Set affinity"). Then, break the DataSet in two (four etc.) and "assign" each half (quarter) to one of the instances.

Sure, we've been suggesting Fidelity to move to .NET 4.0 and optimize WL6.x for parallel processing but while it's not around, let me know if this speeds things up.