Core i5 runs WLP faster than Core i7
Author: haytac
Creation Date: 8/18/2019 1:27 AM
profile picture

haytac

#1
Hi,

I have two machines both running WL Pro strategy (~ 300K lines).

To my surprise: i5 is more than twice as fast as i7

i7 i5
3770 2430M
3.4 GHz 2.4 GHz
4 cores 2 cores
8 threads 4 threads
8 GB 8 GB
DDR3-1600 DDR3-1333
PC3-12800 PC3-10600

Task manager / Performance / Memory
Speed 1600 MHz 20307 MHz
Form factor DIMM SODIMM
Hardware reserved: 60.5 MB 88.8 MB

Time to completion: 10 seconds versus 4 seconds

The only thing that jumps is the Task manager speed of 20307 versus 1600.
There are no decimal points. Not sure if 20307 makes sense. Correlates though.

I added 8 more GB to the i7 machine for a total of 16 GB with no appreciable difference.

Any inputs are appreciated.

forgot to add that:
the i7 (slower for WL) machine runs Windows 8.1
the i5 (faster for WL) machine runs Windows 10
profile picture

Eugene

#2
Sorry but I cannot consider this set of numbers seriously enough because it lacks many really important factors:

1. First and foremost, the equality of WLP configurations. As per the Wiki FAQ:

* On-demand updates should be turned off.
* The data must be updated prior to running any tests.
* It's advised to disable as many performance visualizers as possible in WL's Preferences. Reopen the Strategy window to apply.
* From experience, it wasn't uncommon for the other users with similar issue to run different Strategy revisions.

2. Disk type: SSD vs HDD. It's amazing that you didn't mention it at all but WLP will run slower on HDD.

3. How many tests were done to average the numbers. The first run does not mean a thing.

4. Software configuration. Like what if the antivirus on i7 is slowing down the disk access? etc.
profile picture

superticker

#3
QUOTE:
i7 i5
...
3.4 GHz 2.4 GHz
...
Task manager / Performance / Memory
Speed 1600 MHz 20307 MHz (???)
I'm not completely following these spec's. I think you mean 2307 MHz, not 20,307 MHz. Are you saying the 3.4 GHz machine has a front-side bus speed of 1.6 GHz, and the 2.4 GHz machine has an on-chip main memory (RAM) speed of 2.307 GHz?

If that's so, then there's your answer. On an L3/L2 cache miss, the 3.4 GHz processor will only be running at 1.6 GHz on the front-side bus. In contrast, the 2.4 GHz processor has no front-side bus or L3/L2 cache. All it's main memory is on-chip (no DIMM RAMs present), so there will never be a cache miss at the L3 or L2 level. It only runs at 2.307 GHz with on-chip memory. (There's an L1 cache running at 2.4 GHz, but that's another issue. Forget I mentioned that.)

Bottom line, placing all the main memory on-chip (without the need for L3/L2 cache memory) really speeds things up. But you'll need to clarify this for sure because I'm having trouble determining--for certain--whether the 2.4 GHz has a front-side bus or not. I "think" you're saying it doesn't.

---
The number of processor cores isn't going to make much difference (WL optimization is single threaded) unless you're using BTUtils for Wealth-Lab. Are you? https://www.wealth-lab.com/Forum/Posts/BTAnalytics-Store-deeply-analyze-your-Backtests-and-improve-your-Strategies-39681 You can respond to that in the BTAnalytics thread.
profile picture

Eugene

#4
@superticker, even though CPU/cache is your forte you'll see that this is irrelevant to the context after checking out the CPU specs or running a comparison. The "M" in Intel chips stands for mobile. The 2430M is a low performance laptop chip from 2011. No way it can beat a desktop i7 which should be roughly 3 times faster according to this benchmark.
profile picture

superticker

#5
QUOTE:
The 2430M is a low performance laptop chip from 2011.
I didn't realize that. In that case, it has a front-side bus with L3/L2 cache and will be slow, and my analysis is wrong. I thought he was using one of the new generation chips with on-chip main memory. You can just delete my Post# 3; it's just noise.
profile picture

haytac

#6
I have an update:

By inserting timing probes into sections of code:
- in all segments the i7 is faster as expected. The main loop takes 1.3 seconds for i7 and 1.9 seconds for i5.
- In segments that contain DataSeries statements i5 is much, much faster

For example for a segment that has 10 DataSeries statements i7 takes 2.2 seconds.
Whereas i5 takes hardly any time at all.

This is a per minute code.

Any inputs are appreciated.
Thanks!
profile picture

Eugene

#7
Impossible to tell without having any detail. We still know nothing (zilch, nada) about your DataSeries, how Wealth-Lab is configured (see post #2), symbol(s) and data provider and so on.
profile picture

haytac

#8
Thank you for your patience,

Both machines had identical settings for:
- Visualizers
- On demand data. Help says: this is ignored in streaming mode. Makes sense.

In the process of comparing I saw that i5 had 4,000 bars whereas i7 had all data.

Once I changed i7 to 4,000 the run time for i7 became shorter than i5.

So, problem resolved.

The i5 1 minute file was about 20 MB. The i7 before the change was 70 MB.
I am guessing that: all data has about 12,000 minutes of data.

It is interesting that each DataSeries costs 200 milliseconds for processing 8,000 pieces of data
from multiple original series. This is about 25 microseconds for each minute’s computation.
The 1 minute file is probably in RAM. Probably a dozen memory to register transactions. With 1600 MHz RAMs the 25 microsecond number makes sense.

Thanks again!
profile picture

Eugene

#9
Glad you got the puzzle solved. :)
QUOTE:
In the process of comparing I saw that i5 had 4,000 bars whereas i7 had all data.

Hint: As the equality is important, it might've been easier to reach if you had copied over the WealthLabConfig.txt (and other related files) from the donor PC.
This website uses cookies to improve your experience. We'll assume you're ok with that, but you can opt-out if you wish (Read more).