My little Delphi Memory Replacement Test
2004-03-02

The Candidates

Name
Distributable(1)
Version
Where to get? Cost
Pure Delphi
Yes
D6
Build In included
HPMM
?
2001-09-19 - 1.2
from ex-www.optimalcode.com, today down free
RecyclerMM
? (No)

2004-01-29

http://glscene.sourceforge.net/index.php free
BigBrain
? (Yes)
2004-02-16
http://www.digitaltundra.com/ $50
NexusDB Memory Manager
Yes
2004-01-19
http://www.nexusdb.com/ $70
QMemory
?
2001-01-24 - 2.01
via DSP free
 
   

(1) read as: you can distribute your compiled commercial application royalty free with it. ? means that no license information can be found or I didn't understand it.

The Test

A 'data compiler' that merges and translates some huge data definition file from their own format into one common format. Therefor it creates a huge number of cache TLists & TStringlists, does some getmem/freemem, creates tobjects from interpreted data, discards these lists and objects. It is a single threaded console application, no forms-unit, less than 12.000 lines of code, most of the time it is busy with string manipulation, .create and .free. It creates 100% load.

HPMM has been used to collect some data about the 'data compiler' (of course not during the test). I simply added some counting into the getmem, freemem and reallocmem procedures.

Getmem-Calls: 47.915.159, Reallocmem-Calls: 6.348.579, Freemem-Calls: 47.915.060

Blocksize
<16
<32
<64
<128
<512
<1k
<10k
<100k
<500k
other
Getmem
28.317.517
10.947.741
8.400.639
248.320
904
2
30
5
0
1
Reallocmem
2.606.869
2.677.423
490.131
473.327
89.195
6.224
5.388
22
0
0
  (Read <32 as: 17...31)

The "pure delphi" version was run two times, one at the beginning and one at the end. This should give an idea about the time measuring accuracy. The average of both runs has been used as 100%.

Peak memory usage is queried by GetProcessMemoryInfo, execution time is measured by gettickcount.

I took the Memory Manager 'out of the box', no modification, no compiler switch changes. Nothing. I just added the unit into the 'uses' list at first position, build, copy exe into test folder, next memory manager into the uses, build, copy and so on.

The Test Results

System1: a AMD Duron 1200 MHz 100 MHz FSB, L1 64k, L2 64k, 512 MB/1

2004-02-28 Absolute numbers
AAA
Name
ExeSize
Execution Time
Peak Memory Usage
Pure Delphi
198.144
2.107,671
2.096,295
149.680.128
149.680.128
HPMM
215.040
2.801,258
151.879.680
RecycleMM 2004-01-29
216.064
3.208,183
155.672.576
BigBrain 2004-02-16
204.800
2.428,402
237.596.672
NexusDB Memory Manager 2004-01-19
206.336
2.215,226
152.059.904
QMemory
202.752
2.720,803
152.743.936
 
   
2004-02-28 Relative numbers
AAA
Name
ExeSize
Execution Time
Peak Memory Usage
Pure Delphi
=0
100,27%
99,73%
=0
HPMM
16896
133,27%
2.199.552
RecycleMM 2004-01-29
17920
152,63%
5.992.448
BigBrain 2004-02-16
6656
115,53%
87.916.544
NexusDB Memory Manager 2004-01-19
8192
105,39%
2.379.776
QMemory
4608
129,44%
3.063.808
 
   
   

System2: a AMD Barton 2500 MHz 166 MHz FSB, L1 64k, L2 512k, 1 GB/2

2004-03-02 Absolute numbers
AAA
Name
ExeSize
Execution Time
Peak Memory Usage
Pure Delphi
198.144
1.356,952
1.322,251
149.684.224
149.684.224
HPMM
215.040
1.782,473
151.883.776
RecycleMM 2004-01-29
216.064
2.048,465
155.676.672
BigBrain 2004-02-16
204.800
1.575,846
237.600.768
NexusDB Memory Manager 2004-01-19
206.336
1.372,413
152.055.808
QMemory
202.752
1.372,834
152.748.032
 
   
2004-03-02 Relative numbers
AAA
Name
ExeSize
Execution Time
Peak Memory Usage
Pure Delphi
=0
101,30%
98,70%
=0
HPMM
16896
133,27%
2.199.552
RecycleMM 2004-01-29
17920
152,92%
5.992.448
BigBrain 2004-02-16
6656
117,64%
87.916.544
NexusDB Memory Manager 2004-01-19
8192
102,45%
2.371.584
QMemory
4608
102,48%
3.063.808
 
   
   

Conclusion

I'm surprised.

All memory manager vendors advertise with better speed and memory usage. Some vendors even explicitly state that they feel that their MM is the fastest, nothing can be faster even in single thread environments.

All my tests show that every memory manager is slower and uses more memory.

Again, I'm surprised.

After running the second test I'm even more surprise: while 2 MMs showed the results I assumed (almost no change), 2 MMs had a speed change of about 2% that could be within measuring accuracy. And one had a change at more than 25%.

Again, I'm even more surprised.

Addendum (2004-06-13):

One of the problems with multi threaded application: what percentage of memory management is done in what thread? Are threads really competing in memory requests or most of the time idle (waiting for the next job)?

You might not see that much speed up for 'idle mode' applications. And other benefits might be eaten by slow implementations and more global memory usage (that is what this test shows).

So for a 'better' MM I expect to have at least the same performance and memory usage in single threaded situations the Borland MM has.

Explanations

The difference for QMemory is surprising. My only idea is, that it shows the different processor plattform, specially the huge L2 cache. (Yes, I double checked that all the executables are the right ones and run the test multiple times.)

After publishing I got some feed back from vendors.

Motivation

My 'data compiler' has a worse runtime. Some need about two to three days to process one file. So every speed up, even a small percentage of speed up is a benefit. So while running my small test 'data compilers' I had time to surf the net and look for those speed ups.

Btw.: algorithm changes had of course been the biggest benefit. The 'data compiler' started with estimated runtimes being around at multiple weeks.

The 'data compiler' has been tested with MemCeck (Vincent.Mahon) and Range/Stack/Var-checking.