Storage Benchmarks

Test Description

This test involves several application-specific benchmarka against four different storage implementations to compare performance, memory usage and disk usage. The four storages being tested are:

FileStorage, the conventional implementation
FileStorage, the fsIndex branch designed to reduce memory usage
BerkeleyStorage Full
BerkeleyStorage Packless

The tests use databases that are as large as possible given the time constraints, however they are still not 'very large'.

These tests were performed by Toby Dickenson, [email protected], between 13th and 17 December 2001. The permanent URL for this document is http://www.zope.org/Members/htrd/benchmarks/storages.

Conclusions

If you want to skip ahead, the conclusions are down here.

Availability

This test data, test scripts, and test application are derived from a closed-source product, and are not publically available.

Test Equipment

Redhat 7.1 Linux, on an Athlon 1700 with 512MB memory.
Storage data on a ATA100 IDE disk, formatted with the reiser filesystem.
Python 2.1
Zope 2.4.1 (with many custom patches)
BerkeleyStorage 1.0beta5, db 3.3.11, pybsddb 3.3.2
The storage was hosted in a ZEO server (1.0b3) so that I can isloate storage memory usage from memory used by Zope and the application logic. Note that this test machine is well endowed for memory, therefore I record memory usage as the size of the ZEO process virtual address space (VSZ). This may not be representative of the amount of memory needed before performance is affected, but I believe it is close enough.
The ZODB cache was configuration paramaters are: 60 seconds, 1000 objects.
Zope was started with one worker thread. All test scripts are also single-threaded.

BerkeleyStorage Configuration

I didnt spend too much effort optimising BerkeleyStorage for this test. DB_CONFIG was:

set_lk_max_locks 2000
set_lk_max_objects 2000
set_lk_max_lockers 10
set_cachesize 0 1048576 0
set_lg_bsize 262144

The 1M cachesize gives a cache hit rate of 98% during the bmadd script, compared to 80% with the default 256k cache. This was a significant performance improvement (I didnt measure how much).

The 256k log buffer is something the BerkeleyDB documentation says should improve throughput, since it reduces the 'writes due to overflow' listed by db_stat -l to about 2% of the total writes. In practice I couldnt measure a difference.

It looked like the default number of locks and lock objects would have been enough for this test (due to having no concurrency?), however I doubled the defaults 'just in case'.

During the test db_checkpoint -v -p 5 was running to checkpoint the logfile every 5 minutes. An additional db_checkpoint -1 was run manually shortly before the end of the two long test scripts to ensure that the checkpointing cost was included in the elapsed time measurement.

Test Scripts

I had originally planned to test using two scripts to exercise the storages, but due to time constraints the second test has not yet been performed.

The first is write-heavy, the second read-heavy. In both scripts application logic is expected to use much more processor time than the storage.

bmadd.py

This script transfers roughly 1000 documents into the ZODB using http, where it is indexed (ZCatalog-style indexing and an application-specific indexing process). This corresponds to roughly 18000 ZODB objects.

calc.py

This script traverses all of the documents added by mbadd.py, performing several memory-intensive calculations on each document. Note that this test has not yet been performed.

Test Procedure

Restore the nearly empty 'preadd' database.
If this test uses a FileStorage:
Delete any index file, start ZEO to create a new index, then stop ZEO to leave behind a clean index file of the correct type.
Delete any ZEO client cache. Start ZEO and Zope.
Run the bmadd.py script. When the first indexing operation is complete (that is, the first one out of roughly 1000), measure the VSZ of the ZEO process. At this point any delayed initialization should have ocurred.
Measure the elapsed time to perform bmadd.py.
Record the VSZ of ZEO process soon after that script terminates. (and check occasionally during the run that this is indeed the maximum)
Record the size of the filestorage file, or size of the Berkeley database files (exclude log files).
Restart both ZEO and Zope processes. Record the ZEO VSZ.
Pack the storage. Record the largest VSZ during the pack (it changes quickly, so I may not always catch the highest peak)
Repeat the file size measurements.
Note: The remainder of this test has not yet been performed.
Restart both ZEO and Zope processes. Record the ZEO VSZ.
Run the calc.py script, and measure the elapsed time.
Repeat the VSZ measurement.

That procedure was repeated for the three storage implementations listed above.

Results

	Conventional FileStorage	fsIndex FileStorage	Berkeley Full	Berkeley Packless	(2a)	(2b)
ZEO VSZ before bmadd	6480k	6448k	8336k	8764k
Time for bmadd	2979s	2990s	2999s	3064s (3)
ZEO VSZ after bmadd	9096k	6516k (1)	8364k	13096k
ZEO VSZ growth during bmadd	2616k	68k	28k	4332k (4)
Disk space	132M		188M	134M
ZEO VSZ after restarting	7272k	7452k (2)	8336k	8604x	7536k	6420k
ZEO VSZ peak during packing	12876k	9560k	14468	11316k	13028k	9210k
ZEO VSZ growth during pack	5604k	2180k	6132k	2106k	5492k	2790k
Disk space after packing	111M		188M	135k

1. Toward the end of bmadd.py, the virtual memory size of the fsIndex FileStorage process exhibited short-lived peaks above its initial value of 6448k, but always returned to that original value. I guess those tiny BTree nodes were just on the point of filling up the fragments of free memory left after initialisation.

2. Why is the VSZ of the ZEO process so much larger after restarting? Suspecting that there may be a memory leak or similar problem with de-persisting the BTree index, I repeated the pack test after deleting the index file. These results are in column 2a for the conventional FileStorage, and 2b for the fsIndex. The post-packing numbers suggest that memory was not leaked, just fragmented.

3. During this test the machine was running some other low-load processes, which may account for the longer run time.

4. Looking at the source, I see no good reason why Packless should use increasing amounts of memory as this test proceeds. Is there a memory leak?

Conclusions

There was less than 1% difference in elapsed time when using the different storages. The extra overhead of BerkeleyDB appears to be negligible.
The packing test caused BerkeleyStorage process to grow by 6M for Full, and 2M for Packless. This growth is even larger than for FileStorage. This does not live up to BerkeleyStorage's high-scalability reputation.
In this test the fsIndex branch of FileStorage reduces the memory growth during the test by an impressive factor of 38. During packing the factor is a less impressive 2.5. fsIndex FileStorage uses the least memory when packing of all FileStorages.