CSC/ECE 506 Fall 2007/wiki1 3 as1506
Commercial computers, since long, have been using parallel architectures for its high end applications. However unlike the requirements of scientific or engineering computing where majority of the work done depends on the computing ability, commercial applications require that the system supports maximum number of transactions at any given time so that it can support large database and service large number of customers. The class of such systems is referred to on-line transaction processing.
TPC benchmarks
The goal of TPC benchmarks is to define a set of functional requirements that can be run on any transaction processing system, regardless of hardware or operating system. This methodology allows any vendor, using "proprietary" or "open" systems, to implement the TPC benchmark and guarantees to end-users that they will see an apples-to-apples comparison.
One of the OLTP system benchmark in this suite, the TPC-C, simulates a complete environment where a population of terminal operators executes transactions against a database. The benchmark is centered around the principal activities (transactions) of an order-entry environment. These transactions include entering and delivering orders, recording payments, checking the status of orders, and monitoring the level of stock at the warehouses.
The throughput of TPC-C is a direct result of the level of activity at the terminals. Each system has ten terminals and all five transactions are available at each terminal. A remote terminal emulator (RTE) is used to maintain the required mix of transactions over the performance measurement period. This mix represents the complete business processing of an order as it is entered, paid for, checked, and delivered. More specifically, the required mix is defined to produce an equal number of New-Order and Payment transactions and to produce one Delivery transaction, one Order-Status transaction, and one Stock-Level transaction for every ten New-Order transactions.
The tpm-C metric is the number of New-Order transactions executed per minute. Given the required mix and the wide range of complexity and types among the transactions, this metric more closely simulates a complete business activity, not just one or two transactions or computer operations. For this reason, the tpm-C metric is considered to be a measure of business throughput. The tpm-C, does not just measure a few basic computer or database transactions, but measures how many complete business operations can be processed per minute. This new benchmark should give users a more extensive, more complex yardstick for measuring OLTP system performance.
The current version of TPC-C benchmark is Version 5.9. Compared to the version mentioned in the text, pricing changes included reducing maintenance support pricing to 3 years down from 5 years, 24x7 maintenance up from 8x5, removing terminal network pricing (hubs, switches), and allowing pricing quotes from web pages and print materials. Runtime changes included reducing the disk space requirements to 60 days from 180 days, increasing the measurement interval to 2 hours up from 20 minutes, reporting checkpoint durations, and reporting the number of lost connections of users during the measurement interval.
Top 10 TPC-C according to performance
Company | Name | tpmC | Price/tpmC in $ |
---|---|---|---|
HP | HP Integrity Superdome- Itanium2/1.6Ghz/24MB iL3 | 4,092,799 | 2.93 |
IBM | IBM System p5 595 | 4,033,378 | 2.97 |
IBM | IBM eServer p5 595 | 3,210,540 | 5.07 |
IBM | IBM System p 570 | 1,616,162 | 3.54 |
IBM | IBM eServer p5 595 | 1,601,784 | 5.05 |
FUJITSU | PRIMEQUEST 540 16p/32c | 1,238,579 | 3.94 |
HP | HP Integrity Superdome | 1,231,433 | 4.82 |
Top 10 TPC-C according to price/performance
Company | Name | tpmC | Price/tpmC in $ |
---|---|---|---|
HP | HP ProLiant ML350G5 | 100,926 | .74 |
DELL | PowerEdge 2900/1/2.33Ghz/2x4M | 69,564 | .91 |
DELL | PowerEdge 2900/3Ghz/4M | 65,833 | .98 |
DELL | PowerEdge 2800/1/2.8Ghz/2+2M | 38,622 | .99 |
DELL | PowerEdge 2800/1/3.6Ghz/2M | 28,244 | 1.29 |
DELL | PowerEdge 2900/1/2.66Ghz/2x4M | 126,371 | 1.33 |
DELL | PowerEdge 2800/1/3.4Ghz/2M | 28,122 | 1.40 |
Processor and memory specifications
The ever increasing gap between processor and memory speeds, have elevated memory system design as the critical performance factor for commercial workloads. Memory design depends on the number of factors like data/instruction locality, branches etc. A well designed memory system will have fewer cache misses and reduces branch miss rates. Memory sizes are in the range of few GB's to up to TB's.
Following table shows the memory and processor resources in some of the today's systems used for commercial workloads.
System | Processor | Number of | L3 cache/ proc. | Memory | tpmC |
---|---|---|---|---|---|
processors | MB | GB | |||
HP Integrity Superdome- Itanium2/1.6Ghz/24MB iL3 | 1.6GHz Intel Itanium 2 | 64 | 24 | 2048 | 4,092,799 |
IBM System p5 595 | 2.3GHz POWER5+ | 32 | 36 | 2048 | 4,033,378 |
IBM eServer p5 595 | 3.2GHz 1MB L3 Xeon Processor | 32 | 36 | 2048 | 3,210,540 |
IBM System p 570 | 4.7GHz POWER6 | 8 | 32 | 768 | 1,616,162 |
IBM eServer p5 595 | 3.2GHz 1MB L3 Xeon Processor | 2 | 36 | 2048 | 1,601,784 |
PRIMEQUEST 540 16p/32c | 1.6GHz Dual-Core Intel Itanium2 | 16 | 24 | 1024 | 1,238,579 |
HP Integrity Superdome | Intel Itanium 2 9M CPUs at 1.6GHz | 64 | 9 | 1024 | 1,231,433 |
HP Integrity rx5670 Cluster–Itanium2/1.5 GHz-64p/6 | 1.5GHz Itanium 2 6M w/ 6MB Cache | 64 | 6 | 768 | 1,184,893 |
IBM eServer pSeries 690 Model 7040-681 | 2.8 GHz Xeon W/512KB L2 Cache | 2 | 0.5 | 2 | 1,025,486 |
IBM System p5 570 Model 9117-570 | 3.2GHz 1MB L3 Xeon Processor | 2 | 1 | 2 | 1,025,169 |
HP Integrity rx6600 | Dual Core Itanium 2 Processor 9050 | 4 | 24 | 192 | 344,928 |
HP Integrity rx8620 – Itanium2/1.6 GHz-16p/16c | Intel Itanium2 - 1.6 GHz | 16 | 6 | 256 | 332265 |
From the above table it can be seen that,
performance of HP Integrity Superdome with a 1.6GHz processor > performance of IBM System p5 595 with 2.3GHz processor > performance of IBM eServer p5 595 with 3.2GHz processor.
While the same set of data shows the superiority of performance of IBM eServer p5 595 (L3cache/proc = 36MB memory = 2048GB) as compared to IBM System p5 570 Model 9117-570 (L3cache/proc = 1MB memory = 2 GB) even though both systems have the number of processors(2) and same processor type(3.2GHz 1MB L3 Xeon Processor)
Speedup
The processor speeds matter less in the throughput of the system. Of more importance, is the number of processors.
System | Number of | tpmC | Speedup |
---|---|---|---|
processors | |||
IBM System p5 570 Model 9117-570 | 2 | 1,025,169 | 1 |
IBM eServer pSeries 690 Model 7040-681 | 2 | 1,025,486 | 1.000309217 |
IBM eServer p5 595 | 2 | 1,601,784 | 1.562458482 |
IBM System p 570 | 8 | 1,616,162 | 1.576483487 |
IBM eServer p5 595 | 32 | 3,210,540 | 3.131717795 |
IBM System p5 595 | 32 | 4,033,378 | 3.934354238 |
References
- alskdjfa;slkdjfa;s
External links
lkas;alksdjf;alskdfj;alskf