The Aggregate Cluster Design Rules Form

Version 20070523

Warning:The CGI program that handles this form is a work in progress, but should be stable. If you encounter any problems, or have other suggestions, please send email either to Bill Dieteror Hank Dietz, at


Costs & Internal Parameters

Obviously, one of the most important cluster design constraints is cost... but cost changes rapidly, and is not necessarily the same for everybody depending on where you live, what deals your institution has with vendors, etc. To get truly accurate prices, the best approach is probably to first discuss the cluster with someone who knows your institution's purchasing procedure and any vendor relationships, then to get prices yourself (via the WWW, phone calls, etc.), and finally to write a document explaining precisely what you need so that people in purchasing will not make the common mistake of treating your request like purchase of generic PCs. For the purpose of computing approximate price/performance for the cluster configurations that are otherwise viable, there are a few cost parameters used by our code. The default prices used are very approximate; you may want to alter the values to more accurately reflect the prices you find.

The full set of cost and internal parameters used by the CGI is stored in a database. You can select any of these parameter sets, some of which have been made available by other users, or create one with your own values to better reflect your circumstances. Each email address is allowed one parameter set.

NOTE: all prices for components with an Amazon ID are updated daily. These price updates may cause the CDR to produce different designs given the same input depending on what the best prices are on any given day.

To edit your own parameter set you must first login. You need not login to evaluate a design using any of the parameter sets that appear above. If you are unsure, you probably do not need to login.

Email address:
Password:
  

If you would like to login, you must first create a new user account if you have not done so before.

Warning: The password entered in this form is sent across the network in the clear. Though unlikely, it is possible that someone could intercept it. Do not use the same password here that you use for more secure accounts.


This form contains a series of questions about your supercomputing application and constraints on the cluster design. A CGI program will use your answers to provide a set of cluster designs that might be appropriate for your application. The search space is generally very large, so the CGI limits the search time to 10 seconds. Of course, the designs generated by the CGI should be treated only as a starting point for your own detailed design analysis.

Memory

Although data size often increases somewhat when a program is parallelized (due to copies buffered to avoid communication), the memory space that would be taken by your serial program's data for the largest problem you hope to solve can be used as an estimate of the data space needed in parallel execution. In integer MB, about how much data does your program need to hold in main memory?
per cluster

Your application program will need to be running on each node, thus, each node memory will need a copy of the executable code. Not counting data, how big is your compiled program?
per node

Operating systems differ, but any OS you use on each node will have to fit in node memory. This OS memory use includes not just the OS kernel, but also any background OS tasks and a reasonable set of I/O buffers. How much main memory will your OS use on each node?
per node

Ever since cache moved on-chip, cache bandwidth has been very good for most processors. Unfortunately, codes that have cache-unfriendly memory access patterns are very sensitive to main memory bandwidth. If you specify a particular memory bandwidth (GBytes/second per GFLOPS) here, the GFLOPS rating quoted for each design will be limited to the GFLOPS supported by the memory bandwidth of the system.

Hard Disk Drives

Although use of disk-based virtual memory should be avoided to achieve maximum performance, many existing software packages have "memory leaks" that make their memory image slowly grow despite the fact that the memory space they are truly using does not grow. If your code leaks, you may want to use local disks for virtual memory swap space. How much swap space should be allocated?

In total, how many GB of hard disk storage do you need for live data in your cluster? (Keep in mind that your cluster can borrow disk space from other machines, so no disk is needed within the cluster nodes.)

One of the advantages of clusters is that you can use additional disk drives in each node to increase reliability (using simple disk mirroring or fancier RAID techniques) or to hold backups. Which of these techniques do you intend to use?
no RAID add RAID for recovery from one disk failure
no extra disk space add space for a backup copy

Network

The minimum message latency of the network plays a major role in determining the finest grain parallelism you can use across the nodes of a cluster. What is the largest minimum latency that you expect your code can tolerate?

Latency in performing various types of aggregate functions (aka, collective communications) is different from latency on point-to-point messages. What is the largest minimum latency that you expect your code can tolerate on aggregate operations like barrier synchronization?

Bisection bandwidth is a very important in achieving good performance from some applications, but is relatively unimportant for others. If all the processor cores in the cluster are communicating simultaneously, what is the minimum amount of network bandwidth that must be available to each processor core? (I.e., what is bisection bandwidth / number of processor core?)

When a node needs to communicate, on average, how many other nodes does it need to talk with? We call this number the coordinality of communication. For example, to update boundary regions in a simple 2D grid code, each node typically communicates with every node that is handling an adjacent portion of the 2D space; thus, communication would involve 4 other nodes.

Physical Parameters

No matter how reliable your vendors are, even if you have a maintenance contract, you need spare nodes. Without spares, hardware failures not only will take time to be repaired, but also probably will result in your hardware becoming heterogeneous, because identical parts are no longer available. "Hot" spares give you higher availability than "cold" spares, but increase your network costs. How many spare nodes do you want to include?

Not counting spares, does your cluster need to have a specific number of nodes, processors, or processor cores? What are the constraints?

The cluster nodes and switches need to be mounted in some kind of rack or shelving unit. A typical rack needs about 2x2 feet of floor space; shelving units come in various sizes, but most require about 2x4 feet of floor space (i.e., roughly the space of two 2x2 racks). Allowing at least a 2 foot wide path to walk around the racks, what is the maximum number of 2x2 rack units that you can fit in the space you have available?

Power requirements for nodes vary depending on the precise contents of the nodes, but a large cluster needs a lot of power. In the USA, that power is typically 110VAC, usually with a 15A or 20A limit on each circuit coming from the main power box. Approximately how much 110VAC power can be dedicated to the cluster?

Essentially all the power consumed by a cluster gets turned into heat, which has to be removed from the room housing the cluster. How much air conditioning is available in the cluster room?

Costs

The cost of the cluster does not end after it is purchased. Operating a cluster incurs costs as well. Enter local cost factors below to help the CDR find designs that fit within your budget.

Operating Costs

In addition to money you spend to buy your cluster, it will cost money to operate. Often some or all of these costs are born by other parts of the support organization, and thus are "free" to the cluster purchaser. However, operating costs can be a significant percentage of the cluster purchase price, especially when considered over the life of the cluster. These operating costs tend to vary widely based on location.

How much does electricity cost at the location at which the cluster will be housed?
$/KWh + $/month
What kind of cooling system will be used to cool the cluster?

How much does floor space cost per square foot per month?
$per ft2 per month

What is your annual operating budget? (Enter 0 to disregard operating costs)
$ per year

Software

A cluster needs an operating system, system management tools, run-time libraries, and application software to do useful work. Many cluster users are happy with free/open source software. However, some users want specific commercial software packages that require licensing fees.

How much does your software cost?
per cluster
per

Acquisition Cost

What is your budget for the cluster? Be somewhat conservative, because there are always a few unexpected costs.

Metrics

What is the minimum number of GFLOPS your cluster must be able to achieve?

The HPL model estimates cluster performance based on a model of the HPL benchmark. What is the minimum performance (in GFLOPS) your cluster must be able to achieve on the High Performance Linpack (HPL) benchmark (experimental)? A value of 0 means no constraint on High Performance Linpack (HPL) performance.

The SWEEP3D model estimates cluster performance based on a model of the SWEEP3D benchmark. What is the minimum performance (in 1/sec) your cluster must be able to achieve on the SWEEP3D benchmark (experimental)? A value of 0 means no constraint on SWEEP3D performance.

The Total Cost of Ownership model measures total operating cost over five years plus acquisition cost. The operating cost is computed based on power consumption for the cluster itself, power for air conditioning, and space rental costs. Acquisition cost include all costs to buy the necessary hardware and software. What is the minimum performance (in 1/$) your cluster must be able to achieve on the Total Cost of Ownership benchmark (experimental)? A value of 0 means no constraint on Total Cost of Ownership performance.

What is the maximum number of designs that should be displayed?

There are a variety of metrics computed for each feasible configuration.

The following parameters allow you to set weightings for how these metrics will be combined to determine the sorted order of configurations, and hence which design is best. Weightings are multiplicative factors; higher factors imply that extra performance is more important in that metric.
+ + + + + + + + +

Component Limitations

For non-technical reasons, you may be restricted to including certain components with or excluding certain components from your cluster. You may limit the search to either include or exclude systems that have a particular string in the name or type of at least one component.

Include only designs with of the space separated strings below in the name or type of at least one component. Surround strings that have spaces with quotes (e.g., "Fast Ethernet").

Exclude all designs with of the space separated strings below in the name or type of at least one component. Surround strings that have spaces with quotes (e.g., "Fast Ethernet").


Credits

The C program that generated this page was written by Hank Dietz, with modifications by Bill Dieter, using the CGIC library to implement the CGI interface. Additional features were added by Anand Kumar Kadiyala. Venu Venkata Subramanya Surampudi added support for automatic price updates from Amazon.com, and did an initial conversion of the database to XML, with later modifications by Bill Dieter.

CGI POST data is processed using the CGIC library, which is licensed under the following license:

CGIC, copyright 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004 by Thomas Boutell and Boutell.Com, Inc.. Permission is granted to use CGIC in any application, commercial or noncommercial, at no cost. HOWEVER, this copyright paragraph must appear on a "credits" page accessible in the public online and offline documentation of the program. Modified versions of the CGIC library should not be distributed without the attachment of a clear statement regarding the author of the modifications, and this notice may in no case be removed. Modifications may also be submitted to the author for inclusion in the main CGIC distribution.

IF YOU WOULD PREFER NOT TO ATTACH THE ABOVE NOTICE to the public documentation of your application, consult the information which follows regarding the availability of a nonexclusive commercial license for CGIC

The Aggregate. Cluster Supercomputer Design Rules.