A Cluster of Workstations (COWs) computer is also a collection of workstations used as a parallel computer, but they are managed by a single entity with a mind towards providing access to all nodes for parallel computing. The nodes are still used as regular workstations, and may still be physically distributed, but they will typically have common software and file systems, and users with access to one have access to all for parallel computation.
A Dedicated Cluster Parallel Computer (DCPC) is a collection of workstation gathered specifically for parallel computation. These machines typically do NOT have a display and keyboard, are usually stored physically close to one another, have a unified management facility, common software and file systems, and are pre-loaded with parallel software. By being dedicated, all nodes are available for parallel computation without impacting interactive users. By being physically close, specialized hardware such as very high performance networks can be employed to improve their performance.
A Pile of PCs (POPC) is a dedicated cluster of commodity components including CPUs, motherboards, memories, disks, network devices, etc. used to build a parallel computer system. The POPC differs from the DCPC primarily in that cost is allocated from "bigger" or "faster" nodes to MORE nodes. In addition, the upgrade path of a POPC is much more flexible, as most components adhere to public de facto standards.
A BEOWULF
is a POPC with software. The BEOWULF
project is one of building and integrating software components that make
a POPC architecture "feel" like a unified parallel computer. Thus,
a BEOWULF is
a
cluster,
its nodes are dedicated, the components are commodity, and
it includes software that provides a unified system image.
The nodes on a BEOWULF parallel computer are connected by the fastest network available. Most parallel processing applications need good network performance, so a certain amount of emphasis is placed here, depending on one's budget. To date, 100Mbps ethernet has been the favorite implementation medium. By far the most performance for the buck to be had is in 100BaseT switches. Switched networks give pretty good bandwidth and are fairly inexpensive. Other alternatives are Gigabit ethernet, fibre channel, FDDI, Myrinet, channel-bonded ethernet, and routed ethernet topologies.
The head or manager node of a BEOWULF parallel computer should be reserved for logins, compiling, and managing running applications, but NOT used for computation. There is a certain amount of disagreement on this between those who feel the unified system image is achieved by making every node in the machine (including the head) appear as an equal to all others, and those who feel the unified system image is achieved by treating compute nodes as resources used by programs. The latter approach postulates that the compute nodes should note even be logged into by users, rather they serve only as a target for parallel tasks to execute on. The author is in the latter camp and much of his writing will reflect this fact. At the same time, we must note the alternative opinions.
Finally, what makes a BEOWULF a BEOWULF is its software. The purpose of the software is to provide the unified system image that makes program development and execution as simple and straightforward as possible. This includes code for compiling and debugging, monitoring and managing, communicating and sharing, and performing I/O and computation. This software exists in the kernel, in libraries, in tools, and in packages. There is not a single bit of software that makes a BEOWULF, and there is not one configuration that is "right." BEOWULF is now and always will be a "roll-your-own" type of system, and users will be free to select the tools and methods that make the most sense for them. At the same time, those of us in the BEOWULF community will work to bring together what software there is, build what software there isn't, and provide our advice as to how best build a BEOWULF.