Marathon Extends Reach of Fault Tolerance

Marathon Technologies hardened its fault-tolerant offerings for Windows this week with an option for separating redundant machines by up to 10 kilometers.

Marathon calls the technology Endurance Long Distance SplitSite (LDSS), and it improves upon Marathon's previous distance limit of 500 meters between its lockstep servers but comes at a much higher cost.

Separations of 10 km have been available for awhile for Windows servers in a failover cluster, but Marathon's fundamentally different approach to high availability has made the fibre distance milestone more difficult to achieve.

Standard failover Windows clusters only require that the data be replicated between sites. Latency is not an issue if data is going from disk to disk for later retrieval if one server fails.

Marathon's approach, however, delivers fault tolerance but requires that the separated servers work on the same data at the same time. A Marathon array involves four servers running one application. One computer called a "compute element," a one- or two-processor server, handles transaction processing. It is connected to another server that handles I/O.

A second "compute element" handles the exact same transaction at the exact same time as the first compute element. The second compute element connects to a second I/O server, which is not synchronous with the first I/O server to protect against rare I/O errors crashing both machines simultaneously.

Under the configuration, if one of the compute element-I/O "tuples" fails, the other continues any transactions that are in progress and continues working without interruption. Clustered systems usually require the backup server to start the application from scratch once it detects a failure in the first server.

Separating the Marathon servers by 10 km means that one site can suffer a local disaster without any impact on in-process transactions.

"At that distance, we've found that we're able to measure a slight performance degradation, but it's not noticeable," says Dennis Birch, director of product marketing for Marathon.

Marathon positions its technology as a low cost route to fault tolerance. The four-server array, which requires only one Windows 2000 Server license, and its associated fibre cards and connections has been priced to parallel the cost of a standard Microsoft failover cluster. The new option drives up the price point, in part due to the need for fibre connections between the sites.

LDSS systems start at $100,000.

About the Author

Scott Bekker is editor in chief of Redmond Channel Partner magazine.