Introduction This document presents how to properly tune the TCP/IP and Backhome!/ADSM environment for optimal performance over an Ethernet LAN. Most of the TCP/IP tuning information also applies to other forms of data transfer like FTP.
Future Technical Notes will address results obtained in a larger Tandem environment as well as SNA over Token-Ring and TCP/IP over Token-Ring.
The test environment The test environment consisted of the following hardware and software:
An Himalaya K1000 machine (with 2 processors) running NSK Guardian D43.02. A 3615 Ethernet controller was connected to CPU 0;
A Pentium-PRO 200MHz machine running Windows NT Version 4.0SP3 and IBM ADSM Server Version 3.0. A 3Com 3C905TX NIC was used.
For comparative testing, a second Windows NT machine was used, identical to the first one;
The network consisted of one 10BaseT segment. The machines were inter-connected through an unmanaged 3Com hub.
In both environments, the native TCP/IP stack was used.
Here is a diagram of this environment:
Methodology Three different applications were used to generate the load on the network:
A small test program that was writing 20 megabytes of data to the DISCARD port of the Windows NT server;
The regular NSK Guardian FTP Client;
The BackHome!/ADSM product;
All throughput measures were computed by dividing the actual number of bytes transferred by the elapsed time in seconds. Throughout this Technical Note, all throughput figures are expressed in kilobytes per second (KB/sec).
All the measurements were made using a dedicated Ethernet network. No other activity was taking place on the segment.
The same three applications (or their equivalent) were used to measure throughput obtained when doing the same functions from Windows NT to Windows NT. We then compared the result with the one obtained in the NSK Guardian environment.
The BackHome!/ADSM architecture The following diagram depicts the architecture of the BackHome!/ADSM product.
The data is extracted from the disk through the standard Guardian BACKUP utility. Its output is fed to the BFSBKUP process that emulates a tape drive. BFSBKUP repackages the data and send its output to the ADSM Requestor which is responsible of sending the data over to TCP/IP via a standard socket. The BCOM Monitor oversees the global operation whereas the Logger handles messages issued by all BackHome!/ADSM components.
THE MEASUREMENTS
Initial measurements Initial measurements were made with the three test applications prior to perform any tuning. This was made to show the gains that can be expected after proper tuning.
The following results were obtained on the NSK Guardian client machine:
130KB/sec for the sample application;
150KB/sec for the FTP client;
40KB/sec for BackHome!/ADSM using one requestor.
The following results were obtained on the Windows NT client machine:
1000KB/sec for the sample application;
1000KB/sec for the FTP client;
1000KB/sec for the ADSM Client for Windows NT.
Obviously, the results obtained in the NSK Guardian environment were deceiving and not adequate for bulk data transfer operations.
In view of these results, we opened a problem with Tandem that was very helpful in helping us tune the environment. Tandem Support told us that their FTP client should be able to reach transfer rates of over 700KB/Sec on large K-Series machines.
Final measurements After tuning, the following results were obtained in the NSK Guardian environment:
760KB/sec for the sample application, a 480% improvement;
630KB/sec for the FTP client, a 320% improvement;
250KB/sec for BackHome!/ADSM using one requestor, a 525% performance improvement. High CPU usage on our small K1000 system was a limiting factor.
We can see that significant performance improvements can be obtained by properly tuning the environment. The next section details the actions taken to obtain the above results.
THE TUNING
Tuning was done in the following places:
ADSM client tuning;
TLAM and 3615 controller tuning;
CPU affinity tuning;
Process priority tuning.
The first item has been done for the ADSM client only. The last three items were done on the three test applications.
ADSM client tuning The ADSM client (through the ADSM API) makes use of the DSMSYS configuration file. The file contains two parameters that directly affect performance: TCPWINDOWSIZE and TCPBUFFSIZE. After testing we found that optimal value for these parameters are:
TCPWINDOWSIZE 31
TCPBUFFSIZE 31
With these changes, we were able to increase the throughput of the ADSM requestor from 40KB/sec to 90KB/sec, a 125% improvement.
TLAM and 3615 controller tuning Make sure that you have the QIOMODE ON parameter specified for the controller you will be using for bulk data transfer. Also make sure that the QIOMODE process is running on your system.
The TLAM and 3615 controller tuning consisted in adjusting 2 parameters, DATAFORWARDCOUNT and DATAFORWARDTIME. The default value for DATAFORWARDTIME is 0.010 (10 milliseconds). By lowering this value (to 0.001), we were able to increase throughput by another 50% for the test application. You can achieve the same result by setting DATAFORWARDCOUNT to 1.
These parameters work in conjunction by "buffering" TCP acknowledgements in the 3615 controller. Data is sent up to the TCP/IP stack only when DATAFORWARDCOUNT is reached or when DATAFORWARDTIME has elapsed. By setting DATAFORWARDCOUNT to 1, you are actually telling the controller to immediately interrupt the CPU to process the incoming packet.
Be warned that setting DATAFORWARDCOUNT to 1 or DATAFORWARDTIME to a low value (such as 0.001) will increase the CPU consumption and may interfere with other traffic on the same controller which should not be a problem as you should dedicate controllers for bulk data transfer if possible (see recommendations on configuring for bulk data transfer, later).
You can use SCF to view and set these parameters. Here is an example how to inquire the value ($CHAMA is the TLAM process for the 3615 controller):
ASSUME LINE $CHAMA
INFO PORT #IP
Here is how you can change the value:
ASSUME LINE $CHAMA
ALTER PORT #IP,DATAFORWARDCOUNT 1
ALTER PORT #IP,DATAFORWARDTIME 0.001
CPU affinity tuning Another important aspect that has a significant impact on performance is how you distribute the various processes among the available CPUs.
When performing a network I/O over TCP/IP, the following processes are involved:
The TCP/IP application;
The TCP/IP stack ($ZTC0, for example);
The TLAM process ($CHAMA in our case);
The 3615 controller.
Additionally, when using BackHome!/ADSM, the following additional processes are involved:
The process that runs the standard NSK Guardian BACKUP program;
The BFSBKUP process;
The BMON and LOGGER processes.
When doing bulk data transfer operations, it is extremely important that all processes involved in the network I/O be executing from the same CPU.
With the sample test application, we were able to get an additional 50% gain (from 500KB/sec to 750KB/sec) in throughput, just by running the application in the same CPU! Similar improvements were obtained with FTP. With BackHome!/ADSM, an improvement of 25% has been observed (from 200KB/sec to 250KB/sec). Results were less impressive in the BackHome!/ADSM environment mainly because the CPU usage was reaching its capacity. Better results are to be expected on larger (K10000, K20000 or S70000) machines.
Note that the BMON and LOGGER processes are not on the data path and do not need to run in the same CPU as the other processes.
Processes priority tuning
If you run more than one BackHome!/ADSM requester in one CPU, make sure that they do not have the same priority (parameter CPU-PRIORITY of the ADSM-REQUESTOR statement of the BackHome!/ADSM configuration). You will get better throughput when one requestor is prioritized over the other. Consult the next section (Configurations) for details on how to configure your system when multiple requesters are used.
CONFIGURATIONS
BackHome!/ADSM has the ability to manage simultaneous transfers (up to 16) to one or more ADSM servers. You can make use of this feature by running multiple BackHome!/ADSM requesters in parallel thus obtaining larger aggregate throughput.
You have to find the adequate balance between CPU usage and LAN utilization. If your CPU usage is too high, you will not be able to take advantage of all the bandwidth available on the network. If you have too many requesters in parallel, you will saturate the Ethernet segment. Do not forget that Ethernet is CSMA/CD: there are collisions. If the collision rate is too high, you will probably decrease your throughput instead of increasing it.
On our test configuration (The K1000), we were able to get an aggregate throughput of approximately 400KB/sec (after proper TCP/IP tuning, see above) by running two ADSM requesters in parallel, in the same CPU.
The following diagram illustrates how you can configure your system to run parallel transfer with adequate performance:
In the above example:
CPU 0 runs 3 ADSM requesters, a TCP/IP stack and has one 3615 controller. It is LAN-connected to an ADSM server on the same segment;
CPU 5 runs 2 ADSM requesters, a TCP/IP stack and one 3615 controller;
CPU 6 runs 1 ADSM requester, a TCP/IP stack and one 3615 controller. CPU 5 and 6 transfer data on the same Ethernet segment, where a second ADSM server operates.
With the above configuration, you must watch carefully the collision rate on the second Ethernet segment. If it is too high, you should consider adding an Ethernet segment, moving the ADSM requester from CPU 6 to CPU 5 or completely removing the ADSM requester from CPU 6 and consolidating the transfer requests in the existing ADSM requesters in CPU 5.
When configuring for multiple requesters in parallel, follow these guidelines:
Ensure that the LAN controller, the TCP/IP stack (the $ZTCx process) and the BackHome!/ADSM Requester all run in the same CPU. On a K1000 CPU, you should be able to run 2 requesters in parallel in the same CPU;
At all time, when using parallel configurations, make sure that you are not overloading your CPU (you can use the VIEWSYS utility to monitor CPU usage);
Also, make sure that the collision rate on your Ethernet segment is as low as possible. You can use multiple 3615 controllers to increase throughput but remember: if you put both controllers on the same segment, you will not get better results;
If you are using more than one Ethernet controller, run a copy of TCP/IP ($ZTCx) in each CPU where you have your 3615. The BackHome!/ADSM requester as well as all the I/O processes must also run in the same CPU;
Make sure that your ADSM Server is capable to sustain the rate of transfer you will be generating. You may want to use multiple ADSM Servers, connected through independent LAN segments to achieve good performance;
Make sure that all your concurrent backup requests do not create bottlenecks in the disk I/O subsystem. If your are running multiple backups from the same disk, you can saturate data transfer capability of the disk.
CONCLUSION
You can significantly increase the throughput of your TCP/IP application on NSK Guardian by properly configuring your environment.
Additionally, with BackHome!/ADSM, you can increase the throughput by running requesters in parallel in the same CPU or in multiple CPUs. In this later case, you need to carefully plan your hardware and software configuration so that your system is well balanced.
HOME |
ABOUT US |
PRODUCTS | SUPPORT
| CONTACT US | SITEMAP
Copyright 2003. ETI-NET. All Rights Reserved. All content copyright
their respective owners.
Site Design by
MissionECommerce. Last Updated:
03/23/2006 01:21 PM MST (GMT -0500)