Tuesday, March 16, 2004

MPICH Cluster Setup

This is a test to setup a cluster with two nodes using my home machines.

/etc/hosts (Redhat 9.0 with kernel 2.4.20) files:
Dell Inspiron 8100 (master node): 192.182.1.2 node1
Dell Dimension L600 (secondary node): 192.182.1.3 node2

Download the MPICH 1.2.5.2 from http://www-unix.mcs.anl.gov/mpi/mpich/ , follow the instruction to install on both machines. MPICH uses rsh or ssh to communicate with each other. The default is rsh. If you like to use ssh (secure shell) instead, you should configure with following parameters in the MPICH install directory:

[root]# ./configure --with-device=ch_p4 --prefix=/usr/local/mpich --rsh=ssh
[root]# make

After installation, add the /mpich_install_dir/bin and /mpich_install_dir/util to your $PATH environment. To let the master node ( laptop node1) know the other secondary nodes, add all nodes in the file /mpich_install_dir/util/machines/machines.LINUX:

[huang]$ cat machines.LINUX
# Change this file to contain the machines that you want to use
# to run MPI jobs on. The format is one host name per line, with either
# hostname
# or
# hostname:n
# where n is the number of processors in an SMP. The hostname should
# be the same as the result from the command "hostname"
#localhost.localdomain
node1
node2

To enable rsh (remote shell), edit the /etc/xinetd.d/rsh, change the line of "disable = yes" to "disable = no". To be convenient, I also enable the rlogin service. After the modification, you have to restart the xinetd daemon:

[root]# /etc/rc.d/init.d/xinetd restart

To let the node1 (master node) be able to run the programs in node2 automatically without password prompt, add .rhosts file in user's home directory of node2:

[huang] $ cat ~/.rhosts
node1 huang

Also, the /etc/hosts.allow and /etc/hosts.deny files must be correctly configured to allow the rsh service. For simplicity reason, I add following line on the /etc/hosts.allow file to accept all services between two machines:

ALL: node1 node2 192.182.1.0/255.255.255.0, 127.0.0.1

To allow the super user root to use the rsh and rlogin services, add another line on file /etc/securetty:

rsh, rlogin, rexec, pts/0, pts/1

The authentication file /etc/pam.d/rsh should also be modified:

[root]# cat /etc/pam.d/rsh
#%PAM-1.0
# For root login to succeed here with pam_securetty, "rsh" must be
# listed in /etc/securetty.
auth sufficient /lib/security/pam_nologin.so
auth optional /lib/security/pam_securetty.so
auth sufficient /lib/security/pam_env.so
auth sufficient /lib/security/pam_rhosts_auth.so
account sufficient /lib/security/pam_stack.so service=system-auth
session sufficient /lib/security/pam_stack.so service=system-auth

We could verify the rsh service in master node (node1):

[huang]$ rsh node2 "ps -ef"

Then the running processes in node2 will be showed on node1. Now it's ready to run parallel programs. There are some sample programs in /mpich_install_dir/examples/basic directory, enter the directory and compile the source code with command make (in both machines), e.g., cpi is MPI program to compute the PI value:

[huang]$ mpirun -np 2 cpi
Process 0 of 2 on node1
pi is approximately 3.1415926544231318, Error is 0.0000000008333387
wall clock time = 0.001943
Process 1 of 2 on node2

Make sure the executable files in each machine must be in the same directory structure. We could also specify a configure file instead of using the default machines.LINUX configuration:

[huang]$ cat my.conf
node1 0 /home/huang/cpi
node2 1 /huang/cpi
[huang]$ mpirun -p4pg my.conf cpi
Process 0 of 1 on node1
pi is approximately 3.1415926544231318, Error is 0.0000000008333387
wall clock time = 0.002097
Process 1 of 2 on node2
P4 procgroup file is my.conf.
[huang]$

Enjoy the power of parallel computing!

Thursday, March 11, 2004

Wireless Setup (WPC11V4) in RedHat Linux

Occasionally I worked in Middlesex college with my Dell Inspiron 8100 notebook, but I couldn't get the wireless connection there with my Linux bootup (Redhat 9.0 with kernel 2.4.20). It's Linksys Wireless-B network card (WPC11 V4). Notice the WPC11 V4 network card has different chipset (Realtek 8180) from previous models that are based on prism2. Getting some wireless cards to work in Linux is not a trivial thing, and I spent a couple of days fiddling around how it work. A good link about wireless LAN resources for Linux comes from Jean Tourrilhes' excellent wireless collection.

Go to Realtek's dowload website, search for 8180L driver. There is one for Linux kernel 2.4.20. Download and unzip it. Compile it as a root user, there will be a driver file named "rtl8180-24x.o" created when there were no errors. If there were something wrong, it might be caused by a messed up kernel, or source that doesn't match the kernel you are running. You have to download a new kernel (2.4.20-8) and compile the kernel. When the driver file is created, copy it to the system's module library:

[root]# cp rtl8180_24x.o /lib/modules/`uname -r`/kernel/drivers/net/wireless/
[root]# cardctl insert
[root]# depmod -ae
[root]# modprobe rtl81880_24x
[root]# cardctl ident
Socket 0:
no product info available
Socket 1:
product info: "Realtek", "Rtl8180"
manfid: 0x0000, 0x024c
function: 6(network)
[root]#

Now the device is recognized. To boot it up in the wireless network, we have to do some configuration work. Here is my script to enable the wireless card:

[huang]$ cat /etc/init.d/wlanup
# Load wireless lan driver
#/sbin/insmod -f rtl8180_24x.o
/sbin/modprobe rtl8180_24x

# Work as AP mode & Assign SSID and operation channel.
# Channel 1, 2, 10 are working fine in UWO
/sbin/iwpriv wlan0 wlan_para ssid=uwo
/sbin/iwpriv wlan0 wlan_para ssid2scan=uwo
/sbin/iwpriv wlan0 wlan_para channel=2

# Configure WEP. UWO doesn't use it, a "blue socket" instead
/sbin/iwpriv wlan0 wlan_para encmode=off
/sbin/iwpriv wlan0 wlan_para wepmode=off

# Configure debugging message
/sbin/iwpriv wlan0 msglevel 0


# Enable wireless lan driver
/sbin/iwpriv wlan0 enable
sleep 2

# Get IP address from DHCP server
/sbin/dhclient -1 -q -lf /var/lib/dhcp/dhclient-wlan0.leases \
-pf /var/run/dhclient-wlan0.pid wlan0
echo "$(/sbin/ifconfig wlan0)"
[huang]$

Notice in UWO's campus wireless network, the WEP (Wired Equivalent Privacy) is disable, instead, the school's username and password are used for authentication. The SSID (Service Set Identifier) is lowercase "uwo". After run the wlanup script, a network IP address of the wireless card would show up if the connection is established. To shutdown the connection, another script is used:

[huang]$ cat /etc/init.d/wlandown
# Shut down wlan0 net interface
/sbin/ifconfig wlan0 down

# Disable wireless lan driver
/sbin/iwpriv wlan0 disable
# Unload module
/sbin/rmmod rtl8180_24x
[huang]$ su -
[root]# cardctl eject

Finally, we add a file named "S99wireless" with one line of "/etc/init.d/wlanup" into "/etc/rc.d/rc3.d/" directory, then the script would run automatically during Linux's booting up.

Tuesday, March 02, 2004

Resolve Hotmail Login Problem

I had a trouble when logging in some websites such as hotmail with Mozilla (Firefox).

The hardware is Motorola SB5100 cable modem and Netgear RIP614 router (firmware 4.15RC4)The systems worked properly with Sympatico and Execulink ADSL high-speed connections. After switching to Rogers Cable high speed Internet, logging on to hotmail's web-mail failed and was pending on "transferring data from server".

Solution: reduce the MTU size for the connections.

Linux: [root] # /sbin/ifconfig eth0 mtu 1418

Windows XP: Modify the Windows' Registry by command "regedit": in the tree of "HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\Interfaces", find out the network card's entry, something like "{C42A4EA1-8C79-48F4-9309-99D7DE35D462}", add a DWORD value with name of "MTU", select "Decimal" base and input the value data of "1418"; save and quit.

A MTU size of 1418 bytes is small enough to solve the problem in most cases. Basically a MTU of 1448-1472 bytes should work for most network environment.