Sunday, May 2, 2010

setup a 3-server MPI cluster environment

Environment Description

Hardware/OS: 3 servers running Linux (server 0, 1, 2)

MPI Software: Intel MPI 4.0 for Linux (Redhat 5.0)

download trial version from Intel web site. Try this link (no registration, my link): http://registrationcenter-download.intel.com/akdlm/irc_nas/1718/l_mpi_p_4.0.0.025.tgz. If the link does not work, please download it from here (requires registration)

Example: Matrix Multiplier



Key reference doc
Intel MPI 4.0 Documentation (get started, reference manual)


Installation

1) Prepare servers

The hostname of my 3 servers are:
ericsoa1
ericsoa2
ericsoa3

From each server, user can ssh to the other 2 server (by using host name, Not IP address) via SSH and NO PASSWORD required.
If not, please follow this instruction


2) Installation Intel MPI SDK [Note: ONLY NEED TO DO THE FOLLOWING ON SERVER 0/zero, it will install SW on all other servers]
  • Follow the installation instruction [1] above. DO Step 1, 2, 3
  • Step 4 of installation instruction [1]: install the Intel MPI Library on all nodes of your cluster, ASSUMING NO shared file system, create a machines.LINUX file, with the following content:
ericsoa1
ericsoa2
ericsoa3
  • SKIP Step 5 of of installation instruction [1]
  • Step 6 of [1]
  • Step 7 of [1]. // use default choice
  • steep 8 of [1]. choose 2 // Install cluster node software on every node of your cluster
  • It will ask for "machine.LINUX". Press enter directly since there is one under the current directory
  • review and accept the agreement
  • accept default in the following part
When you finishes the installation, you will see the following result: "Completed cluster installation successfully."



3) prepare mpd.hosts file
create mpd.hosts under directory $PWD (user's home directory. For root user, $PWD is /root)
Include the following content in the file:
ericsoa1
ericsoa2
ericsoa3

4) setup MPI runtime environment (paths, etc)

cd /opt/intel/impi/4.0.0.025/bin
. mpivars.sh //// note the . in the beginning
export CC=gcc


4) start the mpd process
goto $PWD , for root user: cd /root
mpdboot -n 3 -r ssh

use mpdtrace to verify the result. It should shows all the server nodes.


Compile and Run the application

DO BELOW from Serevr 0/zero ONLY

1) Setup project folder in Server 0/zero
mkdir /opt/tp/project

extract example files (*.c, *.h) into this project

cd /opt/tp/project/

mpicc -o tp_10_5_5 *.c // note, tp_10_5_5 is the name of the output file. it means A (1000*500) * B (500*500)

use "ls" to check the file has been generated)


2) Do the following from Server 0/zero:

Create project folder (e.g. /opt/tp/project) in all servers.
ssh ericsoa2 mkdir -p /opt/tp/project
ssh ericsoa3 mkdir -p /opt/tp/project

3) copy the generated executable files to all other servers:
scp /opt/tp/project/tp* root@ericsoa2:/opt/tp/project
scp /opt/tp/project/tp* root@ericsoa3:/opt/tp/project


4) Run the example application from server 0/zero
run the following from the project folder:
mpiexec -n 3 ./tp_10_5_5

and you should see the result.


Good luck!

No comments:

Post a Comment