This page gives an overview of how to run on cosmea, a machine located on Argonne National Laboratory. It is not a general resource; it is only for the SHARP team at Argonne and their collaborators.
Basics of getting onto the machine
- You can request an account through MCS
- You can only login into cosmea from an MCS machine (e.g. terra.mcs.anl.gov).
- You should log in to hostname "login01.cosmea".
- The first time you log in, it likely won't work. To fix this:
- To get an account on MCS, they asked for your public key.
- They put this in your authorized_keys file.
- The authorized keys file is linked to the identity on your desktop machine.
- When you try to get to cosmea from an MCS machine (e.g. terra), it will reject you because you don't have the same identity as your desktop machine, which is the one it is "authorized" for in "authorized_keys".
- So: you need to copy id_rsa.pub (maybe id_rsa as well) into your .ssh directory on the MCS machine.
- I found that I was unable to compile until I got a .modulerc file placed in my home directory.
- I also had to log out and log back in.
The file is:
#%Module1.0 module add IntelCompiler module rm mvapich-0.99-intel module add pbs module add mvapich2-1.0-2008-02-06-intel
Basics of getting a parallel job to run
- You need to set up an .mpdboot thingy. I don't have the details on this yet.
- Dave just gave me a .mpd.conf, which I copied from him. The permissions are 600.
Basics of monitoring a job
- "qstat -a" gives basic information about who is running.
- "qstat -t" gives a lot of information, including what nodes you have.
Background on the machine
- The I/O on this machine is NFS and it is slow.
- It has a good amount of memory and compute on each node.
- VisIt is installed on cosmea at /gfs/software/software/visit