Cheatsheet for Nimrod

 

Cheatsheet for Nimrod Installation

Download/ cvs co nimrod

  • setenv CVSROOT where-is-nimrod
  • cvs co nimrod

Setup dirs

  • mkdir -p ~/.nimrod/experiments
  • touch ~/.nimrodrc

Install python be version 2.3.1 or later

  • which python
  • python -v

Install postgres with python, and start postmaster

  • which psql

Ensure python modules for postgres are in

  • ls $PYTHON/lib/site-packages
  • Check pg module is available to python:
  • python> import pg

Create a database in our user name

  • createdb

Install Globus 2.4

  • Get a certificate and register it on a remote machine
  • grid-proxy-init
  • Check GRAM works:
  • globusrun -a -r brecca-2.vpac.org/jobmanager

Compile nimrod

  • setenv NIMROD_INSTALL ${HOME}/programs/nimrod
  • setenv NIMROD_DATABASE pgsql
  • cd ~/src/nimrod/
  • ./configure --with-pgsql --prefix=$NIMROD_INSTALL
  • make
  • make install

Create a database tables

  • $NIMROD_INSTALL/bin/nimrod dbcreate
  • (looks OK, many notices re "NOTICE: implicit trigger...")

Check database tables

  • psql -c "select * from nimrodapexecutable"
  • (looks good)

Test with Demo

Setup

(Optional) Start with a clean database

  • nimrod dbclean
  • nimrod dbcreate

You need to

  • setenv X509_USER_PROXY ${HOME}/.nimrod/globus_proxy

in order to tell nimrod which your proxy is (do this before grid-proxy-init)

Setup a grid certificate/proxy to use

  • grid-proxy-init -cert ~/bbeeson/.globus/usercert.pem -key ~/bbeeson/.globus/userkey.pem

Untar the demo

Add a new experiment

  • cd ~/.nimrod/experiments/
  • tar -zxf demo.tar.gz

which adds a directory ~/.nimrod/experiments/demo. The main file (only one REQUIRED?) is demo.pln (any name, or must be demo?).

From this 'plan' file, a 'run' file (demo.run) is created. The run file is basically an expanded version (ie each possible parameter permutation) of the plan file.

Add resources to use

  • nimrod resource add globus bezek.dstc.monash.edu.au --limit 2 --immediate T
  • Which limits the number of agents (each agent can run one job at a time) to 2. Don't even think of adding a slash to that name!
  • nimrod resource check globus bezek.dstc.monash.edu.au
  • Again, unless you're psycho, don't add a slash!

Run, using resources

  • put entries unto database, based on experiment 'demo':
  • nimrod portalapi addrun demo G
  • (optional, since startexp does this?). Convert run-file to database Hmm?:
  • nimrod create demo
  • start the experiment:
  • nimrod portalapi startexp demo

Creating my own demo

I try to create my own demo which I can fully understand. Then I will modify it to run zeus code.

  • mkdir ~/.nimrod/experiments/zeus
  • cd .nimrod/experiments/zeus/
  • vi zeus.pln
  • Need plan file syntax help on this, but copy for the moment
  • nimrod generate zeus.pln
  • This generates zeus.run
  • nimrod portalapi addrun zeus G
  • nimrod portalapi startexp zeus

Commandline Usage

  • nimrod portalapi ... (commands used by portal)
  • nimrod ... (other commands)
  • If you didn't add resources before adding experiment, you'll need to run this to add any resources
  • nimrod --experiment experiment-id control addserver computeid
  • nimrod dbclean
  • nimrod dbcreate
  • Examine the tables to see what is happening:
  • psql -c 'select * from TABLE'
  • Where TABLE = nimrodgridrun, etc (see below)
  • Examine the process table to see what is happening:
  • ps auswww | grep python

Database tables

Common ones:

  • nimrodenfexperiment - experiments and their ids
  • nimrodcomputeresource - all the resourses (ie computers) and their ids
  • nimrodenfserver - maps resources to experiments. Note this uses the ids of each.
  • nimrodgridrun - a process started on a remote machine (eg. by GRAM). The local 'actuator' will actually use globus to do this. Each agent is put into this table, then run by the actuator.
  • nimrodenfjob - jobs ready/executing/finish. Here you can see how things are going in your executing experiment. The python process which opens a port for Java applet to connect to probably examines this table.

Others:

  • nimrodparameter - list of all parameter values
  • nimrodtask - what each job will do (parameters are substituted in to $x, etc)
  • nimrodenfexperiment - list of each experiment and status

Glossary

  • Experiment - based on (one?) runfile, an experiment is a collection of jobs.
  • Job - a SINGLE set of commands to run. If you are exploring parameters, each permutation of parameters needs to be run - this is a job. A job can involve several tasks, such as copy file, run program.
  • Resource - a computer which can run things for you. Specified by a hostname and type of resource (eg Globus).
  • Actuator - a python process which examines nimrodgridrun and executes (via Globus GRAM) commands on remote machines.

-- BrettBeeson - 07 Nov 2003