--
SimonKing - 13 Jul 2006
Keeping remote sites in sync with JHU
/export/ws06afsr here at JHU is the master copy. All sites should be kept in sync with this copy. This means transferring files both TO and FROM the other three sites.
How to synchronise the filespace
You should only synchronise things in your own project directories (i.e. below
/export/ws06afsr/projects). Things at the top level, or in directories such as
/export/ws06afsr/data are shared and so should only be synchronised infrequently (and by Simon). You are also allowed to synchronise your
/export/ws06afsr/users/username directory.
How to run an experiment on a remote site
This example is for an experiment to be run in Edinburgh (EDI). Other site codes are UIUC and UW.
- Set up the experiment at JHU
- In your project directory, such as
/export/ws06afsr/projects/my_project, set up all the files you require and do a dry run on the JHU cluster to make sure it works
- Transfer the experiment over to Edinburgh, and run it
- On the JHU system:
cd to /export/ws06afsr/projects/my_project and run /export/ws06afsr/rsync.sh JHU EDI
- login to Edinburgh and run your experiment
- Transfer files back to JHU
- On the JHU system
cd to /export/ws06afsr/projects/my_project and run /export/ws06afsr/rsync.sh EDI JHU
Avoiding being asked for your password when logging in, or running rsync.sh
- For the EDI and UIUC systems:
- make a single pair of public/private ssh keys (without passphrase) by running
ssh-keygen -t dsa in your home directory; hit enter when asked for a passphrase or file location.
- append the contents of
~/.ssh/id_dsa.pub to your ~/.ssh/authorized_keys file on each of the remote systems
- place the private key in a file called
~/.ssh/id_dsa on the JHU system
- For UW, we don't yet have a working authentication system: you'll have to type your password
here is a tutorial:
http://fedoranews.org/dowen/sshkeys/
Avoiding confusion when running rsync
I have set up the
rsync.sh script so that it will compare the
contents of files and not just their timestamps/sizes. I did this because the timestamps on files here at JHU are not 100% reliable because the clocks on individual machines are not all in agreement. Some preliminary tests showed rsync copying files when they didn't need to be copied.
Setting up rsync to look at the file contents does mean that it is possible to overwrite a newer file with an older one, if you are not careful.
I
strongly recommend never editing a file on a remote system. Edit it here at JHU then rsync it (to make the rync as quick as possible,
cd into the directory where the modified file lives and run
rsync.sh from that directory). This way, you know that the JHU copy of a file is
always correct and up-to-date, and the only thing you can do wrong is forget to rsync it out to the remote system.
When rsyncing back from the remote system to JHU, all you should see being copied are the output files of your experiment (and not your scripts etc).