The cluster is just another computer
Open a terminal and play with these commands:
ls pwd mkdir new_directory touch new_file cd new_directory cd .. ls -l cp new_file new_directory/ ls -l new_directory rm -r new_directory new_file
Try exploring your own computer using the terminal. For example, navigate to your last project’s directory.
Log into your account
Create a new directory in your home directory. It’s just the same as in your computer.
Change your password
To exit the cluster use
Ctlr + D
Start by making an R script by typing the following:
Sys.sleep(20) message("my first job in the cluster")
Now let’s move it to the cluster
In your computer’s terminal type:
scp ~/path_to_sript/r_script_name.R usr123@abacus:~/workshop
We can also modify the R script in Abacus
In the abacus terminal type:
cd ~/workshop nano r_script_name.R
nano you can also use other text editors like
vi in case you need/want to modify files directly in the cluster
Visit the GitHub repository on http://github.com/efcaguab/sge-workshop
We can downloading from GitHub to the cluster.
But first we need to enable internet in Abacus.
telnet ienabler 259
On the Abacus terminal:
cd ~/ git clone email@example.com:efcaguab/sge-workshop.git
Now you have all the workshop materials in abacus. Try to modify
script-2.R from Abaucs using
A high performance computer (HPC) is not a faster computer. It’s several computers connected together.
Each of these computers is called node. Each node can have multiple cores
Every core can run one job
One node acts as a master which distributes jobs across the other nodes
To distribute those jobs, the master node uses queues. We’ll look into that in Part 2
When you log-in, as we did before, you access the master node
Abacus has 22 nodes total. 15 of them are available for computing at the School of Biology
Nodes available have between 8 and 48 cores each. Total of 432 cores
You can explore this yourself by going to http://abacus/ganglia/ on your browser
Have a look at the file structure
When you log-in you’re actually in
Usually you store your scripts and raw data in you home directory
The output files, especially if they’re heavy, should go to
Create your own directory in
In your computer you would run an R script using
☢ Never do that in the cluster ☢
cause you could break it…
Make sure that your script runs well in your computer before sending it to the cluster.
Troubleshooting in the cluster can be a nightmare.
Abacus is a shared resource. Be nice to other people. Don’t use it all unless you really need it.
Don’t be afraid to use it though. It’s there for us.