31 Mar 2021 (updated: 2021-03-30)
slurm
job.You’ll be able to…
slurm
batch scheduling.You can follow Dr. Eric Olson’s guidance on the Okapi the Server homepage to open a ssh
portal to enable a virtual desktop from off-campus.
On Mac/linux open a terminal and execute:
ssh -L 33389:localhost:3389 -l <NETID> okapi.math.unr.edu
replacing <NETID>
with your user name. Then open up a MS Remote Desktop instance. Let’s do that now.
From Rizzo (2019) (Example 7.1):
Suppose that \(X_1, X_2\) are iid from a standard normal distribution. To estimate the mean difference \(E|X_1 - X_2|\) we will obtain a Monte Carlo estimate of \(\theta = E|X_1 - X_2|\) based on \(m=1000\) replicates of random samples \(x^{(j)} = (x_1^{(j)}, x_2^{(j)}), j=1,\ldots,m\) of size 2 from \(N(0,1)\). Then compute the replicates \(\hat{\theta}^{(j)} = \frac{1}{m}\sum |x_1^{(j)} - x_2^{(j)}|\)
set.seed(44) m <- 1000 g <- numeric(m) for (i in 1:m) { x <- rnorm(2) g[i] <- abs(x[1] - x[2]) } est <- mean(g) est
## [1] 1.119111
The exact theoretic answer by integration is \(E|X_1-X_2|=2/\sqrt{\pi} \dot{=} 1.128379\).
args = commandArgs(trailingOnly = TRUE) ## Expect command line args at the end. tmpM = as.numeric(args[1]) tmpSeed = as.numeric(args[2]) estimateTheta <- function( m = 1000, seed = NULL ) { if (!is.null( seed )) { set.seed( seed ) } est <- mean ( replicate( n = m, expr = { x <- rnorm(2) abs(x[1] - x[2]) } ) ) return( est ) } (est <- estimateTheta( m = tmpM, seed = tmpSeed )) # print and store jobID <- paste( tmpM, tmpSeed, sep="_" ) saveRDS( data.frame( m = tmpM, seed = tmpSeed, est = est), paste0( "thetaHat_", jobID, ".rds") )
For more on this topic see this nice blog post:
https://thecoatlessprofessor.com/programming/r/working-with-r-on-a-cluster/.
R
noninteractively, passing argumentsRscript ./estimateTheta.R 100 44 > estimateTheta.Rout
slurm
job..slm
filesbatch estimateTheta.slm
.#!/bin/bash #SBATCH -n 1 #SBATCH --mem=8GB Rscript ./estimateTheta.R 100 4444 > estimateTheta.Rout
Before executing the script via the slurm command sbatch estimateTheta.slm
, check the queue
sinfo squeue -p fast
sbatch estimateTheta.slm
slm
#!/bin/bash #SBATCH -n 1 #SBATCH --mem=8GB Rscript ./estimateTheta.R 1000000 44 > estimateTheta.Rout
sbatch estimateTheta.slm
squeue -p fast squeue -u aschissler
less estimateTheta.Rout
scancel <JOBID> scancel -u aschissler ## cancel all your jobs
estimateTheta.R
and estimateTheta.slm
rslurm
slurmR
to scale upconda
, docker
singularity
virtual environmentsbash
scripting#!/bin/bash today=`date +%Y-%m-%d.%H:%M:%S` id=0 echo "$today" for l in 15000 10000 1000 500 250 100 50 25 12 do for group in 0 1 do id=$((id+1)) ## id=$(echo $l-$k) echo "working on pear"$id echo "#!/bin/bash #SBATCH -n 1 #SBATCH --job-name="$id" #SBATCH --mem=4GB cd ~/Research/Mul_NB/brca_nb_pearson_vital time Rscript ./estimate_R_tcga_brca_nb_pearson_group.R "$l" "$group" " > "launch-"$id".slm" chmod a+rx "launch-"$id".slm" sbatch < "launch-"$id".slm" rm "launch-"$id".slm" done done
rslurm
rslurm
Getting Started vignette to specify a job, submit a job, and inspect the results.rslurm
automates slurm
scheduling for embarrasingly large parallel jobs in the R
programming languageslurmR
slurmR
is a light-weight wrapper to automate slurm jobs for USC Biostats. https://github.com/USCbiostats/slurmRconda
slurm
. See here for some convenient commands.R
, rmarkdown
, Rstudio,Refererences
Rizzo, Maria L. 2019. Statistical computing with R. 2nd ed. CRC Press.