Cluster submission from notebook¶
This notebook shows how to take code from BESOS and run it on
ComputeCanada cluster resources.
To run BESOS modules on ComputeCanada cluster resources, you will
need: + A ComputeCanada account with the following installed: +
python 3
and glpk
(both can be loaded with module load
). +
bonmin
and energyplus
. + besos
(pip installable).You can also obviously develop Python files that use the BESOS modules locally and submit them to the cluster via your usual method.
Overall process¶
The general process is as follows: + Write your code to a .py file +
As shown below, you can write out a single cell from your notebook by
adding
%%writefile filename.py
at the topThis has the advantage that you can test your code on BESOS, then
uncomment this line. + To output a whole notebook, select File, Export
Notebook As… and select Executable Script + Write a batch file to
submit it + We use the
%%writefile
method below + Execute the
follwing steps in the terminal (here we use a window inside a notebook
cell, or open a seperate terminal tab): + Move the files over using
SFTP (see here). + Submit the job + Copy back the
results files using SFTP + Unpickle the results and continue
post-processingWhilst this process is somewhat cumbersome, it can be convienient for
novice terminal users to use the cell below as a crib sheet of
commands.
We recommend copying snippets of this notebook and the
SFTP one together to make your own workflow.
Python file for execution on cluster¶
The following cell will write a Python file cluster.py
to be
submitted to the ComputeCanada cluster.
#%%writefile cluster.py
import pickle
import time
import numpy as np
import pandas as pd
from IPython.display import IFrame
from besos import eppy_funcs as ef, pyehub_funcs as pf, sampling
from besos.evaluator import EvaluatorEH, EvaluatorEP
from besos.parameters import (
FieldSelector,
FilterSelector,
GenericSelector,
Parameter,
RangeDescriptor,
expand_plist,
wwr,
)
from besos.problem import EHProblem, EPProblem, Problem
now = time.time() # get the starting timestamp
building = ef.get_building() # load example file if no idf filename is provided
parameters = expand_plist(
{'NonRes Fixed Assembly Window':
{'Solar Heat Gain Coefficient':(0.01,0.99)}}
)
objectives = ['Electricity:Facility', 'Gas:Facility'] # these get made into `MeterReader` or `VariableReader`
problem = EPProblem(parameters, objectives) # problem = parameters + objectives
evaluator = EvaluatorEP(problem, building) # evaluator = problem + building
samples = sampling.dist_sampler(sampling.lhs, problem, 2)
outputs = evaluator.df_apply(samples, keep_input=True)
passedtime = round(time.time()-now,2)
timestr = 'Time to evaluate '+str(len(samples))+' samples: '+str(passedtime)+' seconds.'
with open('time.cluster', 'wb') as timecluster:
pickle.dump(timestr, timecluster)
with open('op.out', 'wb') as op:
pickle.dump(outputs, op)
timestr
HBox(children=(FloatProgress(value=0.0, description='Executing', max=2.0, style=ProgressStyle(description_widt…
'Time to evaluate 2 samples: 3.55 seconds.'
Batch file¶
The following cell will write a batch file clusterbatch.sh
used for
submitting our job.
%%writefile clusterbatch.sh
#!/bin/bash
#SBATCH --account=def-revins
#SBATCH --time=00:10:00
#SBATCH --nodes=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=1000mb
#SBATCH --output=%x-%j.out
echo "Current work dir: `pwd`"
echo "Starting run at: `date`"
echo "Job ID: $SLURM_JOB_ID"
echo "prog started at: `date`"
mpiexec python cluster.py
echo "prog ended at: `date`"
Overwriting clusterbatch.sh
File transfers¶
Now we need to transfer following files to the cluster using SFTP as
described in this notebook: +
cluster.py
+
clusterbatch.sh
Note that we need to transfer these files in a folder residing in
/scratch
(e.g. /scratch/job
) on the cluster since we do not
have access to submit jobs from /home
.Job submission¶
SSH login on the server¶
Get the terminal inside the notebook:
your_gitlab_username = 'mdrpanwar'# change this to your username
IFrame("https://hub.besos.uvic.ca/user/"+your_gitlab_username+"/terminals/2", width=1200, height=250)
If you have trouble getting the $
prompt, change the number at the
end of above url to 3
(or higher) to start a new terminal.
- Execute
ssh -Y cc_username@cluster_name.computecanada.ca
in the terminal, e.g.ssh -Y mpanwar@cedar.computecanada.ca
- Enter the password when prompted.
- This should get the
[cc_username@cluster_name<login_node> ~]$
prompt, e.g.[mpanwar@cedar1 ~]$
Submitting the job¶
- Execute
module load python/3.7
- Execute
module load glpk/4.61
- Assuming you have transfered the required files (as mentioned above), we will now submit the job.
- Navigate to the directory inside of
/scratch
where you transfered your files (i.e./scratch/job
as per example)- Execute
cd scratch/job
- Execute
- Verify that this folder contains
cluster.py
,parameter_sets.py
andclusterbatch.sh
by executingls
. - Execute
sbatch clusterbatch.sh
. - You can check the status of the job by executing
squeue -u cc_username
e.g.squeue -u mpanwar
.
Getting the results¶
The job is finished when squeue -u cc_username
does not contain any
job details.
Go back to the second half of the SFTP guide and get
the results files, in this case: + time.cluster
+ op.out
.
Loading the results¶
You can now unpickle the results and continue post-processing in BESOS.
with open("time.cluster", "rb") as timestr:
passedtime = pickle.load(timestr)
with open('op.out', 'rb') as df:
outputs = pickle.load(df)
print(passedtime)
outputs
Time to evaluate 2 samples: 3.55 seconds.
Solar Heat Gain Coefficient | Electricity:Facility | Gas:Facility | |
---|---|---|---|
0 | 0.145485 | 1.787176e+09 | 2.640505e+09 |
1 | 0.920974 | 2.073960e+09 | 2.537977e+09 |