Cluster submission from notebook

This notebook shows how to take code from BESOS and run it on ComputeCanada cluster resources.
To run BESOS modules on ComputeCanada cluster resources, you will need: + A ComputeCanada account with the following installed: + python 3 and glpk (both can be loaded with module load). + bonmin and energyplus. + besos (pip installable).

You can also obviously develop Python files that use the BESOS modules locally and submit them to the cluster via your usual method.

Overall process

The general process is as follows: + Write your code to a .py file + As shown below, you can write out a single cell from your notebook by adding %%writefile filename.py at the top
This has the advantage that you can test your code on BESOS, then uncomment this line. + To output a whole notebook, select File, Export Notebook As… and select Executable Script + Write a batch file to submit it + We use the %%writefile method below + Execute the follwing steps in the terminal (here we use a window inside a notebook cell, or open a seperate terminal tab): + Move the files over using SFTP (see here). + Submit the job + Copy back the results files using SFTP + Unpickle the results and continue post-processing
Whilst this process is somewhat cumbersome, it can be convienient for novice terminal users to use the cell below as a crib sheet of commands.
We recommend copying snippets of this notebook and the SFTP one together to make your own workflow.

Python file for execution on cluster

The following cell will write a Python file cluster.py to be submitted to the ComputeCanada cluster.

#%%writefile cluster.py
import pickle
import time

import numpy as np
import pandas as pd
from IPython.display import IFrame
from besos import eppy_funcs as ef, pyehub_funcs as pf, sampling
from besos.evaluator import EvaluatorEH, EvaluatorEP
from besos.parameters import (
    FieldSelector,
    FilterSelector,
    GenericSelector,
    Parameter,
    RangeParameter,
    expand_plist,
    wwr,
)
from besos.problem import EHProblem, EPProblem, Problem


now = time.time() # get the starting timestamp

building = ef.get_building() # load example file if no idf filename is provided
parameters = expand_plist(
    {'NonRes Fixed Assembly Window':
          {'Solar Heat Gain Coefficient':(0.01,0.99)}}
    )
objectives = ['Electricity:Facility', 'Gas:Facility'] # these get made into `MeterReader` or `VariableReader`
problem = EPProblem(parameters, objectives) # problem = parameters + objectives
evaluator = EvaluatorEP(problem, building) # evaluator = problem + building
samples = sampling.dist_sampler(sampling.lhs, problem, 2)
outputs = evaluator.df_apply(samples, keep_input=True)

passedtime = round(time.time()-now,2)
timestr = 'Time to evaluate '+str(len(samples))+' samples: '+str(passedtime)+' seconds.'

with open('time.cluster', 'wb') as timecluster:
     pickle.dump(timestr, timecluster)
with open('op.out', 'wb') as op:
     pickle.dump(outputs, op)

timestr
HBox(children=(FloatProgress(value=0.0, description='Executing', max=2.0, style=ProgressStyle(description_widt…
'Time to evaluate 2 samples: 3.76 seconds.'

Batch file

The following cell will write a batch file clusterbatch.sh used for submitting our job.

%%writefile clusterbatch.sh
#!/bin/bash
#SBATCH --account=def-revins
#SBATCH --time=00:10:00
#SBATCH --nodes=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=1000mb
#SBATCH --output=%x-%j.out

echo "Current work dir: `pwd`"
echo "Starting run at: `date`"

echo "Job ID: $SLURM_JOB_ID"

echo "prog started at: `date`"
mpiexec python cluster.py
echo "prog ended at: `date`"
Overwriting clusterbatch.sh

File transfers

Now we need to transfer following files to the cluster using SFTP as described in this notebook: + cluster.py + clusterbatch.sh
Note that we need to transfer these files in a folder residing in /scratch (e.g. /scratch/job) on the cluster since we do not have access to submit jobs from /home.

Job submission

SSH login on the server

Get the terminal inside the notebook:

your_gitlab_username = 'mdrpanwar'# change this to your username
IFrame("https://hub.besos.uvic.ca/user/"+your_gitlab_username+"/terminals/2", width=1200, height=250)

If you have trouble getting the $ prompt, change the number at the end of above url to 3 (or higher) to start a new terminal.

  • Execute ssh -Y cc_username@cluster_name.computecanada.ca in the terminal, e.g. ssh -Y mpanwar@cedar.computecanada.ca

  • Enter the password when prompted.

  • This should get the [cc_username@cluster_name<login_node> ~]$ prompt, e.g. [mpanwar@cedar1 ~]$

Submitting the job

  • Execute module load python/3.7

  • Execute module load glpk/4.61

  • Assuming you have transfered the required files (as mentioned above), we will now submit the job.

  • Navigate to the directory inside of /scratch where you transfered your files (i.e. /scratch/job as per example)

    • Execute cd scratch/job

  • Verify that this folder contains cluster.py, parameter_sets.py and clusterbatch.sh by executing ls.

  • Execute sbatch clusterbatch.sh.

  • You can check the status of the job by executing squeue -u cc_username e.g. squeue -u mpanwar.

Getting the results

The job is finished when squeue -u cc_username does not contain any job details.

Go back to the second half of the SFTP guide and get the results files, in this case: + time.cluster + op.out.

Loading the results

You can now unpickle the results and continue post-processing in BESOS.

with open("time.cluster", "rb") as timestr:
  passedtime = pickle.load(timestr)
with open('op.out', 'rb') as df:
  outputs = pickle.load(df)

print(passedtime)
outputs
Time to evaluate 2 samples: 3.76 seconds.
Solar Heat Gain Coefficient Electricity:Facility Gas:Facility
0 0.118725 1.775400e+09 2.643277e+09
1 0.638928 1.939934e+09 2.591439e+09