Fit feedforward Neural Network model With Dask¶
This notebook takes the “Fit feedforward Neural Network model” notebook and parallelizes the processes using Dask. It will skip over explanation of code unrelated to Dask. Refer to the “Fit feedforward Neural Network model” notebook for more details on this notebook.
First import packages, and initialize the scheduler
import joblib
from besos import eppy_funcs as ef, sampling
from besos.evaluator import EvaluatorEP, EvaluatorGeneric
from besos.problem import EPProblem
from dask.distributed import Client
from sklearn.model_selection import GridSearchCV, train_test_split
from sklearn.neural_network import MLPRegressor
from sklearn.preprocessing import StandardScaler
import warnings
from parameter_sets import parameter_set
from dask.distributed import Client
client = Client()
client
Client
|
Cluster
|
We gather the parameters and the building, then create the problem and evaluator.
parameters = parameter_set(7)
problem = EPProblem(parameters, ["Electricity:Facility"])
building = ef.get_building()
problem = EPProblem(parameters, ['Electricity:Facility'])
evaluator = EvaluatorEP(problem, building)
When df_apply is called, the dataframe will be processed concurrently.
By passing in the processes
parameter you can define the number of
paritions the dataframe will be divided into. If you are running this
notebook locally, you can open the Dask dashboard. A link is provided by
the client
object (refer to the first cell in the notebook where we
initialized Client
). On the dashboard, you can see what processes
are running.
%%time
inputs = sampling.dist_sampler(sampling.lhs, problem, 50)
outputs = evaluator.df_apply(inputs, processes=4)
inputs
CPU times: user 1.47 s, sys: 287 ms, total: 1.75 s
Wall time: 23.3 s
Conductivity | Thickness | U-Factor | Solar Heat Gain Coefficient | ElectricEquipment | Lights | Window to Wall Ratio | |
---|---|---|---|---|---|---|---|
0 | 0.055023 | 0.289414 | 0.971157 | 0.453890 | 11.879346 | 12.800983 | 0.566172 |
1 | 0.161703 | 0.151686 | 0.248658 | 0.433111 | 14.650401 | 14.239511 | 0.888224 |
2 | 0.146857 | 0.293566 | 3.995612 | 0.905855 | 12.620180 | 14.457510 | 0.120768 |
3 | 0.193907 | 0.242702 | 0.570291 | 0.814085 | 12.934642 | 13.796069 | 0.085228 |
4 | 0.050311 | 0.180439 | 3.084850 | 0.061345 | 10.312514 | 13.480864 | 0.677188 |
5 | 0.132940 | 0.119451 | 1.384960 | 0.863318 | 12.853962 | 14.055337 | 0.160043 |
6 | 0.059651 | 0.106822 | 3.443967 | 0.919450 | 10.687633 | 10.351821 | 0.366714 |
7 | 0.098363 | 0.170353 | 4.623611 | 0.548621 | 13.039621 | 10.172123 | 0.729395 |
8 | 0.166916 | 0.231849 | 4.274812 | 0.371251 | 12.335803 | 14.189433 | 0.395333 |
9 | 0.185077 | 0.298956 | 2.575012 | 0.737015 | 14.540488 | 13.018182 | 0.592655 |
10 | 0.039453 | 0.239708 | 1.237301 | 0.714868 | 13.738186 | 13.961779 | 0.019614 |
11 | 0.155364 | 0.120459 | 3.664835 | 0.128758 | 13.968844 | 10.897971 | 0.348100 |
12 | 0.114585 | 0.261586 | 1.784999 | 0.489818 | 11.676686 | 11.817922 | 0.958709 |
13 | 0.074132 | 0.113085 | 0.168243 | 0.361055 | 12.403512 | 10.030331 | 0.780304 |
14 | 0.125671 | 0.172620 | 2.243686 | 0.386179 | 14.729687 | 13.321995 | 0.977003 |
15 | 0.197555 | 0.145545 | 4.553300 | 0.625993 | 11.044226 | 10.291775 | 0.702575 |
16 | 0.111633 | 0.209954 | 2.939609 | 0.465666 | 14.208762 | 11.182091 | 0.763889 |
17 | 0.099961 | 0.138505 | 3.765241 | 0.030199 | 12.058338 | 10.654501 | 0.038554 |
18 | 0.032217 | 0.235183 | 4.716094 | 0.224655 | 13.895373 | 12.428741 | 0.442455 |
19 | 0.087545 | 0.250956 | 2.493906 | 0.972490 | 13.501133 | 11.628705 | 0.629117 |
20 | 0.079181 | 0.109904 | 3.152474 | 0.801282 | 13.471743 | 11.307619 | 0.925028 |
21 | 0.186814 | 0.212415 | 1.760755 | 0.766256 | 14.987510 | 14.726486 | 0.176318 |
22 | 0.088964 | 0.160811 | 1.490295 | 0.123413 | 10.849416 | 13.278679 | 0.263240 |
23 | 0.122988 | 0.284068 | 2.429327 | 0.592769 | 10.470285 | 12.325289 | 0.405024 |
24 | 0.026182 | 0.204374 | 4.931655 | 0.156479 | 11.135790 | 12.527733 | 0.235172 |
25 | 0.151733 | 0.256406 | 0.770869 | 0.517453 | 10.936935 | 12.189044 | 0.105411 |
26 | 0.129251 | 0.226350 | 4.456843 | 0.238534 | 11.440466 | 14.341981 | 0.146293 |
27 | 0.158677 | 0.279158 | 1.049345 | 0.082118 | 11.503272 | 13.661504 | 0.675995 |
28 | 0.044414 | 0.133110 | 0.489562 | 0.682752 | 11.953141 | 12.231597 | 0.484533 |
29 | 0.047032 | 0.220143 | 2.046230 | 0.402257 | 13.301214 | 14.895942 | 0.830231 |
30 | 0.139758 | 0.201843 | 4.852871 | 0.572439 | 12.562844 | 14.610093 | 0.608437 |
31 | 0.145235 | 0.185034 | 3.405676 | 0.536250 | 11.372711 | 11.752122 | 0.288744 |
32 | 0.022745 | 0.124947 | 1.601113 | 0.655625 | 10.097336 | 11.988407 | 0.871940 |
33 | 0.105943 | 0.158020 | 3.567968 | 0.662704 | 12.759740 | 14.547573 | 0.646999 |
34 | 0.067095 | 0.281144 | 2.657010 | 0.932130 | 14.058247 | 13.580063 | 0.810015 |
35 | 0.095066 | 0.266337 | 0.605780 | 0.248594 | 13.143416 | 12.748950 | 0.318998 |
36 | 0.178696 | 0.142271 | 3.887490 | 0.299538 | 14.449008 | 10.709081 | 0.277599 |
37 | 0.189534 | 0.100527 | 2.833761 | 0.179466 | 11.209249 | 12.931609 | 0.503187 |
38 | 0.136791 | 0.154748 | 3.021646 | 0.276294 | 14.889010 | 10.416115 | 0.432814 |
39 | 0.176826 | 0.270708 | 4.078908 | 0.845699 | 12.276264 | 11.480877 | 0.202349 |
40 | 0.106603 | 0.274593 | 0.875880 | 0.101857 | 11.742971 | 13.101020 | 0.221728 |
41 | 0.059221 | 0.130004 | 4.122003 | 0.606201 | 10.706087 | 12.061205 | 0.892999 |
42 | 0.169417 | 0.164906 | 1.134970 | 0.872666 | 14.375552 | 11.061248 | 0.323963 |
43 | 0.073724 | 0.219424 | 2.096114 | 0.957559 | 10.143611 | 11.256390 | 0.836476 |
44 | 0.035543 | 0.191936 | 1.891513 | 0.013676 | 10.211157 | 11.525980 | 0.531643 |
45 | 0.081527 | 0.198758 | 3.251619 | 0.192598 | 13.253799 | 14.983523 | 0.934755 |
46 | 0.064496 | 0.176742 | 0.329414 | 0.715988 | 14.191205 | 12.682700 | 0.744238 |
47 | 0.028722 | 0.255675 | 2.270806 | 0.791544 | 12.183257 | 13.852691 | 0.059637 |
48 | 0.118118 | 0.195794 | 4.325877 | 0.304809 | 13.637606 | 10.525400 | 0.555531 |
49 | 0.172703 | 0.244258 | 1.308726 | 0.340048 | 10.511005 | 10.924541 | 0.476248 |
Set up model parameters¶
In this cell, we setup the model. More detail can be found in the “Fit feedforward Neural Network model” notebook
train_in, test_in, train_out, test_out = train_test_split(
inputs, outputs, test_size=0.2
)
scaler = StandardScaler()
inputs = scaler.fit_transform(X=train_in)
scaler_out = StandardScaler()
outputs = scaler_out.fit_transform(X=train_out)
hyperparameters = {
"hidden_layer_sizes": (
(len(parameters) * 16,),
(len(parameters) * 16, len(parameters) * 16),
),
"alpha": [1, 10, 10 ** 3],
}
neural_net = MLPRegressor(max_iter=1000, early_stopping=False)
folds = 3
Model fitting with Dask¶
Here, we use the NN model from ScikitLearn. In a different example we use TensorFlow (with and without the Keras wrapper).
Below we parallelize the model fit. Normally, SciketLearn uses joblib to parallelize model fitting. By specifying the parrallel backend to be Dask, joblib switches over to using the Dask scheduler. For this example, using Dask may not be any faster. This is because joblib also has the ability to parrallelize accross cores. An example where this tool would be useful is when Dask is using a ditributed network with access to more cores.
%%time
with joblib.parallel_backend("dask"):
clf = GridSearchCV(neural_net, hyperparameters, iid=True, cv=folds)
with warnings.catch_warnings():
warnings.simplefilter("ignore", category=FutureWarning)
clf.fit(inputs, outputs.ravel())
print(f"Best performing model $R^2$ score on training set: {clf.best_score_}")
print(f"Model $R^2$ parameters: {clf.best_params_}")
print(
f"Best performing model $R^2$ score on a separate test set: {clf.best_estimator_.score(scaler.transform(test_in), scaler_out.transform(test_out))}"
)
Best performing model $R^2$ score on training set: 0.9859359222733858
Model $R^2$ parameters: {'alpha': 1, 'hidden_layer_sizes': (112,)}
Best performing model $R^2$ score on a separate test set: 0.9973443498852271
CPU times: user 565 ms, sys: 69.8 ms, total: 635 ms
Wall time: 4.38 s
Surrogate Modelling Evaluator object¶
We can wrap the fitted model in a BESOS Evaluator
. This has
identical behaviour to the original EnergyPlus Evaluator object.
The parrallelization occurs when calling the df_apply function.
def evaluation_func(ind, scaler=scaler):
ind = scaler.transform(X=[ind])
return (scaler_out.inverse_transform(clf.predict(ind))[0],)
NN_SM = EvaluatorGeneric(evaluation_func, problem)
Running a large surrogate evaluation¶
Here we bump up the sample count to 50,000 and partition the data into 4. (if you have more cores available, feel free to try increasing the proccesses)
%%time
inputs = sampling.dist_sampler(sampling.lhs, problem, 50000)
outputs = NN_SM.df_apply(inputs, processes=4)
results = inputs.join(outputs)
results.head()
CPU times: user 724 ms, sys: 149 ms, total: 873 ms
Wall time: 9.38 s
Conductivity | Thickness | U-Factor | Solar Heat Gain Coefficient | ElectricEquipment | Lights | Window to Wall Ratio | Electricity:Facility | |
---|---|---|---|---|---|---|---|---|
0 | 0.192527 | 0.153194 | 4.481314 | 0.567995 | 11.835818 | 13.052947 | 0.857530 | 2.057846e+09 |
1 | 0.080337 | 0.110154 | 2.589947 | 0.811769 | 11.163278 | 13.104261 | 0.966630 | 1.995323e+09 |
2 | 0.095729 | 0.285002 | 4.567989 | 0.364190 | 14.112162 | 12.881720 | 0.017776 | 2.154193e+09 |
3 | 0.156267 | 0.214472 | 2.800968 | 0.401643 | 11.264160 | 11.086406 | 0.247729 | 1.882376e+09 |
4 | 0.196672 | 0.273954 | 2.064289 | 0.267733 | 10.874480 | 10.699428 | 0.980151 | 1.845306e+09 |