Architectures
Energy Flow Networks (EFNs) and Particle Flow Networks (PFNs) are model architectures designed for learning from collider events as unordered, variable-length sets of particles. Both EFNs and PFNs are parameterized by a learnable per-particle function \Phi and latent space function F.
An EFN takes the following form:
where z_i is a measure of the energy of particle i, such as z_i=p_{T,i}, and \hat p_i is a measure of the angular information of particle i, such as \hat p_i = (y_i,\phi_i). Any infrared- and collinear-safe observable can be parameterized in this form.
A PFN takes the following form:
where p_i is the information of particle i, such as its four-momentum, charge, or flavor. Any observable can be parameterized in this form. See the Deep Sets framework for additional discussion.
Since these architectures are not used by the core EnergyFlow code, and require
the external TensorFlow and scikit-learn libraries, they are not imported by default but must be
explicitly imported, e.g. from energyflow.archs import *
. EnergyFlow also
contains several additional model architectures for ease of using common models
that frequently appear in the intersection of particle physics and machine
learning.
ArchBase
Base class for all architectures contained in EnergyFlow. The mechanism of specifying hyperparameters for all architectures is described here. Methods common to all architectures are documented here. Note that this class cannot be instantiated directly as it is an abstract base class.
energyflow.archs.archbase.ArchBase(*args, **kwargs)
Accepts arbitrary arguments. Positional arguments (if present) are dictionaries of hyperparameters, keyword arguments (if present) are hyperparameters directly. Keyword hyperparameters take precedence over positional hyperparameter dictionaries.
Arguments
- *args : arbitrary positional arguments
- Each argument is a dictionary containing hyperparameter (name, value) pairs.
- *kwargs : arbitrary keyword arguments
- Hyperparameters as keyword arguments. Takes precedence over the positional arguments.
Default NN Hyperparameters
Common hyperparameters that apply to all architectures except for
LinearClassifier
.
Compilation Options
- loss=
'categorical_crossentropy'
: str- The loss function to use for the model. See the Keras loss function docs for available loss functions.
- optimizer=
'adam'
: Keras optimizer or str- A Keras optimizer instance or a string referring to one (in which case the default arguments are used).
- metrics=
['accuracy']
: list of str- The Keras metrics to apply to the model.
- compile_opts=
{}
: dict- Dictionary of keyword arguments to be passed on to the
compile
method of the model.loss
,optimizer
, andmetrics
(see above) are included in this dictionary. All other values are the Keras defaults.
- Dictionary of keyword arguments to be passed on to the
Output Options
- output_dim=
2
: int- The output dimension of the model.
- output_act=
'softmax'
: str or Keras activation- Activation function to apply to the output.
Callback Options
- filepath=
None
: str- The file path for where to save the model. If
None
then the model will not be saved.
- The file path for where to save the model. If
- save_while_training=
True
: bool- Whether the model is saved during training (using the
ModelCheckpoint
callback) or only once training terminates. Only relevant iffilepath
is set.
- Whether the model is saved during training (using the
- save_weights_only=
False
: bool- Whether only the weights of the model or the full model are
saved. Only relevant if
filepath
is set.
- Whether only the weights of the model or the full model are
saved. Only relevant if
- modelcheck_opts=
{'save_best_only':True, 'verbose':1}
: dict- Dictionary of keyword arguments to be passed on to the
ModelCheckpoint
callback, if it is present.save_weights_only
(see above) is included in this dictionary. All other arguments are the Keras defaults.
- Dictionary of keyword arguments to be passed on to the
- patience=
None
: int- The number of epochs with no improvement after which the training
is stopped (using the
EarlyStopping
callback). IfNone
then no early stopping is used.
- The number of epochs with no improvement after which the training
is stopped (using the
- earlystop_opts=
{'restore_best_weights':True, 'verbose':1}
: dict- Dictionary of keyword arguments to be passed on to the
EarlyStopping
callback, if it is present.patience
(see above) is included in this dictionary. All other arguments are the Keras defaults.
- Dictionary of keyword arguments to be passed on to the
Flags
- name_layers=
True
: bool- Whether to give the layers of the model explicit names or let
them be named automatically. One reason to set this to
False
would be in order to use parts of this model in another model (all Keras layers in a model are required to have unique names).
- Whether to give the layers of the model explicit names or let
them be named automatically. One reason to set this to
- compile=
True
: bool- Whether the model should be compiled or not.
- summary=
True
: bool- Whether a summary should be printed or not.
fit
fit(*args, **kwargs)
Train the model by fitting the provided training dataset and labels.
Transparently calls the .fit()
method of the underlying model.
Arguments
- *args : numpy.ndarray or tensorflow.data.Dataset
- Either the
X_train
andY_train
NumPy arrays or a TensorFlow dataset.
- Either the
- kwargs : dict
- Keyword arguments passed on to the
.fit()
method of the underlying model. Most relevant for neural network models, where the TensorFlow/Keras model docs contain detailed information on the possible arguments.
- Keyword arguments passed on to the
Returns
- The return value of the the underlying model's
.fit()
method.
predict
predict(X_test, **kwargs)
Evaluate the model on a dataset. Note that for the LinearClassifier
this corresponds to the predict_proba
method of the underlying
scikit-learn model.
Arguments
- X_test : numpy.ndarray
- The dataset to evaluate the model on.
- kwargs : dict
- Keyword arguments passed on to the underlying model when predicting on a dataset.
Returns
- numpy.ndarray
- The value of the model on the input dataset.
properties
model
model
The underlying model held by this architecture. Note that accessing an attribute that the architecture does not have will resulting in attempting to retrieve the attribute from this model. This allows for interrogation of the EnergyFlow architecture in the same manner as the underlying model.
Examples
- For neural network models:
model.layers
will return a list of the layers, wheremodel
is any EnergFlow neural network.
- For linear models:
model.coef_
will return the coefficients, wheremodel
is any EnergyFlowLinearClassifier
instance.
EFN
Energy Flow Network (EFN) architecture.
energyflow.archs.EFN(*args, **kwargs)
See ArchBase
for how to pass in hyperparameters as
well as defaults common to all EnergyFlow neural network models.
Required EFN Hyperparameters
- input_dim : int
- The number of features for each particle.
- Phi_sizes (formerly
ppm_sizes
) : {tuple, list} of int- The sizes of the dense layers in the per-particle frontend module \Phi. The last element will be the number of latent observables that the model defines.
- F_sizes (formerly
dense_sizes
) : {tuple, list} of int- The sizes of the dense layers in the backend module F.
Default EFN Hyperparameters
- Phi_acts=
'relu'
(formerlyppm_acts
) : {tuple, list} of str or Keras activation- Activation functions(s) for the dense layers in the
per-particle frontend module \Phi. A single string or activation
layer will apply the same activation to all layers. Keras advanced
activation layers are also accepted, either as strings (which use
the default arguments) or as Keras
Layer
instances. If passing a singleLayer
instance, be aware that this layer will be used for all activations and may introduce weight sharing (such as withPReLU
); it is recommended in this case to pass as many activations as there are layers in the model. See the Keras activations docs for more detail.
- Activation functions(s) for the dense layers in the
per-particle frontend module \Phi. A single string or activation
layer will apply the same activation to all layers. Keras advanced
activation layers are also accepted, either as strings (which use
the default arguments) or as Keras
- F_acts=
'relu'
(formerlydense_acts
) : {tuple, list} of str or Keras activation- Activation functions(s) for the dense layers in the backend module F. A single string or activation layer will apply the same activation to all layers.
- Phi_k_inits=
'he_uniform'
(formerlyppm_k_inits
) : {tuple, list} of str or Keras initializer- Kernel initializers for the dense layers in the per-particle frontend module \Phi. A single string will apply the same initializer to all layers. See the Keras initializer docs for more detail.
- F_k_inits=
'he_uniform'
(formerlydense_k_inits
) : {tuple, list} of str or Keras initializer- Kernel initializers for the dense layers in the backend module F. A single string will apply the same initializer to all layers.
- latent_dropout=
0
: float- Dropout rates for the summation layer that defines the value of the latent observables on the inputs. See the Keras Dropout layer for more detail.
- F_dropouts=
0
(formerlydense_dropouts
) : {tuple, list} of float- Dropout rates for the dense layers in the backend module F. A single float will apply the same dropout rate to all dense layers.
- Phi_l2_regs=
0
: {tuple, list} of float- L_2-regulatization strength for both the weights and biases of the layers in the \Phi network. A single float will apply the same L_2-regulatization to all layers.
- F_l2_regs=
0
: {tuple, list} of float- L_2-regulatization strength for both the weights and biases of the layers in the F network. A single float will apply the same L_2-regulatization to all layers.
- mask_val=
0
: float- The value for which particles with all features set equal to
this value will be ignored. The Keras Masking layer appears to have issues masking
the biases of a network, so this has been implemented in a
custom (and correct) manner since version
0.12.0
.
- The value for which particles with all features set equal to
this value will be ignored. The Keras Masking layer appears to have issues masking
the biases of a network, so this has been implemented in a
custom (and correct) manner since version
- num_global_features=
None
: int- Number of additional features to be concatenated with the latent
space observables to form the input to F. If not
None
, then the features are to be provided at the end of the list of inputs.
- Number of additional features to be concatenated with the latent
space observables to form the input to F. If not
eval_filters
eval_filters(patch, n=100, prune=True)
Evaluates the latent space filters of this model on a patch of the two-dimensional geometric input space.
Arguments
- patch : {tuple, list} of float
- Specifies the patch of the geometric input space to be evaluated.
A list of length 4 is interpretted as
[xmin, ymin, xmax, ymax]
. Passing a single floatR
is equivalent to[-R,-R,R,R]
.
- Specifies the patch of the geometric input space to be evaluated.
A list of length 4 is interpretted as
- n : {tuple, list} of int
- The number of grid points on which to evaluate the filters. A list
of length 2 is interpretted as
[nx, ny]
wherenx
is the number of points along the x (or first) dimension andny
is the number of points along the y (or second) dimension.
- The number of grid points on which to evaluate the filters. A list
of length 2 is interpretted as
- prune : bool
- Whether to remove filters that are all zero (which happens sometimes due to dying ReLUs).
Returns
- (numpy.ndarray, numpy.ndarray, numpy.ndarray)
- Returns three arrays,
(X, Y, Z)
, whereX
andY
have shape(nx, ny)
and are arrays of the values of the geometric inputs in the specified patch.Z
has shape(num_filters, nx, ny)
and is the value of the different filters at each point.
- Returns three arrays,
properties
inputs
inputs
List of input tensors to the model. EFNs have two input tensors:
inputs[0]
corresponds to the zs
input and inputs[1]
corresponds
to the phats
input.
weights
weights
Weight tensor for the model. This is the zs
input where entries
equal to mask_val
have been set to zero.
Phi
Phi
List of tensors corresponding to the layers in the \Phi network.
latent
latent
List of tensors corresponding to the summation layer in the network, including any dropout layer if present.
F
F
List of tensors corresponding to the layers in the F network.
output
output
Output tensor for the model.
PFN
Particle Flow Network (PFN) architecture. Accepts the same
hyperparameters as the EFN
.
energyflow.archs.PFN(*args, **kwargs)
properties
inputs
inputs
List of input tensors to the model. PFNs have one input tensor
corresponding to the ps
input.
weights
weights
Weight tensor for the model. A weight of 0
is assigned to any
particle which has all features equal to mask_val
, and 1
is
assigned otherwise.
Phi
Phi
List of tensors corresponding to the layers in the \Phi network.
latent
latent
List of tensors corresponding to the summation layer in the network, including any dropout layer if present.
F
F
List of tensors corresponding to the layers in the F network.
output
output
Output tensor for the model.
CNN
Convolutional Neural Network architecture.
energyflow.archs.CNN(*args, **kwargs)
See ArchBase
for how to pass in hyperparameters as
well as defaults common to all EnergyFlow neural network models.
Required CNN Hyperparameters
- input_shape : {tuple, list} of int
- The shape of a single jet image. Assuming that
data_format
is set tochannels_first
, this is(nb_chan,npix,npix)
.
- The shape of a single jet image. Assuming that
- filter_sizes : {tuple, list} of int
- The size of the filters, which are taken to be square, in each convolutional layer of the network. The length of the list will be the number of convolutional layers in the network.
- num_filters : {tuple, list} of int
- The number of filters in each convolutional layer. The length of
num_filters
must match that offilter_sizes
.
- The number of filters in each convolutional layer. The length of
Default CNN Hyperparameters
- dense_sizes=
None
: {tuple, list} of int- The sizes of the dense layer backend. A value of
None
is equivalent to an empty list.
- The sizes of the dense layer backend. A value of
- pool_sizes=
0
: {tuple, list} of int- Size of maxpooling filter, taken to be a square. A value of
0
will not use maxpooling.
- Size of maxpooling filter, taken to be a square. A value of
- conv_acts=
'relu'
: {tuple, list} of str or Keras activation- Activation function(s) for the conv layers. A single string or
activation layer will apply the same activation to all conv layers.
Keras advanced activation layers are also accepted, either as
strings (which use the default arguments) or as Keras
Layer
instances. If passing a singleLayer
instance, be aware that this layer will be used for all activations and may introduce weight sharing (such as withPReLU
); it is recommended in this case to pass as many activations as there are layers in the model.See the Keras activations docs for more detail.
- Activation function(s) for the conv layers. A single string or
activation layer will apply the same activation to all conv layers.
Keras advanced activation layers are also accepted, either as
strings (which use the default arguments) or as Keras
- dense_acts=
'relu'
: {tuple, list} of str or Keras activation- Activation functions(s) for the dense layers. A single string or activation layer will apply the same activation to all dense layers.
- conv_k_inits=
'he_uniform'
: {tuple, list} of str or Keras initializer- Kernel initializers for the convolutional layers. A single string will apply the same initializer to all layers. See the Keras initializer docs for more detail.
- dense_k_inits=
'he_uniform'
: {tuple, list} of str or Keras initializer- Kernel initializers for the dense layers. A single string will apply the same initializer to all layers.
- conv_dropouts=
0
: {tuple, list} of float- Dropout rates for the convolutional layers. A single float will apply the same dropout rate to all conv layers. See the Keras Dropout layer for more detail.
- num_spatial2d_dropout=
0
: int- The number of convolutional layers, starting from the beginning of the model, for which to apply SpatialDropout2D instead of Dropout.
- dense_dropouts=
0
: {tuple, list} of float- Dropout rates for the dense layers. A single float will apply the same dropout rate to all dense layers.
- paddings=
'valid'
: {tuple, list} of str- Controls how the filters are convoled with the inputs. See the Keras Conv2D layer for more detail.
- data_format=
'channels_last'
: {'channels_first'
,'channels_last'
}- Sets which axis is expected to contain the different channels.
'channels_first'
appears to have issues with newer versions of tensorflow, so prefer'channels_last'
.
- Sets which axis is expected to contain the different channels.
DNN
Dense Neural Network architecture.
energyflow.archs.DNN(*args, **kwargs)
See ArchBase
for how to pass in hyperparameters as
well as defaults common to all EnergyFlow neural network models.
Required DNN Hyperparameters
- input_dim : int
- The number of inputs to the model.
- dense_sizes : {tuple, list} of int
- The number of nodes in the dense layers of the model.
Default DNN Hyperparameters
- acts=
'relu'
: {tuple, list} of str or Keras activation- Activation functions(s) for the dense layers. A single string or
activation layer will apply the same activation to all dense layers.
Keras advanced activation layers are also accepted, either as
strings (which use the default arguments) or as Keras
Layer
instances. If passing a singleLayer
instance, be aware that this layer will be used for all activations and may introduce weight sharing (such as withPReLU
); it is recommended in this case to pass as many activations as there are layers in the model.See the Keras activations docs for more detail.
- Activation functions(s) for the dense layers. A single string or
activation layer will apply the same activation to all dense layers.
Keras advanced activation layers are also accepted, either as
strings (which use the default arguments) or as Keras
- k_inits=
'he_uniform'
: {tuple, list} of str or Keras initializer- Kernel initializers for the dense layers. A single string will apply the same initializer to all layers. See the Keras initializer docs for more detail.
- dropouts=
0
: {tuple, list} of float- Dropout rates for the dense layers. A single float will apply the same dropout rate to all layers. See the Keras Dropout layer for more detail.
- l2_regs=
0
: {tuple, list} of float- L_2-regulatization strength for both the weights and biases of the dense layers. A single float will apply the same L_2-regulatization to all layers.
LinearClassifier
Linear classifier that can be either Fisher's linear discriminant or logistic regression. Relies on the scikit-learn implementations of these classifiers.
energyflow.archs.LinearClassifier(*args, **kwargs)
See ArchBase
for how to pass in hyperparameters.
Default Hyperparameters
- linclass_type=
'lda'
: {'lda'
,'lr'
}- Controls which type of linear classifier is used.
'lda'
corresponds toLinearDisciminantAnalysis
and'lr'
toLogistic Regression
. If using'lr'
all arguments are passed on directly to the scikit-learn class.
- Controls which type of linear classifier is used.
Linear Discriminant Analysis Hyperparameters
- solver=
'svd'
: {'svd'
,'lsqr'
,'eigen'
}- Which LDA solver to use.
- tol=
1e-12
: float- Threshold used for rank estimation. Notably not a convergence parameter.
Logistic Regression Hyperparameters
- LR_hps=
{}
: dict- Dictionary of keyword arguments to pass on to the underlying
LogisticRegression
model.
- Dictionary of keyword arguments to pass on to the underlying