### Particle Tools

Tools for dealing with particle momenta four-vectors. A four-vector can either
be in Cartesian coordinates, `[e,px,py,pz]`

(energy, momentum in `x`

direction,
momentum in `y`

direction, momentum in `z`

direction), or hadronic coordinates,
`[pt,y,phi,m]`

(transverse momentum, rapidity, azimuthal angle, mass), which
are related via:

and inversely:

The pseudorapidity `eta`

can be obtained from a Cartesian four-momentum as:

and is related to the rapidity via

Note that the above formulas are numerically stable up to values of rapidity or pseudorapidity of a few hundred, above which the formulas have numerical issues. In this case, a different but equivalent formulae are used that are numerically stable in this region. In all cases, the $p_T\to0$ limit produces infinite values.

In the context of this package, an "event" is a two-dimensional numpy array
with shape `(M,4)`

where `M`

is the multiplicity. An array of events is a
three-dimensional array with shape `(N,M,4)`

where `N`

is the number of events.
The valid inputs and outputs of the functions here will be described using
this terminology.

#### ptyphims_from_p4s

```
energyflow.ptyphims_from_p4s(p4s, phi_ref=None)
```

Convert to hadronic coordinates `[pt,y,phi,m]`

from Cartesian
coordinates. All-zero four-vectors are left alone.

**Arguments**

**p4s**:*numpy.ndarray*or*list*- A single particle, event, or array of events in Cartesian coordinates.

**phi_ref**: {`None`

,`'hardest'`

,*float*,*numpy.ndarray*}- Used to help deal with the fact that $\phi$ is a periodic coordinate.
If a float (which should be in $[0,2\pi)$), all phi values will be
within $\pm\pi$ of this reference value. If
`'\hardest'`

, the phi of the hardest particle is used as the reference value. If`None`

, all phis will be in the range $[0,2\pi)$. An array is accepted in the case that`p4s`

is an array of events, in which case the`phi_ref`

array should have shape`(N,)`

where`N`

is the number of events.

- Used to help deal with the fact that $\phi$ is a periodic coordinate.
If a float (which should be in $[0,2\pi)$), all phi values will be
within $\pm\pi$ of this reference value. If

**Returns**

*numpy.ndarray*- An array of hadronic four-momenta with the same shape as the input.

#### pts_from_p4s

```
energyflow.pts_from_p4s(p4s)
```

Calculate the transverse momenta of a collection of four-vectors.

**Arguments**

**p4s**:*numpy.ndarray*or*list*- A single particle, event, or array of events in Cartesian coordinates.

**Returns**

*numpy.ndarray*- An array of transverse momenta with shape
`p4s.shape[:-1]`

.

- An array of transverse momenta with shape

#### pt2s_from_p4s

```
energyflow.pt2s_from_p4s(p4s)
```

Calculate the squared transverse momenta of a collection of four-vectors.

**Arguments**

**p4s**:*numpy.ndarray*or*list*- A single particle, event, or array of events in Cartesian coordinates.

**Returns**

*numpy.ndarray*- An array of squared transverse momenta with shape
`p4s.shape[:-1]`

.

- An array of squared transverse momenta with shape

#### ys_from_p4s

```
energyflow.ys_from_p4s(p4s)
```

Calculate the rapidities of a collection of four-vectors. Returns zero for all-zero particles

**Arguments**

**p4s**:*numpy.ndarray*or*list*- A single particle, event, or array of events in Cartesian coordinates.

**Returns**

*numpy.ndarray*- An array of rapidities with shape
`p4s.shape[:-1]`

.

- An array of rapidities with shape

#### etas_from_p4s

```
energyflow.etas_from_p4s(p4s)
```

Calculate the pseudorapidities of a collection of four-vectors. Returns zero for all-zero particles

**Arguments**

**p4s**:*numpy.ndarray*or*list*- A single particle, event, or array of events in Cartesian coordinates.

**Returns**

*numpy.ndarray*- An array of pseudorapidities with shape
`p4s.shape[:-1]`

.

- An array of pseudorapidities with shape

#### phis_from_p4s

```
energyflow.phis_from_p4s(p4s, phi_ref=None)
```

Calculate the azimuthal angles of a collection of four-vectors.

**Arguments**

**p4s**:*numpy.ndarray*or*list*- A single particle, event, or array of events in Cartesian coordinates.

**phi_ref**: {*float*,*numpy.ndarray*,`None`

,`'hardest'`

}- Used to help deal with the fact that $\phi$ is a periodic coordinate.
If a float (which should be in $[0,2\pi)$), all phi values will be
within $\pm\pi$ of this reference value. If
`'\hardest'`

, the phi of the hardest particle is used as the reference value. If`None`

, all phis will be in the range $[0,2\pi)$. An array is accepted in the case that`p4s`

is an array of events, in which case the`phi_ref`

array should have shape`(N,)`

where`N`

is the number of events.

- Used to help deal with the fact that $\phi$ is a periodic coordinate.
If a float (which should be in $[0,2\pi)$), all phi values will be
within $\pm\pi$ of this reference value. If

**Returns**

*numpy.ndarray*- An array of azimuthal angles with shape
`p4s.shape[:-1]`

.

- An array of azimuthal angles with shape

#### m2s_from_p4s

```
energyflow.m2s_from_p4s(p4s)
```

Calculate the squared masses of a collection of four-vectors.

**Arguments**

**p4s**:*numpy.ndarray*or*list*- A single particle, event, or array of events in Cartesian coordinates.

**Returns**

*numpy.ndarray*- An array of squared masses with shape
`p4s.shape[:-1]`

.

- An array of squared masses with shape

#### ms_from_p4s

```
energyflow.ms_from_p4s(p4s)
```

Calculate the masses of a collection of four-vectors.

**Arguments**

**p4s**:*numpy.ndarray*or*list*- A single particle, event, or array of events in Cartesian coordinates.

**Returns**

*numpy.ndarray*- An array of masses with shape
`p4s.shape[:-1]`

.

- An array of masses with shape

#### ms_from_ps

```
energyflow.ms_from_ps(ps)
```

Calculate the masses of a collection of Lorentz vectors in two or more spacetime dimensions.

**Arguments**

**ps**:*numpy.ndarray*or*list*- A single particle, event, or array of events in Cartesian coordinates in $d\ge2$ spacetime dimensions.

**Returns**

*numpy.ndarray*- An array of masses with shape
`ps.shape[:-1]`

.

- An array of masses with shape

#### etas_from_pts_ys_ms

```
energyflow.etas_from_pts_ys_ms(pts, ys, ms)
```

Calculate pseudorapidities from transverse momenta, rapidities, and masses. All input arrays should have the same shape.

**Arguments**

**pts**:*numpy.ndarray*- Array of transverse momenta.

**ys**:*numpy.ndarray*- Array of rapidities.

**ms**:*numpy.ndarray*- Array of masses.

**Returns**

*numpy.ndarray*- Array of pseudorapidities with the same shape as
`ys`

.

- Array of pseudorapidities with the same shape as

#### ys_from_pts_etas_ms

```
energyflow.ys_from_pts_etas_ms(pts, etas, ms)
```

Calculate rapidities from transverse momenta, pseudorapidities, and masses. All input arrays should have the same shape.

**Arguments**

**pts**:*numpy.ndarray*- Array of transverse momenta.

**etas**:*numpy.ndarray*- Array of pseudorapidities.

**ms**:*numpy.ndarray*- Array of masses.

**Returns**

*numpy.ndarray*- Array of rapidities with the same shape as
`etas`

.

- Array of rapidities with the same shape as

#### p4s_from_ptyphims

```
energyflow.p4s_from_ptyphims(ptyphims)
```

Calculate Cartesian four-vectors from transverse momenta, rapidities, azimuthal angles, and (optionally) masses for each input.

**Arguments**

**ptyphims**:*numpy.ndarray*or*list*- A single particle, event, or array of events in hadronic coordinates. The mass is optional and if left out will be taken to be zero.

**Returns**

*numpy.ndarray*- An array of Cartesian four-vectors.

#### p4s_from_ptyphipids

```
energyflow.p4s_from_ptyphipids(ptyphipids, error_on_unknown=False)
```

Calculate Cartesian four-vectors from transverse momenta, rapidities, azimuthal angles, and particle IDs for each input. The particle IDs are used to lookup the mass of the particle. Transverse momenta should have units of GeV when using this function.

**Arguments**

**ptyphipids**:*numpy.ndarray*or*list*- A single particle, event, or array of events in hadronic coordinates where the mass is replaced by the PDG ID of the particle.

**error_on_unknown**:*bool*- See the corresponding argument of
`pids2ms`

.

- See the corresponding argument of

**Returns**

*numpy.ndarray*- An array of Cartesian four-vectors with the same shape as the input.

#### sum_ptyphims

```
energyflow.sum_ptyphims(ptyphims)
```

Add a collection of four-vectors that are expressed in hadronic coordinates by first converting to Cartesian coordinates and then summing.

**Arguments**

**ptyphims**:*numpy.ndarray*or*list*- An event, or array of events in hadronic coordinates. The mass is optional and if left out will be taken to be zero. An argument of a single particle does nothing.

**Returns**

*numpy.ndarray*- Array of summed four-vectors, in hadronic coordinates.

#### sum_ptyphipids

```
energyflow.sum_ptyphipids(ptyphipids, error_on_unknown=False)
```

Add a collection of four-vectors that are expressed as
`[pT,y,phi,pdgid]`

.

**Arguments**

**ptyphipids**:*numpy.ndarray*or*list*- A single particle, event, or array of events in hadronic coordinates where the mass is replaced by the PDG ID of the particle.

**error_on_unknown**:*bool*- See the corresponding argument of
`pids2ms`

.

- See the corresponding argument of

**Returns**

*numpy.ndarray*- Array of summed four-vectors, in hadronic coordinates.

#### pids2ms

```
energyflow.pids2ms(pids, error_on_uknown=False)
```

Map an array of Particle Data Group IDs to an array of the corresponding particle masses (in GeV).

**Arguments**

**pids**:*numpy.ndarray*or*list*- An array of numeric (float or integer) PDG ID values.

**error_on_unknown**:*bool*- Controls whether a
`KeyError`

is raised if an unknown PDG ID is encountered. If`False`

, unknown PDG IDs will map to zero.

- Controls whether a

**Returns**

*numpy.ndarray*- An array of masses in GeV.

#### phi_fix

```
energyflow.phi_fix(phis, phi_ref, copy=True)
```

A function to ensure that all phis are within $\pi$ of `phi_ref`

. It is
assumed that all starting phi values are $\pm 2\pi$ of `phi_ref`

.

**Arguments**

**phis**:*numpy.ndarray*or*list*- Array of phi values.

**phi_ref**: {*float*or*numpy.ndarray*}- A reference value used so that all phis will be within $\pm\pi$ of
this value. Should have a shape of
`phis.shape[:-1]`

.

- A reference value used so that all phis will be within $\pm\pi$ of
this value. Should have a shape of
**copy**:*bool*- Determines if
`phis`

are copied or not. If`False`

then`phis`

is modified in place.

- Determines if

**Returns**

*numpy.ndarray*- An array of the fixed phi values.

#### flat_metric

```
energyflow.flat_metric(dim)
```

The Minkowski metric in `dim`

spacetime dimensions in the mostly-minus
convention.

**Arguments**

**dim**:*int*- The number of spacetime dimensions (thought to be four in our universe).

**Returns**

*1-d numpy.ndarray*- A
`dim`

-length, one-dimensional (not matrix) array equal to`[+1,-1,...,-1]`

.

- A

### Random Events

Functions to generate random sets of four-vectors. Includes an implementation of the RAMBO algorithm for sampling uniform M-body massless phase space. Also includes other functions for various random, non-center of momentum, and non-uniform sampling.

#### gen_random_events

```
energyflow.gen_random_events(nevents, nparticles, dim=4, mass=0.0)
```

Generate random events with a given number of particles in a given spacetime dimension. The spatial components of the momenta are distributed uniformly in $[-1,+1]$. These events are not guaranteed to uniformly sample phase space.

**Arguments**

**nevents**:*int*- Number of events to generate.

**nparticles**:*int*- Number of particles in each event.

**dim**:*int*- Number of spacetime dimensions.

**mass**:*float*or`'random'`

- Mass of the particles to generate. Can be set to
`'random'`

to obtain a different random mass for each particle.

- Mass of the particles to generate. Can be set to

**Returns**

*numpy.ndarray*- An
`(nevents,nparticles,dim)`

array of events. The particles are specified as`[E,p1,p2,...]`

. If`nevents`

is 1 then that axis is dropped.

- An

#### gen_random_events_mcom

```
energyflow.gen_random_events_mcom(nevents, nparticles, dim=4)
```

Generate random events with a given number of massless particles in a given spacetime dimension. The total momentum are made to sum to zero. These events are not guaranteed to uniformly sample phase space.

**Arguments**

**nevents**:*int*- Number of events to generate.

**nparticles**:*int*- Number of particles in each event.

**dim**:*int*- Number of spacetime dimensions.

**Returns**

*numpy.ndarray*- An
`(nevents,nparticles,dim)`

array of events. The particles are specified as`[E,p1,p2,...]`

.

- An

#### gen_massless_phase_space

```
energyflow.gen_massless_phase_space(nevents, nparticles, energy=1.0)
```

Implementation of the RAMBO algorithm for uniformly sampling massless M-body phase space for any center of mass energy.

**Arguments**

**nevents**:*int*- Number of events to generate.

**nparticles**:*int*- Number of particles in each event.

**energy**:*float*- Total center of mass energy of each event.

**Returns**

*numpy.ndarray*- An
`(nevents,nparticles,4)`

array of events. The particles are specified as`[E,p_x,p_y,p_z]`

. If`nevents`

is 1 then that axis is dropped.

- An

### Data Tools

Functions for dealing with datasets. These are not importable from
the top level `energyflow`

module, but must instead be imported
from `energyflow.utils`

.

#### get_examples

```
energyflow.utils.get_examples(path='~/.energyflow', which='all', overwrite=False)
```

Pulls examples from GitHub. To ensure availability of all examples update EnergyFlow to the latest version.

**Arguments**

**path**:*str*- The destination for the downloaded files. Note that
`examples`

is automatically appended to the end of this path.

- The destination for the downloaded files. Note that
**which**: {*list*,`'all'`

}- List of examples to download, or the string
`'all'`

in which case all the available examples are downloaded.

- List of examples to download, or the string
**overwrite**:*bool*- Whether to overwrite existing files or not.

#### data_split

```
energyflow.utils.data_split(*args, train=-1, val=0.0, test=0.1, shuffle=True)
```

A function to split a dataset into train, test, and optionally validation datasets.

**Arguments**

***args**: arbitrary*numpy.ndarray*datasets- An arbitrary number of datasets, each required to have the same number of elements, as numpy arrays.

**train**: {*int*,*float*}- If a float, the fraction of elements to include in the training
set. If an integer, the number of elements to include in the
training set. The value
`-1`

is special and means include the remaining part of the dataset in the training dataset after the test and (optionally) val parts have been removed

- If a float, the fraction of elements to include in the training
set. If an integer, the number of elements to include in the
training set. The value
**val**: {*int*,*float*}- If a float, the fraction of elements to include in the validation
set. If an integer, the number of elements to include in the
validation set. The value
`0`

is special and means do not form a validation set.

- If a float, the fraction of elements to include in the validation
set. If an integer, the number of elements to include in the
validation set. The value
**test**: {*int*,*float*}- If a float, the fraction of elements to include in the test set. If an integer, the number of elements to include in the test set.

**shuffle**:*bool*- A flag to control whether the dataset is shuffle prior to being split into parts.

**Returns**

*list*- A list of the split datasets in train, [val], test order. If
datasets
`X`

,`Y`

, and`Z`

were given as`args`

(and assuming a non-zero`val`

), then [`X_train`

,`X_val`

,`X_test`

,`Y_train`

,`Y_val`

,`Y_test`

,`Z_train`

,`Z_val`

,`Z_test`

] will be returned.

- A list of the split datasets in train, [val], test order. If
datasets

#### to_categorical

```
energyflow.utils.to_categorical(labels, num_classes=None)
```

One-hot encodes class labels.

**Arguments**

**labels**:*1-d numpy.ndarray*- Labels in the range
`[0,num_classes)`

.

- Labels in the range
**num_classes**: {*int*,`None`

}- The total number of classes. If
`None`

, taken to be the maximum label plus one.

- The total number of classes. If

**Returns**

*2-d numpy.ndarray*- The one-hot encoded labels.

#### remap_pids

```
energyflow.utils.remap_pids(events, pid_i=3)
```

Remaps PDG id numbers to small floats for use in a neural network.
`events`

are modified in place and nothing is returned.

**Arguments**

**events**:*3-d numpy.ndarray*- The events as an array of arrays of particles.

**pid_i**:*int*- The index corresponding to pid information along the last
axis of
`events`

.

- The index corresponding to pid information along the last
axis of

### Image Tools

Functions for dealing with image representations of events. These are
not importable from the top level `energyflow`

module, but must
instead be imported from `energyflow.utils`

.

#### pixelate

```
energyflow.utils.pixelate(jet, npix=33, img_width=0.8, nb_chan=1, norm=True, charged_counts_only=False)
```

A function for creating a jet image from an array of particles.

**Arguments**

**jet**:*numpy.ndarray*- An array of particles where each particle is of the form
`[pt,y,phi,pid]`

where the particle id column is only used if`nb_chan=2`

and`charged_counts_only=True`

.

- An array of particles where each particle is of the form
**npix**:*int*- The number of pixels on one edge of the jet image, which is taken to be a square.

**img_width**:*float*- The size of one edge of the jet image in the rapidity-azimuth plane.

**nb_chan**: {`1`

,`2`

}- The number of channels in the jet image. If
`1`

, then only a $p_T$ channel is constructed (grayscale). If`2`

, then both a $p_T$ channel and a count channel are formed (color).

- The number of channels in the jet image. If
**norm**:*bool*- Whether to normalize the $p_T$ pixels to sum to
`1`

.

- Whether to normalize the $p_T$ pixels to sum to
**charged_counts_only**:*bool*- If making a count channel, whether to only include charged
particles. Requires that
`pid`

information be given.

- If making a count channel, whether to only include charged
particles. Requires that

**Returns**

*3-d numpy.ndarray*- The jet image as a
`(nb_chan, npix, npix)`

array.

- The jet image as a

#### standardize

```
energyflow.utils.standardize(*args, channels=None, copy=False, reg=10**-10)
```

Normalizes each argument by the standard deviation of the pixels in
arg[0]. The expected use case would be `standardize(X_train, X_val, X_test)`

.

**Arguments**

***args**: arbitrary*numpy.ndarray*datasets- An arbitrary number of datasets, each required to have the same shape in all but the first axis.

**channels**:*int*- A list of which channels (assumed to be the second axis)
to standardize.
`None`

is interpretted to mean every channel.

- A list of which channels (assumed to be the second axis)
to standardize.
**copy**:*bool*- Whether or not to copy the input arrays before modifying them.

**reg**:*float*- Small parameter used to avoid dividing by zero. It's important that this be kept consistent for images used with a given model.

**Returns**

*list*- A list of the now-standardized arguments.

#### zero_center

```
energyflow.utils.zero_center(args, kwargs)
```

Subtracts the mean of arg[0] from the arguments. The expected
use case would be `standardize(X_train, X_val, X_test)`

.

**Arguments**

***args**: arbitrary*numpy.ndarray*datasets- An arbitrary number of datasets, each required to have the same shape in all but the first axis.

**channels**:*int*- A list of which channels (assumed to be the second axis)
to zero center.
`None`

is interpretted to mean every channel.

- A list of which channels (assumed to be the second axis)
to zero center.
**copy**:*bool*- Whether or not to copy the input arrays before modifying them.

**Returns**

*list*- A list of the zero-centered arguments.