ParticleDA.jl

ParticleDA.jl is a Julia package to perform data assimilation with particle filters, distributed using MPI.

Installation

To install the latest stable version of the package, open the Julia REPL, enter the package manager with ], then run the command

add ParticleDA.jl

If you plan to develop the package (make changes, submit pull requests, etc), in the package manager mode run this command

dev ParticleDA

This will automatically clone the repository to your local directory ~/.julia/dev/ParticleDA.

You can exit from the package manager mode by pressing CTRL + C or, alternatively, the backspace key when there is no input in the prompt.

Usage

After installing the package, you can start using it in Julia's REPL with

using ParticleDA

The run the particle filter you can use the function run_particle_filter:

ParticleDA.run_particle_filter — Function

run_particle_filter(
    init_model,
    input_file_path,
    observation_file_path,
    filter_type=BootstrapFilter,
    summary_stat_type=MeanAndVarSummaryStat;
    rng=Random.TaskLocalRNG()
) -> Tuple{Matrix, Union{NamedTuple, Nothing}}

Run particle filter. init_model is the function which initialise the model, input_file_path is the path to the YAML file with the input parameters. observation_file_path is the path to the HDF5 file containing the observation sequence to perform filtering for. filter_type is the particle filter type to use. See ParticleFilter for the possible values. summary_stat_type is a type specifying the summary statistics of the particles to compute at each time step. See AbstractSummaryStat for the possible values. rng is a random number generator to use to generate random variates while filtering - a seeded random number generator may be specified to ensure reproducible results. If running with multiple threads a thread-safe generator such as Random.TaskLocalRNG (the default) must be used.

Returns a tuple containing the state particles representing an estimate of the filtering distribution at the final observation time (with each particle a column of the returned matrix) and a named tuple containing the estimated summary statistics of this final filtering distribution. If running on multiple ranks using MPI, the returned states array will correspond only to the particles local to this rank and the summary statistics will be returned only on the master rank with all other ranks returning nothing for their second return value.

source

To simulate observations from the model (which can be used to for example test the filtering algorithms) you can use the function simulate_observations_from_model

ParticleDA.simulate_observations_from_model — Function

simulate_observations_from_model(
    init_model, input_file_path, output_file_path; rng=Random.TaskLocalRNG()
) -> Matrix

Simulate observations from the state space model initialised by the init_model function with parameters specified by the model key in the input YAML file at input_file_path and save the simulated observation and state sequences to a HDF5 file at output_file_path. rng is a random number generator to use to generate random variates while simulating from the model - a seeded random number generator may be specified to ensure reproducible results.

The input YAML file at input_file_path should have a simulate_observations key with value a dictionary with keys seed and n_time_step corresponding to respectively the number of time steps to generate observations for from the model and the seed to use to initialise the state of the random number generator used to simulate the observations.

The simulated observation sequence is returned as a matrix with columns corresponding to the observation vectors at each time step.

source

The filter_type argument to run_particle_filter should be a concrete subtype of the ParticleFilter abstract type.

ParticleDA.ParticleFilter — Type

ParticleFilter

Abstract type for particle filters. Currently implemented subtypes are:

BootstrapFilter
OptimalFilter

source

ParticleDA.BootstrapFilter — Type

BootstrapFilter <: ParticleFilter

Singleton type BootstrapFilter. This can be used as argument of run_particle_filter to select the bootstrap filter.

source

ParticleDA.OptimalFilter — Type

OptimalFilter <: ParticleFilter

Singleton type OptimalFilter. This can be used as argument of run_particle_filter to select the optimal proposal filter (for conditionally linear-Gaussian models).

source

The summary_stat_type argument to run_particle_filter should be a concrete subtype of the AbstractSummaryStat abstract type.

ParticleDA.AbstractSummaryStat — Type

AbstractSummaryStat

Abstract type for summary statistics of particle ensemble. Concrete subtypes can be passed as the filter_type argument to run_particle_filter to specify which summary statistics to record and how they are computed.

source

ParticleDA.AbstractCustomReductionSummaryStat — Type

AbstractCustomReductionSummaryStat <: AbstractSummaryStat

Abstract type for summary statistics computed using custom MPI reductions. Allows greater flexibility in computing statistics which can support more numerically stable implementations, but at a cost of not being compatible with all CPU architectures. In particular, MPI.jl does not currently support custom operators on Power PC and ARM architecures.

source

ParticleDA.AbstractSumReductionSummaryStat — Type

AbstractSumReductionSummaryStat <: AbstractSummaryStat

Abstract type for summary statistics computed using standard MPI sum reductions. Compatible with a wider range of CPU architectures but may require less numerically stable implementations.

source

ParticleDA.MeanAndVarSummaryStat — Type

MeanAndVarSummaryStat <: AbstractCustomReductionSummaryStat

Custom reduction based summary statistic type which computes the means and variances o the particle ensemble for each state dimension. On CPU architectures which do not support custom reductions NaiveMeanAndVarSummaryStat can be used instead.

source

ParticleDA.MeanSummaryStat — Type

MeanSummaryStat <: AbstractCustomReductionSummaryStat

Custom reduction based summary statistic type which computes the means of the particle ensemble for each state dimension. On CPU architectures which do not support custom reductions NaiveMeanSummaryStat can be used instead.

source

ParticleDA.NaiveMeanAndVarSummaryStat — Type

NaiveMeanAndVarSummaryStat <: AbstractSumReductionSummaryStat

Sum reduction based summary statistic type which computes the means and variances of the particle ensemble for each state dimension. The mean and variance are computed by directly accumulating the sums of the particle values, the squared particle values and number of particles on each rank, with the variance computed as the scaled difference between the sum of the squares and square of the sums. This 'naive' implementation avoids custom MPI reductions but can be numerically unstable for large ensembles or state components with large values. If custom reductions are supported by the CPU architecture in use the more numerically stable MeanAndVarSummaryStat should be used instead.

source

ParticleDA.NaiveMeanSummaryStat — Type

NaiveMeanSummaryStat <: AbstractSumReductionSummaryStat

Sum reduction based summary statistic type which computes the means of the particle ensemble for each state dimension. The mean is computed by directly accumulating the sums of the particle values and number of particles on each rank. If custom reductions are supported by the CPU architecture in use the more numerically stable MeanSummaryStat should be used instead.

source

The next section details how to write the interface between the model and the particle filter.

Interfacing the model

The model needs to define a custom data structure and a few functions, that will be used by run_particle_filter:

a custom structure which holds the data about the model. This will be used to dispatch the methods to be defined, listed below;
an initialisation function with the following signature:
```
init(params_dict::Dict, n_tasks::Integer) -> model
```
with params_dict a dictionary with the parameters of the model and n_tasks an integer specifying the maximum number tasks (coroutines) parallelisable operations will be scheduled over. This initialisation function should create an instance of the model data structure and return it. The value of n_tasks can be used to create task-specific buffers for writing to when computing the model updates to avoid reallocating data structures on each function call. As tasks may be run in parallel over multiple threads, any buffers used in functions called within tasks should be unique to the task; to facilitate this functions in the model interface (see below) which may be called within tasks scheduled in parallel are passed a task_index argument which is a integer index in 1:n_tasks which is guaranteed to be unique to a particular task and so can be used to index in to task specific buffers.
The model needs to extend the following methods, using the model type for dispatch:

ParticleDA.get_state_dimension — Function

ParticleDA.get_state_dimension(model) -> Integer

Return the positive integer dimension of the state vector which is assumed to be fixed for all time steps.