atomica.plotting

Functions for generating plots from model outputs

This module implements Atomica’s plotting library, which is used to generate various plots from model outputs.

Functions

plot_bars(plotdata[, stack_pops, …]) Produce a bar plot
plot_legend(entries[, plot_type, fig]) Render a new legend
plot_series(plotdata[, plot_type, axis, …]) Produce a time series plot
relabel_legend(figs, labels) Change the labels on an existing legend
reorder_legend(figs[, order]) Change the order of an existing legend
save_figs(figs[, path, prefix, fnames]) Save figures to disk as PNG files

Classes

PlotData(results[, outputs, pops, …]) Process model outputs into plottable quantities
Series(tvec, vals[, result, pop, output, …]) Represent a plottable time series
class atomica.plotting.PlotData(results, outputs=None, pops=None, output_aggregation=None, pop_aggregation=None, project=None, time_aggregation=None, t_bins=None, accumulate=None)[source]

Process model outputs into plottable quantities

This is what gets passed into a plotting function, which displays a View of the data Conceptually, we are applying visuals to the data. But we are performing an extraction step rather than doing it directly because things like labels, colours, groupings etc. only apply to plots, not to results, and there could be several different views of the same data.

Operators for - and / are defined to faciliate looking at differences and relative differences of derived quantities (quantities computed using PlotData operations) across individual results. To keep the implementation tractable, they don’t generalize further than that, and operators + and * are not implemented because these operations rarely make sense for the data being operated on.

Parameters:
  • results – which results to plot. Can be - a Result, - a list of Results, - a dict/odict of results (the name of the result is taken from the Result, not the dict)
  • outputs – The name of an output compartment, characteristic, or parameter, or list of names. Inside a list, a dict can be given to specify an aggregation e.g. outputs=['sus',{'total':['sus','vac']}] where the key is the new name. Or, a formula can be given which will be evaluated by looking up labels within the model object. Links will automatically be summed over
  • pops – The name of an output population, or list of names. Like outputs, can specify a dict with a list of pops to aggregate over them
  • output_aggregation

    If an output aggregation is requested, combine the outputs listed using one of

    • ’sum’ - just add values together
    • ’average’ - unweighted average of quantities
    • ’weighted’ - weighted average where the weight is the compartment size, characteristic value, or link source compartment size (summed over duplicate links). ‘weighted’ method cannot be used with non-transition parameters and a KeyError will result in that case
  • pop_aggregation – Same as output_aggregation, except that ‘weighted’ uses population sizes. Note that output aggregation is performed before population aggregation. This also means that population aggregation can be used to combine already aggregated outputs (e.g. can first sum ‘sus’+’vac’ within populations, and then take weighted average across populations)
  • project – Optionally provide a Project object, which will be used to convert names to labels in the outputs for plotting.
  • time_aggregation – Optionally specify time aggregation method. Supported methods are ‘integrate’ and ‘average’ (no weighting). When aggregating times, non-annualized flow rates will be used.
  • t_bins

    Optionally specify time bins, which will enable time aggregation. Supported inputs are - A vector of bin edges. Time points are included if the time

    is >= the lower bin value and < upper bin value.
    • A scalar bin size (e.g. 5) which will be expanded to a vector spanning the data
    • The string ‘all’ will maps to bin edges [-inf, inf] aggregating over all time
  • accumulate – Optionally accumulate outputs over time. Can be ‘sum’ or ‘integrate’ to either sum quantities or integrate by multiplying by the timestep. Accumulation happens after time aggregation. The logic is extremely simple - the quantities in the Series pass through cumsum. If ‘integrate’ is selected, then the quantities are multiplied by dt and the units are multiplied by years
Returns:

A PlotData instance that can be passed to plot_series() or plot_bars()

__getitem__(key)[source]

Implement custom indexing

The Series objects stored within PlotData are each bound to a single result, population, and output. This operator makes it possible to easily retrieve a particular Series instance. For example,

>>> d = PlotData(results)
... d['default','0-4','sus']
Parameters:key (tuple) – A tuple of (result,pop,output)
Returns:A Series instance
_accumulate(accumulation_method)[source]

Internal method to accumulate values over time

Accumulation methods are

Parameters:accumulation_method

Select whether to add or integrate. Supported methods are: - ‘sum’ : runs cumsum on all quantities - should not be used if units are flow rates (so will check for ‘/year’).

Summation should be used for compartment-based quantities, such as DALYs
  • ’integrate’ : integrate using trapezoidal rule, assuming initial value of 0
    Note that here there is no concept of ‘dt’ because we might have non-uniform time aggregation bins Therefore, we need to use the time vector actually contained in the Series object (via cumtrapz())
Return type:None
_time_aggregate(t_bins, time_aggregation=None)[source]

Internal method for time aggregation

Note that accumulation is a running total, whereas aggregation refers to binning. The two can be both be applied (with aggregation occuring prior to accumulation).

Parameters:
  • t_bins – Vector of bin edges OR a scalar bin size, which will be automatically expanded to a vector of bin edges
  • time_aggregation – can be ‘sum’ or ‘average’. Note that for quantities that have a timescale, ‘sum’ behaves like integration so flow parameters in number units will be adjusted accordingly (e.g. a parameter in units of ‘people/day’ aggregated over a 1 year period will display as the equivalent number of people that year)
Return type:

None

interpolate(new_tvec)[source]

Interpolate all Series onto new time values

This will modify all of the contained Series objects in-place. The modified PlotData instance is also returned, so that interpolation and construction can be performed in one line. i.e. both

>>> d = PlotData(result)
... d.interpolate(tvals)

and

>>> vals = PlotData(result).interpolate(tvals)

will work as intended.

Parameters:new_tvec – Vector of new time values
Returns:The modified PlotData instance
static programs(results, outputs=None, t_bins=None, quantity='spending', accumulate=None, nan_outside=False)[source]

Constructs a PlotData instance from program values

This alternate constructor can be used to plot program-related quantities such as spending or coverage.

Parameters:
  • results – single Result, or list of Results
  • outputs

    specification of which programs to plot spending for. Can be: - the name of a single program - a list of program names - aggregation dict e.g. {‘treatment’:[‘tx-1’,’tx-2’]} or list of such dicts. Output aggregation type is automatically ‘sum’ for

    program spending, and aggregation is NOT permitted for coverages (due to modality interactions)
  • t_bins – aggregate over time, using summation for spending and number coverage, and average for fraction/proportion coverage. Notice that unlike the PlotData() constructor, this function does _not_ allow the time aggregation method to be manually set.
  • quantity – can be ‘spending’, ‘coverage_number’, ‘coverage_eligible’, or ‘coverage_fraction’. The ‘coverage_eligible’ is the sum of compartments reached by a program, such that coverage_fraction = coverage_number/coverage_eligible
  • accumulate – can be ‘sum’ or ‘integrate’
  • nan_outside – If True, then values will be NaN outside the program start/stop year
Returns:

A new PlotData instance

set_colors(colors=None, results='all', pops='all', outputs='all', overwrite=False)[source]

Assign colors to quantities

This function facilitates assigned colors to the Series objects contained in this PlotData instance.

Parameters:
  • colors – Specify the colours to use. This can be - A list of colours that applies to the list of all matching items - A single colour to use for all matching items - The name of a colormap to use (e.g., ‘Blues’)
  • results – A list of results to set colors for, or a dict of results where the key names the results (e.g. PlotData.results)
  • pops – A list of pops to set colors for, or a dict of pops where the key names the pops (e.g. PlotData.pops

:param outputs:A list of outputs to set colors for, or a dict of outputs where the key names the outputs (e.g. PlotData.outputs) :param overwrite: False (default) or True. If True, then any existing manually set colours will be overwritten :return: The PlotData instance (also modified in-place)

Essentially, the lists of results, pops, and outputs are used to filter the Series resulting in a list of Series to operate on. Then, the colors argument is applied to that list.

tvals()[source]

Return vector of time values

This method returns a vector of time values for the PlotData object, if all of the series have the same time axis (otherwise it will throw an error). All series must have the same number of timepoints. This will always be the case for a PlotData object unless the instance has been manually modified after construction.

Returns:Tuple with (array of time values, array of time labels)
class atomica.plotting.Series(tvec, vals, result='default', pop='default', output='default', data_label='', color=None, units='', timescale=None, data_pop='')[source]

Represent a plottable time series

A Series represents a quantity available for plotting. It is like a TimeSeries but contains additional information only used for plotting, such as color.

Parameters:
  • tvec – array of time values
  • vals – array of values
  • result – name of the result associated with ths data
  • pop – name of the pop associated with the data
  • output – name of the output associated with the data
  • data_label – name of a quantity in project data to plot in conjunction with this Series
  • color – the color to render the Series with
  • units – the units for the values
  • timescale – For Number, Probability and Duration units, there are timescales associated with them
color = None

the color to render the Series with

data_label = None

Used to identify data for plotting - should match the name of a data TDVE

data_pop = None

Used to identify which population in the TDVE (specified by data_label) to look up

interpolate(new_tvec)[source]

Return interpolated vector of values

This function returns an np.array() with the values of this series interpolated onto the requested time array new_tvec. To ensure results are not misleading, extrapolation is disabled and will return NaN if new_tvec contains values outside the original time range.

Note that unlike PlotData.interpolate(), Series.interpolate() does not modify the object but instead returns the interpolated values. This makes the Series object more versatile (PlotData is generally used only for plotting, but the Series object can be a convenient way to work with values computed using the sophisticated aggregations within PlotData).

Parameters:new_tvec – array of new time values
Returns:array with interpolated values (same size as new_tvec)
output = None

name of the output associated with the data

pop = None

name of the pop associated with the data

result = None

name of the result associated with ths data

t_labels = None

Iterable array of time labels - could be set to strings like [2010-2014]

timescale = None

If the quantity has a time-like denominator (e.g. number/year, probability/day) then the denominator is stored here (in units of years) This enables quantities to be time-aggregated correctly (e.g. number/day must be converted to number/timestep prior to summation or integration) For links, the timescale is normally just dt. This also enables more rigorous checking for quantities with time denominators than checking for a string like '/year' because users may not set this specifically.

tvec = None

array of time values

unit_string

Return the units for the quantity including timescale

When making plots, it is useful for the axis label to have the units of the quantity. The units should also include the time scale e.g. “Death rate (probability per day)”. However, if the timescale changes due to aggregation or accumulation, then the value might be different. In that case, The unit of the quantity is interpreted as a numerator if the Timescale is not None. For example, Compartments have units of ‘number’, while Links have units of ‘number/timestep’ which is stored as Series.units='number' and Series.timescale=0.25 (if dt=0.25). The unit_string attribute

Returns:
units = None

The units for the quantity to display on the plot

vals = None

array of values

atomica.plotting._apply_series_formatting(ax, plot_type)[source]
Return type:None
atomica.plotting._expand_dict(x)[source]

Expand a dict with multiple keys into a list of single-key dicts

An aggregation is defined as a mapping of multiple outputs into a single variable with a single label. This is represented by a dict with a single key, where the key is the label of the new quantity, and the value represents the instructions for how to compute the quantity. Sometimes outputs and pops are used directly, without renaming, so in this case, only the string representing the name of the quantity is required. Therefore, the format used internally by PlotData is that outputs/pops are represented as lists with length equal to the total number of quantities being returned/computed, and that list can contain dictionaries with single keys whenever an aggregation is required.

For ease of use, it is convenient for users to enter multiple aggregations as a single dict with multiple keys. This function processes such a dict into the format used internally by PlotData.

Parameters:x (list) – A list of inputs, containing strings or dicts that might have multiple keys
Return type:list
Returns:A list containing strings or dicts where any dicts have only one key

Example usage:

>>> _expand_dict(['a',{'b':1,'c':2}])
['a', {'b': 1}, {'c': 2}]
atomica.plotting._extract_labels(input_arrays)[source]

Extract all quantities from list of dicts

The inputs supported by outputs and pops can contain lists of optional aggregations. The first step in PlotData is to extract all of the quantities in the Model object that are required to compute the requested aggregations.

Parameters:input_arrays – Input string, list, or dict specifying aggregations
Return type:set
Returns:Set of unique string values that correspond to model quantities

Example usage:

>>> _extract_labels(['vac',{'a':['vac','sus']}])
set(['vac','sus'])

The main workflow is:

[‘vac’,{‘a’:[‘vac’,’sus’]}] -> [‘vac’,’vac’,’sus’] -> set([‘vac’,’sus’])

i.e. first a flat list is constructed by replacing any dicts with their values and concatenating, then the list is converted into a set

atomica.plotting._get_full_name(code_name, proj=None)[source]

Return the label of an object retrieved by name

If a Project has been provided, code names can be converted into labels for plotting. This function is different to framework.get_label() though, because it supports converting population names to labels as well (this information is in the project’s data, not in the framework), and it also supports converting link syntax (e.g. sus:vac) into full names as well. Note also that this means that the strings returned by _get_full_name can be as specific as necessary for plotting.

Parameters:
  • code_name (str) – The code name for a variable (e.g. ‘sus’, ‘pris’, ‘sus:vac’)
  • proj – Optionally specify a Project instance
Return type:

str

Returns:

If a project was provided, returns the full name. Otherwise, just returns the code name

atomica.plotting._render_data(ax, data, series, baseline=None, filled=False)[source]

Renders a scatter plot for a single variable in a single population

Parameters:
  • ax – axis object that data will be rendered in
  • data – a ProjectData instance containing the data to render
  • series – a Series object, the ‘pop’ and ‘data_label’ attributes are used to extract the TimeSeries from the data
  • baseline – adds an offset to the data e.g. for stacked plots
  • filled – fill the marker with a solid fill e.g. for stacked plots
Return type:

None

atomica.plotting._render_legend(ax, plot_type=None, handles=None)[source]

Internal function to render a legend

Parameters:
  • ax – Axis in which to create the legend
  • plot_type – Used to decide whether to reverse the legend order for stackplots
  • handles – The handles of the objects to enter in the legend. Labels should be stored in the handles
Return type:

None

atomica.plotting._stack_data(ax, data, series)[source]

Internal function to stack series data

Used by plot_series when rendering stacked plots and also showing data.

Return type:None
atomica.plotting._turn_off_border(ax)[source]

Turns off top and right borders.

Note that this function will leave the bottom and left borders on.

Parameters:ax – An axis object
Return type:None
Returns:None
atomica.plotting.plot_bars(plotdata, stack_pops=None, stack_outputs=None, outer=None, legend_mode=None, show_all_labels=False, orientation='vertical')[source]

Produce a bar plot

Parameters:
  • plotdata – a PlotData instance to plot
  • stack_pops – A list of lists with populations to stack. A bar is rendered for each item in the list. For example, [[‘0-4’,‘5-14’],[‘15-64’]] will render two bars, with two populations stacked in the first bar, and only one population in the second bar. Items not appearing in this list will be rendered unstacked.
  • stack_outputs – Same as stack_pops, but for outputs.
  • outer – Optionally select whether the outermost/highest level of grouping is by ‘times’ or by ‘results’
  • legend_mode – override the default legend mode in settings
  • show_all_labels – If True, then inner/outer labels will be shown even if there is only one label
  • orientation – ‘vertical’ (default) or ‘horizontal’
Return type:

list

Returns:

A list of newly created Figures

atomica.plotting.plot_legend(entries, plot_type='patch', fig=None)[source]

Render a new legend

Parameters:
  • entries (dict) – Dict where key is the label and value is the colour e.g. {‘sus’:’blue’,’vac’:’red’}
  • plot_type – can be ‘patch’ or ‘line’
  • fig – Optionally takes in the figure to render the legend in. If not provided, a new figure will be created
Returns:

The matplotlib Figure object containing the legend

atomica.plotting.plot_series(plotdata, plot_type='line', axis=None, data=None, legend_mode=None, lw=None)[source]

Produce a time series plot

Parameters:
  • plotdata – a PlotData instance to plot
  • plot_type – ‘line’, ‘stacked’, or ‘proportion’ (stacked, normalized to 1)
  • axis – Specify which quantity to group outputs on plots by - can be ‘outputs’, ‘results’, or ‘pops’. A line will be drawn for each of the selected quantity, and any other quantities will appear as separate figures.
  • data – Draw scatter points for data wherever the output label matches a data label. Only draws data if the plot_type is ‘line’
  • legend_mode – override the default legend mode in settings
  • lw – override the default line width
Return type:

list

Returns:

A list of newly created Figures

atomica.plotting.relabel_legend(figs, labels)[source]

Change the labels on an existing legend

Parameters:
  • figs – Figure, or list of figures, to change labels in
  • labelslist of labels the same length as the number of legend labels OR a dict of labels where the key is the index

of the labels to change. The dict input option makes it possible to change only a subset of the labels.

Return type:None
atomica.plotting.reorder_legend(figs, order=None)[source]

Change the order of an existing legend

Parameters:
  • figs – Figure, or list of figures, containing legends for which the order should be changed
  • order

    Specification of the order in which to render the legend entries. This can be - The string ‘reverse’ which will reverse the order of the legend - A list of indices mapping old position to new position. For example, if the

    original label order was [‘a,’b’,’c’], then order=[1,0,2] would result in [‘b’,’a’,’c’]. If a partial list is provided, then only a subset of the legend entries will appear. This allows this function to be used to remove legend entries as well.
Return type:

None

atomica.plotting.save_figs(figs, path='.', prefix='', fnames=None)[source]

Save figures to disk as PNG files

Functions like plot_series and plot_bars can generate multiple figures, depending on the data and legend options. This function facilitates saving those figures together. The name for the file can be automatically selected when saving figures generated by plot_series and plot_bars. This function also deals with cases where the figure list may or may not contain a separate legend (so saving figures with this function means the legend mode can be changed freely without having to change the figure saving code).

Parameters:
  • figs – A figure or list of figures
  • path – Optionally append a path to the figure file name
  • prefix – Optionally prepend a prefix to the file name
  • fnames – Optionally an array of file names. By default, each figure is named

using its ‘label’ property. If a figure has an empty ‘label’ string it is assumed to be a legend and will be named based on the name of the figure immediately before it. If you provide an empty string in the fnames argument this same operation will be carried out. If the last figure name is omitted, an empty string will automatically be added.

Return type:None