atomica.data

Implementation of Databook functionality

This module defines the ProjectData class, which serves as a Python-based representation of the Databook, as well as providing methods for reading Databooks into ProjectData instances, and saving ProjectData back to Excel files.

Classes

ProjectData(framework) Store project data: class-equivalent of Databooks
class atomica.data.ProjectData(framework)[source]

Store project data: class-equivalent of Databooks

This class is used to load and work with data that is entered in databooks. It provides the interface for

  • Loading data
  • Modifying data (values, adding/removing populations etc.
  • Saving modified data
  • Writing new databooks

To instantiate, the ProjectData constructor is normally not used. Instead, use the static methods

  • ProjectData.new() to create a new instance/databook given a ProjectFramework
  • ProjectData.from_spreadsheet() to load a databook
_book = None

Temporary storage for the workbook while writing a databook

_formats = None

Temporary storage for the Excel formatting while writing a databook

_pop_types = None

Store set of valid population types from framework

_read_interpops(sheet)[source]

Writes the ‘Interactions’ sheet

Return type:None
_read_pops(sheet)[source]

Reads the ‘Population Definitions’ sheet

Return type:None
_read_transfers(sheet)[source]

Writes the ‘Transfers’ sheet

Return type:None
_references = None

Temporary storage for cell references while writing a databook

_write_interpops()[source]

Writes the ‘Interactions’ sheet

Return type:None
_write_pops()[source]

Writes the ‘Population Definitions’ sheet

Return type:None
_write_tdve()[source]

Writes the TDVE tables

This method will create multiple sheets, one for each custom page specified in the Framework.

Return type:None
_write_transfers()[source]

Writes the ‘Transfers’ sheet

Return type:None
add_interaction(code_name, full_name, from_pop_type=None, to_pop_type=None)[source]

Add a new empty interaction

Normally this method would only be manually called if a framework had been updated to contain a new interaction, and the databook now required updating. Therefore, this method would generally only be used when an interaction with given code name, full name, and pop type had already been added to a framework.

Parameters:
  • code_name (str) – The code name of the interaction to create
  • full_name (str) – The full name of the interaction to create
  • from_pop_type (Optional[str]) – The name of a population type, which will identify the populations to be added. Default is first population type in the framework
  • to_pop_type (Optional[str]) – The name of a population type, which will identify the populations to be added. Default is first population type in the framework
Return type:

TimeDependentConnections

Returns:

Newly instantiated TimeDependentConnections object (also added to ProjectData.interpops)

add_pop(code_name, full_name, pop_type=None)[source]

Add a population

This will add a population to the databook. The population type should match one of the population types in the framework

Parameters:
  • code_name (str) – The code name for the new population
  • full_name (str) – The full name/label for the new population
  • pop_type (Optional[str]) – String with the population type code name
Return type:

None

add_transfer(code_name, full_name, pop_type=None)[source]

Add a new empty transfer

Parameters:
  • code_name (str) – The code name of the transfer to create
  • full_name (str) – The full name of the transfer to create
  • pop_type (Optional[str]) – Code name of the population type. Default is first population type in the framework
Return type:

TimeDependentConnections

Returns:

Newly instantiated TimeDependentConnections object (also added to ProjectData.transfers)

change_tvec(tvec)[source]

Change the databook years

This function can be used to change the time vector in all of the TDVE/TDC tables. There are two ways to change the time arrays:

  • Setting ProjectData.tvec directly will only affect newly added tables, and will keep existing tables as they are
  • Calling ProjectData.change_tvec() will modify all existing tables

Note that the TDVE/TDC tables store time/value pairs sparsely within their TimeSeries objects. Therefore, changing the time array won’t modify any of the data - it will only have an effect the next time a databook is written (so typically this method would be called as part of preparing a modified databook).

Parameters:tvec (<built-in function array>) – A float, list, or array containing time values (in years) for the databook
Return type:None
end_year

Return the start year from the databook

The ProjectData end year is defined as the latest time point in any of the TDVE/TDC tables (noting that it it is possible for the TDVE tables to have different time values). This quantity should be used when changing the simulation end year, if using all of the data in the databook is desired.

Return type:float
Returns:The latest year in the databook
static from_spreadsheet(spreadsheet, framework)[source]

Construct ProjectData from spreadsheet

The framework is needed because the databook does not read in or otherwise store
  • The valid units for quantities
  • Which population type is associated with TDVE tables
Parameters:
  • spreadsheet – The name of a spreadsheet, or a sc.Spreadsheet
  • framework – A ProjectFramework instance
Returns:

A new ProjectData instance

get_tdve_page(code_name)[source]

Given a code name for a TDVE quantity, find which page it is on

Parameters:code_name – The code name for a TDVE quantity
Return type:str
Returns:The sheet that it appears on
get_ts(name, key=None)[source]

Extract a TimeSeries from a TDVE table or TDC table

Parameters:
  • name (str) – The code name for the container storing the TimeSeries - The code name of a transfer, interaction, or compartment/characteristic/parameter - The name of a transfer parameter instantiated in model.build e.g. ‘age_0-4_to_5-14’. this is mainly useful when retrieving data for plotting, where variables are organized according to names like ‘age_0-4_to_5-14’
  • key – Specify the identifier for the TimeSeries - If name is a comp/charac/par, then key should be a pop name - If name is a transfer or interaction, then key should be a tuple (from_pop,to_pop) - If name is the name of a model transfer parameter, then key should be left as None
Returns:

A TimeSeries, or None if there were no matches

Regarding the specification of the key - the same transfer could be specified as

  • name='age', key=('0-4','5-14')
  • name='age_0-4_to_5-14', key=None

where the former is typically used when working with data and calibrations, and the latter is used in Model and is therefore encountered on the Result and plotting side.

interpops = None

This stores a list of TimeDependentConnections instances for interactions

static new(framework, tvec, pops, transfers)[source]

Make a new databook/ProjectData instance

This method should be used (instead of the standard constructor) to produce a new class instance (e.g. if creating a new databook).

Parameters:
  • framework – A ProjectFramework instance
  • tvec – A scalar, list, or array of times (typically would be generated with numpy.arange())
  • pops – A number of populations, or a dict with either {name:label} or {name:{label:label,type:type}}. Type defaults to the first population type in the framework
  • transfers – A number of transfers, or a dict with either {name:label} or {name:{label:label,type:type}}. The type defaults to the first population type in the framework. Transfers can only take place between populations of the same type.
Returns:

A new ProjectData instance

pops = None

full_name, ‘type’:pop_type}

Type:This is an odict mapping code_name
Type:{‘label’
remove_interaction(code_name)[source]

Remove an interaction

Parameters:code_name (str) – Code name of the interaction to remove
Return type:None
remove_pop(pop_name)[source]
remove_transfer(code_name)[source]

Remove a transfer

Parameters:code_name (str) – Code name of the transfer to remove
Return type:None
rename_pop(existing_code_name, new_code_name, new_full_name)[source]

Rename a population

Parameters:
  • existing_code_name (str) – Existing code name of a population
  • new_code_name (str) – New code name to assign
  • new_full_name (str) – New full name/label to assign
Return type:

None

rename_transfer(existing_code_name, new_code_name, new_full_name)[source]

Rename an existing transfer

Parameters:
  • existing_code_name (str) – The existing code name to change
  • new_code_name (str) – The new code name
  • new_full_name (str) – The new full name
Return type:

None

save(fname)[source]

Save databook to disk

This function provides a shortcut to generate a spreadsheet and immediately save it to disk.

Parameters:fname – File name to write on disk
Return type:None
start_year

Return the start year from the databook

The ProjectData start year is defined as the earliest time point in any of the TDVE/TDC tables (noting that it it is possible for the TDVE tables to have different time values). This quantity should be used when changing the simulation start year, if using all of the data in the databook is desired.

Return type:float
Returns:The earliest year in the databook
tdve = None

This is an odict storing TimeDependentValuesEntry instances keyed by the code name of the TDVE

tdve_pages = None

This is an odict mapping worksheet name to an (ordered) list of TDVE code names appearing on that sheet

to_spreadsheet()[source]

Return content as an AtomicaSpreadsheet

Returns:An AtomicaSpreadsheet instance
transfers = None

This stores a list of TimeDependentConnections instances for transfers

tvec = None

This is the data’s tvec used when instantiating new tables. Not _guaranteed_ to be the same for every TDVE/TDC table

validate(framework)[source]

Check if the ProjectData instance can be used to run simulations

A databook can be ‘valid’ in two senses

  • The Excel file adheres to the correct syntax and it can be parsed into a ProjectData object
  • The resulting ProjectData object contains sufficient information to run a simulation

Sometimes it is desirable for ProjectData to be valid in one sense rather than the other. For example, in order to run a simulation, the ProjectData needs to contain at least one value for every TDVE table. However, the TDVE table does _not_ need to contain values if all we want to do is add another key pop Thus, the first stage of validation is the ProjectData constructor - if that runs, then users can access methods like ‘add_pop’,’remove_transfer’ etc.

On the other hand, to actually run a simulation, the _contents_ of the databook need to satisfy various conditions These tests are implemented here. The typical workflow would be that ProjectData.validate() should be used if a simulation is going to be run. In the first instance, this can be done in Project.load_databook but the FE might want to perform this check at a different point if the databook manipulation methods e.g. add_pop are going to be exposed in the interface

This function throws an informative error if there are any problems identified or otherwise returns True

Parameters:framework – A ProjectFramework instance to validate the data against
Return type:bool
Returns:True if ProjectData is valid. An error will be raised otherwise