Data Classes¶
All emdfile data classes inherit from the Node class, which adds core tree-building and metadata storage functionality. Each additionally has its own data and built-in metadata interface.
Array¶
- class emdfile.Array(data: ndarray, name: str | None = 'array', units: str | None = '', dims: list | None = None, dim_names: list | None = None, dim_units: list | None = None, slicelabels=None)¶
Array instances store N-dimensional array-like data.
- __init__(data: ndarray, name: str | None = 'array', units: str | None = '', dims: list | None = None, dim_names: list | None = None, dim_units: list | None = None, slicelabels=None)¶
- Parameters:
data (np.ndarray)
name (str)
units (str) – units for the pixel values
dims (variable) – specify calibration vectors for each axis of the data array. Valid values for each element of the list are None, a number, a 2-element list/array, or an M-element list/array where M is the extent of the corresponding array dimension. If None is passed, the dim will be populated with integer values starting at 0 and its units will be set to pixels. If a number is passed, the dim is populated with a vector beginning at zero and increasing linearly by this step size. If a 2-element list/array is passed, the dim is populated with a linear vector with these two numbers as the first two elements. If a list/array of length M is passed, this is used as the dim vector. If dims recieves a list of fewer than N arguments for an N-dimensional data array, the extra dimensions are populated as if None were passed, using integer pixel values. If the
dimsparameter is not passed, all dim vectors are populated this way.dim_units (list) – the units for the calibration dim vectors. If nothing is passed, dims vectors which have been populated automatically with integers corresponding to pixel numbers will be assigned units of ‘pixels’, and any other dim vectors will be assigned units of ‘unknown’. If a list with length < the array dimensions, the passed values are assumed to apply to the first N dimensions, and the remaining values are populated with ‘pixels’ or ‘unknown’ as above.
dim_names (list) – labels for each axis of the data array. Values which are not passed will be autopopulated with the name “dim#” where # is the axis number.
slicelabels (None or True or list) – if not None, array will be promoted to a stack array - see object docstring for details. If a list is passed it should specify the sub-array names.
- Return type:
- dim(n)¶
Return the n’th dim vector
- get_dim(n)¶
Return the n’th dim vector
- get_dim_name(n)¶
Get the n’th dim vector name
- get_dim_units(n)¶
Return the n’th dim vector units
- set_dim(n: int, dim: list | ndarray, units: str | None = None, name: str | None = None)¶
Sets the n’th dim vector, using
dimas described in the Array documentation. Ifunitsand/ornameare passed, sets these values for the n’th dim vector.- Parameters:
n (int) – specifies which dim vector
dim (list or array) – length must be either 2, or match the length of the n’th axis
units (str)
name (str)
- set_dim_name(n: int, name: str)¶
Sets the n’th dim vector name to name.
- Parameters:
n (int) – which dim vector
name (str) – new name
- set_dim_units(n: int, units: str)¶
Sets the n’th dim vector units to units.
- Parameters:
n (int) – which dim vector
units (str) – new units
- to_h5(group)¶
Calls Node.to_h5 to greate the group’s node and write its metadata. Then writes Array data, calibration vectors, units, and any stack/label info.
- Parameters:
group (h5py Group)
- Return type:
(h5py Group) the new array’s Group
PointList¶
- class emdfile.PointList(data: ndarray, name: str | None = 'pointlist')¶
PointList instances represent sets of points in some M dimensional space. Each dimension is given by a named field and has its own dtype. See also the documentation for numpy structured arrays.
- __init__(data: ndarray, name: str | None = 'pointlist')¶
- Parameters:
data (structured numpy ndarray) – the data
name (str) – name for the PointList
- Return type:
- add(data)¶
Appends a numpy structured array. Its dtypes must agree with the existing data.
- add_data_by_field(data, fields=None)¶
Add a list of data arrays to the PointList, in the fields given by
fields. Iffieldsis not specified, assumes the data arrays are in the same order as self.fields- Parameters:
data (list) – arrays of data to add to each field
- add_fields(new_fields, name='')¶
Creates a copy of the PointList, but with additional fields given by new_fields.
- Parameters:
new_fields (list of 2-tuples, ('name', dtype))
name (string)
- copy(name=None)¶
Returns a copy of the PointList. If name=None, sets to {name}_copy
- remove(mask)¶
Removes points wherever mask==True
- sort(field, order='ascending')¶
Sorts the point list according to field, which must be a field in self.dtype. order should be ‘descending’ or ‘ascending’.
- to_h5(group)¶
Calls Node.to_h5 to greate the group’s node and write its metadata. Then writes PointList data including the structured data array and field names and dtypes.
- Parameters:
group (h5py Group)
- Returns:
h5py Group
- Return type:
the new pointlist’s group
PointListArray¶
- class emdfile.PointListArray(dtype, shape, name: str | None = 'pointlistarray')¶
A PointListArray instance comprises a 2D grid of PointLists, each sharing a single dtype and set of fields, and each having any variable length. It therefore represents a “ragged array” in 2+1 dimensions, i.e. with two dimensions of a fixed shape and one of variable length, embedded in an M dimensional space for PointLists with M fields.
- __init__(dtype, shape, name: str | None = 'pointlistarray')¶
Creates an empty PointListArray.
- Parameters:
dtype (dtype) – the dtype of the data comprising each PointList
shape (2-tuple of ints) – the shape of the array of PointLists
name (str)
- Return type:
- add_fields(new_fields, name='')¶
Creates a copy of the PointListArray, but with additional fields given by new_fields.
- Parameters:
new_fields (list of 2-tuples, ('name', dtype))
name (string)
- copy(name='')¶
Returns a copy of itself.
- get_pointlist(i, j, name=None)¶
Returns the pointlist at i,j
- to_h5(group)¶
Calls Node.to_h5 to greate the group’s node and write its metadata. Then writes PointListArray data including the data itself, array shape and the dtype.
- Parameters:
group (h5py Group)
- Return type:
(h5py Group) the new pointlistarray’s group