.. usage: Examples and Usage ================== Below are introductory examples on * :ref:`basics ` * :ref:`trees ` * :ref:`metadata ` * :ref:`nodes ` * :ref:`arrays ` * :ref:`more data classes ` * :ref:`append mode ` * :ref:`defining classes ` See also the `basic usage `_ notebook in the repo `sample-code `_ folder. .. _basics: ****** Basics ****** .. admonition:: Includes * :ref:`Save & read an array ` * :ref:`Save & read a dictionary ` * :ref:`Save & read several arrays & dicts ` After .. code-block:: >>> import emdfile as emd, numpy as np .. _save-read-array: save and read an array with .. code-block:: >>> ar = np.random.random((5,5)) >>> emd.save(path, ar) >>> _ar = emd.read(path) .. _save-read-dict: or a Python dictionary with .. code-block:: >>> dic = {'a':1, 'b':2} >>> emd.save(path, dic) >>> _dic = emd.read(path) .. _save-read-several: or save a combination of arrays ``ar_*`` and ``dic_*`` with .. code-block:: >>> emd.save(path, [ar_A,ar_B,ar_C,dic_X,dic_Y,dic_Z]) and read and unpack them with .. code-block:: >>> data = emd.read(path) >>> _ar_A = data.tree('array_0') >>> _ar_B = data.tree('array_1') >>> _ar_C = data.tree('array_2') >>> _dic_A = data.metadata['dictionary_0'] >>> _dic_B = data.metadata['dictionary_1'] >>> _dic_C = data.metadata['dictionary_2'] .. _build-trees: ***** Trees ***** .. admonition:: Includes * :ref:`Build a tree ` * :ref:`Inspect a Python tree ` * :ref:`Save a tree ` * :ref:`Inspect an HDF5 tree ` * :ref:`Read from an HDF5 tree ` .. _build-a-tree: ``emdfile`` classes can be composed into filetree-like heirarchies. For class instances ``A``, ``B``, and ``C`` with names ``'A'``, ``'B'`` and ``'C'`` build a tree with .. code-block:: >>> R = emd.Root() >>> R.tree(A) >>> R.tree(B) >>> B.tree(C) .. _inspect-python-tree: and display it with .. code-block:: >>> R.tree() which prints .. code-block:: / |---A |---B |---C .. _save-a-tree: Save the whole tree with .. code-block:: >>> emd.save(path, R) .. _inspect-hdf5-tree: then print the file contents with .. code-block:: >>> emd.printtree(path) which prints .. code-block:: / |---root |---A |---B |---C .. _read-hdf5-tree: Read the whole tree again with .. code-block:: >>> data = emd.read(path) or read some subset with .. code-block:: >>> data = emd.read(path, emdpath='root/A') # reads A >>> data = emd.read(path, emdpath='root/B') # reads B---C >>> data = emd.read(path, emdpath='root/B', tree=False) # reads B only .. _metadata-example: ******** Metadata ******** .. admonition:: Includes * :ref:`Read dictionaries to Metadata ` * :ref:`Use Metadata like dictionaries ` * :ref:`Store various data types ` .. _read-dict-to-metadata: When you save a Python dictionary and read it again, you get an ``emd.Metadata`` instance .. code-block:: >>> emd.save(path, {'a':1,'b':2}) >>> x = emd.read(path) >>> print(x) Metadata( A Metadata instance called 'dictionary', containing the following fields: a: 1 b: 2 ) .. _use-metadata-like-dict: You can access values like a normal Python dictionary .. code-block:: >>> x['a'] 1 as well as add data .. code-block:: >>> x['c'] = 3 .. _store-data-types: Nested dictionarys of any depth are premitted, as are various Python and numpy values. Doing .. code-block:: >>> m = emd.Metadata( name='my_metadata' ) >>> m['x'] = True >>> m['y'] = np.random.random((3,4,5)) >>> m['z'] = { >>> 'alpha' : None, >>> 'beta' : { >>> 'gamma' : [10,11,12] >>> } >>> } >>> emd.save(path, m) saves a dictionary and .. code-block:: >>> _m = emd.read(path) reads it again. Print its contents with .. code-block:: >> print(_m) Metadata( A Metadata instance called 'my_metadata', containing the following fields: x: True y: 3D-array z: {'alpha': None, 'beta': {'gamma': [10, 11, 12]}} ) Any number of Metadata instances can be stored in each emdfile node - see the :doc:`Metadata ` and :ref:`Node ` docstrings for more information. .. _data-nodes: ***** Nodes ***** .. admonition:: Includes * :ref:`Nodes have names ` * :ref:`Nodes hold arbitrary metadata ` * :ref:`Nodes have a versatile .tree method ` The :ref:`Node ` class is the base class that all :doc:`emdfile classes ` inherit from, allowing them to build and modify trees and store arbitrary metadata. Each node has a ``.name`` and ``.metadata`` attribute and a ``.tree`` method. .. _node-names: A node's name is used to find it in data trees and to save it to files, and can be assigned during instantiation .. code-block:: >>> node = emd.Node( name='my_node' ) .. _nodes-hold-metadata: The ``.metadata`` property has unique assignment behavior to allow storing many ``Metadata`` instances in a given node. Doing .. code-block:: >>> node.metadata = emd.Metadata('md1',{'x':1,'y':2}) >>> node.metadata = emd.Metadata('md2',{'a':1,'b':{'c':2,'d':3}}) will store *both* ``Metadata`` instances md1 and md2 in ``node`` (and not overwrite one of them, as you would expect in normal Python assignment). You can return all the ``Metadata`` instances in a node with .. code-block:: >>> node.metadata {'md1': Metadata( A Metadata instance called 'md1', containing the following fields: x: 1 y: 2 ), 'md2': Metadata( A Metadata instance called 'md2', containing the following fields: a: 1 b: {'c': 2, 'd': 3} )} and one of the ``Metadata`` instances can be retrieved by .. code-block:: >>> node.metadata['md1'] Metadata( A Metadata instance called 'md1', containing the following fields: x: 1 y: 2 ) .. _node-tree-method: Basic EMD ``.tree`` usage for building and printing tree structures is :ref:`shown above `. Using ``.tree`` you can also retrieve any tree node, split one tree into two with the ``cut`` operation, or merge two trees into one with the ``graft`` operation. EMD trees must begin with a ``Root`` instance, a special ``Node`` subtype intended for this purpose. See the :ref:`Node ` documentation. .. _array-example: ****** Arrays ****** .. admonition:: Includes * :ref:`Minimal Array instantiation ` * :ref:`Arrays with built-in calibrations ` * :ref:`Get or modify dimension vectors ` .. _minimal-array: The :ref:`Array ` class enables storage of array-like data. The minimal required argument to make a new instance is a numpy array .. code-block:: >>> array = emd.Array(np.random.random((3,3))) .. _array-calibrations: The ``Array`` class also natively stores some self-descriptive metadata specifying the data and its coordinate system. Instantiate an Array instance with this calibrating metadata included with e.g. .. code-block:: >>> ar = emd.Array( >>> np.ones((20,40,1000)), >>> name = '3ddatacube', >>> units = 'intensity', >>> dims = [ >>> [0,5], >>> [0,5], >>> [0,0.02], >>> ], >>> dim_units = [ >>> 'nm', >>> 'nm', >>> 'eV' >>> ], >>> dim_names = [ >>> 'x', >>> 'y', >>> 'E', >>> ], >>> ) where ``dims`` generates vectors which calibrate each of the array's axes. In the case above, the two numbers given (e.g. ``[0,5]`` for each of the first two dimensions) are linearly extrapolated, so the first dimension's first 5 pixels correspond to the locations ``[0,5,10,15,20...]``. Printing the array to standard output displays the calibration info .. code-block:: >>> print(array) Array( A 3-dimensional array of shape (20, 40, 1000) called '3ddatacube', with dimensions: x = [0,5,10,...] nm y = [0,5,10,...] nm E = [0.0,0.02,0.04,...] eV ) .. _dim-vectors: The dimension vectors, units, and names can all be retrieved or set after instantiation with various ``Array`` methods like .. code-block:: >>> ar.dims >>> ar.get_dim >>> ar.set_dim >>> ar.set_dim_units >>> ar.set_dim_name See the :ref:`Array ` docs for further discussion. ``Array`` instances have all the normal :ref:`Node ` functionality like ``.metadata`` and ``.tree``. .. _more-data-classes: ***************** More Data Classes ***************** In addition to ``Array``, the normal data-containing classes include ``PointList`` for a set of points in some M dimensional space, and ``PointListArray`` for "ragged array"-like data, with 2+1 dimensional data currently supported. For instantiation and usage, see the :ref:`PointList ` and :ref:`PointListArray ` docstrings. ``emdfile`` also includes a ``Custom`` class, designed for composition of the other class types into a single Node container. See the :ref:`defining classes ` section below. .. _append-examples: *********** Append Mode *********** .. admonition:: Includes * :ref:`Append two EMD trees to one file ` * :ref:`Append new data into an existing EMD tree ` * :ref:`Append-over mode to overwrite data ` .. _append-two-trees-one-file: In addition to writing new files, ``emdfile`` allows appending new data to existing files. If we first write some tree .. code-block:: >>> root1 = emd.Root('root1') >>> root1.tree( ) >>> emd.save(path, root1) and then later make a second tree of data .. code-block:: >>> root2 = emd.Root('root2') >>> root2.tree( ) the second tree can be added to the same file using "append" mode .. code-block:: >>> emd.save(path, root2, mode='a') The two trees will both be saved to the same file, each starting at their own root group just under the HDF5 root, provided that the ``Root`` instances have different names. .. _append-existing-tree: If we append to an existing file using a root with a name already in the file, ``emdfile`` will perform a diffmerge-like operation, i.e. it will compare the two trees, determine which nodes in the incoming tree are new and which already exist, and write the new nodes to the file. Already existing nodes will be skipped if ``mode='a'``, and overwritten if ``mode='ao'``. Note that comparison happens at the level of node *names*: the contents of the nodes are not evaluated the the ``save`` function. For example, if we make a tree and save it .. code-block:: >>> root = emd.Root( 'my_root' ) >>> ar1 = emd.Array(np.ones((5,5)),'array1') >>> root.tree(ar1) >>> emd.save(path, root) then add more data later .. code-block:: >>> ar2 = emd.Array(np.zeros((3,3,3)),'array2') >>> ar1.tree(ar2) then we can grow the tree saved to the filesystem at ``path`` with .. code-block:: >>> emd.save(path, root, mode='a') After the first write operation, the file tree will look like .. code-block:: my_root |---ar1 and after the second operation it will be .. code-block:: my_root |---ar1 |---ar2 .. _append-over-mode: What if the data in ``ar1`` is changed some time after its been written to file? E.g. .. code-block:: >>> ar1.data += np.random.rand((5,5)) In this case, this change will *not* be reflected in the file if we perform a normal append operation like .. code-block:: >>> emd.save(path, root, mode='a') but *will* be reflected in the file if we perform an "append-over" operation, e.g. .. code-block:: >>> emd.save(path, root, mode='ao') Note, however, that this append-over will overwrite every node appearing in both the runtime and filesystem trees (in this case, just ``'ar1'`` and ``'ar2'``). Moreover, the system storage that's been overwritten is not freed by this operation, so overwriting large data blocks is not recommended, unless followed up by re-packing the files, e.g. by subsequently copying then deleting the original file. More targetted save operations - e.g. adding or overwriting a single node, or appending a specific tree branch downstream of a selected node - are also possible. See the :ref:`save ` docs for more info. .. _define-classes: **************** Defining Classes **************** ``emdfile`` is designed for downstream integration, that is, you can build your own Python scripts, modules, and packages which import ``emdfile`` and use it to handle reading and writing operations. For more info, see the :doc:`subclassing guidelines `.