Parsing Raw Data

fogdb.raw

Module providing raw data parsing capabilities.

fogdb.raw.to_dict(handler)[source]

Read in raw data.

Parameters

handler – Handler class providing key interfacing capabilities. See fogdb.raw.lcl or fogdb.raw.smb for examples.

Returns

Tuple of quadruple nested dicts holding the data keyed by subcategory, keyed by filename - ending, keyed by (found) categories as in:

returned_tuple = (
    {
        "crawford": {  # category
            "common_fruiting_trees": {  # subcategory
                "Cydonia_oblonga": {  # filename - ending
                    "common_names": "Quince",
                    "USDA_hardiness": 4,
                    # ...,
                },
            },
            "less_common_fruiting_trees": {  # another subcategory
                "Armelancher_canadensis": {  # filename - ending
                    "common_names": "Juneberry",
                    "USDA_hardiness": 4,
                    # ...,
                },
            },
        },
    },
    {
        "jacke": {  # another category
            "plant_matrix": {  # source specific subcategory
                "Cydonia_oblonga": {  # filename - ending
                    "common_names": "Quince",
                    "USDA_hardiness": 4,
                    # ...,
                },
            },
        },
    },
)

Return type

tuple

class fogdb.raw.BaseHandler(categories='all', dtype='txt', excl_dirs=('FRITZ',))[source]

Bases: object

Partly abstract Handler base class for mapping raw data.

Parameters
  • categories (str, list, default="all") –

    String or list of strings specifying which categories (i.e. sublevel folders) are used for reading in the data. If "all" is used, all sublevel folders are traversed.

    Can be something like "crawford", "jacke", "myRand0mSUBf0lder", …

  • dtype (str, default="txt") –

    String specifying the data type of the raw datafiles. If "all" is used, data type is not filtered.

    Can be something like "rst", "cfg", …

  • excl_dirs (Container) – Container of strings specifying folder names to excluded during the mapping.

map_raw_data_file_tree()[source]

Return mapped file tree of the expected Raw Data strucure.

fogdb.raw.lcl

Module a handler class for mapping locally stored raw data.

class fogdb.raw.lcl.Handler(top_level_folder, categories='all', dtype='txt', excl_dirs=('FRITZ',))[source]

Bases: fogdb.raw.BaseHandler

Handle data mapping for locally stored raw data.

Parameters
  • top_level_folder (str, pathlib.Path) – String/Path specifying the path of the toplevel folder where the raw data is found.

  • categories (str, list, default="all") –

    String or list of strings specifying which categories (i.e. sublevel folders) are used for reading in the data. If "all" is used, all sublevel folders are traversed.

    Can be something like "crawford", "jacke", "myRand0mSUBf0lder", …

  • dtype (str, default="txt") –

    String specifying the data type of the raw datafiles. If "all" is used, data type is not filtered.

    Can be something like "rst", "cfg", …

  • excl_dirs (Container) – Container of strings specifying folder names to excluded during the mapping.

map_source_file_data(relative_file_path)[source]

Return source file data mappings.

fogdb.raw.smb

Module a handler class for mapping locally stored raw data.

class fogdb.raw.smb.Handler(connection, sharename, top_level_folder, categories='all', dtype='txt', excl_dirs=('FRITZ',))[source]

Bases: fogdb.raw.BaseHandler

Handle data mapping for locally stored raw data.

Parameters
  • connection

    pysmb ConnectionClass instance providing the samba shared network connection under the label of sharename.

    This is usually something like:

    from smb.SMBConnection import SMBConnection
     conn = SMBConnection(
         username="MyUserName",
         password="MyPassword",
         my_name="",
         remote_name="192.168.178.1",
     )
    

  • sharename (str) –

    String labeling the samba shared network service, provided by the connection.

    For a Fritz!Box, this usually is:

    "fritz.nas"
    

  • top_level_folder (str) – String specifying the name of the toplevel folder where the raw data is found.

  • categories (str, list, default="all") –

    String or list of strings specifying which categories (i.e. sublevel folders) are used for reading in the data. If "all" is used, all sublevel folders are traversed.

    Can be something like "crawford", "jacke", "myRand0mSUBf0lder", …

  • dtype (str, default="txt") –

    String specifying the data type of the raw datafiles. If "all" is used, data type is not filtered.

    Can be something like "rst", "cfg", …

  • excl_dirs (Container) – Container of strings specifying folder names to excluded during the mapping.

map_source_file_data(relative_file_path)[source]

Return source file data mappings.