32 D. S. Brennan et al. for document in collection.find( {"population":"index-example"}, {"_id":0} ): pp.pprint(document) Fig. 8 Retrieving documents via the find method ‘1588007839000000000’ on the structure ‘structure-1’ in the population ‘population-1’; however, there would be allowed another channel called ‘world’. The section of the script that creates this index is provided below. db["collection"].create_index([ ("population", pymongo.ASCENDING), ("name", pymongo.ASCENDING), ("timestamp", pymongo.ASCENDING), ("channels.name", pymongo.ASCENDING) ], unique=True) The index created in the example script (see Appendix 4) also enables Channel data for a single timestamp on a Structure within a Population to be split across multiple documents (see Sect. 4.3.6 for the cohesion with the PBSHM Schema Requirements). The division of data across multiple documents can provide infrastructure benefits; however, the main advantage to the PBSHM Frameworks is data security and integrity. Because Channel data are split across multiple documents, PBSHM Framework users uploading data only require insert and find permissions; this means the database can operate as animmutable data store. The advantage to the PBSHM Framework is that data from an end user can be uploaded into the PBSHM database when they become available; this becomes apparent when considering that multiple sub systems may be involved in data capture on a structure. If only a single document is permitted for a timestamp on a Structure within the Population, the end user would create a document when the first set of data was available, then have to find and update the existing document to include additional data. Instead, when permitting multiple documents for a timestamp on a Structure within the Population, the end user is no longer required to find and update existing documents; the end user instead creates a new document for the additional data. An example is provided in this paper which outlines how the same Channel data can be split across multiple files (see Appendix 5). 4.3.6 Querying The PBSHM Schema Requirements state that retrieval of information from the PBSHM database must be easy to understand; however, due to data Channel data being stored as either a single document or multiple documents for a given timestamp on a Structure within a Population, using the standard MongoDBfind method (see Fig. 8) for querying the database would produce different results depending on the setup of the environment. MongoDB provides an aggregate method which enables processing data before returning results. Using the aggregate method, results can be standardised regardless of the single/multiple document environment. The aggregate method takes an array of aggregate operations (pipeline) to perform the data processing. The match operation of an aggregate method is identical to the query parameter of the find method. The aggregate example included (see Fig. 9) performs the following operations; unwinds the Structure documents into virtual Structure document, creating a document for every Channel within a Structure document such that each virtual Structure document contains only one nested Channel object, groups all of the virtual documents based upon the population, name and timestampvalues and then removes the id created to group together the documents on the returned document. A full example detailing how to connect to the server and query the PBSHM Database via both the find and aggregate methods is provided in this paper (see Appendix 6).
RkJQdWJsaXNoZXIy MTMzNzEzMQ==