Skip to main content

Infrastructure Data

Next page
Big Data

 

The Project and Asset Information Models are a combination of geometrical (graphical) data, non-geometrical data and documents. This means that it is expected that for every component, product, material, and system that makes up any infrastructure and its associated physical assets, there would be some level of geometrical data, non-geometrical data, and associated documents. Figure 18: Information Models and Data illustrates this relationship between information models, information types, applications and file formats. Note file types are indicative only and should not represent endorsement of specific products over open file formats. A list of file formats accepted by the NSW Spatial Data Platform is at Appendix C – Standards.

IDMF Diagram 18
Figure 18: Information Models and Data

A key concept in the development of infrastructure data is the level of data definition required across the different lifecycle stages. ISO 19650 defines this as the “level of information need”, which states that “the quality of each information deliverable should be defined in terms of its granularity to serve the purpose for which the information is required and no more”. From an industry perspective, this information need is referred to as Level of Definition, where the amount of:

  • Geometrical information developed for a given stage is termed “Level of Detail” or LOD, and
  • Non-geometrical information developed is termed “Level of Information” or LOI.

AS ISO 19650-2Organization and digitization of information about buildings and civil engineering works, including building information modelling (BIM) - Information management using building information modelling – Part 2: Delivery phase of the assets details the typical requirements for each Level of Definition across the asset lifecycle. It explains what the information model can be relied upon for at each stage of development process as may be required to support co-ordination activities, logistics planning, programming and cost-planning. This then also determines the required detail within the 3D models developed by the project.

IDMF Lifecycle Chart
Figure 19: Levels of Definition

Agencies need to be specific about the expected minimum levels of definition for project phases, as well as what is required for Operations and Maintenance. Note that LOD500 may not be required for all data developed during a project phase, as this level of information may not be required for operations and maintenance.

It is well understood that not all data is created equal – some data is structured, but from a data volume perspective, most is unstructured. The way the data is collected, processed and analysed all depends on its structure and format.

Structured data is comprised of clearly defined data types whose pattern makes them easily searchable; while unstructured data – “everything else” – is comprised of data that has no pre-defined format or organisation and is usually not as easily searchable. Unstructured data includes formats like audio, video and social media postings. In addition to being collected, processed and analysed in different ways, structured and unstructured data typically reside in different databases to structured data. Figure 20 illustrates the relationships between structured, semi-structured and unstructured data.

IDMF Diagram 20
Figure 20: Structured, Semi-Structured and Unstructured Data
Source: Adapted from “Non-Geometric Information Visualization in BIM: An Approach to Improve Project Team Communication” by Paula Gomez Zamora

There is no conflict between the use of structured and unstructured data, however agencies must be clear about their information requirements to define the most appropriate data structure, including the applications that use the data, e.g. relational databases for structured data, and many other types of applications for unstructured data.

What makes data management increasingly complex is the many disparate data sources, as well as the continuing rise in the volume of data – structured and unstructured. This is increasing the need for agencies to deal with both large volumes of data and large files (structured and unstructured).

The majority of infrastructure data currently available to agencies is unstructured. Unfortunately, the majority of the data currently captured and managed is not machine readable, not interoperable, and not well structured, if at all.

However, regardless of whether data is structured or unstructured, having the most accurate and relevant data available will be key for agencies looking to gain an advantage in making better whole of life decisions on their infrastructure. For overall success, agencies need to properly and effectively analyse all their data, regardless of the source or type to understand how best to maximise the value of infrastructure data.

Structured Data

Structured data is essential in all the stages of a built asset’s lifecycle and the quality of the data must be consistently validated.

During the early stages of the asset lifecycle (Strategic Planning, Planning and Design), quality data is used to assist in decision making. Developing information models with structured data generates value driven data results that can be adapted and derived to provide the best possible outcome.

During the construction stage, structured data is used to ensure the values defining the performance of the products installed in an infrastructure asset meet the design and technical design criteria – it is key in developing the as-built model of an asset.

Based on the time and cost of infrastructure during the Operations and Maintenance stage, it is clear that quality data is critical to support activities such as maintenance scheduling, increasing efficiencies when replacing and upgrading parts, and measuring performance over a time period of actual versus proposed requirements.

Geometrical and Non-Geometrical Data

Structured geometrical data is spatial or object-based data (3D model or graphical representation) of the physical asset, while structured non-geometrical data (e.g. construction schedule) is derived and linked to a geometrical model. Structured data for infrastructure includes the following types of information:

  • Geometrical data:
    • 2D CAD models
    • 3D Models (design, construction and as-builts, etc.)
    • GIS data sets
  • Non-geometrical data (when associated with model geometry):
    • 4D schedule (time)
    • 5D cost (e.g. estimates)
    • 6D asset (for operations and maintenance)
    • Other linked data may include risk, health and safety, sustainability, etc.

Spatial and 3D model data is commonly visualised as geometry (lines, surfaces and solids) with parameters and other aspects of the model linked to it. Non-geometrical data could be derived directly from the model (e.g. areas) and stored in a database, or it could be extracted from an external database (e.g. from suppliers) and be stored in a dataset that is dependent on the geometry (e.g. materials for cost estimation).

What good looks like

  • Clear, prescriptive information requirements (including data required) for infrastructure (for Projects and Operations and Maintenance);
  • Identification of a common identifier to link all information, e.g. through adoption of consistent asset and location classification schemas, ideally compliant with ISO 12006-2 2015 Building construction – Organization of information about construction works – Part 2: Framework for classification and the NSW Standard for Spatially Enabling Information;
  • Clear specifications on formats of geometrical and non-geometrical information deliverables, compliant with open international standards where available (see Appendix C – Standards);
  • Consistent requirements for the exchange of information deliverables at a NSW Government and agency level, which can be consistently communicated to industry and service providers;
  • Appropriate technology infrastructure that supports good data management practices, using open standards and architecture, open data exchange, access control and back up; and
  • Internal data capabilities at NSW Government, agency and project level to view, review, share and store the structured information deliverables (e.g. via Common Data Environments).

How to achieve good practice

  • Develop an agreed approach for infrastructure data management that aligns with the IDMF;
  • Develop a standardised approach to structured (and non-structured) data across the asset lifecycle, noting that the approach depends on specific stage requirements;
  • Incorporate data requirements in procurement processes;
  • Utilise guidance and technical support for infrastructure data procurement; and
  • Utilise data expertise to support projects, information handover and O&M.

Semi-structured Data

Some data used in an infrastructure context is neither structured nor unstructured. Semi-structured data maintains internal tags and markings that identify separate data elements, which then enables information grouping and hierarchies. Both documents and databases can be semi-structured. This type of data, which has critical business usage and value, is typically about 5-10% of the volume of structured, semi-structured and unstructured data. Examples of semi-structured data include:

  • Email: Although more advanced analysis tools are necessary for thread tracking or concept searching, the native email metadata enables classification and keyword searching without any additional tools.
  • IoT sensor data: This type of data will increasingly require more attention from agencies to be better prepared for management of the large volumes of data generated by sensors. See the NSW IoT Policy for more guidance on the management of IoT data.

What good looks like

  • Clarity on data governance for semi-structured data;
  • Consistent requirements for the exchange of semi-structured information to ensure that agencies can engage appropriately with suppliers;
  • Appropriate technology infrastructure to support management of larger volumes of semi-structured data, for instance, data lakes; and
  • Data capabilities to manage, analyse and interpret semi-structured data.

How to achieve good practice

  • Develop an agency approach to increase the value of semi-structured data by improving the classification and metadata of semi-structured data. This will ensure the semi-structured data is less prone to the “garbage in, garbage out” maxim.

Unstructured Data

Unstructured data is most often categorised as qualitative data and is difficult to process and analyse using conventional tools and methods. Unstructured data includes word processing documents, multimedia, video, PDF files, spreadsheets, messaging content, digital pictures and graphics, mobile phone GPS records, satellite imagery, and surveillance imagery. The challenge is that most of this data is used inefficiently. Significant industry effort is devoted to development of automated processes to make sense of the large amounts of data the construction industry produces.

Unstructured data is difficult to deconstruct because it has no pre-defined model, meaning it cannot be organised in relational databases. More than 80 percent of all data generated today is considered unstructured, and this number will continue to rise of technologies such as the Internet of Things. Finding the insight buried within unstructured data is often complex requiring advanced analytics (e.g. Artificial Intelligence) and a high level of technical expertise to really make a difference.

What good looks like

  • Clarity on the uses and value of unstructured data to support agency infrastructure;
  • Consistent requirements for the exchange of unstructured information to ensure that agencies can engage appropriately with suppliers;
  • Appropriate technology infrastructure to support management of much larger volumes of unstructured data; and
  • Data capabilities to manage, analyse and interpret unstructured data.

How to achieve good practice

  • Develop an agency approach to increase the value of unstructured data by adopting appropriate methods and technologies, such as Artificial Intelligence, to support infrastructure management.

 

Next page
Big Data

 


Last updated 24 Nov 2020