Data integration is the process of combining data from different sources into a single, unified view. To support the aims of NSW Government in supporting Smart Places and a NSW Digital Twin, agencies need to be aware of the requirements to integrate data from different sources. For instance, a typical GIS environment that is used to host a digital twin may not be able to consume and display a 3D infrastructure model without some form of transformation. Integration thus begins with the ingestion process, and includes steps such as cleansing, mapping and transformation. Data integration ultimately enables the analysis of federated models of information to support analytics tools that can produce effective, actionable business intelligence.
For infrastructure data, integration is the process of taking data from a number of disparate sources and making it usable. However, as the number of sources continues to grow the need for effective data integration becomes more important.
There are several key components of data integration relevant to infrastructure data, including:
Data migration
Data migration is the process of moving data between locations, formats or applications. It is often caused by the introduction of a new system or location for the data. One common cause today is the shift from on-premises to cloud-based storage and applications.
This is also relevant to agencies when data is transferred to and from service providers, whether for a project (short duration), or for a services contract to outsource the operations and/or maintenance of state-owned assets (longer duration).
Application integration
Application integration is one approach to achieving interoperability between different business systems. Specifically, it requires approaching problems related to the organisational structure of an agency and arrangements with specific business partners. Some key factors to consider include:
- Interoperability – managing the different operating systems, including data formats so that they can be connected;
- Integration – creation of a standard process for managing the flow of data between applications and systems to ensure consistency; and
- Robustness, stability, scalability – regardless of the solutions implemented, it needs to be able to adapt to changes within the business environment.
Typical solutions also include middleware to help with centralisation and standardisation of data management.
Data aggregation
Data federation
Data federation is becoming increasingly important within the infrastructure space as it supports design activities such as clash detection, and O&M activities such as wayfinding when looking for specific assets to maintain.
Data federation typically creates a virtual database that does not store the source data, but contains information about where the actual data is. Regardless of how and where data is stored, it should be presented as one integrated data set. This quite often implies that data federation involves transformation, cleansing, and, if necessary, enrichment of data.
Data warehousing
Data warehousing aggregates structured data from one or multiple sources in order to compare and analyse the data to achieve greater business intelligence. It is effective for getting a better understanding of the overall performance of infrastructure and associated assets because it makes a wide range of data available for analysis.
Last updated 12 Nov 2020