Working with Legacy Relational Data in A Graph-Based World
Main Article Content
Abstract
The Next Generation Linkage Management System (NGLMS) was designed around keeping all data in a graph database. However, this constraint, while easily achievable for greenfield projects and/or new data linkage units, may not be easily met where legacy data exists.
Objectives and Approach
The NGLMS was extended to encompassed system where data was held partially or even completely in a relational database. By grouping the data managed by the NGLMS into system metadata, record link data and record data and allowing system metadata and record data to be stored separately and independently in either a relational or a graph database, the NGLMS allows hybrid installations of mixed graph and relational data and, with some loss of functionality, purely relational installations.
Results
The functionality of the NGLMS was expanded to allow review of existing legacy data stored in a relational database system. Through minor changes to the server used by the NGLMS Clerical Review tool (NGLMS-CR) the review tools was able to present the same interface and allow the same integration as project stored completely within the graph database.
Hybrid projects where link information was stored in a graph could be accommodated with no loss of functionality.
Relational-only projects allowed clerical review of identified clusters in a manner identical to the graph-only NGLMS but involved some curtailment of the advanced clustering, time-slicing, compositional and concurrent functionality of the NGLMS due to the loss of the functionality provided by a graph database. They do provide an upgrade pathway to hybrid projects and then graph-only projects.
Conclusion / Implications
Allowing the NGLMS to be configured as a hybrid system enables a gradual adoption of the NGLMS toolset and software.
- purely relational data still allows the use of the NGLMS-CR with customisable workpools and workflows
- a hybrid relational/graph system where record data is still kept in a relational store but cluster and linkage information is kept in the graph store allows the use of legacy data without major disruption and provides a pathway to full adoption