A big data architecture for integration of legacy systems and data
thesisposted on 05.10.2021, 01:28 authored by Sanjay JhaSanjay Jha
Storing, analysing, and accessing data is a growing problem for organisations. Competitive pressures and new regulations are requiring organisations to efficiently handle increasing volumes and varieties of data, but this does not come cheap. Data sets grow rapidly in part because they are increasingly gathered by cheap and numerous information-sensing Internet of things devices such as mobile devices, aerial (remote sensing), software logs, cameras, microphones, radio-frequency identification (RFID) readers and wireless sensor networks. These kinds of data sets are referred to as big data and are too large or complex for traditional data-processing application software to adequately deal with. As the demands of big data exceed the constraints of traditional relational databases, evaluating legacy data and assessing new technology has become a necessity for most organisations, not only to gain competitive advantage, but also for compliance purposes. The challenge is how well an organisation's legacy data and processes can be integrated into the big data solutions. It is without a doubt that big data must be accommodated and the integration of legacy systems and processes into big data solutions must be dealt with. Legacy systems contain the significant and invaluable business logic of the organisation, with encoded ‘business logic’ that represents many years of coding, development, real-life experiences, enhancements, modifications, and debugging amongst other functions. Most legacy systems were developed without process or data models, which are now needed to support and be integrated into big data. To integrate legacy systems into a big data solution, re-engineering of the legacy processes is required depending on data used from the legacy system. Many approaches to re-engineer legacy systems have been developed; none are focused on integrating legacy systems with big data solutions (Vijaya & Venkataraman, 2018). Integrating legacy systems with big data solutions may change an organisation’s Enterprise Architecture (EA), as EA demonstrates application, data, technology, and business architectures of an organisation. However, addressing the issues and scope related to incorporating legacy systems into big data allows mature legacy systems to become part of overall organisational changes so that big data solutions can be implemented in the organisation. This research addresses issues and concerns of existing legacy systems within an organisation for decision making. This research further focuses on identifying current issues and concerns of integrating big data solutions with legacy systems in organisations and proposes a Big Data Architecture for Integration of Legacy Systems and Data. This research is carried out using a combination of quantitative and qualitative studies. To understand the issues and concerns around integration of big data solutions with legacy systems, a survey was conducted on the practices of how organisations are using big data for different use cases and what practices are being employed to integrate big data solutions. The results from this survey were used to develop a Big Data Architecture for integration of legacy systems and data which addresses the identified issues and concerns of integrating big data solutions with legacy systems. A Big Data Architecture for integration of legacy systems and data was applied in industry to demonstrate the usefulness of the developed artifact of Big Data architecture and the architecture was evaluated using a higher institution. The key results and contributions emerging out of this thesis are listed below and divided into managerial implications, theoretical contributions, and future directions in this research: • Identifying the requirements of organisations trying to achieve Big Data solutions for their organisations and what they are looking for while integrating legacy systems with Big Data solutions contributes to managerial implications. • Identifying Analytics Value Chain as a part of the building blocks for streaming Big Data into EA is a theoretical contribution. • Developing an e-business process model for a Big Data architecture to integrate data from different sources contributes towards the theoretical model which can be used by the organisations. • Applying the Big Data architecture using open-source technologies contributes in verifying the theoretical model. • Applying the Big Data architecture in Higher Education Institutions for Learning Analytics contributes towards validating the theoretical model. • This Big Data architecture developed in this research should be applied to other case studies in different domains such as financials and the health sector to address real-time analytics. However, the possible obstacles the organisations can face are: insufficient understanding of Big Data technologies; complexity of big data technologies; complexity of managing data quality; and Big Data security issues.