Most of the organizations use multiple applications, meant for various
purposes (ERP, Vendor Management System, CRM, Loyalty, IoT Sensors,
POS Systems, etc.). These applications might be running in different
technologies and have their own data sets. In order to extract insights
from these data, it is important to integrate them seamlessly. Data
integration is the necessary step where data from different sources are
consolidated. It is the basic step in a Big Data Analytics ecosystem, which
is used to ingest, cleanse, transform and load the data coming from
different data sources to a common data storage (data lake, data
warehouse, data hub, etc.).
Depending on the goals and objectives of the organization, an ETL
process or ELT process can be obtained for Data integration. If huge
volume data needs to be stored in a data lake structured in raw format, an
ELT process will be suitable, where the transformation and treatment of
data happens after loading of the data.On the other hand, for a traditional
data warehouse architecture, the ETL process will be more suitable with
the transformation of data happening before loading the data.
