Data loading
Data loading, or simply loading, is a part of data processing where data is moved between two systems so that it ends up in a staging area on the target system.
With the traditional extract, transform and load method, the load job is the last step, and the data that is loaded has already been transformed. With the alternative method extract, load and transform, the loading job is the middle step, and the transformed data is loaded in its original format for data transformation in the target system.
Traditionally, loading jobs on large systems have taken a long time, and have typically been run at night outside a company's opening hours.
Purpose
Two main goals of data loading are to obtain fresher data in the systems after loading, and that the loading is fast so that the data can be updated frequently. For full data refresh, faster loading can be achieved by turning off referential integrity, secondary indexes and logging, but this is usually not allowed with incremental update or trickle feed.Types
Data loading can be done either by complete update, incremental loading and updating, or trickle feed. The choice of technique may depend on the amount of data that is updated, changed or added, and how up-to-date the data must be. The type of data delivered by the source system, and whether historical data delivered by the source system can be trusted are also important factors.Full refresh
Full data refresh means that existing data in the target table is deleted first. All data from the source is then loaded into the target table, new indexes are created in the target table, and new measures are calculated for the updated table.Full refresh is easy to implement, but involves moving of much data which can take a long time, and can make it challenging to keep historical data.