Extract, load, transform


Extract, load, transform is an alternative to extract, transform, load used with data lake implementations. In contrast to ETL, in ELT models the data is not transformed on entry to the data lake, but stored in its original raw format. This enables faster loading times. However, ELT requires sufficient processing power within the data processing engine to carry out the transformation on demand, to return the results in a timely manner. Since the data is not processed on entry to the data lake, the query and schema do not need to be defined a priori. ELT is a data pipeline model.

Benefits

Some of the benefits of an ELT process include speed and the ability to handle both structured and unstructured data.

Cloud data lake components

Common storage options

  • AWS
  • *Simple Storage Service
  • *Amazon RDS
  • Azure
  • *Azure Blob Storage
  • GCP
  • *Google Storage

    Querying

  • AWS
  • *Redshift Spectrum
  • * Athena
  • *EMR
  • Azure
  • *Azure Data Lake
  • GCP
  • *BigQuery