Power Query
Power Query is an ETL tool created by Microsoft for data extraction, loading and transformation, and is used to retrieve data from sources, process it, and load them into one or more target systems. Power Query is available in several variations within the Microsoft Power Platform, and is used for business intelligence on fully or partially self-service platforms. It is found in software such as Excel, Power BI, Analysis Services, Dataverse, Power Apps, Azure Data Factory, SSIS, Dynamics 365, and in cloud services such as Microsoft Dataflows, including Power BI Dataflow used with the online Power BI Service or the somewhat more generic version of Microsoft Dataflow used with Power Automate.
ETL is closely related to data modeling, and for transformation, Power Query can be used to develop a logical data model in those cases where the data does not already have one, or where there is a need to further develop the data model.
History
Power Query was first announced in 2011 under the codename "Data Explorer" as part of Azure SQL Labs. In 2013, in order to expand on the self-service business intelligence capabilities of Microsoft Excel, the project was redesigned to be packaged as an add-in Excel and was renamed "Data Explorer Preview for Excel", and was made available for Excel 2010 and Excel 2013. In July 2013 the add-in was removed from preview and renamed to "Power Query". Monthly updates for the add-in were released until 2016, when Power Query was included in Excel natively. In Excel 2016, the function was renamed "Get & Transform" for a short time, but has since been changed back to Power Query.In April 2017, Power Query was made available in Microsoft Analysis Services. With the launch of the Common Data Service in March 2018, Power Query was included as its main data import tool.
M Formula language
Power Query is built on what was then a new query language called M. It is a mashup language designed to create queries that mix together data. It is similar to the F# programming language, and according to Microsoft it is a "mostly pure, higher-order, dynamically typed, partially lazy, functional language." The M language is case-sensitive.Much of the user interaction with Power Query can be done via graphical user interfaces with wizards, and this can be used for many common or basic tasks. It is also possible to use the advanced editing mode where the developer can write in the M formula language; this gives greater expressive power, more possibilities, and can also be used to change the code generated by the graphical wizards.
Let expression
User queries are typically written with a top level let expression. The let expression contains a list of comma-separated named reference bindings and an in expression which is what the let expression evaluates to. The in expression can reference the variables and the variables can reference each other. Backwards and forward referencing is allowed, and self-referencing is allowed by prefixing the @ on the variable. Variables are recursively evaluated as needed to evaluate the in expression. No variable is evaluated more than once.Examples
let
a = "Hello",
b = "World",
result = a & " " & b
in
result
let
result = Fib,
Fib = =>
if iteration = 0 or iteration = 1 then
1
else
let
a = @Fib,
b = @Fib
in
a + b
in
result
Assertions and datatypes
Variables are not typed in Power Query. Instead, an expression can have a type assertion which will evaluate to an error when the expression does not evaluate to a value compatible with the assertion. Assertions can be preceded by nullable to include null in the allowed values.| Name | Description | Datatype | Assertion |
| number | Assertion for integer and floating-point numbers | ||
| int | Signed 32-bit integer | ||
| long | Signed 64-bit integer | ||
| double | IEEE 754 float | ||
| decimal | 128-bit float. Same as C#'s decimal | ||
| time | Time of day | ||
| date | A calendar date ranging from 1 CE to 9999 CE in the Georgian Calendar | ||
| datetime | A composite of the date and time datatypes | ||
| duration | A measurement of elapsed time | ||
| logical | Represents a Boolean true or false value | ||
| text | A Unicode string | ||
| guid | A Globally Unique Identifier | ||
| list | An ordered list of values | ||
| record | An ordered map from text to any value | ||
| table | A 2D matrix where each column has a unique name and type | ||
| function | A power query function | ||
| type | Represents a datatype and may contain assertion information | ||
| action | An internally used datatype | ||
| null | The null singleton | ||
| any | Represents all values | ||
| anynonnull | Represents all values except null | ||
| none | Represents no values and always fails as an assertion | ||
| error | A pseudo value representing an error |