DAX is a simple language. You can start using DAX in a matter of hours. When it comes to an understanding of the advanced concepts like evaluation contexts, iterations, and context transitions, life with DAX becomes tricky. Especially for those whose most significant coding achievement is writing some Excel formulas. I was one of them.
After a while, I understood that one needs to be patient with it. It just takes some time and practice getting used to along with finding the right resource for reference. I found it in the form of a book: The Definitive Guide to DAX (by Marco Russo and Alberto Ferrari)
After going through the book, I realized that to unleash DAX's full power, a sound understanding of the evaluation context is essential.
The following are the excerpts from the book, along with my understanding.
What is the evaluation context?
The evaluation context is the "environment" within which the DAX calculates the expression.
There are two evaluation contexts in DAX: The Filter context and the Row context.
The filter context filters data, and the row context iterates table.
DAX evaluates all formulas within a respective context. Even though the equation is the same, the result is different because DAX executes the same code against different subsets of data.
We use DAX for writing Measures and Calculated Columns. A DAX measure evaluates in the context of a visual element, or a DAX query. A Calculated Column processed at the row level of the table it belongs to.
Understanding Filter Context:
Consider a simple measure:
Total Orders = SUM(Orders[Order Quantity])
This formula computes the sum of column Order Quantity. We can use this measure in a visual to get the result:
The output above looks obvious, but precisely this is what we asked DAX to do.
We can further slice this value by different dimensions (let’s say, Product Category) to get interesting insights:
Although it is intuitive, instead of showing the total sum of Order Quantity, it is now giving the sum of sales for each product category along with the total. We did not apply the filter in the DAX. This filtering happens outside of the formula.
Essentially, each cell in the visual is creating a filter context(environment) and filter the table by each Product Category and then applying the SUM function on each filter context.
The DAX engine might have some internal optimization to improve speed, but every cell has an independent and autonomous evaluation of underlying DAX expression.
Hence, the Total row in the table above is not a result of summing other cells in the visual.
It is the aggregation of all the rows of the Order table.
Depending on the DAX expression, the total row's result might display a different result, unrelated to the other rows in the same report.
When the Product Category is on the rows, the filter context filters one product category for each cell. We can increase the complexity of the visual by adding Ship Mode on the columns:
Each cell shows a subset of data related to one product category and one ship mode. Now the filter context is filtering the Order table by each Product Category and Ship Mode, and then applying the DAX expression.
The rule is simple: The more dimensions we use to slice and dice, the more columns are being filtered by the filter context in each cell of the matrix visual.
Whether a dimension is on the rows or the columns of the visual, the slicer, part of any filter(visual/report/page), or any other kind of filter we can create with a report doesn't change the fundamental behavior of filter context. All these filters contribute to defining a single filter context that DAX uses to evaluate the formula.
Filter context filters (tables).
Understanding Row Context
Row context iterates over tables and evaluates column values. The simplest way of understanding a row context is to create a calculated column. Consider the following DAX expression:
Sales Amount = Orders[Order Quantity] * Orders[Unit Price]
The expression above creates an additional column with a different value for each row:
Though, this behavior is expected, especially by Excel users, where native Excel tables display similar behavior. It comes as a natural consequence that the final expression computes different values.
As it happened with the filter context, the reason is the presence of an evaluation context. This time, the context does not filter a table. Instead, it identifies the row for which the calculation happens.
The row context references a row in the result of a DAX table expression. It should not be confused with a row in the report. DAX does not have a way to reference a row or a column in the report directly.
The values displayed in a matrix in Power BI and a Pivot Table in Excel are the results of DAX measures computed in a filter context or the values stored in the table as native or calculated columns.
We realized that a calculated column is computed row by row, but how does DAX know which row it is currently iterating? It does because of the row context, which provides the evaluation context(environment) as a row.
With the creation of a calculated column with multiple records present in the table, DAX creates a row context that evaluates the expression iterating over the table by row, using the row context as the cursor.
When we create a calculated column, DAX creates a row context automatically. How can we create the row context manually? The answer is by using iterators.
Sales Amount (Measure) =
SUMX(
Orders,
Orders[Order Quantity] * Orders[Unit Price]
)
In this case, since it is a measure, there is not an automatic row context. SUMX, being an iterator, creates a row context that starts iterating over the Order table, row by row. During the iteration, it executes the second argument of SUMX insider the row context.
It is how the DAX knows which value to use for the two columns used in the expression during each step of the iteration.
Row context exists when:
Creating a calculated column
Computing an expression inside an iteration
There is no other way of creating a row context.
Sales Amount = Orders[Order Quantity] * Orders[Unit Price]
The expression above is valid when executed as a calculated column and invalid if used in a measure. The reason is that the calculated column has an automatic row context, whereas a DAX measure needs an iterator to create a row context.
Iterators in DAX follow below-mentioned steps:
Evaluates the first parameter in the existing context(read filter contexts) to determine the rows to scan
Creates a new row context for each row of the table evaluated in the previous step
Iterates the table and evaluates the second parameter in the existing evaluation context, including the newly created row context.
And last, aggregates the values computed during the previous step.
To summarize: A row context is not filtering the model in any way. It only tells the DAX which row to use from the table. If one wants to apply a filter to the model, then we use the filter context. In the case of row-level calculation, we use the row context.
Filter context filters, row context iterates.
コメント