NM-method
The NM-method or Naszodi–Mendonca [|method] is the operation that can be applied in statistics, econometrics, economics, sociology, and demography to construct counterfactual contingency tables. The method finds the matrix which is "closest" to matrix in the sense of being [|ranked] the same but with the [|row and column totals of a target matrix] . While the row totals and column totals of are known, matrix itself may not be known.
Since the [|solution] for matrix is unique, the NM-method is a function:, where is a row vector of ones of size, while is a column vector of ones of size.
The NM-method was developed by Naszodi and Mendonca to solve for matrix in problems, where matrix is not a sample from the population characterized by the row totals and column totals of matrix, but represents another population.
Their application aimed at quantifying intergenerational changes in the strength of educational homophily and thus measuring the historical change in social inequality between different educational groups in the US between 1980 and 2010. The trend in inequality was found to be U-shaped, supporting the view that with appropriate social and economic policies inequality can be reduced.
Definition of matrix ranking
The closeness between two matrices of the same size can be defined in several ways. The Euclidean distance, and the Kullback–Leibler divergence are two well-known examples.The NM-method is consistent with a definition relying on the ordinal Liu–Lu index which is the slightly modified version of the Coleman-index defined by Eq. in Coleman. According to this definition, matrix is "closest" to matrix, if their Liu–Lu values are the same. In other words, if they are ranked the same by the ordinal Liu–Lu index.
If is a 2×2 matrix, its [|scalar-valued Liu–Lu index] is defined as
, where
Following Coleman, this index is interpreted as the “actual minus expected over maximum minus minimum”, where is the actual value of the entry of the seed matrix ; is its expected value under the counterfactual assumptions that the corresponding row total and column total of are predetermined, while its interior is random. Also, is its minimum value if the association between the row variable and the column variable of is non-negative. Finally, is the maximum value of for given row total and column total.
For matrix of size n×m, the Liu–Lu index was generalized by Naszodi and Mendonca to a matrix-valued index. One of the [|preconditions] for the generalization is that the row variable and the column variable of matrix have to be ordered. Equating the generalized, matrix-valued Liu–Lu index of with that of matrix is equivalent to dichotomizing their ordered row variable and ordered column variable in ways by explointing the ordered nature of the row and column variables. Than, equating the original, [|scalar-valued Liu–Lu indices] of the 2×2 matrices obtained with the dichotomizations. I.e., for any pair of the restriction is imposed, where is the matrix with its being of size, and its being of size. Similarly, is the matrix given by the transpose of with its being of size, and its being of size.
Constraints on the row totals and column totals
Matrix should satisfy not only but also the pair of constraints on its row totals and column totals: and.Solution
Assuming that for all pairs of , the solution for is unique, deterministic, and given by a closed-form formula.For matrices and of size, the solution is
The other 3 cells of are uniquely determined by the row totals and column totals. So, this is how the NM-method works for 2×2 seed tables.
For, and matrices of size , the solution is obtained by dichotomizing their ordered row variable and ordered column variable in [|all possible meaningful ways] before solving number of problems of 2×2 form. Each problem is defined for an pair with, and the target row totals and column totals:, and, respectively. Each problem is to be solved separately by the [|formula] for. The set of solutions determine number of entries of matrix. Its remaining elements are uniquely determined by the target row totals and column totals.
Next, let us see how the NM-method works if matrix is such that the second [|precondition] of is not met for.
If for all pairs of, the solution for is also unique, deterministic, and given by a closed-form formula. However, the corresponding concept of matrix ranking is slightly different from the one [|discussed above]. Liu and Lu define it as , where ; is the smallest integer being larger than or equal to.
Finally, neither the NM-method, nor is defined if pair such that, while for another pair of .
A numerical example
Consider the following complemented with its row totals and column totals and the targets, i.e., the and :| Z | 1 | 2 | 3 | 4 | TOTAL | TARGET |
| 1 | 240 | |||||
| 2 | 235 | |||||
| 3 | 185 | |||||
| 4 | 140 | |||||
| TOTAL | 210 | 230 | 185 | 175 | 800 | |
| TARGET | 1,000 |
As a first step of the NM-method, is multiplied by the, and matrices for each pair of . It yields the following 9 matrices of size 2×2 with their target row totals and column totals:
The next step is to calculate the generalized matrix-valued Liu–Lu index, by applying the formula of the original scalar-valued Liu–Lu index to each of the 9 matrices:
Apparently, matrix is positive. Therefore, the NM-method is defined. [|Solving] each of the 9 problems of the 2×2 form yields 9 entries of the matrix. Its other 7 entries are uniquely determined by the target row totals and column totals. The solution for is:
Another numerical example taken from Abbott et al.(2019)Consider the following complemented with its row totals and column totals and the targets, i.e., the and :
As a first step of the NM-method, is multiplied by the, and matrices for each pair of . It yields the following 4 matrices of size 2×2 with their target row totals and column totals:
|