Measures of Departure from Local Marginal Homogeneity for Square Contingency Tables

Saito, Ken; Takakubo, Nozomi; Ishii, Aki; Nakagawa, Tomoyuki; Tomizawa, Sadao

doi:10.3390/sym14061075

Open AccessArticle

Measures of Departure from Local Marginal Homogeneity for Square Contingency Tables

¹

Department of Information Science, Graduate School of Science and Technology, Tokyo University of Science, Noda City 278-8510, Japan

²

Department of Information Science, Faculty of Science and Technology, Tokyo University of Science, Noda City 278-8510, Japan

^*

Author to whom correspondence should be addressed.

Symmetry 2022, 14(6), 1075; https://0-doi-org.brum.beds.ac.uk/10.3390/sym14061075

Submission received: 24 March 2022 / Revised: 27 April 2022 / Accepted: 19 May 2022 / Published: 24 May 2022

(This article belongs to the Special Issue Advances in Quasi-Symmetry Models)

Download Versions Notes

Abstract

:

When focusing on changes in political party support, it is crucial to determine whether or not there has been a change in the aggregate. From this perspective, various types of marginal homogeneity models have been proposed. We propose local marginal homogeneity models, which indicate that there are symmetric structures of probabilities for only one pair of symmetric marginal probabilities or cumulative probabilities. In addition, we propose two measures, one for nominal categories and one for ordered categories, to express the degree of departure from local marginal homogeneity models. We also apply the measures to data and confirm that the measures help compare the degree of departure from the model in several tables.

Keywords:

harmonic mean; marginal homogeneity; nominal category; ordered category; square contingency table

1. Introduction

Let us consider

r \times r

contingency tables with the same row and column classifications. In such contingency tables, the test of independence is meaningless because the observations are concentrated on the main diagonal cell. Therefore, we perform an analysis with respect to the symmetry of the contingency table. Let

p_{i j}

denote the probability that an observation will fall in the (

i, j

)th cell of the table (

i = 1, \dots, r; j = 1, \dots, r

). For nominal contingency tables, several symmetry models with respect to the main diagonal are considered. The symmetry (S) model (Bowker [1] and Bishop et al. [2]) is defined as

p_{i j} = p_{j i} for all (i, j; i \neq j) .

The partial symmetry (PS) model (Saigusa et al. [3]) is defined as

p_{i j} = p_{j i} for at least one (i, j; i \neq j) .

The local symmetry (LS) model (Saigusa et al. [4]) is defined as

p_{i j} = p_{j i} for only one (i, j; i \neq j) .

The LS model indicates that the cell probability that an observation falls in the ith row category and the jth (

> i

) column category is equal to the probability that the observation falls in the jth row category and the ith column category, for only one (

i, j

). Because of the strong constraints of the S model, various models using marginal probabilities have been proposed to loosen the constraints. The marginal homogeneity (MH) model (Stuart [5]) is defined as

p_{i \cdot} = p_{\cdot i} for all i = 1, \dots, r,

where

p_{i \cdot} = \sum_{t = 1}^{r} p_{i t}

, and

p_{\cdot i} = \sum_{s = 1}^{r} p_{s i}

. The partial marginal homogeneity (PMH) model (Saigusa et al. [6]) is defined as

p_{i \cdot} = p_{\cdot i} for at least one i = 1, \dots, r .

In addition to these, other symmetry (e.g., quasi symmetry [7]) models or asymmetry (e.g., conditional symmetry [8], diagonal-parameter symmetry [9], and linear diagonals-parameter symmetry [10]) models are proposed.

Some symmetry models are also proposed for square contingency tables with ordered categories, including cumulative probabilities from the upper-right and lower-left corners of the table. Let us denote the row and column variables by X and Y, respectively. The cumulative probability is defined as

C_{i j} = \{\begin{matrix} P (X \leq i, Y \geq j) = \sum_{s = 1}^{i} \sum_{t = j}^{r} p_{s t} when i < j, \\ P (X \geq i, Y \leq j) = \sum_{s = i}^{r} \sum_{t = 1}^{j} p_{s t} when i > j . \end{matrix}

Then, the S model can also be expressed as

C_{i j} = C_{j i} for all (i, j; i \neq j) .

The cumulative partial symmetry (CPS) model (Saigusa et al. [11]) is defined as

C_{i j} = C_{j i} for at least one (i, j; i \neq j) .

The cumulative local symmetry (CLS) model (Saigusa et al. [12]) is defined as

C_{i j} = C_{j i} for only one (i, j; i \neq j) .

The CLS model describes the probability that an observation falls in the ith row category or below and the jth (

> i

) column category or above (upper-right corner) is equivalent to the probability that the observation falls in the jth row category or above and the ith column category or below (lower-left corner), for only one (

i, j

). Also proposed are some marginal homogeneity models that have cumulative probabilities. The cumulative probability is defined as

\begin{matrix} G_{1 (i)} & = P (X \leq i, Y \geq i + 1) = \sum_{s = 1}^{i} \sum_{t = i + 1}^{r} p_{s t}, \\ G_{2 (i)} & = P (X \geq i + 1, Y \leq i) = \sum_{s = i + 1}^{r} \sum_{t = 1}^{i} p_{s t} . \end{matrix}

Then, the MH model is expressed as

G_{1 (i)} = G_{2 (i)} for all i = 1, \dots, r - 1 .

The cumulative partial marginal homogeneity (CPMH) model (Nakagawa et al. [13]) is defined as

G_{1 (i)} = G_{2 (i)} for at least one i = 1, \dots, r - 1 .

Some statistics for testing the goodness of fit of the MH model are provided by, for example, Stuart [5], Bhapkar [14], Fleiss and Everitt [15], Bishop et al. [2] and Agresti [16]. Let us now consider several square tables. When there is no structure of MH in any of these tables, we are interested in measuring and comparing the degrees of departure from MH in the tables. The test statistic can be used for testing the goodness-of-fit of the MH model, but the test statistic is not suitable for comparing the degrees of departure from the MH model in several square tables. See Tomizawa et al. [17] for details.

We mention that statistics cannot measure the degree of departure from the model for some contingency tables that do not fit the model. Therefore, measures have been proposed to measure the degree of departure from the model. In the analysis of two-way contingency tables, the degree of departure from independence is assessed by using measures of association between the row and column variables. Measures of association include, for example, Yule’s coefficients of association and colligation [18,19], Cramér’s coefficient [20], and Goodman and Kruskal’s coefficient [21]. For contingency tables with nominal categories, measures to represent the degree of departure from the S, PS, and LS models have been developed (Tomizawa et al. [22], Saigusa et al. [3], and Saigusa et al. [4]). These measures are given by Patil and Taillie as forms of weighted arithmetic, geometric, and harmonic means of a diversity index consisting of cell probabilities [23]. In the sense that the values of these measures do not depend on the order of the categories, these measures may not be suitable for ordered contingency tables. For square contingency tables with ordered categories, several measures of the structure of cumulative probability are proposed that incorporate information about the order of the categories. The measures for the S, CPS, and CLS models are given as weighted arithmetic, geometric, and harmonic means of the diversity index consisting of the cumulative probabilities

C_{i j}

(Tomizawa et al. [24], Saigusa et al. [11], and Saigusa et al. [12]). Similarly, measures to represent the degree of departure from several MH models are proposed. For square contingency tables with nominal categories, the measures for the MH and PMH models are given as weighted arithmetic and geometric means of the diversity index consisting of marginal probabilities (Tomizawa and Makii [25], Altun and Aktaş [26], and Saigusa et al. [6]). The values of these measures do not depend on the order of the categories. For square contingency tables with ordered categories, the measures for the MH and CPMH models are given as weighted arithmetic and geometric means of the diversity index consisting of the cumulative probabilities

G_{1 (i)}

and

G_{2 (i)}

(Tomizawa et al. [17] and Nakagawa et al. [13]).

On the other hand, the Rand index [27] is proposed as a correspondence measure between different partitions. Hubert and Arabie [28] introduce an extension of the Rand index and its application to the rows and columns of contingency tables. The application to contingency tables is based on dividing the entire sample with respect to row and column categories to form a contingency table. Therefore, the symmetry-related measures and Rand index have different objectives. In addition, the Rand index is calculated based on the number of samples in each contingency table cell, while the measures proposed in prior studies and this paper are not.

This paper aims to propose local marginal homogeneity models for marginal probabilities and cumulative probabilities. Moreover, we propose weighted harmonic mean measures for the proposed models. Section 2 proposes new measures for the local homogeneity of marginal probabilities

p_{i \cdot}

and

p_{\cdot i}

with nominal categories and cumulative probabilities

G_{1 (i)}

and

G_{2 (i)}

with ordered categories. Section 3 provides an approximate confidence interval of the measures. Section 4 denotes the properties of the measures using artificial data sets. Section 6 shows examples that apply to the measures.

2. New Models and Measures

In Section 2.1, we propose a new model that has the structure of local marginal homogeneity for a square contingency table with nominal categories; we also propose its measure, which expresses the degree of departure from the model. In Section 2.2, we define another model with cumulative local marginal homogeneity structure for a square contingency table with ordered categories; we also provide its measure.

2.1. For the Nominal Category

For square contingency tables with nominal categories, we propose a local marginal homogeneity (LMH) model defined by

p_{i \cdot} = p_{\cdot i} for only one i (i = 1, \dots, r) .

The LMH model describes that the probability that an observation falls in the ith row category is equal to that of the observation falling in the ith column category, for only one i.

Let us assume that

p_{i \cdot} + p_{\cdot i} \neq 0

(i = 1, \dots, r)

and

p_{i \cdot} \neq p_{\cdot i}

for any i except for only one a. We propose the following measure:

ψ_{M H (H)}^{(λ)} = \frac{\prod_{s = 1}^{r} ψ_{s}^{(λ)}}{\sum_{i = 1}^{r} (π_{i} \prod_{\begin{matrix} s = 1 \\ s \neq i \end{matrix}}^{r} ψ_{s}^{(λ)})} (λ > - 1),

where

π_{i} = (p_{i \cdot} + p_{\cdot i}) / 2

,

p_{1 (i)} = p_{i \cdot} / (p_{i \cdot} + p_{\cdot i})

,

p_{2 (i)} = p_{\cdot i} / (p_{i \cdot} + p_{\cdot i})

,

ψ_{i}^{(λ)} = 1 - \frac{λ 2^{λ}}{2^{λ} - 1} I_{i}^{(λ)},

I_{i}^{(λ)} = \frac{1}{λ} \{1 - {(p_{1 (i)})}^{λ + 1} - {(p_{2 (i)})}^{λ + 1}\} .

For

λ = 0

, we define that

ψ_{M H (H)}^{(0)} = {lim}_{λ \to 0} ψ_{M H (H)}^{(λ)}

. Note that

λ

is a real value chosen by users. The index

I_{i}^{(λ)}

is a diversity index of degree-

λ

for

{p_{1 (i)}, p_{2 (i)}}

. We note that the diversity index includes the Shanon entropy (when

λ = 0

) and the Gini concentration (when

λ = 1

) in special cases. For more details of this diversity index, see Patio and Taillie [23]. We can rewrite submeasure

ψ_{i}^{(λ)}

as follows:

ψ_{i}^{λ} = \frac{λ (λ - 1)}{2^{λ} - 1} D_{i}^{(λ)} ({p_{k (i)}}; \{\frac{1}{2}\}),

D_{i}^{(λ)} ({p_{k (i)}}; \{\frac{1}{2}\}) = \frac{1}{λ (λ + 1)} [p_{1 (i)} \{{(\frac{p_{1 (i)}}{1 / 2})}^{λ} - 1\} + p_{2 (i)} \{{(\frac{p_{2 (i)}}{1 / 2})}^{λ} - 1\}] .

D_{i}^{(λ)}

is a power divergence between two distributions:

{p_{1 (i)}, p_{2 (i)}}

and

{1 / 2, 1 / 2}

. We note that the power divergence includes the Kullback–Leibler (KL) information (when

λ = 0

) and the Pearson chi-squared type discrepancy (when

λ = 1

) in special cases. For more details of the power divergence, see Cressie and Read [29] and Read and Cressie [30]. For any

λ > - 1

, the

ψ_{M H (H)}^{(λ)}

has the following characteristics:

1.: the measure $ψ_{M H (H)}^{(λ)}$ must lie between 0 and 1.
2.: $ψ_{M H (H)}^{(λ)} = 0$ if and only if the LMH model holds.
3.: $ψ_{M H (H)}^{(λ)} = 1$ if and only if the degree of departure from LMH is the maximum, in the sense that $p_{i \cdot} = 0$ (then $p_{\cdot i} > 0$ ) or $p_{\cdot i} = 0$ (then $p_{i \cdot} > 0$ ) for all $i = 1, \dots, r$ .

When the LMH model does not hold, it is easy to see that

ψ_{M H (H)}^{(λ)} = {(\sum_{i = 1}^{r} \frac{π_{i}}{ψ_{i}^{(λ)}})}^{- 1} .

Namely, the measure is expressed as the weighted harmonic mean of {

ψ_{i}^{(λ)}

}.

The measure

ψ_{M H (H)}^{(λ)}

is appropriate for analyzing data on a nominal scale because the value of

ψ_{M H (H)}^{(λ)}

is invariant under the same arbitrary permutation of the row and column categories.

2.2. For the Ordered Category

For square contingency tables with ordered categories, we propose the cumulative local marginal homogeneity (CLMH) model defined by

G_{1 (i)} = G_{2 (i)} for only one (i = 1, \dots, r - 1) .

The CLMH model describes that the probability that an observation falls in the ith row category or below and the

i + 1

th column category or above is equal to the probability that the observation falls in the

i + 1

th row category or above and the ith column category or below, for only one i.

Assume that

G_{1 (i)} + G_{2 (i)} \neq 0 (i = 1, \dots, r - 1)

and

G_{1 (i)} \neq G_{2 (i)}

for any i except for only one a. We propose the following measure:

τ_{M H (H)}^{(λ)} = \frac{\prod_{s = 1}^{r - 1} ω_{s}^{(λ)}}{\sum_{i = 1}^{r - 1} \{(G_{1 (i)}^{*} + G_{2 (i)}^{*}) \prod_{\begin{matrix} s = 1 \\ s \neq i \end{matrix}}^{r - 1} ω_{s}^{(λ)}\}} (λ > - 1),

where

G_{s (i)}^{*} = G_{s (i)} / Δ

(Δ = \sum_{i = 1}^{r - 1} (G_{1 (i)} + G_{2 (i)}))

,

G_{s (i)}^{c} = G_{s (i)} / (G_{1 (i)} + G_{2 (i)})

,

\begin{matrix} ω_{i}^{(λ)} & = 1 - \frac{λ 2^{λ}}{2^{λ} - 1} H_{i}^{(λ)}, \\ H_{i}^{(λ)} & = \frac{1}{λ} \{1 - {(G_{1 (i)}^{c})}^{λ + 1} - {(G_{2 (i)}^{c})}^{λ + 1}\} . \end{matrix}

For

λ = 0

, we define that

τ_{M H (H)}^{(0)} = {lim}_{λ \to 0} τ_{M H (H)}^{(λ)}

. The measure holds the following properties, which are the same as the measure of the LMH model in Section 2.1. For any

λ > - 1

:

(1): the measure $τ_{M H (H)}^{(λ)}$ must lie between 0 and 1.
(2): $τ_{M H (H)}^{(λ)} = 0$ if and only if the probability table has the structure of CLMH.
(3): $τ_{M H (H)}^{(λ)} = 1$ if and only if the probability table has the structure of complete marginal inhomogeneity in the sense that $G_{1 (i)} = 0$ (then $G_{2 (i)} \neq 0$ ) or $G_{2 (i)} = 0$ (then $G_{1 (i)} \neq 0$ ) for all $i = 1, \dots, r - 1$ .

It should be noted that the measure

τ_{M H (H)}^{(λ)}

is expressed as the weighted harmonic mean of

{ω_{s}^{(λ)}}

.

3. Approximate Confidence Interval of the Measures

In this section, we construct an approximate confidence interval for

ψ_{M H (H)}^{(λ)}

and

τ_{M H (H)}^{(λ)}

. As seen in Section 2, the measures

ψ_{M H (H)}^{(λ)}

and

τ_{M H (H)}^{(λ)}

are the functions of

p_{i j}

. For the sake of general discussion, we first consider

Φ^{(λ)}

as a function of

p_{i j}

and construct an approximate confidence interval for it. Then, we obtain the approximate confidence intervals of the measures

ψ_{M H (H)}^{(λ)}

and

τ_{M H (H)}^{(λ)}

by replacing

Φ_{M H (H)}^{(λ)}

with

ψ_{M H (H)}^{(λ)}

and

τ_{M H (H)}^{(λ)}

. Let

n_{i j}

denote the observed frequency in the (

i, j

)th cell of the table (

i = 1, \dots, r; j = 1, \dots, r

). Assuming that a multinomial distribution applies to the

r \times r

table, we consider the approximate standard error and the large-sample confidence interval of the measure

Φ^{(λ)}

using the delta method, the description of which is given by, for example, Bishop et al. [2] and Agresti [31]. The sample version of

Φ^{(λ)}

, i.e.,

{\hat{Φ}}^{(λ)}

, is given by

Φ^{(λ)}

with {

p_{i j}

} replaced by {

{\hat{p}}_{i j}

}, where

{\hat{p}}_{i j} = n_{i j} / N

and

N = \sum_{i = 1}^{r} \sum_{j = 1}^{r} n_{i j}

. Using the delta method,

\sqrt{N} ({\hat{Φ}}^{(λ)} - Φ^{(λ)})

asymptotically (as

N \to \infty

) has a normal distribution with a mean of zero and a variance of

σ^{2}

, where

σ^{2} = \sum_{i = 1}^{r} \sum_{j = 1}^{r} p_{i j} {(\frac{\partial Φ^{(λ)}}{\partial p_{i j}})}^{2} - {(\sum_{i = 1}^{r} \sum_{j = 1}^{r} p_{i j} \frac{\partial Φ^{(λ)}}{\partial p_{i j}})}^{2} (λ > - 1) .

Let

{\hat{σ}}^{2}

denote

σ^{2}

with

{p_{i j}}

replaced by

{{\hat{p}}_{i j}}

. Then,

\hat{σ} / \sqrt{N}

is an estimated approximate standard error for

{\hat{Φ}}^{(λ)}

, and

{\hat{Φ}}^{(λ)} \pm z_{α / 2} \hat{σ} / \sqrt{N}

is the approximate

(1 - α)

confidence limit for

Φ^{(λ)}

, where

z_{α / 2}

is the upper

α / 2

point of the standard normal distribution.

The confidence interval of the measure

ψ_{M H (H)}^{(λ)}

is given by

\partial Φ^{(λ)} / \partial p_{i j}

replaced by

γ_{i j}^{(λ)}

, where

γ_{i j}^{(λ)} = - {(ψ_{M H (H)}^{(λ)})}^{2} \{\frac{1}{{(ψ_{i}^{(λ)})}^{2}} A_{12} (i) + \frac{1}{{(ψ_{j}^{(λ)})}^{2}} A_{21} (j)\} (λ \neq 0),

with

\begin{matrix} A_{12} (i) & = \frac{ψ_{i}^{(λ)}}{2} - \frac{2^{λ - 1} (λ + 1)}{2^{λ} - 1} p_{2 (i)} \{{(p_{1 (i)})}^{λ} - {(p_{2 (i)})}^{λ}\}, \\ A_{21} (i) & = \frac{ψ_{i}^{(λ)}}{2} + \frac{2^{λ - 1} (λ + 1)}{2^{λ} - 1} p_{1 (i)} \{{(p_{1 (i)})}^{λ} - {(p_{2 (i)})}^{λ}\}, \end{matrix}

and the confidence interval of the measure

τ_{M H (H)}^{(λ)}

is also given by

\partial Φ^{(λ)} / \partial p_{i j}

replaced by

β_{i j}^{(λ)}

, where

β_{i j}^{(λ)} = \{\begin{matrix} \frac{{(τ_{M H (H)}^{(λ)})}^{2}}{Δ} \sum_{k = i}^{j - 1} B_{12} (k) + (j - i) \frac{τ_{M H (H)}^{(λ)}}{Δ} (i < j), \\ \frac{{(τ_{M H (H)}^{(λ)})}^{2}}{Δ} \sum_{k = t}^{i - 1} B_{21} (k) + (i - j) \frac{τ_{M H (H)}^{(λ)}}{Δ} (i > j), \end{matrix}

with

\begin{matrix} B_{12} (k) & = \frac{2^{λ} (λ + 1) G_{2 (k)}^{c}}{(2^{λ} - 1) {(ω_{k}^{(λ)})}^{2}} \{{(G_{1 (k)}^{c})}^{λ} - {(G_{2 (k)}^{c})}^{λ}\} - \frac{1}{ω_{k}^{(λ)}}, \\ B_{21} (k) & = \frac{2^{λ} (λ + 1) G_{1 (k)}^{c}}{(2^{λ} - 1) {(ω_{k}^{(λ)})}^{2}} \{{(G_{2 (k)}^{c})}^{λ} - {(G_{1 (k)}^{c})}^{λ}\} - \frac{1}{ω_{k}^{(λ)}}, \end{matrix}

and

γ_{i j}^{(0)} = {lim}_{λ \to 0} γ_{i j}^{(λ)}

,

β_{i j}^{(0)} = {lim}_{λ \to 0} β_{i j}^{(λ)}

.

4. Properties of Measures

In this section, we check the properties of the measures given in this paper and their relationship to the measures proposed in previous studies using artificial data. Firstly, we show that the proposed measures are the smallest in each of the nominal contingency tables and ordered contingency tables. Let us denote the measures for MH and PMH for nominal contingency tables

ψ_{M H (A)}

and

ψ_{M H (G)}

, respectively (see Appendix A). Since the arithmetic mean is larger than the geometric mean, it holds that

ψ_{M H (H)}^{(λ)} \leq ψ_{M H (G)}^{(λ)} \leq ψ_{M H (A)}^{(λ)}

(1)

and the equal signs can be used only when

ψ_{1}^{(λ)} = ψ_{2}^{(λ)} = \dots = ψ_{r}^{(λ)} .

This means that, from the formula

ψ_{i}^{(λ)}

, the ratio of

p_{1 (i)}

and

p_{2 (i)}

is equal for all i.

Let us denote the measure for MH and CPMH for ordered contingency tables

τ_{M H (A)}

and

τ_{M H (G)}

, respectively (see Appendix A). In the same manner as in the discussion above, it holds that

τ_{M H (H)}^{(λ)} \leq τ_{M H (G)}^{(λ)} \leq τ_{M H (A)}^{(λ)}

(2)

and equal signs can be used only when

ω_{1}^{(λ)} = ω_{2}^{(λ)} = \dots = ω_{r - 1}^{(λ)} .

From the formula

ω_{i}^{(λ)}

, the ratio of

G_{1 (i)}^{c}

and

G_{2 (i)}^{c}

is also equal for all i.

Now, we check the above properties by using artificial data, as seen in Table 1 and Table 2. As we can see from a glance at Table 2, properties (1) and (2) are satisfied. Table 1a is a table with

p_{1 \cdot} = p_{\cdot 1}

and

G_{1 (1)} = G_{2 (1)}

. From Table 2a(a) and Table 2b(a), it can be confirmed that

ψ_{M H (H)}^{(λ)} = τ_{M H (H)}^{(λ)} = 0

. In Table 1c,d, as we can see from the actual calculation,

G_{1 (i)}^{c} / G_{2 (i)}^{c}

is equivalent to

1 / 2 or 2

,

ω_{1}^{(λ)} = ω_{2}^{(λ)} = ω_{3}^{(λ)}

and

p_{1 (i)} / p_{2 (i)}

are equal to

1 / 3 or 3

,

ψ_{1}^{(λ)} = ψ_{2}^{(λ)} = ψ_{3}^{(λ)} = ψ_{4}^{(λ)}

, respectively. Therefore, it can be confirmed that

ψ_{M H (H)}^{(λ)} = ψ_{M H (G)}^{(λ)} = ψ_{M H (A)}^{(λ)}

and

τ_{M H (H)}^{(λ)} = τ_{M H (G)}^{(λ)} = τ_{M H (A)}^{(λ)}

from Table 2a(c,d). Table 1b,c has numbers (1) and (4) interchanged.

ψ_{M H (H)}^{(λ)}

is invariant from Table 2a(b,c), but

τ_{M H (H)}^{(λ)}

has changed from Table 2b(b,c). Therefore, it can be confirmed that

τ_{M H (H)}^{(λ)}

is the measure that takes order into account. Table 1e,f provides examples of contingency tables that have the structures with the greatest departures from CLMH and LMH, respectively. They do not necessarily have the same structure.

5. Simulation

This section simulates the probability of coverage of the confidence intervals for the LMH and CLMH model measures.

Simulations were performed on

4 \times 4

randomly generated contingency tables. Tables with sample sizes of 200, 500, and 1000 were generated 1000 times according to the probability structure of the contingency tables. Confidence intervals for the LMH and CLMH measures were calculated with eight lambda values (−0.5, 0.0, 0.5, 1.0, 1.5, 2.0, 2.5, and 3.0) to determine the probability of the actual measures falling within the 95% confidence interval.

The confidence interval is sufficiently reliable since it exceeds 90% in most of the cells in Table 3. The probability of an actual measure falling in the confidence interval increases as the sample size increases, but this is not the case for some cells, e.g., the sample size 1000 for

λ = 0.0

in Table 3a. This may be because when the sample size is large, the simulation completes without problems even when the scale takes extreme values.

6. Example

In this section, we show examples of the adaptation of each measure for nominal or ordered contingency tables.

The first set of data provides an example of a contingency table with nominal categories taken from Upton [32], showing the changes in choice of voting party for the three parties (Conservative, Labour, and Liberal) and abstentions in 1964, 1966, and 1970. Table 4a shows the results of estimating the measure

ψ_{M H (H)}^{(λ)}

for the change in voting party from 1964 to 1966, and Table 4b estimates the measure for the difference in voting party from 1966 to 1970 to see the degree of departure from the LMH model. Table 4a shows that the changes in 1964 and 1966 fit the LMH model well. Table 4b shows that the degree of departure from the LMH model is more significant for the changes in voting party between 1966 and 1970 than between 1964 and 1966.

The second set of data provides an example of a contingency table with ordered categories and is taken from Tominaga [33]; the data show the cross-classifications of occupational statuses for Japanese fathers and their sons in 1955 and 1975. Although it may appear bizarre to think of occupational classes in modern society, we treat them as an ordered category according to the references. The statuses of the category numbers are as follows: (1) professional and managers; (2) clerical and sales; (3) skilled manual, semiskilled manual, and unskilled manual; and (4) farmers. Table 5a shows the results of estimating the measure

τ_{M H (H)}^{(λ)}

for the occupation class of a father and son as of 1955, and Table 5b estimates the measure for the occupational class of a father and son as of 1975 to see the degree of departure from the CLMH model. From Table 5, the values in the confidence interval of

τ_{M H (H)}^{(λ)}

are greater for Table 5b than for Table 5a. Therefore, the degree of departure from the CLMH model for father and son pairs is estimated to be larger in 1975 than in 1955.

7. Concluding Remarks

For

r \times r

square contingency tables, we proposed an LMH model for nominal categories and a CLMH model for ordered categories. In addition, we proposed harmonic mean-type measures of departure from these models. As shown in the example in Section 6, there are two types of categories, namely, nominal and ordered. If we applied an ordered measure to a nominal contingency table, we would introduce extra information; if we used a nominal measure for an ordered contingency table, information about the order would be lost. Therefore, to analyze a contingency table, it is necessary to consider whether the elements of the categories are ordered or not.

As described in Section 1, the measures of MH, PMH, and LMH models are constructed using arithmetic, geometric, and harmonic means, respectively. We seek to express these three measures in a single formula.

Author Contributions

All authors contributed to the writing and reviewing of the paper. Additionally, K.S. and N.T. implemented the method, contributed the original draft, and co-wrote and revised the paper. A.I. and T.N. contributed to the validation and co-wrote the original and revised versions of the paper. S.T. defined and reviewed the methodology and supervised the whole study and the writing of the paper. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available in [32,33].

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Measures Proposed in Previous Studies

The measures for the MH and PMH models for nominal contingency tables and the MH and CPMH models for ordered contingency tables are shown. Assuming that

p_{i \cdot} + p_{\cdot i} \neq 0

, Tomizawa and Makii [25] proposed a measure to represent the degree of departure from the MH model as follows:

ψ_{M H (A)}^{(λ)} = \sum_{i = 1}^{r} π_{i} ψ_{i}^{(λ)} for λ > - 1

where

π_{i} = \frac{p_{i \cdot} + p_{\cdot i}}{2}, p_{1 (i)} = \frac{p_{i \cdot}}{p_{i \cdot} + p_{\cdot i}}, p_{2 (i)} = \frac{p_{\cdot i}}{p_{i \cdot} + p_{\cdot i}},

ψ_{i}^{(λ)} = \{\begin{matrix} 1 - \frac{λ 2^{λ}}{2^{λ} - 1} I_{i}^{(λ)} for λ \neq 0, \\ 1 - \frac{1}{log 2} I_{i}^{(0)} for λ = 0, \end{matrix}

I_{i}^{(λ)} = \{\begin{matrix} \frac{1}{λ} \{1 - {(p_{1 (i)})}^{λ + 1} - {(p_{2 (i)})}^{λ + 1}\} for λ \neq 0, \\ - p_{1 (i)} log p_{1 (i)} - p_{2 (i)} log p_{2 (i)} for λ \neq 0 . \end{matrix}

Saigusa et al. [6] proposed a measure for the PMH model defined by

ψ_{M H (G)}^{(λ)} = \prod_{i = 1}^{r} {(ψ_{i}^{(λ)})}^{π_{i}} for λ > - 1 .

Assuming that

G_{1 (i)} + G_{2 (i)} \neq 0

, Tomizawa et al. [17] proposed a measure to represent the degree of departure from the MH model as follows:

τ_{M H (A)}^{(λ)} = \sum_{i = 1}^{r - 1} (G_{1 (i)}^{*} + G_{2 (i)}^{*}) ω_{i}^{(λ)} for λ > - 1

where

G_{s (i)}^{*} = \frac{G_{s (i)}}{Δ}, Δ = \sum_{i = 1}^{r - 1} (G_{1 (i)} + G_{2 (i)}), G_{s (i)}^{c} = \frac{G_{s (i)}}{G_{1 (i)} + G_{2 (i)}} (s = 1 or 2),

ω_{i}^{(λ)} = \{\begin{matrix} 1 - \frac{λ 2^{λ}}{2^{λ} - 1} H_{i}^{(λ)} for λ \neq 0, \\ 1 - \frac{1}{log 2} H_{i}^{(0)} for λ = 0, \end{matrix}

H_{i}^{(λ)} = \{\begin{matrix} \frac{1}{λ} \{1 - {(G_{1 (i)}^{c})}^{λ + 1} - {(G_{2 (i)}^{c})}^{λ + 1}\} for λ \neq 0, \\ - G_{1 (i)}^{c} log G_{1 (i)}^{c} - G_{2 (i)}^{c} log G_{2 (i)}^{c} for λ \neq 0 . \end{matrix}

Nakagawa et al. [13] proposed a measure for the CPMH model defined by

τ_{M H (G)}^{(λ)} = \prod_{i = 1}^{r - 1} {(ω_{i}^{(λ)})}^{(G_{1 (i)}^{*} + G_{2 (i)}^{*})} for λ > - 1 .

It can be seen that the measure

ψ_{M H (A)}^{(λ)}

and

τ_{M H (A)}^{(λ)}

are weighted arithmetic means of the submeasure

ψ_{i}^{(λ)}

and

ω_{i}^{(λ)}

, respectively.

ψ_{M H (G)}^{(λ)}

and

τ_{M H (G)}^{(λ)}

are also weighted geometric means of the submeasure

ψ_{i}^{(λ)}

and

ω_{i}^{(λ)}

, respectively.

Appendix B. Differentiation of the Proposed Measures

Appendix B.1. Measure of LMH

Consider

p_{i j} (i = 1, \dots, r, j = 1, \dots, r)

. Differentiating

ψ_{M H (H)}^{(λ)}

by

p_{i j}

, we obtain

\begin{matrix} \frac{\partial}{\partial p_{i j}} (ψ_{M H (H)}^{(λ)}) & = {[\sum_{i = 1}^{r} (π_{i} \prod_{\begin{matrix} s = 1 \\ s \neq i \end{matrix}}^{r} ψ_{s}^{(λ)})]}^{- 1} \cdot \frac{\partial}{\partial p_{i j}} \{\prod_{s = 1}^{r} ψ_{s}^{(λ)}\} \\ + \prod_{s = 1}^{r} ψ_{s}^{(λ)} \cdot \frac{\partial}{\partial p_{i j}} {[\sum_{i = 1}^{r} (π_{i} \prod_{\begin{matrix} s = 1 \\ s \neq i \end{matrix}}^{r - 1} ψ_{s}^{(λ)})]}^{- 1} \\ = {(ψ_{M H (H)}^{(λ)})}^{2} \{\frac{π_{i}}{{(ψ_{i}^{(λ)})}^{2}} \cdot \frac{\partial ψ_{i}^{(λ)}}{\partial p_{i j}} + \frac{π_{j}}{{(ψ_{j}^{(λ)})}^{2}} \cdot \frac{\partial ψ_{j}^{(λ)}}{\partial p_{i j}}\} \\ - {(ψ_{M H (H)}^{(λ)})}^{2} \{\frac{1}{ψ_{i}^{(λ)}} \cdot \frac{\partial π_{i}}{\partial p_{i j}} + \frac{1}{ψ_{j}^{(λ)}} \cdot \frac{\partial π_{j}}{\partial p_{i j}}\} . \end{matrix}

Considering the derivative of

ψ_{i}^{(λ)}

and

ψ_{j}^{(λ)}

, we obtain

\begin{matrix} \frac{\partial ψ_{i}}{\partial p_{i j}} & = \frac{2^{λ - 1} (λ + 1)}{2^{λ} - 1} \frac{p_{2 (i)}}{π_{i}} \{{(p_{1 (i)})}^{λ} - {(p_{2 (i)})}^{λ}\}, \\ \frac{\partial ψ_{j}}{\partial p_{i j}} & = - \frac{2^{λ - 1} (λ + 1)}{2^{λ} - 1} \frac{p_{1 (j)}}{π_{j}} \{{(p_{1 (j)})}^{λ} - {(p_{2 (j)})}^{λ}\} . \end{matrix}

Because

\partial π_{i} / \partial p_{i j}

and

\partial π_{j} / \partial p_{i j}

is equal to 1/2, we obtain

\begin{matrix} \frac{\partial}{\partial p_{i j}} (ψ_{M H (H)}^{(λ)}) & = {(ψ_{M H (H)}^{(λ)})}^{2} \{\frac{π_{i}}{{(ψ_{i}^{(λ)})}^{2}} \cdot \frac{\partial ψ_{i}^{(λ)}}{\partial p_{i j}} + \frac{π_{j}}{{(ψ_{j}^{(λ)})}^{2}} \cdot \frac{\partial ψ_{j}^{(λ)}}{\partial p_{i j}}\} \\ - {(ψ_{M H (H)}^{(λ)})}^{2} \{\frac{1}{ψ_{i}^{(λ)}} \cdot \frac{\partial π_{i}}{\partial p_{i j}} + \frac{1}{ψ_{j}^{(λ)}} \cdot \frac{\partial π_{j}}{\partial p_{i j}}\} \\ = - {(ψ_{M H (H)}^{(λ)})}^{2} [\frac{1}{2 ψ_{i}^{(λ)}} - \frac{2^{λ - 1} (λ + 1)}{2^{λ} - 1} \frac{p_{2 (i)}}{{(ψ_{i}^{(λ)})}^{2}} \{{(p_{1 (i)})}^{λ} - {(p_{2 (i)})}^{λ}\}] \\ - {(ψ_{M H (H)}^{(λ)})}^{2} [\frac{1}{2 ψ_{j}^{(λ)}} + \frac{2^{λ - 1} (λ + 1)}{2^{λ} - 1} \frac{p_{2 (j)}}{{(ψ_{j}^{(λ)})}^{2}} \{{(p_{1 (j)})}^{λ} - {(p_{2 (j)})}^{λ}\}] . \end{matrix}

Appendix B.2. Measure of CLMH

Consider

p_{s t} (s < t) (s = 1, \dots, r, t = 1, \dots, r)

. Differentiating

τ_{M H (H)}^{(λ)}

by

p_{s t}

, we obtain

\begin{matrix} \frac{\partial}{\partial p_{s t}} (τ_{M H (H)}^{(λ)}) & = {[\sum_{i = 1}^{r - 1} ((G_{1 (i)}^{*} + G_{2 (i)}^{*}) \prod_{\begin{matrix} s = 1 \\ s \neq i \end{matrix}}^{r - 1} ω_{s}^{(λ)})]}^{- 1} \cdot \frac{\partial}{\partial p_{s t}} \{\prod_{s = 1}^{r - 1} ω_{s}^{(λ)}\} \\ + \prod_{s = 1}^{r - 1} ω_{s}^{(λ)} \cdot \frac{\partial}{\partial p_{s t}} {[\sum_{i = 1}^{r - 1} ((G_{1 (i)}^{*} + G_{2 (i)}^{*}) \prod_{\begin{matrix} s = 1 \\ s \neq i \end{matrix}}^{r - 1} ω_{s}^{(λ)})]}^{- 1} \\ = {(τ_{M H (H)}^{(λ)})}^{2} \{\frac{G_{1 (s)}^{*} + G_{2 (s)}^{*}}{{(ω_{s}^{(λ)})}^{2}} \cdot \frac{\partial ω_{s}^{(λ)}}{\partial p_{s t}} + \dots + \frac{G_{1 (t - 1)}^{*} + G_{2 (t - 1)}^{*}}{{(ω_{t - 1}^{(λ)})}^{2}} \cdot \frac{\partial ω_{t - 1}^{(λ)}}{\partial p_{s t}}\} \\ - {(τ_{M H (H)}^{(λ)})}^{2} \{\frac{1}{ω_{1}^{(λ)}} \cdot \frac{\partial (G_{1 (1)}^{*} + G_{2 (1)}^{*})}{\partial p_{s t}} + \dots + \frac{1}{ω_{r}^{(λ)}} \cdot \frac{\partial (G_{1 (r)}^{*} + G_{2 (r)}^{*})}{\partial p_{s t}}\} . \end{matrix}

Considering the derivative of

ω_{s}^{(λ)}

, we obtain

\frac{\partial ω_{s}^{(λ)}}{\partial p_{s t}} = \frac{2^{λ} (λ + 1) G_{2 (s)}^{c}}{(2^{λ} - 1) (G_{1 (s)} + G_{2 (s)})} ({(G_{1 (s)}^{c})}^{λ} - {(G_{2 (s)}^{c})}^{λ}) .

Consider with respect to the derivative of

G_{1 (i)}^{*} + G_{2 (i)}^{*}

. Assume that

G_{1 (n)}^{*}

contains

p_{s t}

and

G_{1 (m)}^{*}

does not contain

p_{s t}

, we have

\begin{matrix} \frac{\partial (G_{1 (n)}^{*} + G_{2 (n)}^{*})}{\partial p_{s t}} & = \frac{1}{Δ} {1 - (t - s) (G_{1 (n)}^{*} + G_{2 (n)}^{*})}, \\ \frac{\partial (G_{1 (m)}^{*} + G_{2 (m)}^{*})}{\partial p_{s t}} & = - (t - s) (\frac{1}{Δ}) (G_{1 (n)}^{*} + G_{2 (n)}^{*}) . \end{matrix}

Substituting these derivatives into the derivative of

τ_{M H (H)}^{(λ)}

, we obtain

\begin{matrix} \frac{\partial}{\partial p_{s t}} (τ_{M H (H)}^{(λ)}) & = {(τ_{M H (H)}^{(λ)})}^{2} \{\frac{G_{1 (s)}^{*} + G_{2 (s)}^{*}}{{(ω_{s}^{(λ)})}^{2}} \cdot \frac{\partial ω_{s}^{(λ)}}{\partial p_{s t}} + \dots + \frac{G_{1 (t - 1)}^{*} + G_{2 (t - 1)}^{*}}{{(ω_{t - 1}^{(λ)})}^{2}} \cdot \frac{\partial ω_{t - 1}^{(λ)}}{\partial p_{s t}}\} \\ - {(τ_{M H (H)}^{(λ)})}^{2} \{\frac{1}{ω_{1}^{(λ)}} \cdot \frac{\partial (G_{1 (1)}^{*} + G_{2 (1)}^{*})}{\partial p_{s t}} + \dots + \frac{1}{ω_{r}^{(λ)}} \cdot \frac{\partial (G_{1 (r)}^{*} + G_{2 (r)}^{*})}{\partial p_{s t}}\} \\ = \frac{{(τ_{M H (H)}^{(λ)})}^{2}}{Δ} \sum_{k = s}^{t - 1} (\frac{2^{λ} (λ + 1) G_{2 (k)}^{c}}{(2^{λ} - 1) {(ω_{k}^{(λ)})}^{2}} ({(G_{1 (k)}^{c})}^{λ} - {(G_{2 k s)}^{c})}^{λ}) - \frac{1}{ω_{k}^{(λ)}}) \\ + (t - s) \frac{τ_{M H (H)}^{(λ)}}{Δ} . \end{matrix}

Similarly consider

p_{s t} (s > t) (s = 1, \dots, r, t = 1, \dots, r)

. Noting that the derivative of

ω_{s}^{(λ)}

is

\begin{matrix} \frac{\partial ω_{s}^{(λ)}}{\partial p_{s t}} = \frac{2^{λ} (λ + 1) G_{1 (s)}^{c}}{(2^{λ} - 1) (G_{1 (s)} + G_{2 (s)})} ({(G_{2 (s)}^{c})}^{λ} - {(G_{1 (s)}^{c})}^{λ}), \end{matrix}

the derivative of

τ_{M H (H)}^{(λ)}

is

\begin{matrix} \frac{\partial}{\partial p_{s t}} (τ_{M H (H)}^{(λ)}) & = {(τ_{M H (H)}^{(λ)})}^{2} \{\frac{G_{1 (t)}^{*} + G_{2 (t)}^{*}}{{(ω_{t}^{(λ)})}^{2}} \cdot \frac{\partial ω_{t}^{(λ)}}{\partial p_{s t}} + \dots + \frac{G_{1 (s - 1)}^{*} + G_{2 (s - 1)}^{*}}{{(ω_{s - 1}^{(λ)})}^{2}} \cdot \frac{\partial ω_{s - 1}^{(λ)}}{\partial p_{s t}}\} \\ - {(τ_{M H (H)}^{(λ)})}^{2} \{\frac{1}{ω_{1}^{(λ)}} \cdot \frac{\partial (G_{1 (1)}^{*} + G_{2 (1)}^{*})}{\partial p_{s t}} + \dots + \frac{1}{ω_{r}^{(λ)}} \cdot \frac{\partial (G_{1 (r)}^{*} + G_{2 (r)}^{*})}{\partial p_{s t}}\} \\ = \frac{{(τ_{M H (H)}^{(λ)})}^{2}}{Δ} \sum_{k = t}^{s - 1} (\frac{2^{λ} (λ + 1) G_{1 (k)}^{c}}{(2^{λ} - 1) {(ω_{k}^{(λ)})}^{2}} ({(G_{2 (k)}^{c})}^{λ} - {(G_{1 (k)}^{c})}^{λ}) - \frac{1}{ω_{k}^{(λ)}}) \\ + (s - t) \frac{(τ_{M H (H)}^{(λ)})}{Δ} . \end{matrix}

References

Bowker, A.H. A test for symmetry in contingency tables. J. Am. Stat. Assoc. 1948, 43, 572–574. [Google Scholar] [CrossRef]
Bishop, Y.M.; Fienberg, S.E.; Holland, P.W. Discrete Multivariate Analysis: Theory and Practice; The MIT Press: Cambridge, UK, 1975. [Google Scholar]
Saigusa, Y.; Tahata, K.; Tomizawa, S. Measure of departure from partial symmetry for square contingency tables. J. Math. Stat. 2016, 12, 152–156. [Google Scholar] [CrossRef] [Green Version]
Saigusa, Y.; Takami, M.; Ishii, A.; Tomizawa, S. Measure of departure from local symmetry for square contingency tables. Int. J. Stat. Probab. 2019, 8, 140–145. [Google Scholar] [CrossRef]
Stuart, A. A test for homogeneity of the marginal distributions in a two-way classification. Biometrika 1955, 42, 412–416. [Google Scholar] [CrossRef]
Saigusa, Y.; Kubo, Y.; Tahata, K.; Tomizawa, S. A measure of departure from partial marginal homogeneity for square contingency tables. J. Stat. Appl. Probab. Lett. 2020, 7, 1–7. [Google Scholar]
Caussinus, H. Contribution to correlation analysis of two qualitative variables. Ann. Fac. Des Sci. L’Univ. Toulouse 1965, 29, 77–182. (In French) [Google Scholar] [CrossRef]
McCullagh, P. A class of parametric models for the analysis of square contingency tables with ordered categories. Biometrika 1978, 65, 413–418. [Google Scholar] [CrossRef]
Goodman, L.A. Multiplicative models for square contingency tables with ordered categories. Biometrika 1979, 66, 413–418. [Google Scholar] [CrossRef]
Agresti, A. A simple diagonals-parameter symmetry and quasi-symmetry model. Stat. Probab. Lett. 1983, 1, 313–316. [Google Scholar] [CrossRef]
Saigusa, Y.; Takami, M.; Ishii, A.; Nakagawa, T.; Tomizawa, S. Measure for departure from cumulative partial symmetry square contingency tables with ordered categories. J. Stat. Adv. Theory Appl. 2019, 21, 53–70. [Google Scholar] [CrossRef]
Saigusa, Y.; Takada, T.; Ishii, A.; Nakagawa, T.; Tomizawa, S. Measure of departure from cumulative local symmetry for square contingency tables having ordered categories. Biom. Lett. 2020, 57, 23–35. [Google Scholar] [CrossRef]
Nakagawa, T.; Takei, T.; Ishii, A.; Tomizawa, S. Geometric mean type measure of marginal homogeneity for square contingency tables with ordered categories. J. Math. Stat. 2020, 16, 170–175. [Google Scholar] [CrossRef]
Bhapkar, V.P. A note on the equivalence of two test criteria for hypotheses in categorical data. J. Am. Stat. Assoc. 1966, 61, 228–235. [Google Scholar] [CrossRef]
Fleiss, J.L.; Everitt, B.S. Comparing the marginal totals of square contingency tables. Br. J. Math. Stat. Psychol. 1971, 24, 117–123. [Google Scholar] [CrossRef]
Agresti, A. Testing marginal homogeneity for ordinal categorical variables. Biometrics 1983, 39, 505–510. [Google Scholar] [CrossRef]
Tomizawa, S.; Miyamoto, N.; Ashihara, N. Measure of departure from marginal homogeneity for square contingency tables having ordered categories. Behaviormetrika 2003, 30, 173–193. [Google Scholar] [CrossRef]
Yule, G.U. On the association of attributes in statistics. Philos. Trans. R. Soc. London. Ser. A Contain. Pap. Math. Phys. Character 1900, 194, 257–319. [Google Scholar]
Yule, G.U. On the methods of measuring association between two attributes. J. R. Stat. Soc. 1912, 75, 579–652. [Google Scholar] [CrossRef] [Green Version]
Cramér, H. Mathematical Methods of Statistics; Princeton University Press: Princeton, NJ, USA, 1946. [Google Scholar]
Goodman, L.A.; Kruskal, W.H. Measures of association for cross classifications. J. Am. Stat. Assoc. 1954, 49, 732–764. [Google Scholar]
Tomizawa, S.; Seo, T.; Yamamoto, H. Power-divergence-type measure of departure from symmetry for square contingency tables that have nominal categories. J. Appl. Stat. 1998, 25, 387–398. [Google Scholar] [CrossRef]
Patil, G.P.; Taillie, C. Diversity as a concept and its measurement. J. Am. Stat. Assoc. 1982, 77, 548–561. [Google Scholar] [CrossRef]
Tomizawa, S.; Miyamoto, N.; Hatanaka, Y. Measure of asymmetry for square contingency tables having ordered categories. Aust. N. Z. J. Stat. 2001, 43, 335–349. [Google Scholar] [CrossRef]
Tomizawa, S.; Makii, T. Generalized measures of departure from marginal homogeneity for contingency tables with nominal categories. J. Stat. Res. 2001, 35, 1–24. [Google Scholar]
Altun, G.; Aktaş, S. Measures of departure from marginal homogeneity model in square contingency tables. İstat. Derg. İstat. Aktüerya 2018, 11, 93–108. [Google Scholar]
Rand, W.M. Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc. 1971, 66, 846–850. [Google Scholar] [CrossRef]
Hubert, L.; Arabie, P. Comparing partitions. J. Classif. 1985, 2, 193–218. [Google Scholar] [CrossRef]
Cressie, N.; Read, T.R. Multinomial goodness-of-fit tests. J. R. Stat. Soc. Ser. B (Methodol.) 1984, 46, 440–464. [Google Scholar] [CrossRef]
Read, T.R.; Cressie, N.A. Goodness-of-Fit Statistics for Discrete Multivariate Data; Springer Science and Business Media: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
Agresti, A. Categorical Data Analysis, 2nd ed.; Wiley: New York, NY, USA, 2002. [Google Scholar]
Upton, G.J.G. A memory model for voting transitions in British elections. J. R. Stat. Soc. Ser. A (Gen.) 1977, 140, 86–94. [Google Scholar] [CrossRef]
Tominaga, K. Nippon no Kaisou Kouzou (Japanese Hierarchical Structure); University of Tokyo Press: Tokyo, Japan, 1979. (In Japanese) [Google Scholar]

Table 1. Artificial data.

(a)						(d)
	(1)	(2)	(3)	(4)	Total		(1)	(2)	(3)	(4)	Total
(1)	0.12	0.09	0.07	0.02	0.30	(1)	0.02	0.09	0.12	0.04	0.27
(2)	0.08	0.09	0.12	0.02	0.31	(2)	0.02	0.03	0.03	0.02	0.10
(3)	0.06	0.03	0.06	0.05	0.20	(3)	0.02	0.01	0.08	0.04	0.15
(4)	0.04	0.01	0.08	0.06	0.19	(4)	0.03	0.17	0.22	0.06	0.48
Total	0.30	0.22	0.33	0.15	1.00	Total	0.09	0.30	0.45	0.16	1.00
(b)						(e)
	(1)	(2)	(3)	(4)	Total		(1)	(2)	(3)	(4)	Total
(1)	0.16	0.12	0.05	0.03	0.36	(1)	0.00	0.20	0.00	0.10	0.30
(2)	0.02	0.10	0.03	0.02	0.17	(2)	0.00	0.00	0.30	0.05	0.35
(3)	0.04	0.01	0.14	0.02	0.21	(3)	0.00	0.00	0.00	0.35	0.35
(4)	0.04	0.10	0.00	0.12	0.26	(4)	0.00	0.00	0.00	0.00	0.00
Total	0.26	0.33	0.22	0.19	1.00	Total	0.00	0.20	0.30	0.50	1.00
(c)						(f)
	(1)	(2)	(3)	(4)	Total		(1)	(2)	(3)	(4)	Total
(1)	0.12	0.10	0.00	0.04	0.26	(1)	0.00	0.20	0.00	0.45	0.65
(2)	0.02	0.10	0.03	0.02	0.17	(2)	0.00	0.00	0.00	0.00	0.00
(3)	0.02	0.01	0.14	0.04	0.21	(3)	0.00	0.05	0.00	0.30	0.35
(4)	0.03	0.12	0.05	0.16	0.36	(4)	0.00	0.00	0.00	0.00	0.00
Total	0.19	0.33	0.22	0.26	1.00	Total	0.00	0.25	0.00	0.75	1.00

Table 2. Values of six measures for Table 1 that are related to various Marginal Homogeneity models.

(a) Measures of nominal categories
			Applied tables
			(a)	(b)	(c)	(d)	(e)	(f)
${\hat{ψ}}_{MH (A)}^{(λ)}$	$λ$	0.00	0.019	0.029	0.029	0.189	0.416	1.000
		0.50	0.024	0.036	0.036	0.230	0.420	1.000
		1.00	0.026	0.039	0.039	0.250	0.422	1.000
${\hat{ψ}}_{MH (G)}^{(λ)}$	$λ$	0.00	0.000	0.011	0.011	0.189	0.076	1.000
		0.50	0.000	0.014	0.014	0.230	0.087	1.000
		1.00	0.000	0.016	0.016	0.250	0.092	1.000
${\hat{ψ}}_{MH (H)}^{(λ)}$	$λ$	0.00	0.000	0.002	0.002	0.189	0.012	1.000
		0.50	0.000	0.002	0.002	0.230	0.015	1.000
		1.00	0.000	0.002	0.002	0.250	0.017	1.000
(b) Measures of ordered categories
			Applied tables
			(a)	(b)	(c)	(d)	(e)	(f)
${\hat{τ}}_{MH (A)}^{(λ)}$	$λ$	0.00	0.022	0.060	0.082	0.180	1.000	0.877
		0.50	0.028	0.075	0.101	0.217	1.000	0.897
		1.00	0.031	0.082	0.111	0.234	1.000	0.905
${\hat{τ}}_{MH (G)}^{(λ)}$	$λ$	0.00	0.000	0.052	0.082	0.046	1.000	0.847
		0.50	0.000	0.065	0.101	0.056	1.000	0.878
		1.00	0.000	0.071	0.111	0.060	1.000	0.889
${\hat{τ}}_{MH (H)}^{(λ)}$	$λ$	0.00	0.000	0.044	0.082	0.004	1.000	0.811
		0.50	0.000	0.055	0.101	0.005	1.000	0.855
		1.00	0.000	0.061	0.111	0.006	1.000	0.871

Table 3. Simulation results for LMH and CLMH.

(a) Results for LMH				(b) Results for CLMH
$λ$	Sample Size			$λ$	Sample Size
$λ$	200	500	1000	$λ$	200	500	1000
−0.5	0.941	0.955	0.949	−0.5	0.874	0.885	0.940
0.0	0.939	0.929	0.897	0.0	0.946	0.951	0.954
0.5	0.874	0.890	0.918	0.5	0.906	0.948	0.885
1.0	0.949	0.941	0.965	1.0	0.942	0.940	0.947
1.5	0.940	0.956	0.910	1.5	0.937	0.950	0.952
2.0	0.962	0.940	0.951	2.0	0.934	0.962	0.917
2.5	0.939	0.851	0.923	2.5	0.939	0.934	0.948
3.0	0.934	0.948	0.943	3.0	0.936	0.927	0.875

Table 4. The estimated measures, estimated approximate standard errors, and approximate 95% confidence interval for

ψ_{M H (H)}^{(λ)}

, applied to voting changes in the 1964, 1966, and 1970 British elections; taken from Upton [32].

Table 4. The estimated measures, estimated approximate standard errors, and approximate 95% confidence interval for

ψ_{M H (H)}^{(λ)}

, applied to voting changes in the 1964, 1966, and 1970 British elections; taken from Upton [32].

(a) Result of voting changes between the 1966 and 1964 British elections
$λ$	Estimated measure	Standard error	Confidence interval
−0.5	0.0000	0.0005	(−0.0009, 0.0010)
0.0	0.0001	0.0008	(−0.0015, 0.0016)
0.5	0.0001	0.0010	(−0.0019, 0.0021)
1.0	0.0001	0.0011	(−0.0021, 0.0023)
1.5	0.0001	0.0011	(−0.0021, 0.0023)
2.0	0.0001	0.0011	(−0.0021, 0.0023)
2.5	0.0001	0.0010	(−0.0019, 0.0021)
3.0	0.0001	0.0009	(−0.0018, 0.0020)
(b) Result of voting changes between the 1966 and 1970 British elections
$λ$	Estimated measure	Standard error	Confidence interval
−0.5	0.0079	0.0033	(0.0014, 0.0144)
0.0	0.0133	0.0056	(0.0024, 0.0243)
0.5	0.0167	0.0070	(0.0030, 0.0304)
1.0	0.0184	0.0077	(0.0033, 0.0335)
1.5	0.0188	0.0079	(0.0034, 0.0343)
2.0	0.0184	0.0077	(0.0033, 0.0335)
2.5	0.0173	0.0072	(0.0031, 0.0315)
3.0	0.0158	0.0066	(0.0028, 0.0288)

Table 5. The estimated measures, estimated approximate standard errors, and approximate 95% confidence interval for

τ_{M H (H)}^{(λ)}

, applied to cross-classifications of the occupational statuses of Japanese fathers and sons in 1955 and 1975 (Tominaga [33]).

Table 5. The estimated measures, estimated approximate standard errors, and approximate 95% confidence interval for

τ_{M H (H)}^{(λ)}

, applied to cross-classifications of the occupational statuses of Japanese fathers and sons in 1955 and 1975 (Tominaga [33]).

(a) Result in 1955
$λ$	Estimated measure	Standard error	Confidence interval
−0.5	0.0032	0.0094	(−0.0151, 0.0216)
0.0	0.0055	0.0158	(−0.0255, 0.0364)
0.5	0.0068	0.0198	(−0.0319, 0.0456)
1.0	0.0076	0.0218	(−0.0352, 0.0504)
1.5	0.0078	0.0224	(−0.0361, 0.0516)
2.0	0.0076	0.0218	(−0.0352, 0.0504)
2.5	0.0071	0.0205	(−0.0331, 0.0474)
3.0	0.0065	0.0188	(−0.0303, 0.0433)
(b) Result in 1975
$λ$	Estimated measure	Standard error	Confidence interval
−0.5	0.0713	0.0196	(0.0328, 0.1098)
0.0	0.1172	0.0314	(0.0556, 0.1788)
0.5	0.1443	0.0379	(0.0700, 0.2187)
1.0	0.1576	0.0410	(0.0773, 0.2379)
1.5	0.1611	0.0417	(0.0793, 0.2428)
2.0	0.1576	0.0410	(0.0773, 0.2379)
2.5	0.1495	0.0392	(0.0726, 0.2265)
3.0	0.1385	0.0369	(0.0662, 0.2109)

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Saito, K.; Takakubo, N.; Ishii, A.; Nakagawa, T.; Tomizawa, S. Measures of Departure from Local Marginal Homogeneity for Square Contingency Tables. Symmetry 2022, 14, 1075. https://0-doi-org.brum.beds.ac.uk/10.3390/sym14061075

AMA Style

Saito K, Takakubo N, Ishii A, Nakagawa T, Tomizawa S. Measures of Departure from Local Marginal Homogeneity for Square Contingency Tables. Symmetry. 2022; 14(6):1075. https://0-doi-org.brum.beds.ac.uk/10.3390/sym14061075

Chicago/Turabian Style

Saito, Ken, Nozomi Takakubo, Aki Ishii, Tomoyuki Nakagawa, and Sadao Tomizawa. 2022. "Measures of Departure from Local Marginal Homogeneity for Square Contingency Tables" Symmetry 14, no. 6: 1075. https://0-doi-org.brum.beds.ac.uk/10.3390/sym14061075

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Measures of Departure from Local Marginal Homogeneity for Square Contingency Tables

Abstract

1. Introduction

2. New Models and Measures

2.1. For the Nominal Category

2.2. For the Ordered Category

3. Approximate Confidence Interval of the Measures

4. Properties of Measures

5. Simulation

6. Example

7. Concluding Remarks

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A. Measures Proposed in Previous Studies

Appendix B. Differentiation of the Proposed Measures

Appendix B.1. Measure of LMH

Appendix B.2. Measure of CLMH

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI