# Bivariate Distributions Underlying Responses to Ordinal Variables

*Reviewer 1:*Anonymous

*Reviewer 2:*Anonymous

*Reviewer 3:*Anonymous

**Round 1**

*Reviewer 1 Report*

This paper is well-written and easy to follow. The technical details seem correct, and also the code is a useful supplement.

The paper is concerned with the estimation of a latent correlation underlying two ordinal variables.

The main contribution is to supply code and results for fitting normal mixture distributions to empirical data.

The topic is important, since much of social science data is on the ordinal scale, often analyzed using polychorics.

Although I like the paper, I have several questions and remarks. Also, there is missing som recent literature that I think should be discussed in the manuscript.

MAJOR REMARKS

At the end of the 2nd paragraph it is questioned whether the NT polychromic estimator is robust to departures from the normality assumption, and whether there might be other distributions than the normal that fit better to empirical datasets.

The first question has been negatively answered in e.g., [1] (see below), a central paper which should be referenced in the manuscript. This paper also contains a bootstrap test for underlying multivariate normality, implemented in package discnorm on CRAN.

It seems that the author is unaware of the recent theoretical findings in [2] (see below). That paper (which is to be followed by a general ordinal data paper) shows that with 2X2 tables, all correlations (-1, 1) are compatible with any given 2X2 contingency table. This means that the latent correlation can be anything.! So even though a mixture distribution fits the data table well, there are many other distributions from other classes that also fits the table perfectly. And the correlations of these distributions span from -1 to 1 ! This shows that it is really hard to determine the class of underlying distributions. More assumptions outside of the data are needed (what we call substantive knowledge). For instance, we could assume that the marginal distributions are known, but still the range of possible correlations is very large, as illustrated in the paper. This extends to the general ordinal case, but that paper is under review currently. So a more pessimistic or sober view of the task at hand should be written into the paper. Of course, it is still interesting to look at tests that may rule out some distributional classes.

So, to estimate the latent correlations some assumptions must be made, and the correlation estimated.

The assumptions should be tested. But even if the test is not rejected, the user must be aware that the correlation in theory could be of opposite sign (given that such distributions exist that are perfectly consistent with the data, and with an opposite sign of the correlation that has been estimated). This is quite troublesome, in my opinion.

Concerning the mixture distributions, is the two sub-populations interpretation to be taken seriously? Or is it just to offer mathematical distributional flexibility that two densities are combined? On page 7, lines 210 approx you discuss two subpopulations as real. In the manuscript I could not find how the polychoric correlation is actually estimated for mixture densities? You report on p 7 a positive (+0.85) and a negative (-0.33) correlation. But to be used in SEM, these must be replaced by a single value? Is the way to do that clearly described in the manuscript?

What happens to the asymptotic covariance matrix (GAMMA) of the polychoric correlations under other classes of distributions? I did not see any discussion of this. To be used in SEM this matrix is needed for DWLS and ULS. Scott Monroe in 2018 (MBR) suggested a simulation approach that might be extended also to other distributional classes.

This relates to the question: What if some bivariate normality tests fail, but not all. What to do. To overcome this Maydeu proposed his multivariate test in 09 and later Foldnes and Grønneberg 2019 proposed a bootstrap test in [1].

So some discussion on going from the bivariate case to the general multivariate case would be useful.

In general, the bootstrap test in discnorm package could also be used instead of the LRT test. Not sure which has best power and Type I error control of these. I also ran your empirical test with disc norm and got similar results (in case this is of interest)

*data("DS14")*

*library(discnorm)*

*pvals <- NULL*

*for(i in 3:15)*

* for(j in (i+1):16)*

* pvals <- c(pvals, bootTest(DS14[,c(i,j) ]))*

* *

*> mean(pvals < .05)*

*[1] 0.7362637*

*mean(p.adjust(pvals)>.05)*

*70 (similar to paper?)*

*bootTest(DS14[, 3:16])#normality ?*

*0!!#no*

MINOR REMARKS:

-p1, line 14 (scienceS)

-p2, line 43, maybe add the reference [1] to reference [18], as these are related

-p2, line 48. Remove reference to [12] and the whole sentence. In that paper there was NO departure from normality, as shown by [17]

-p4, line 175. Given [2] below, it is not possible from data alone to determine the most plausible dist

-p4, line 14o. In eq (7) I got confused. Gamma contains the thresholds. But then, inside the integral, gamma should not

Occur inside the skew-normal diet. The skew-normal dist does not care about the thresholds!

-p4, line 140. Eq(8): Please define \pi in the formula.

-p5, line 154. Eq(10). Again, mixture density h() is not dependent upon the thresholds, so do not use gamma inside the integral.

-p5, line 161. Maybe here is the place to define what the overall rho is, that is, how to combine the two correlations into one correlation to be used as the underlying correlation in the mixture diet.

-p6, line 184, please also give the sample size somewhere when introducing DS14

-p10, line 275. I am confused by the many free parameters. If the thresholds are freely estimated, the variances in the mixture density must be restricted? Otherwise the thresholds are not identified? But there is no restriction on the variances in the two normal distributions??

-p10, line 284: I think that once the thresholds have been estimated, e.g. for the first variable. Then these must be fixed when considering other pairs of variables. Eg. The threshold var var1 for pair 1-2 should be equal to the thresholds for var1 in pair 1-3?

-p10, line 288. The bonferroni adjustment has been proposed and studied in [3] below, I believe. Please explain if your approach is different from [3].

References

1 Njål Foldnes & Steffen Grønneberg (2020) Pernicious Polychorics: The Impact and Detection of Underlying Non-normality, Structural Equation Modeling: A Multidisciplinary Journal, 27:4, 525-543, DOI: 10.1080/10705511.2019.1673168

2 Grønneberg S, Moss J, Foldnes N. Partial Identification of Latent Correlations with Binary Data. Psychometrika. 2020 Dec;85(4):1028-1051. doi: 10.1007/s11336-020-09737-y. Epub 2020 Dec 21. PMID: 33346887.

3 Tenko Raykov & George A. Marcoulides (2015) On Examining the Underlying Normal Variable Assumption in Latent Variable Models With Categorical Indicators, Structural Equation Modeling: A Multidisciplinary Journal, 22:4, 581-587, DOI: 10.1080/10705511.2014.937846

*Author Response*

Please see attachment (the attachment contains the responses to all three reviewers)

Author Response File: Author Response.pdf

*Reviewer 2 Report*

The manuscript "Bivariate Distributions Underlying Responses to Ordinal Variables" applies polychoric correlations with various underlying distributions (normal, skew-normal, and mixture) to two empirical datasets. The paper is well-written and easy to understand, however, I have two major conceptual issues that I would like to see the authors address.

(1) My first concern is that I am not sure that the problem of discovering the "correct" underlying distribution is well-defined or mathematically identified.

- For example, suppose that a bivariate normal underlying joint distribution holds - then take a log transformation of one of the two variables. The correlation between the two underlying variables will change as a result of the nonlinear transformation, but the same bivariate counts of polychotomized responses could be observed, so that multiple underlying distributions can lead to the same pattern of data. This is only my intuition, but I would like to see the authors address this topic before using language such as the "correct underlying distribution" or strongly interpreting a finding of (e.g.) a positive shape parameter to mean that the underlying distribution must be right skewed (e.g., p. 6, lines 193-195).

- Notably, the availability of fit statistics is not evidence that the problem is identified. Just because we can reject a bivariate normal doesn't mean that the population distribution is knowable with enough data. Conversely, failure to reject a certain distribution does not imply that that is the most likely underlying distribution.

(2) The authors rely almost exclusively on the use of fit statistics (p-values greater than vs. lower than a threshold) to determine which underlying distribution is most plausible. However, fit statistics are a poor arbiter of truth. For one, fit statistics are highly sensitive to sample size.

- Another criterion that the authors could consider (if finding the true underlying distribution is their main concern) is the positive definite-ness of the resulting correlation *matrices*, in addition to the pairwise correlations. In my experience, most matrices of polychoric correlations are not positive semi-definite, and one plausible cause of this phenomenon may be the misspecification of the underlying distribution.

- The authors might also look at the extent to which the same marginal underlying distribution is identified for the same item across different paired items.

Other issues:

- What sample sizes are recommended for each type of polychoric correlation? How might that affect how you interpret your results and the recommendations you make?

- Section 4.4.2 demonstrates that different choices of distribution can sometimes have a very large effect - what should the reader take away from this?

- Appendix B, line 533 runs off the page

- Your current code seems to depend on packages that you may not control. I would recommend that the authors make clear in their code where and how these packages are used (in case future updates break parts of the code). Citing package versions can also help toward this end.

*Author Response*

Please see attachment (includes responses to all three review reports).

Author Response File: Author Response.pdf

*Reviewer 3 Report*

Thank you for the opportunity to review this manuscript. I really enjoyed reading it and I think it is a very useful publication. The manuscript is well written and easy to follow. In the text that follows I only have some comments and suggestions that the authors may address or not as they see fit. I hope my feedback will be useful.

- In the discussion section, the authors could try to anticipate some lines of future work. For example, what are the implications of the results obtained for analyses that often make use of polychoric correlations, such as factor analysis?
- At the same time, it would be good to make some comment on the sample size required to estimate with sufficient precision this type of correlations. Commenting on this and comparing it with Pearson correlations might be helpful to readers. A sentence on tetrachoric correlations, calculated in the case of dichotomous items as these are relatively frequent in Psychology (e.g., right/wrong and true/false items), could also be included.
- In Table 4, could it be interesting to see the bias in addition to the absolute bias? In this same section, to increase the didactics of the article, perhaps it would be convenient to take a pair of items, show their wording and the estimated values for each type of correlation, trying to validate from a substantive point of view which value seems to make more sense. As it is a simple illustration, it could be done with only one of the datasets. The comparison between "normal" and "mixture" with free rho is of particular interest since this is where the biggest differences occur.

*Author Response*

Please see attachment (includes responses to all three review reports).

Author Response File: Author Response.pdf

**Round 2**

*Reviewer 1 Report*

The paper has been improved a great deal. My minor remarks are as follows:

*- Please check language (is/are)* * *

* - The “gamma" problem I pointed to in my previous review has not been solved. * *For instance, line 138 says that gamma contains the thresholds. In eq gamma appears as the parameter vector for the normal distribution (stated in line 137).* *But the normal distributions does not care about thresholds…. Same problem in eq 5* * *

* - line 91: “it may be difficult” . Here the truth is “impossible”. *

*Author Response*

*Thank you very much for reading and reviewing the revision of our manuscript.*

- Please check language (is/are)

*We carefully read the manuscript and let a colleague read it, and we corrected several mistakes indeed.*

- The “gamma" problem I pointed to in my previous review has not been solved. For instance, line 138 says that gamma contains the thresholds. In eq gamma appears as the parameter vector for the normal distribution (stated in line 137). But the normal distributions does not care about thresholds…. Same problem in eq 5

*Our apologies for not addressing this issue in the previous round of revisions. We misunderstood the reviewer’s point, but now we see that indeed we cannot refer to the parameter vector gamma in these equations. We corrected the issue in equations 3, 5, and 9, so now we only name the parameters that are part of the continuous distribution under consideration. Thank you for spotting these errors.*

- line 91: “it may be difficult” . Here the truth is “impossible”.

*We agree and changed this sentence to:*

*“Although it is clear that the polychoric correlation coefficient can be accurately estimated as long as the underlying distribution giving rise to the observed ordinal responses is known, it is impossible to identify the correct underlying distribution for empirical data.”*

*Reviewer 2 Report*

I am satisfied with the authors' changes and recommend publication.

*Author Response*

Thank you for your constructive review and positive recommendation.