Discussiones Mathematicae Probability and Statistics 24(2) (2004) 255-280

Application of the Rasch model in categorical pedigree analysis using MCEM: I Binary data

G. Qian, R.M. Huggins

Department of Statistical Science, La Trobe University
VIC, 3086, Australia

D.Z. Loesch

School of Psychological Science, La Trobe University
VIC, 3086, Australia

Abstract

An extension of the Rasch model with correlated latent variables is proposed to model correlated binary data within families. The latent variables have the classical correlation structure of Fisher (1918) and the model parameters thus have genetic interpretations. The proposed model is fitted to data using a hybrid of the Metropolis-Hastings algorithm and the MCEM modification of the EM-algorithm and is illustrated using genotype-phenotype data on a psychological subtest in families where some members are affected by the genetic disorder fragile X. In addition, hypothesis testing and model selection methods based on the Wald statistic are discussed.

Keywords: pedigree analysis, binary data,MCEM algorithm, Metropolis-Hastings algorithm.

2000 Mathematics Subject Classification: 62F10, 62F03, 92D30.

References

[1] J.H. Albert and S. Chib, Bayesian analysis of binary and polychotomous response data, Journal of the American Statistical Association, 88 (1993), 669-679.
[2] J. Albert and M. Ghosh, Item respose modelling, in: Generalized Linear Models A Bayesian Perspective Ed. Dey, D.K. Ghosh, S.K. Mallick, B.K. Marcel Dekker, New York (2000), 173-193.
[3] G.E. Bonney, Regressive logistic models for familial disease and other binary traits, Biometrics 42 (1986), 611-625.
[4] K.S. Chan and J. Ledholter, Monte Carlo EM estimation for time series models involving counts, J. Amer. Stat. Assoc. 90 (1995), 242-252.
[5] S. Chib, Bayesian methods for correlated binary data, in: Generalized Linear Models, A Bayesian Perspective, Ed. Dey, D.K., Ghosh, S.K., Mallick, B.K. Marcel Dekker, New York (2000), 113-131.
[6] S. Chib and E. Greenberg, Understanding the Metropolis-Hastings algorithm, American Statistician 49 (1995), 327-335.
[7] A.P. Dempster, N. Laird and D.B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, J. Royal Stat. Soc. B 39 (1977), 1-38.
[8]G.H. Fischer and I.W. Molenaar, Rasch Models, Foundations, Recent Developments, and Applications, Springer-Verlag, New York 1995.
[9] R.A. Fisher, The correlation between relatives on the supposition of Mendelian inheritance, Trans. of the Royal Society of Edinburgh 52 (1918), 399-433.
[10] P.E.B. FitzGerald and M.W. Knuiman, Interpretation of regressive logistic regression coefficients in analyses of familial data, Biometrics 54 (1998),909-920.
[11] A. Gelman and D.B. Rubin, Inference from iterative simulation using multiple sequences, Statistical Science 7 (1992), 457-472.
[12] S.W. Guo and E.A. Thompson, Monte Carlo estimation of mixed models for large complex pedigrees, Biometrics 50 (1994), 417-432.
[13] J.L. Hopper, Variance components for statistical genetics: applications in medical research to characteristics related to human diseases and health, Statistical Methods in Medical Research 2 (1993), 199-223.
[14] J.L. Hopper and J.D. Mathews, Extensions to multivariate normal models for pedigree analysis, Ann. Hum. Genet. 46 (1982), 373-383.
[15] R.M. Huggins, On robust analysis of pedigree data, Aust J. Stat. 35 (1993), 43-57.
[16] K.L. Lange, J. Westlake and M.A. Spence, Extensions to pedigree analysis, III, Variance components by the scoring method, Ann. Hum. Genet. 39 (1976), 485-491.
[17] D.Z. Loesch, Q.M. Bui, J. Grigsby, E. Butler, J. Epstein, RM. Huggins and AK. Taylor, Effect of the fragile X status categories and the FMRP levels on executive functioning in fragile X males and females, Neuropsychology (2002) (in press).
[18] T.A. Louis, Finding observed information using the EM algorithm, J. Royal Stat. Soc. B 44 (1982), 226-233.
[19] X.L. Meng and D.B. Rubin, Using EM to obtain asymptotic variance-covariance matrices: The SEM algorithm, J. Amer. Stat. Assoc. 86 (1991), 899-909.
[20] G. Rasch, Probabilistic Models for some Intelligence and Attainment Tests, University of Chicago Press, Chicago 1980.
[21] D. Sinha, M.A. Tanner and W.J. Hall, Maximization of the marginal likelihood of grouped survival data, Biometrika 81 (1994), 53-60.
[22]S. Sommer and R.M. Huggins, Variable selection using the Wald test and a robust Cp, Applied Statistics 45 (1996), 15-29.
[23] M.A. Tanner, Tools for Statistical Inference: Methods for the Exploration of Posterior Distributions and Likelihood Functions, 3rd Ed., Springer, New York 1996.
[24] G.C.G. Wei and M.A. Tanner, A Monte Carlo implementation of the EM algorithm and the poor man's data augmentation algorithm, J. Amer. Stat. Assoc. 85 (1990), 699-704.

Received 14 March 2004