Genome-Wide Linkage and Association Study of Childhood Gender Nonconformity in Males. Alan R. Sanders, Gary W. Beecham, Shengru Guo, Khytam Dawood, Gerulf Rieger, Ritesha S. Krishnappa, Alana B. Kolundzija, J. Michael Bailey & Eden R. Martin. Archives of Sexual Behavior, Sep 13 2021. https://link.springer.com/article/10.1007%2Fs10508-021-02146-x
Abstract: Male sexual orientation is influenced by environmental and complex genetic factors. Childhood gender nonconformity (CGN) is one of the strongest correlates of homosexuality with substantial familiality. We studied brothers in families with two or more homosexual brothers (409 concordant sibling pairs in 384 families, as well as their heterosexual brothers), who self-recalled their CGN. To map loci for CGN, we conducted a genome-wide linkage scan (GWLS) using SNP genotypes. The strongest linkage peaks, each with significant or suggestive two-point LOD scores and multipoint LOD score support, were on chromosomes 5q31 (maximum two-point LOD = 4.45), 6q12 (maximum two-point LOD = 3.64), 7q33 (maximum two-point LOD = 3.09), and 8q24 (maximum two-point LOD = 3.67), with the latter not overlapping with previously reported strongest linkage region for male sexual orientation on pericentromeric chromosome 8. Family-based association analyses were used to identify associated variants in the linkage regions, with a cluster of SNPs (minimum association p = 1.3 × 10–8) found at the 5q31 linkage peak. Genome-wide, clusters of multiple SNPs in the 10–6 to 10–8 p-value range were found at chromosomes 5p13, 5q31, 7q32, 8p22, and 10q23, highlighting glutamate-related genes. This is the first reported GWLS and genome-wide association study on CGN. Further increasing genetic knowledge about CGN and its relationships to male sexual orientation should help advance our understanding of the biology of these associated traits.
Discussion
In this first GWLS on CGN in males, we found genome-wide significant linkage with multipoint support for several linkage regions, most notably at chromosomes 5q31 and 8q24 (Fig. 2, Supplementary Table 1). The strongest multipoint linkage peaks for CGN (Fig. 2) did not align with the strongest such signals from earlier GWLS on male sexual orientation (Sanders et al., 2015). This was not unexpected since while CGN and sexual orientation are associated phenotypes, they are far from being the same and both are traits with complex genetics, and thus, one would not necessarily expect largely overlapping linkage or association patterns. We note that one of the top multipoint peaks from the GWLS (chromosome 5q31, Supplementary Fig. 2) also contains a cluster of 10 associated (10–6 < p < 10–8 p value) SNPs from the GWAS, 2 of which are genome-wide significant associations, thus with both linkage and association positional evidence. However, none of the genes in the immediate region of this cluster have obvious putative connections to CGN.
This initial GWAS report on CGN had some interesting findings as well. Compared to the previous GWAS on male sexual orientation on the same dataset (Sanders et al., 2017), the current CGN GWAS had substantially more regions with SNPs associated at a level of 10–6 < p < 10–8, including two loci (5q31, 10q23) breaching genome-wide significance (Fig. 3, Supplementary Table 2). Possible explanations include a potentially stronger genetic contribution for CGN (versus sexual orientation) and enhanced statistical power for a quantitative measure with CGN (versus a categorical approach for sexual orientation). A recent large association meta-analysis of same-sex sexual behavior found five genome-wide significant loci (Ganna et al., 2019); however, none of those loci overlap with the top GWLS or GWAS findings for CGN in the current study.
We found two loci (5q31, 10q23) with SNPs reaching genome-wide significance (p < 5 × 10–8) for association with CGN and detected several additional regions (Fig. 3, Supplementary Table 2) with promising findings (10–6 < p < 10–8 association p-values). These regions contain a number of genes of putative relevance to the trait, some of which we highlight next. At the 5p13 SNP cluster, the nearest gene is SLC1A3, a brain expressed glutamate transporter which has been implicated in some behavioral phenotypes, e.g., attention deficit hyperactivity disorder, mood disorders, cortico-limbic connectivity during affective regulation (Huang et al., 2019; Medina et al., 2016; Poletti et al., 2018; van Amen-Hellebrekers et al., 2016). The 10q23 SNP cluster overlaps with GRID1, which encodes a glutamate receptor channel subunit, and has also been implicated in various behavioral phenotypes (e.g., mood disorders; Fallin et al., 2005; Zhang et al., 2018) and when deleted in the mouse leads to changes in emotional and social behaviors (Yadav et al., 2012). The SNPs in the 7q32 cluster fall within (3’UTR, synonymous coding) and near LRRC4, which has been implicated in autism spectrum disorders (Du et al., 2020; Um et al., 2018). When deleted (Lrrc4−/−) in the mouse, N-Methyl-D-aspartate receptor (NMDAR, an ionotropic glutamate receptor)-dependent synaptic plasticity in the hippocampus was decreased, and these mice displayed mild social interaction deficits, increased self-grooming, and modest anxiety-like behaviors, which were reversed by pharmacological NMDAR activation (Um et al., 2018). Thus, three of the top associated SNP clusters involve glutamate-related genes which have separate evidence of relevance to other behavioral traits, some of which vary in prevalence by gender (e.g., mood disorder; (Sanders et al., 2010) and references therein) in the general population.
Gene mapping challenges include those inherent to GWLS and GWAS of traits manifesting complex genetics such as CGN, as well as limitations in statistical power given the sample size. We discuss power limitations further in the supplementary text but note here that for traits manifesting complex genetics (such as CGN), contributory genetic variants generally have individually small effects, leading to challenges in generating replicable findings. Other limitations include the current study being on a predominantly European ancestry sample and only on males, using retrospective recall of CGN rather than prospective ratings, and not including a replication sample. Replication and extension efforts are somewhat hampered in that relevant survey questions are often not included in large biobank samples such as for CGN; however, there are more sexuality data-points becoming available in some instances (e.g., sexual orientation and gender identity questions in allofus.nih.gov). Additional and larger studies in the future should provide further insight into genetic contributions to CGN and also to its relationship with sexual orientation.