Monday, January 13, 2020

The developing neonatal cortical folding is unique enough to be considered as a “fingerprint” that can reliably identify an individual within a cohort of infants, even monozygotic twins with similar developmental environments

Individual identification and individual variability analysis based on cortical folding features in developing infant singletons and twins. Dingna Duan et al. Human Brain Mapping, January 12 2020.

Abstract: Studying the early dynamic development of cortical folding with remarkable individual variability is critical for understanding normal early brain development and related neurodevelopmental disorders. This study focuses on the fingerprinting capability and the individual variability of cortical folding during early brain development. Specifically, we aim to explore (a) whether the developing neonatal cortical folding is unique enough to be considered as a “fingerprint” that can reliably identify an individual within a cohort of infants; (b) which cortical regions manifest more individual variability and thus contribute more for infant identification; (c) whether the infant twins can be distinguished by cortical folding. Hence, for the first time, we conduct infant individual identification and individual variability analysis involving twins based on the developing cortical folding features (mean curvature, average convexity, and sulcal depth) in 472 neonates with 1,141 longitudinal MRI scans. Experimental results show that the infant individual identification achieves 100% accuracy when using the neonatal cortical folding features to predict the identities of 1‐ and 2‐year‐olds. Besides, we observe high identification capability in the high‐order association cortices (i.e., prefrontal, lateral temporal, and inferior parietal regions) and two unimodal cortices (i.e., precentral gyrus and lateral occipital cortex), which largely overlap with the regions encoding remarkable individual variability in cortical folding during the first 2 years. For twins study, we show that even for monozygotic twins with identical genes and similar developmental environments, their cortical folding features are unique enough for accurate individual identification; and in some high‐order association cortices, the differences between monozygotic twin pairs are significantly lower than those between dizygotic twins. This study thus provides important insights into individual identification and individual variability based on cortical folding during infancy.


4.1 Infant identification based on cortical folding features

The first main contribution of this study resides in the finding that the cortical folding morphology fingerprints the dynamic developing infant brain and is reliable for individual identification during the first postnatal years. Despite the dramatic global change in cortical size, shape and folding features during birth and 2 years of age (Li, Nie, Wang, Shi, Lin, et al., 2013a; Li, Wang, Shi, Lyall et al., 2014; Lyall et al., 2014; Meng et al., 2014), as also shown in Figure 1, we achieved promising accuracies in identifying 1‐ and 2‐year‐old brains from neonatal cortices using the combinations of cortical folding features (Table 2 ). More importantly, we can thus anticipate that the evidenced fingerprinting power of the neonatal brain of a specific subject can be carried out across the whole lifespan. The reasons for this assumption are in two aspects. First, all major cortical folds and individual variability patterns of the human brain are established at term birth (Duan et al., 2018; Hill et al., 2010; Li, Wang, Shi, Lin, & Shen, 2014a). Second, the most dynamic phase of postnatal brain development is the first 2 years of life, and the folding patterns only undergo minor changes during later childhood and adulthood, thus the 2‐year‐olds' brains largely resemble the adult brains in cortical folding (Gilmore et al., 2007; Li, Nie, Wang, Shi, Lin, et al., 2013a; Li, Nie, Wang, Shi, Gilmore, et al., 2014). Hence, once the 2‐year‐olds can be correctly identified, the possibility of identifying the adult brains based on their neonatal cortical folding patterns would be very high. However, further investigations are required to validate this assumption using a larger longitudinal dataset covering both developing and adult periods.
Table 2 provides us further insights into the infant identification tasks from neonatal cortical folding. Specifically, first, the combination of three kinds of cortical folding features can slightly improve identification accuracies compared to any single feature. Though the improvement is not significant, we prefer to adopt the combinations of three features into the proposed individual identification framework because of two reasons: (a) the mean curvature, average convexity, and sulcal depth provide complementary morphological information of cortical folding from different aspects, as described at the beginning of Section 2.3; (b) the accuracies are all 100% in all tasks, outperforming any single feature and other feature combinations. Second, the identification accuracies in the first two tasks using neonatal brain to identify 1‐ and 2‐year‐olds are lower than that in the third task using 1‐year‐olds to identify 2‐year‐olds. Compared to the first two tasks, the third task is more similar to the adult individual identification, due to the moderate change of cortical folding from Year 1 to Year 2. Thus, these results indirectly validate that the infant individual identification involving neonates is much more challenging than the adult individual identification. Furthermore, the identification accuracies in the first task (i.e., using the dataset with scans at Year 0 to predict the identities of scans at Year 1) are sometimes lower than that of the second task (i.e., using the dataset with scans at Year 0 to predict the identities of scans at Year 2). It might seem less reasonable, since the first task should be easier than the second one, because of the smaller brain development in the first year, in comparison to the first 2 years. To analyze whether this result is caused by the imbalanced datasets in the first two tasks, we repeated experiments with balanced datasets based on both ROI‐based and global‐based methods, as shown in Table S3. Here, to obtain the balanced Year 1 and Year 2 datasets, we randomly selected 200 subjects for 10 times from their original datasets, respectively. Table S3 shows the averaged accuracies of 10 times experiments. As we can see, the accuracies in Task 1 are still lower than Task 2 in some experiments. Excluding the reason of imbalanced datasets, we speculate that the different surface quality in Year 1 and Year 2 datasets might be responsible for this unexpected result in Table 2. Specifically, in the image processing pipeline, the surfaces in Year 0 dataset are reconstructed based on the segmentation results obtained from T2‐weighted images, which show better tissue contrast than the T1‐weighted images of neonatal brains, while the surfaces in Year 1 and Year 2 datasets are reconstructed from T1‐weighted images. Due to the poorer contrast of T1‐weighted images at Year 1 compared with those at Year 2, the surface quality of images in Year 1 dataset in the first task is poorer than that in Year 2 dataset in the second task, which thus might lead to the unexpected slightly lower identification accuracies in the first task.
To handle the case where the subject to be identified has no corresponding scan in the dataset, we set a threshold of the ratio between the frequencies of the first ranked potential identity and the second ranked potential identity. We recorded the ratios during all subjects' identification procedures, and found that the minimum ratios in the first two tasks are 2.0 and 2.2, respectively. The distributions of the ratios are displayed in histograms in Figure 7. Of note, choosing a proper threshold of the ratios is important for the individual identification method. If the threshold is too large, the FPR would be 0, but the FNR would be large; if it is too small, the FPR would be large, and the FNR would be 0. In both situations with improper thresholds, the identification accuracies would be low. The accuracies, FPRs and FNRs based on different thresholds in inverse tasks are displayed in Table S1. Here, we set the threshold to 2.0 according to the above minimum ratio in the first two tasks. Based on thresholding, if a new coming scan has no corresponding scan in the early dataset, we would reject it, thus controlling the false discovery rate.

4.3 Twins study: Individual identification and cortical variability

The third main contribution of our study is the discovery that both the MZ and DZ twins' brains can be correctly identified using the cortical folding features despite similar genetic and environmental influences. Table 5 demonstrates that the cortical folding features are reliable for identifying both infant MZ and DZ twins, and there is no statistical significant difference in the difficulty degree between their identification. Besides, the accuracies show slightly higher values than the corresponding identification accuracies in Table 2, but no significant improvement was found. The slight difference might be caused by the largely imbalanced datasets in the tasks of individual and twin identification. Moreover, Figure 5 and Figures S4‐S5 show that the discordance between MZ twin pairs in most cortical regions, especially the high‐order association cortices, is generally lower than that between DZ twin pairs. Figure 6 further validates that in most of these high‐order association cortices, the degree of discordance between MZ twin pairs is significantly lower than that of between DZ twin pairs, which is in line with the universal biological principles (Kaminsky et al., 2009). In a few regions in Figure 5, the differences between MZ twin pairs are slightly larger than the difference between DZ twin pairs, but no significance was found according to the results of the one‐tailed test. According to the results of these statistical tests, we can have an interesting conclusion that only in some high‐order association cortices, the differences between MZ twin pairs are significantly lower than those between DZ twin pairs, while in other ROIs, there is no statistical significant difference between the discordance of MZ and DZ twins.
It is interesting that the MZ twins, who share the identical genetic makeup (DNA) from a single fertilized egg (Jain, Prabhakar, & Pankanti, 2002; Patwari & Lee, 2008), present distinctive cortical folding patterns in infants. Though the underlying reason is still unclear, recent studies found that the genetically‐identical cells and organisms are not an entirely genetic characteristic, but influenced by both genetic and environmental factors in a dynamic and complex manner (Jha et al., 2018; Raser & O'shea, 2005). Specifically, first, the variation in gene expression may contribute to the phenotypic variability (Patwari & Lee, 2008; Raser & O'shea, 2005) of cortical folding patterns. Second, the prenatal environmental factors (Patwari & Lee, 2008), including the umbilical cord length, access to the nutrition, blood pressure, and position in the womb, also play import roles during the prenatal dynamic development of cortical folding. To this end, we can conclude that (a) both genetic and environmental factors could influence the early development of cortical folding morphology; (b) individual identification based on cortical folding is valid and promising, because there are no two identical brains, even for MZ twins.

4.4 Additional considerations

There are several issues that require further considerations as listed below.

4.4.1 Cortical parcellation choice

From a methodological perspective, the scale and definition of the ROI might influence the patterns of regions with high identification accuracy to some extent. With a cortical region showing high identification accuracy, it might be hard to know which specific part of this region is more critical, given the large sizes of some ROIs from Desikan‐Killiany parcellation (Desikan et al., 2006). Future work may explore the patterns of identification accuracy with a finer‐scale ROI parcellation to better understand which sub‐regions are more or less contributive for individual identification and to better inspect the relation of cortical identification capability and individual variability patterns.

4.4.2 Longitudinal individual cortical folding study across lifespan

It remains unclear to which extent the individual cortical folding is consistent across the whole lifespan. This would be ideally explored using a longitudinal brain dataset with follow‐up MR images from birth to adulthood—which is currently nonexistent. Future studies should include further collecting follow‐up images, and exploring the consistent aspects and developmental aspects of cortical folding during brain development. Additionally, since our experiments with promising identification accuracies were carried out on healthy subjects, it is still unclear that whether specific neurodevelopmental disorder would influence the individual variability and fingerprinting power of the cortical folding. However, it is promising since some studies found that the descriptor of brain morphology can be used to effectively identify adult individuals with Alzheimer's Disease (Peper et al., 2007; Wachinger et al., 2015). Further studies would be required to validate this assumption based on datasets including both healthy subjects and subjects with neurodevelopmental disorders. It would constitute a formidable step forward to demonstrate this in brain development and maturation research.

4.4.3 Infant identification

Though our proposed method based on cortical folding features achieves promising identification accuracies (100%), we do realize that it is not a realistic way for infant identification at present, since MRI is a relatively slow and expensive imaging examination until now. Of note, our main innovation of this study is not the real application of infant identification but rather the three neuroscientific discoveries we reported. Thus, we concisely review the background of current infant identification methods as follows. To our knowledge, this is the first study to leverage the developing cortical folding as the biometric trait for infant identification. There are a few infant identification studies based on other conventional biometric traits, for example, fingerprint (Jain et al., 2016), footprint (E. Liu, 2017), face (Bharadwaj, Bhatt, Singh, Vatsa, & Singh, 2010), or iris (Corby et al., 2006). Compared to the cortical folding features, these biometric traits are more convenient to acquire. However, their performance is less promising, especially when involving neonates (typically < 90% in accuracy) due to the rapidly changing biometric traits during infancy. Besides, these exterior biometric traits are typically unstable and easy to be artificially changed or imitated on purpose in the real application. In future, once brain MRI becomes fast, convenient and cheap to acquire, cortical folding could potentially be a reliable biometric trait for infant identification.

No comments:

Post a Comment