- Research
- Open access
- Published:
Fragmentomics of plasma mitochondrial and nuclear DNA inform prognosis in COVID-19 patients with critical symptoms
BMC Medical Genomics volume 17, Article number: 243 (2024)
Abstract
Background
The mortality rate of COVID-19 patients with critical symptoms is reported to be 40.5%. Early identification of patients with poor progression in the critical cohort is essential to timely clinical intervention and reduction of mortality. Although older age, chronic diseases, have been recognized as risk factors for COVID-19 mortality, we still lack an accurate prediction method for every patient. This study aimed to delve into the cell-free DNA fragmentomics of critically ill patients, and develop new promising biomarkers for identifying the patients with high mortality risk.
Methods
We utilized whole genome sequencing on the plasma cell-free DNA (cfDNA) from 33 COVID-19 patients with critical symptoms, whose outcomes were classified as survival (n = 16) and death (n = 17). Mitochondrial DNA (mtDNA) abundance and fragmentomic properties of cfDNA, including size profiles, ends motif and promoter coverages were interrogated and compared between survival and death groups.
Results
Significantly decreased abundance (~ 76% reduction) and dramatically shorter fragment size of cell-free mtDNA were observed in deceased patients. Likewise, the deceased patients exhibited distinct end-motif patterns of cfDNA with an enhanced preference for “CC” started motifs, which are related to the activity of nuclease DNASE1L3. Several dysregulated genes involved in the COVID-19 progression-related pathways were further inferred from promoter coverages. These informative cfDNA features enabled a high PPV of 83.3% in predicting deceased patients in the critical cohort.
Conclusion
The dysregulated biological processes observed in COVID-19 patients with fatal outcomes may contribute to abnormal release and modifications of plasma cfDNA. Our findings provided the feasibility of plasma cfDNA as a promising biomarker in the prognosis prediction in critically ill COVID-19 patients in clinical practice.
Introduction
The COVID-19 pandemic is dramatically changing daily lives across the globe. As of September 2024, over 776 million confirmed cases, including over 7 million deaths have been reported to World Health Organization(https://covid19.who.int/) [1]. Among the COVID-19 patients, those with critical symptoms may develop respiratory failure, viral sepsis, and even multiple organ dysfunction, which were the leading causes of death in COVID-19 [2,3,4,5]. According to Macedo et al.’s investigation on 33 studies involving 13,398 patients diagnosed with COVID-19 in 2020, the mortality rate of critically ill patients was 40.5% [6]. Although older age, chronic diseases, vital signs, and laboratory results, such as D-Dimer [7], Lactate Dehydrogenase (LDH) level [8] and circulating ACE2 activity [9], have been used to predict mortality and disease severity in COVID-19 patients [10,11,12]. There are still complex factors and unknown characteristics impeding precise predictions of mortality risk for patients.
Plasma cell-free DNA (cfDNA) is short DNA molecule released from bodily cells and non-randomly fragmented in blood circulation, reflecting the physiological state of an individual. Recently, certain characteristics of cfDNA in the plasma of COVID-19 patients have been investigated and demonstrated as reliable biomarkers for assessing the clinical status of COVID-19 patients. Andargie et al. and Cheng et al. revealed that elevated cfDNA concentration was associated with higher severity of COVID-19 [13, 14]. Besides, Scozzi et al. reported that the mtDNA level quantified by qPCR was also increased in COVID-19 patients with severe symptoms [15]. In our previous studies, we further investigated characteristics related to cfDNA fragmentation in COVID-19 patients. Through the analysis cfDNA coverage surrounding the transcription start site (TSS), we revealed a set of genes and pathways significantly associated with COVID-19 severity [16]. By integrating plasma DNA fragmentomic features and clinical laboratory results, we improved the prediction of severity risk compared to the model using clinical indicators only [17].
Different from previous studies which explored the cfDNA differences between the mild and severe COVID-19 patients, we conducted a more focused study only on critically ill COVID-19 patients with different outcomes of either deceased or alive. The hypothesis is that by dissecting comprehensive fragmentomic features of plasma cfDNA, novel cfDNA biomarkers may be discovered for mortality risk prediction. Our findings demonstrated the feasibility of plasma cfDNA as a promising biomarker for prognosis prediction of critically ill COVID-19 patients, which extended the application scenarios of cfDNA fragmentomics in clinical practice.
Materials and methods
Patient sample collection
Thirty three critical patients and nine non-COVID-19 controls were recruited from Union Hospital, Tongji Medical College, Huazhong University of Science and Technology between March 6 and March 14, 2020. The admission date of COVID-19 cases is from January 30 to March 13, 2020. All cases were diagnosed as confirmed critical COVID-19 patients, according to the Diagnosis and Treatment Protocol for Novel Coronavirus Pneumonia (Trial Version 7), released by National Health Commission & State Administration of Traditional Chinese Medicine. The critical cases were defined as meeting any one of the following criteria: (1) Respiratory failure and requiring mechanical ventilation; (2) Shock; (3) With other organ failure that requires ICU care. The “alive” standard, was defined as follows: two consecutive negative SARS-CoV-2 nucleic acid test results of respiratory specimens (sampling time interval of at least 24 h), clinical symptoms disappeared, and were discharged. All “deceased” patients died during hospitalization.
Clinical data collection
We obtained clinical data of patients enrolled in this study from the electronic medical records of the hospital. We collected the patient’s clinical characteristics, including age, sex, clinical outcome states, blood routine, liver and kidney function, blood glucose and lipids, coagulation profile, and myocardial enzymes. All the data have been double-checked by senior physicians to ensure the accuracy of the clinical information. The baseline demographics, complications, and laboratory test results during admission were described and summarized in Table 1.
Sample preparation
For each individual, 3 ml of blood sample was collected in an EDTA tube and centrifuged at 1600 × g for 10 min at 4 °C. The plasma supernatant was separated within 6 h after blood collection. The plasma was heat-inactivated (56 °C for 30 min) and stored at -80 °C until DNA extraction. Cell-free DNA was extracted from 200uL plasma using MagPure Circulating DNA KF Kit (MD5432-02, Magen) by MGISP-96XL (MGI) sample preparation system. All the operations were performed in a Biosafety Level 2 (BSL-2) laboratory.
Library construction and DNA sequencing
Sequencing libraries were prepared using the MGIEasy Cell-free DNA Library Prep kit (1,000,012,701, MGI) according to the manufacturer’s protocol. In brief, the cfDNA ends were repaired and ligated with ‘A’ tailing, then adapters were ligated with ‘A’ tail. After adapter ligation reaction, 10 cycles of PCR amplification were performed. The PCR products were further treated to generate DNA nanoballs (DNB). Then the DNB in libraries were sequenced on the DNBSEQ platform (MGI) with 100 bp pair-end mode.
Sequencing alignment
Raw sequencing reads were firstly treated by removing low-quality reads (i.e., quality Phred < 5, or “N” base > 10) and the sequencing adapter by using Fastp [18]. Next, these cleaned reads were aligned to the human reference genome (GRCh38/hg38) by Minimap2 [19] using the default parameters. Next, we only kept paired-ends reads with both reads aligned to the same chromosome with proper orientation. And those pair-ends reads with insert size more than 600 bp would be further filtered. The retained reads (median: 401,422,638, range: 106,917,218 – 1,162,515,540) were converted to BED format and used for downstream analysis.
MtDNA analysis
Sequencing reads that properly and uniquely aligned to the mitochondrial genome were classified as cell-free mitochondrial DNA (cfmtDNA) reads. The relative cfmtDNA abundance was determined by calculating the ratio between the count of reads aligned to mtDNA genome and human autosomes. The size of mtDNA molecules was determined according to the outmost ending site of the paired aligned reads. For each cfmtDNA length of 0-600 bp, the size frequency (SF) is calculated as the proportion of mtDNA of that length among all mtDNA within 1 and 600 bp. As the peak of cfmtDNA size profiles was around 80 bp and 165 bp for alive and deceased groups, respectively (Fig. 1B). We quantified the cfmtDNA size score by calculating the differences between the average SF (referring to the average proportion of mtDNA) for each single base pair length within 11-bp windows centered around the 165 bp and 80 bp peak positions, as follows:
Motif analysis
As previous study defined [20], the first four nucleotides at the 5’ end of each strand of plasma DNA molecules are referred to as the “ends motif” in this work. To obtain the motif occurrence, we calculated the frequency of each type of 256 motifs in each of the samples. In order to gauge the uniformity of motif distribution, we further employed entropy analysis on 256 motifs frequency of each sample to obtain the motif diversity score (MDS) [20]. A low MDS reflected a skewed distribution, and a high MDS represented a uniform distribution of ends motif in frequencies.
Statistics
The statistical operations and data visualization in this study were performed by using in-house scripts written by using Python3. Mann–Whitney U-test was employed for the different tests (Benjamini and Hochberg adjust were utilized in Fig. S3). P-value < 0.05 was considered statistically significant. P-values of less than 0.05, 0.01, and 0.001 were denoted, respectively, by the symbols *, **, and ***.
Results
Assignment of COIVID-19 patients
Overall, a total of 33 critical patients were recruited for this study from March 6 to March 14, 2020, by Union Hospital, Tongji Medical College, Huazhong University of Science and Technology in Wuhan, China. The collected clinical data including demographics, complications and laboratory test indicators were summarized in Tables 1 and S1. The median follow-up after blood sampling for all subjects was 15 days. For those alive and deceased patients, the end of follow-up is the date they were discharged or died. The clinical outcomes in this study were determined according to the disease status at the end of our follow-up, and the criteria of “critical” followed the Diagnosis and Treatment Protocol for Novel Coronavirus Pneumonia (Trial Version 7) [21], issued by National Health Commission & National Administration of Traditional Chinese Medicine, China.
A total of 33 critically ill patients with COVID-19 were classified into a alive group and a deceased group. At the end of our follow-up, 16 patients characterized as alive were recovered and discharged, and 17 patients died from COVID-19 (Fig. 1A). For all the patients, blood samples were collected for plasma cfDNA study, and the sampling timepoints during hospitalization were shown in Fig. 1A and Fig. S1A. In order to compare the hospitalization duration at the blood sampling timepoint between alive and deceased groups, we calculated the ratio between this hospitalization duration and the total days from admission to the end of our follow-up, denoted as hospitalization index. We showed that there was no significant difference between the alive and deceased groups (P-value = 0.30, Mann–Whitney U-test) in terms of the hospitalization index (Fig. S1B). In other words, the outcomes of the two groups were not differed by treatment durations, but differed in disease progression and treatment responses. To elucidate the potential of cfDNA to monitor disease progression, we focused on the multi-characteristics of plasma cfDNA and systematically compared the two groups.
Timepoint of sampling during hospitalization within follow-up and plasma cfmtDNA features in critical COVID-19 patients. A Sampling timepoint during hospitalization within follow-up. The green line indicates the COVID-19 patients who were finally alive, and the red line indicates those who were deceased in this study. Blue, orange, green, and red dots indicate the relative time points of admission, blood sampling, hospital discharge, and death, respectively. B Size profiles of plasma cell-free mitochondrial DNA (mtDNA) in alive and deceased patients. The blue and red lines indicate the median size profile of the patients in the alive and deceased groups, respectively. The shadow flanking the lines of median size indicates the range of standard deviation. Vertical dashed lines in purple and dark red indicate fragment sizes of 80 bp and 165 bp, respectively. The dark regions flanking the dashed lines represent the regions of 5 bp upstream and downstream of the 80 bp (i.e. 75 bp ~ 85 bp) and 165 bp (i.e. 160 bp ~ 170 bp), respectively. Box plots of (C) mtDNA size score and D relative mtDNA abundance in critical COVID-19 patients with different outcomes. E ROC analysis for the discrimination of COVID-19 patients with different outcomes based on plasma mtDNA
Size profiles and relative abundances of plasma mitochondrial DNA distinguish outcomes of critical COVID-19 patients
Size profile is one of the most valuable features of cfDNA fragmentomics [22, 23]. The altered size distribution of cell-free nuclear DNA in COVID-19 patients has previously been reported [14, 24]. On the other hand, the cell-free mtDNA levels quantified by qPCR were also discovered as an early indicator of mortality in COVID-19 patients [15]. Here, we investigated the size distribution of cell-free nuclear and mitochondrial DNA in COVID-19 patients at single-base resolution, through the analysis of high-throughput sequencing of cfDNA.
As shown in Fig. S1C, the size profiles of cell-free nuclear DNA were similar between the alive and deceased groups. However, a notable difference was observed in the size profiles of cell-free mtDNA between these groups. While the cfmtDNA size peak was approximately 80 bp for the alive group, a distinct peak was observed around 165 bp for the deceased group (Fig. 1B). In comparison to the alive group, the profiles of the deceased group were right-skewed, revealing a dramatic increase of longer mtDNA fragments. To quantify the size distribution pattern, for each sample, we calculated the difference between the average mtDNA proportions for each single base pair length within 160 ~ 170 bp and 75 ~ 85 bp, termed as mtDNA size score (MSS) (See Methods). Of note, significantly higher MSS (P-value = 0.0011, Mann–Whitney U-test) was observed in the deceased group (Fig. 1C), reflecting a higher frequency of fragments around the 165 peak (160 ~ 170 bp) and lower frequency of fragments around the 80 bp peak (75 ~ 85 bp). In addition, we have also found a significantly lower relative mtDNA abundance in the deceased group (median: 0.00003, range: 0.000007 – 0.000056) than in the alive group (median: 0.000041, range: 0.000009 – 0.000645, P-value = 0.0348, Student’s t-test) (Fig. 1D). To discriminate these two groups of patients with different outcomes, receiver operating characteristic (ROC) curve analysis was performed based on MSS and relative abundances of plasma mtDNA. The area under the curve (AUC) was 0.83 for the MSS, and 0.70 for the relative abundance (Fig. 1E), both of which have the power to distinguish outcomes of COVID-19 patients.
Aberrant plasma DNA ends motif in deceased patients with critical COVID-19
The digestion pattern of cfDNA in plasma has been revealed to be a non-random process [25, 26]. The fragmentation signatures created by nuclease in plasma were able to be reflected by profiling the nucleotides at the ends of cfDNA molecules, termed as cfDNA ends motif [20]. To depict the digestion pattern of cfDNA in COVID-19 patients with different prognoses, we studied the 4-mer motifs at 5’ ends of plasma DNA in 16 patients of the alive group and 17 patients of the deceased group. As shown in Fig. 2A, the patterns of the 4-mer motifs of plasma DNA in COVID-19 patients with alive and deceased outcomes were clustered into two groups, suggesting that the cfDNA fragmentations were highly distinctive in patients with different prognoses.
End motifs of plasma DNA in patients with critical COVID-19. A Heatmap analysis of motif frequencies in patients with COVID-19 (alive: n = 16, deceased: n = 17). The frequencies were normalized by calculating z-scores for the rows. B Box plot of motif diversity score (MDS) among alive (n = 16) and deceased (n = 17) patients. C-F Box plot analysis of 4 representative motifs showing differences in terms of frequency between the alive and deceased groups. G ROC curve analysis between alive and deceased groups by using MDS and the occurrence of 4 representative motifs
To quantify the spectrum of ends motifs, we calculated the normalized Shannon entropy of the frequencies across 256 motifs, which has been defined as motif diversity score (MDS) [20], in the plasma DNA of alive and deceased groups. We observed that the MDS in the deceased group (median: 0.94; range: 0.91 – 0.95) was significantly lower than that in the alive group (median: 0.95; range: 0.93 – 0.95) (P-value = 0.0052, Mann–Whitney U-test, Fig. 2B), indicating that the motif distribution was skewed in deceased patients compared with alive patients. As the motif profiles were distinct between alive and deceased patients, we wondered if these motifs carried discrimination power for patients with different prognoses. To this end, we compared the frequencies of four representative ends motifs (i.e., ‘CCCA’, ‘CCTG’, ‘CCAG’, and ‘CCCT’), which were the top four motifs with high frequencies in both groups, between the two groups. As shown in Fig. 2C-F, the frequencies of these motifs were significantly higher in deceased patients compared with alive patients. The median frequency of the CCCA motif was 2.90% (range: 1.95% – 4.64%) in the plasma DNA of deceased patients, whereas the median was 2.08% (range: 1.81% – 3.32%) (P-value = 0.0020, Mann–Whitney U-test) for those alive patients. Besides, compared with alive patients, the motif of CCCT, CCAG and CCTG also exhibited a significant median increase of 25.44%, 20.81% and 41.12% (P-value = 0.0090, = 0.0073 and = 0.0081, Mann–Whitney U-test), respectively, in the deceased group. As the abundance of “CC” started motif in plasma DNA has been reported to be associated with the activity of nuclease DNASE1L3 [27], the elevated frequency of such motifs in deceased patients might indicate that the fragmentation pattern induced by DNASE1L3 was enhanced in patients with poor prognosis.
Furthermore, we performed ROC analysis to classify patients with different prognoses during our follow-up using MDS and four representative motifs. As shown in Fig. 2G, the motif ‘CCCA’ achieved the highest AUC of 0.82 in the discrimination of alive and deceased patients. Besides, the AUC of using MDS, ‘CCTG’, ‘CCAG’ and ‘CCCT’ motifs also reached 0.79, 0.77, 0.78, and 0.77, respectively. These results demonstrated that the ends motif signatures carried by plasma DNA might serve as biomarkers in the assessment of clinical outcomes of critical COVID-19 patients.
Inferred dysregulated genes in critical COVID-19 patients with deceased outcome revealed by profiling cfDNA coverage in promoter regions
It has been reported that the cfDNA coverage of the promoter region was correlated with the gene expression level in the human body [28]. We envisioned that the gene expression pattern in deceased patients with critical COVID-19 might be different from those who were alive, which could be reflected from the cfDNA coverage profile in regions surrounding the transcriptional start site (TSS).
Figure 3A shows our algorithm for measuring and quantifying the coverage pattern of cfDNA molecules around TSS in COVID-19 patients. The genomic region within 2 kb distance of TSS was defined as TSS region in this study. The cfDNA coverages within this region were normalized by the average depth of the whole genome and was referred to as relative coverage. To establish a baseline of TSS coverage profile, we further collected and sequenced plasma DNA samples from 9 healthy individuals. The accumulated deviation of the relative coverage from the baseline in each TSS region, termed as deviation score (DS), was calculated for each COVID-19 patient. Figure 3B exhibits the distributions of coverage deviations from the baseline in alive (n = 16) and deceased (n = 17) groups across the TSS region (chr19: 16,861,273 – 16,865,274) of the SIN3B gene. The coverage deviations from baseline in the deceased group were observed to be generally larger than those in the alive group across different sites of this TSS region, which suggested that the expression level of the SIN3B gene in alive patients with COVID-19 resembled the healthy individuals, whereas the expression pattern was greatly altered in deceased patients. To further explore all the other TSS regions with differential cfDNA coverages between the alive and deceased groups, we performed a volcano plot analysis using deviation scores of two groups (cutoff: P-value < 0.001, | log2(fold change) | > 0.5) (Fig. 3C). Totally, we obtained 1,194 inferred differential TSSs, accordingly, we deduced 963 genes (1,194 TSSs corresponding to 963 unique genes) that were differentially expressed between the alive and deceased group. Multi-dimensional scaling was performed on the 1,194 differentiated TSSs for these 33 samples. As shown in Fig. S2A, the alive group exhibiting a tendency to cluster together and the deceased group forming a distinct cluster.
Inferred differentially expressed genes between COVID-19 patients with different outcomes were revealed by profiling cfDNA coverage in promoter regions. A The schematic illustration of deviation score for the quantification of relative transcriptional start site (TSS) coverage profile. The blue line represents the baseline, which is the mean of the relative coverage within ± 2 kb of TSS of nine healthy subjects. The red line represents the TSS coverage profile of a patient with COVID-19. The yellow-shaded areas represent the deviation between the TSS coverage profile of baseline and COVID-19 patients. Deviation score was calculated by accumulating the deviations within the shadow area. B Distribution of relative coverage deviations between COVID-19 patients and baseline across the TSS ± 2 kb region of the SIN3B gene. The blue and red lines indicate the average of deviation distributions for the alive and deceased groups, respectively. The shaded area flanking the average distributions represents the 95% confidence interval of the deviation values in each group. C The volcano plot for identifying TSSs with differential deviation scores (i.e., P-value < 0.001, | log2(fold change) | > 0.5) between alive and deceased COVID-19 groups. These TSSs are defined as DEG-TSS. D Box plots of the accumulated deviation scores (ADS) of 1,194 DEG-TSSs. E Enrichment analysis on pathway and biological process for 963 genes with DEG-TSSs. The top 20 pathways ranked by P-value are displayed
Additionally, we accumulated the deviation scores of these 1,194 TSSs in the alive and deceased groups, respectively. As shown in Fig. 3D, the accumulated deviation score (ADS) of the deceased group (median: 4.052 × 106; range: 9.961 × 105– 7.071 × 106) was significantly higher than that of the alive group (median: 1.255 × 106; range: 6.981 × 105 – 9.899 × 106) (P-value = 0.0016, Mann–Whitney U-test). We further performed pathway and functional analysis using the Metascape [29] on these deduced differential genes to identify the altered pathways that were related to the prognoses of critical COVID-19 patients. The top 20 pathways were shown in Fig. 3E.
CfDNA characteristics discriminate COVID-19 patients with different outcomes
As significant differences in MSS, ADS, and MDS were observed between the alive and deceased groups, we wondered whether the combination of these characteristics would allow us to obtain better performance for the patient’s prognostic classification than only using a single feature. To this end, we investigated the classification performance of prognoses on MDS-ADS, MDS-MSS, MSS-ADS, and MSS-MDS-ADS combinations based on ROC analysis. As shown in Fig. 4A, an equivalent AUC of 0.83 was achieved in the combination of MSS and MDS, which was similar to that of the MSS only (Fig. S2B-D). To further access the discrimination potential of these indicators, we determined the threshold for each feature at a specificity above 75% (i.e., MSS > 0.29; ADS > 0.65 × 108; MDS < 0.9455) (Fig. 4B-E). The subject would be classified as deceased patient only when all dimensions of combined features reached the threshold. As summarized in Fig. 4F, we found that the combination of MSS with either MDS or ADS could achieve equally good performance compared with the combination without MSS (i.e. combination of ADS with MDS). By using the MSS combinations, 15 out of 17 deceased patients were accurately predicted, achieving a sensitivity of 88%. Meanwhile, 13 out of 16 alive patients were accurately predicted as negative subjects, achieving a specificity of 81%. In addition, the positive predictive value (PPV) of these combinations reached 83.3% (Fig. 4F). These results further indicated that the fragmentation patterns of cfnDNA and cfmtDNA in plasma of COVID-19 patients were highly correlated with disease status and could be readily utilized to predict the clinical prognoses.
Discrimination of COVID-19 patients with different outcomes by using two- and three-dimensional features in cfDNA. A ROC plot for different combinations of features on the classification of alive and deceased patients. Scatter plots of two- and three-dimensional cfDNA characteristics for all alive or deceased patients with COVID-19, which are (B) accumulated deviation scores (ADS) and motif diversity score (MDS), C mitochondrial DNA size score (MSS) and MDS, (D) MSS and ADS, and E MSS, ADS and MDS. F Summary table of the performance for discrimination of COVID-19 patients with alive and deceased outcomes based on different combinations of cfDNA features
Discussion
In this study, we investigated multiple properties of plasma cfDNA in COVID-19 patients with critical symptoms. We uncovered that the size profile of plasma mtDNA, ends motif spectrum and promoter coverage profile of plasma nuclear DNA were significantly changed in patients that were deceased at the end of the follow-up compared with those who were alive. We also demonstrated the clinical potential of these cfDNA features in the determination of critical COVID-19 patients with poor prognoses.
Cell-free mtDNA was known to be released during cellular clearance or repair processes, particularly autography, apoptosis, necrosis, and NETosis [30,31,32]. The previous study, based on the quantitative PCR method, has shown that mtDNA levels of deceased patients with COVID-19 were significantly higher than that in the survival group [15]. However, in our study, the deceased group of COVID-19 patients tend to have a significantly lower relative mtDNA abundance than those alive patients. One possibility for this discrepancy is that deceased patients experienced more intense massive cell death and a consequent overall increase in nuclear cfDNA, which may have diluted the relative abundance of mtDNA in the sequenced DNA. Additionally, it has been revealed that plasma mtDNA could exist in either linear or circular form [33]. The quantification of plasma mtDNA in previous COVID-19 studies was based on qPCR [15], which directly measured the total mtDNA, including both linear and circular mtDNA. Whereas the relative mtDNA abundance in our study was assessed in sequencing data and contained only linear mtDNA information. Moreover, previous study has indicated that cfDNA released from hematopoietic cells is significantly increased in COVID-19 patients compared to healthy subjects and those with influenza, suggesting a higher degree of hematopoietic cell death during COVID-19 infection [13]. Interestingly, Ma et al. [33] reported that the mtDNA derived from hematopoietic cells primarily consisted of circular mtDNA, which cannot be sequenced in our protocol but can be quantified by qPCR. This protocol difference and the existence of different forms of mtDNA may potentially result in the inconsistencies observed in the abundance of mtDNA between our study and others.
Notably, we observed increased size of cell-free mtDNA in deceased samples compared with alive samples, although the size profile of nuclear DNA fragments in two groups was similar (Fig. S1C). The alive samples with a peak size of around 80 bp coincide with previous studies on healthy individuals [33]. In contrast, the deceased group showed a higher abundance of longer mtDNA fragments, with a discernible peak around 165 bp. Previously, Li et al. uncovered that the mtDNA in plasma extracellular vesicles (EVs) carried a larger median fragment size than cfmtDNA in plasma (159 bp vs. 109 bp) [34]. Therefore, the mtDNA in deceased patients is more likely to be released from the EVs in plasma, while the mtDNA in alive patients is directly released from the mitochondria. Two additional studies also found that EV nuclear DNA exhibited larger fragment sizes than plasma cfDNA, suggesting that DNA released via EVs may be more protected and subject to different degradation mechanisms.
Ends motif of plasma DNA is an emerging cfDNA fragmentomic feature that reflects the nuclease digestion signatures on plasma DNA. Jiang et al. revealed that plasma DNA released from liver, hematopoietic cells, placenta and tumor carried distinct end-motif patterns related to its tissue-of-origin [20]. This variation in end motifs could be attributed to the influence of nucleases in tissue, [20] with DNASE1L3 playing a key role in cfDNA digestion. Notably, cfDNA digested by DNASE1L3 shows a preferential occurrence of C-ends [27]. In our study, we explored the end-motif profile of plasma DNA in COVID-19, which extends our understanding of cfDNA fragmentation characteristics in patients with infectious diseases. Declined MDS, enriched “CC” stared motifs and distinct end-motif distribution in deceased patients suggest that the tissue injury and the nuclease activity in the plasma of patients with the poor outcome may be altered or enhanced. Andargie et al. have shown that, compared with the critical COVID-19 patients that were survived, those deceased patients exhibited more intensive damage to lung, liver and heart tissues [13]. Therefore, it is expected that in the plasma of deceased patients, there will be more cfDNA released from these damaged organs. Hence, we analyzed the expression levels obtained from the Genotype-Tissue Expression (GTEx) portal for DNASE1L3, a nuclease that preferentially created “CC” started motifs in plasma, among these damaged tissues and whole blood cells which represent the predominant source of cfDNA in plasma of healthy individuals. We observed that the log2(TPM) of DNASE1L3 were significant higher in kidney medulla (median: 29.09, range: 11.32 – 58.56), kidney cortex (median: 14.63, range: 4.47 – 82.36), liver (median: 21.16, range: 0.83 – 126.5), lung (median: 6.94, range: 0.61 – 120.9), heart atrial appendage (median: 5.39, range: 0.028 – 44.62) and heart left ventricle (median: 0.55, range: 0.018 – 18.02) than it in the whole blood cells (median: 0.32, range: 0.0085 – 21.3) with Mann–Whitney U-test (Adjusted P-values are 0.0034, < 0.0001, < 0.0001, < 0.0001, < 0.0001 and < 0.0001, respectively) (Fig. S3). Hence, we speculate that the increased frequencies of “CC” started motifs in the plasma of deceased patients were due to the increased cfDNA released from the aggravatedly damaged organs. In other word, the altered motif profiles in plasma of COVID-19 patients allowed us to trace the tissue damage in patients with poor prognoses. Previously, Korabecna et al. found that cfDNA plays a crucial role in immune system regulation [35]. Its presence boosts genes related to immune homeostasis, while its clearance triggers an innate immune response. The varied mtDNA sizes and cfDNA end motifs suggest diverse tissue origins and cfDNA release. These distinct turnover patterns of cfDNA in COVID-19 patients with different outcomes may involve immune system regulations that impact their final clinical results.
By profiling the cfDNA coverage around TSS in alive and deceased patients, we inferred a number of dysregulated genes in deceased patients. Most of these pathways have been reported to be involved in COVID-19. For example, several COVID-19 studies have identified a notable presence of ‘Herpes simplex virus 1 (HSV-1) infection’ among critically ill patients [36,37,38]. Moreover, Vigon et al. observed a twofold increase in anti-HSV-1 IgG levels among patients with severe COVID-19 compared to those with mild cases [39]. Our findings, in conjunction with those of previous studies, indicated that HSV-1 coinfection may be a significant contributing factor to the unfavorable prognosis observed in patients with SARS-CoV-2 infection. However, we did not conduct HSV-1 antigen testing when we collected the samples from COVID-19 patients, and it would be interesting to incorporate this consideration in future studies. In addition, ‘nucleus organization’, ‘ribosome assembly’, ‘DNA metabolic process’, ‘chromatin organization’, ‘mitochondrion organization’ and ‘RNA biosynthetic process’, are cell cycle-related pathways. It has been revealed that SARS-CoV-2 infection could lead to cell cycle arrest [40]. In this study, we conjectured those alive patients may have more vigorous cell division and metabolism, with newborn cells replacing the dead cells to continue working to maintain the normal life activities of the organism. In contrast, in deceased patients, the rate of cell division cannot catch up with the rate of cell death, and tissue cells are filled with inflammatory infiltrates or cell necrosis, which could lead to organ dysfunction or failure, and eventually death of the patient. For the pathways of ‘metabolism of RNA’, Wang et al. reported that RNA metabolism was down-regulated in A549 cell lines infected with SARS-cov2 [41]. ‘Interleukin-12 signaling’ also played an important role in the immune response during COVID-19 infection [42, 43].
A major limitation of our study is the small sample size. As critically ill individuals are rare, with only 5% of all COVID-19 patients reporting having such a condition [44], we only included 16 alive and 17 deceased patients in our analysis. For this small sample size, some known risk factors, including age and BMI were not discussed. A larger cohort collected from multiple centers is needed to further validate and extend our findings. Second, our blood samples were collected during the hospitalization of the patients. Further research on the associations between cfDNA characteristics at the time of admission and the clinical outcomes would be of great value to enhance the early prediction of COVID-19 prognosis. Finally, since the COVID-19 patients in this study were recruited in 2020, it is worth exploring the potential impact of different SARS-CoV-2 variants on cfDNA characteristics in future studies.
Conclusions
We dissected the fragmentomic properties of cfDNA in COVID-19 patients with different prognoses and identified the altered pathways in patients with poor outcomes during follow-up. These informative cfDNA features may serve as prospective biomarkers for early prognosis prediction and disease status monitoring in COVID-19 patients, particularly in those who were critically ill. These results not only identified the clinical value of cfDNA as a reliable non-invasive marker for prognosis prediction but also provided important insights into the mechanisms underlying the generation and release of cfDNA in critically ill COVID-19 patients with poor outcomes.
Availability of data and materials
The raw sequence data reported in this paper are available in the database of Genome Sequence Archive (GSA) in National Genomics Data Center (NGDC), China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences (https://ngdc.cncb.ac.cn/gsa-human/) under accession number HRA008603. Any additional information or data required to reanalyze the results reported in this paper is available from the Lead Contact upon request (Xin Jin, jinxin@genomics.cn).
References
WHO COVID-19 dashboard [https://covid19.who.int/] Accessed on 29 September 2024. https://tumor.informatics.jax.org/cancer_links.html.
von Stillfried S, Bülow RD, Röhrig R, Boor P, Böcker J, Schmidt J, et al. First report from the German COVID-19 autopsy registry. Lancet Reg Heal - Eur. 2022;15:100330. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.lanepe.2022.100330.
Bian XW, Yao XH, Ping YF, Yu S, Shi Y, Luo T, et al. Autopsy of COVID-19 patients in China. Nat Sci Rev. 2020;7:1414–8.
Chen N, Zhou M, Dong X, Qu J, Gong F, Han Y, et al. Epidemiological and clinical characteristics of 99 cases of 2019 novel Coronavirus pneumonia in Wuhan, China: a descriptive study. Lancet. 2020;395:507. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/S0140-6736(20)30211-7.
Huang C, Wang Y, Li X, Ren L, Zhao J, Hu Y, et al. Clinical features of patients infected with 2019 novel Coronavirus in Wuhan China. Lancet. 2020;395:497. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/S0140-6736(20)30183-5.
Macedo A, Gonçalves N, Febra C. COVID-19 fatality rates in hospitalized patients: systematic review and meta-analysis. Ann Epidemiol. 2021;57:14.
Long H, Nie L, Xiang X, Li H, Zhang X, Fu X, et al. D-Dimer and prothrombin time are the significant indicators of severe COVID-19 and poor prognosis. Biomed Res Int. 2020;2020:6159720. https://doiorg.publicaciones.saludcastillayleon.es/10.1155/2020/6159720.
Li C, Ye J, Chen Q, Hu W, Wang L, Fan Y, et al. Elevated Lactate Dehydrogenase level as an independent risk factor for the severity and mortality of COVID-19. Aging (Albany NY). 2020;12:15670. https://doiorg.publicaciones.saludcastillayleon.es/10.18632/AGING.103770.
Fagyas M, Fejes Z, Sütő R, Nagy Z, Székely B, Pócsi M, et al. Circulating ACE2 activity predicts mortality and disease severity in hospitalized COVID-19 patients. Int J Infect Dis. 2022. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.ijid.2021.11.028.
Liang W, Yao J, Chen A, Lv Q, Zanin M, Liu J, et al. Early triage of critically ill COVID-19 patients using deep learning. Nat Commun. 2020;11:3543. https://doiorg.publicaciones.saludcastillayleon.es/10.1038/s41467-020-17280-8.
Wynants L, Van Calster B, Collins GS, Riley RD, Heinze G, Schuit E, et al. Prediction models for diagnosis and prognosis of covid-19: Systematic review and critical appraisal. BMJ. 2020;369:m1328. https://doiorg.publicaciones.saludcastillayleon.es/10.1136/bmj.m1328.
Chen R, Liang W, Jiang M, Guan W, Zhan C, Wang T, et al. Risk factors of fatal outcome in hospitalized subjects with Coronavirus disease 2019 from a nationwide analysis in China. Chest. 2020;158:97. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.chest.2020.04.010.
Andargie TE, Tsuji N, Seifuddin F, Jang MK, Yuen PST, Kong H, et al. Cell-free DNA maps COVID-19 tissue injury and risk of death and can cause tissue injury. JCI Insight. 2021;6:e147610. https://doiorg.publicaciones.saludcastillayleon.es/10.1172/jci.insight.147610.
Cheng AP, Cheng MP, Gu W, Sesing Lenz J, Hsu E, Schurr E, et al. Cell-free DNA tissues of origin by methylation profiling reveals significant cell, tissue, and organ-specific injury related to COVID-19 severity. Med. 2021;2:411-422.e5.
Scozzi D, Cano M, Ma L, Zhou D, Zhu JH, O’Halloran JA, et al. Circulating mitochondrial DNA is an early indicator of severe illness and mortality from COVID-19. JCI Insight. 2021;6:1–16.
Chen X, Wu T, Li L, Lin Y, Ma Z, Xu J, et al. Transcriptional start site coverage analysis in plasma cell-free DNA reveals disease severity and tissue specificity of COVID-19 patients. Front Genet. 2021;12:663098. https://doiorg.publicaciones.saludcastillayleon.es/10.3389/fgene.2021.663098.
Bai Y, Zheng F, Zhang T, Luo Q, Luo Y, Zhou R, et al. Integrating plasma cell-free DNA with clinical laboratory results enhances the prediction of critically ill patients with COVID-19 at hospital admission. Clin Transl Med. 2022;12:1–6.
Chen S, Zhou Y, Chen Y, Gu J. Fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018;34:i884.
Li H. Minimap2: Pairwise alignment for nucleotide sequences. Bioinformatics. 2018;34:3094. https://doiorg.publicaciones.saludcastillayleon.es/10.1093/bioinformatics/bty191.
Jiang P, Sun K, Peng W, Cheng SH, Ni M, Yeung PC, et al. Plasma DNA end-motif profiling as a fragmentomic marker in cancer, pregnancy, and transplantation. Cancer Discov. 2020;10:664–73.
Wei P-F. Diagnosis and treatment protocol for novel coronavirus pneumonia (Trial version 7). Chin Med J. 2020;133:1087–95.
Lo YMD, Han DSC, Jiang P, Chiu RWK. Epigenetics, fragmentomics, and topology of cell-free DNA in liquid biopsies. Science (80- ). 2021;372:eaaw3616.
Sanchez C, Roch B, Mazard T, Blache P, Al Amir Dache Z, Pastor B, et al. Circulating nuclear DNA structural features, origins, and complete size profile revealed by fragmentomics. JCI Insight. 2021;6:e144561. https://doiorg.publicaciones.saludcastillayleon.es/10.1172/jci.insight.144561.
Jin X, Wang Y, Xu J, Li Y, Cheng F, Luo Y, et al. Plasma cell-free DNA promise disease monitoring and tissue injury assessment of COVID-19. medRxiv. 2021;298:823.
Sun K, Jiang P, Wong AIC, Cheng YKY, Cheng SH, Zhang H, et al. Size-tagged preferred ends in maternal plasma DNA shed light on the production mechanism and show utility in noninvasive prenatal testing. Proc Natl Acad Sci U S A. 2018;115:E5106. https://doiorg.publicaciones.saludcastillayleon.es/10.1073/pnas.1804134115.
Jiang P, Sun K, Tong YK, Cheng SH, Cheng THTT, Heung MMSS, et al. Preferred end coordinates and somatic variants as signatures of circulating tumor DNA associated with hepatocellular carcinoma. Proc Natl Acad Sci U S A. 2018;115:E10925–33.
Han DSC, Ni M, Chan RWY, Chan VWH, Lui KO, Chiu RWK, et al. The biology of cell-free DNA fragmentation and the roles of DNASE1, DNASE1L3, and DFFB. Am J Hum Genet. 2020;106:202. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.ajhg.2020.01.008.
Ulz P, Thallinger GG, Auer M, Graf R, Kashofer K, Jahn SW, et al. Inferring expressed genes by whole-genome sequencing of plasma DNA. Nat Genet. 2016;48:1273–8.
Zhou Y, Zhou B, Pache L, Chang M, Khodabakhshi AH, Tanaseichuk O, et al. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat Commun. 2019;10:1523.
Thierry AR, Roch B. Neutrophil extracellular traps and by-products play a key role in COVID-19: pathogenesis, risk factors, and therapy. J Clin Med. 2020;9:2942. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/jcm9092942.
Barnes BJ, Adrover JM, Baxter-Stoltzfus A, Borczuk A, Cools-Lartigue J, Crawford JM, et al. Targeting potential drivers of COVID-19: Neutrophil extracellular traps. J Exp Med. 2020;217:e20200652.
Zuo Y, Yalavarthi S, Shi H, Gockman K, Zuo M, Madison JA, et al. Neutrophil extracellular traps in COVID-19. JCI Insight. 2020;5:e138999. https://doiorg.publicaciones.saludcastillayleon.es/10.1172/jci.insight.138999.
Ma MJL, Zhang H, Jiang P, Sin STK, Lam WKJ, Cheng SH, et al. Topologic analysis of plasma mitochondrial DNA reveals the coexistence of both linear and circular molecules. Clin Chem. 2019;65:1161–70.
Li Y, Guo X, Guo S, Wang Y, Chen L, Liu Y, et al. Next generation sequencing-based analysis of mitochondrial DNA characteristics in plasma extracellular vesicles of patients with hepatocellular carcinoma. Oncol Lett. 2020;20:2820. https://doiorg.publicaciones.saludcastillayleon.es/10.3892/ol.2020.11831.
Korabecna M, Zinkova A, Brynychova I, Chylikova B, Prikryl P, Sedova L, et al. Cell-free DNA in plasma as an essential immune system regulator. Sci Rep. 2020;10:17478. https://doiorg.publicaciones.saludcastillayleon.es/10.1038/s41598-020-74288-2.
Franceschini E, Cozzi-Lepri A, Santoro A, Bacca E, Lancellotti G, Menozzi M, et al. Herpes simplex virus re-activation in patients with SARS-CoV-2 pneumonia: A prospective, observational study. Microorganisms. 2021;9:1896. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/microorganisms9091896.
Meyer A, Buetti N, Houhou-Fidouh N, Patrier J, Abdel-Nabey M, Jaquet P, et al. HSV-1 reactivation is associated with an increased risk of mortality and pneumonia in critically ill COVID-19 patients. Crit Care. 2021;25:417. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13054-021-03843-8.
Shanshal M, Ahmed HS. COVID-19 and herpes simplex virus infection: a cross-sectional study. Cureus. 2021;13:e18022. https://doiorg.publicaciones.saludcastillayleon.es/10.7759/cureus.18022.
Vigón L, García-Pérez J, Rodríguez-Mora S, Torres M, Mateos E, Castillo de la Osa M, et al. Impaired antibody-dependent cellular cytotoxicity in a Spanish cohort of patients with COVID-19 admitted to the ICU. Front Immunol. 2021;12:742631. https://doiorg.publicaciones.saludcastillayleon.es/10.3389/fimmu.2021.742631.
Bouhaddou M, Memon D, Meyer B, White KM, Rezelj VV, Correa Marrero M, et al. The Global phosphorylation landscape of SARS-CoV-2 infection. Cell. 2020;182:685. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.cell.2020.06.034.
Wang JY, Zhang W, Roehrl MW, Roehrl VB, Roehrl MH. An autoantigen profile of human A549 lung cells reveals viral and host etiologic molecular attributes of autoimmunity in COVID-19. J Autoimmun. 2021;120:102644. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.jaut.2021.102644.
Young BE, Ong SWX, Ng LFP, Anderson DE, Chia WN, Chia PY, et al. Viral dynamics and immune correlates of coronavirus disease 2019 (COVID-19) severity. Clin Infect Dis. 2021;73:e2932.https://doiorg.publicaciones.saludcastillayleon.es/10.1093/cid/ciaa1280
Liu Y, Chen D, Hou J, Li H, Cao D, Guo M, et al. An inter-correlated cytokine network identified at the center of cytokine storm predicted COVID-19 prognosis. Cytokine. 2021;138:155365. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.cyto.2020.155365.
Wu Z, McGoogan JM. Characteristics of and important lessons from the coronavirus disease 2019 (COVID-19) outbreak in China: summary of a report of 72 314 cases from the chinese center for disease control and prevention. JAMA. 2020;323:1239–42.
Acknowledgements
We would like to thank the subjects who donated their blood. We also thank the Core Facilities of BGI-shenzhen and China National GeneBank (CNGB) for their technical support.
Funding
This study was supported by the National Natural Science Foundation of China (32171441 and 32000398), National Key Research and Development Program of China (2023YFC2605400), the Innovative Major Emergency Project Funding against the COVID-19 in Hubei Province (2020FCA041), and the Innovative Major Emergency Project Funding against the COVID-19, HUST (2020kfyXGYJ039).
Author information
Authors and Affiliations
Contributions
H.Z., F.C. and X.J. designed the research. F.Z, R.X, Y.C. and Y.J. collected the blood samples. Y. Luo, Y. Lin, R.O, Y.W., W.Z. and Y.T. performed DNA extraction and libraries construction. L.L., H.Z. and Y.B analyzed the sequencing data. J.X. and R.Q. analyzed clinical data. H.Z., L.L., Y. Luo, F.Z., Y.Z. and R.X. wrote the paper. F.C. and X.J. revised the paper. All authors read and approved the final manuscript.
Corresponding authors
Ethics declarations
Ethics approval and consent to participate
Plasma samples were collected by the Union Hospital of Tongji Medical College of Huazhong University of Science and Technology. This study had been approved by the Medical Ethics Committee of Union Hospital of Tongji Medical College of Huazhong University of Science and Technology, and the Institutional Review Board of BGI. Written informed consents were obtained from all participants.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Zhang, H., Li, L., Luo, Y. et al. Fragmentomics of plasma mitochondrial and nuclear DNA inform prognosis in COVID-19 patients with critical symptoms. BMC Med Genomics 17, 243 (2024). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12920-024-02022-2
Received:
Accepted:
Published:
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12920-024-02022-2