Carcinogenesis, Teratogenesis & Mutagenesis ›› 2023, Vol. 35 ›› Issue (5): 374-381.doi: 10.3969/j.issn.1004-616x.2023.05.008

Previous Articles     Next Articles

Bioinformatics analysis of key gene expression in esophageal cancer and its clinical significance

ZHAO Liran1,2, BU Liang1   

  1. 1. Department of Thoracic Surgery, Xiang'an Hospital of Xiamen University, Xiamen 361100;
    2. School of Medicine, Xiamen University, Xiamen 361100, Fujian, China
  • Received:2023-06-07 Revised:2023-09-19 Published:2023-10-13

Abstract: OBJECTIVE:To investigate expression and clinical significance of aberrantly expressed genes in esophageal cancers (ESCA) by mining them from the Gene Expression Omnibus (GEO) database through bioinformatics technology. METHODS: The ESCA microarray datasets GSE38129 and GSE20347 were downloaded from the GEO database using the GEOquery package in R. After the batch effect was removed from the merged dataset by the sva package,the merged dataset was screened for differentially expressed genes (DEGs) using the Limma package, and gene ontology (GO) functional enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis were performed on the DEGs using the clusterProfiler package. Ontology (GO) functional enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis were performed on the DEGs at the STRING website,and protein interaction network analysis (PPI) was performed on the DEGs,and core modules and core genes (hub genes) were extracted by using MCODE and CytoHubba plugins. The Hub gene was entered into the University of Alabama at Birmingham Cancer Database (UALCAN) to analyze the relationship between its expression level and esophageal cancer stage,methylation level, and TP53 mutation, etc. Finally, the core genes were verified with the help of microarray data GSE70409. RESULTS: 390 DEGs were screened in the normalized merged dataset:166 up-regulated DEGs and 224 down-regulated DEGs.GO analysis showed that they were mainly involved in biological processes such as mitotic cell cycle phase transition,extracellular matrix production,and epidermal development. KEGG enrichment analysis showed that DEGs were associated with signaling pathways such as cell cycle, extracellular matrix receptor interactions, and amoebiasis. The obtained PPIs were imported into Cytotec. The obtained PPIs were input into Cytoscape software,and a total of 20 key genes were screened. The key genes were inputted into the UALCAN database for analysis,and the mRNA expression levels of three genes (CDK1,TOP2A,AURKA) were screened out to be significantly higher in esophageal cancer than in normal tissues (P<0.05),and the expression rates of these three genes were higher in cases with clinical stage of esophageal cancer,mutation of TP53,and methylation of the promoter of this gene,and the differences were all statistically significant (P<0.05). The expression rates of these three genes were significantly higher in male patients than in female patients (P<0.05),and the expression rates of these three genes were significantly higher in patients whose ages were concentrated in the 41-60 range than in other age groups (P<0.05). By Kaplan-Meier survival curve analysis of Gene Expression Profiling Interactive Analysis (GEPIA) database,the overall survival of ESCA patients with high expression of CDK1 gene was shorter than that of those with low expression (P=0.036),and the overall survival of ESCA patients with high expression of AURKA gene was shorter than that of those with low expression (P=0.033), therefore CDK1, AURKA gene expression was negatively associated with the prognosis of esophageal cancer. CONCLUSION: Using a combination of bioinformatics techniques,three core genes were identified,the expression of which was higher in esophageal cancer than in normal tissues. Expression of CDK1,TOP2A,and AURKA was positively correlated with the stage of the tumor,the level of methylation of the promoter of the genes,and the mutation status of TP53. Among them,CDK1 and AURKA had the potential to become a new molecular marker for clinical diagnosis and prognosis of esophageal cancer.

Key words: esophageal cancer, bioinformatics, GEO, differentially expressed genes

CLC Number: