看到了一个数据挖掘文章,标题是:《Computational analysis for identification of early diagnostic biomarkers and prognostic biomarkers of liver cancer based on GEO and TCGA databases and studies on pathways and biological functions affecting the survival time of liver cancer》,它里面的错误很多,我们来一点一滴掰扯一下。前面我们提到了第一个错误是:肝癌的简称是: LIHC - Liver Hepatocellular Carcinoma 被弄错了,详见: tcga数据库没有这个癌症啊 这个数据挖掘文章很容易理解,就是两个数据集各自的独立的差异分析: GSE25097 datasets were firstly obtained and compared with TCGA LICA datasets and an analysis of the overlapping differentially expressed genes (DEGs) was conducted. 然后两次差异分析各自的上下调基因整体作为一个差异分析基因列表去取交集: 790 DEGs and 2162 DEGs were obtained respectively from the GSE25097 and TCGA 102 Common DEGs were identified b