Identification of coding and long noncoding RNAs differentially expressed in tumors and preferentially expressed in healthy tissues
The Cancer Genome Atlas (TCGA) and Genotype-Tissue Expression (GTEx) datasets allow unprecedented gene expression analyses. Here, using these datasets, we performed pan-cancer and pan-tissue identification of coding and long noncoding RNA (lncRNA) transcripts differentially expressed in tumors and preferentially expressed in healthy tissues and/or tumors. Pan-cancer comparison of mRNAs and lncRNAs showed that lncRNAs were deregulated in a more tumor-specific manner. Given that lncRNAs are more tissue-specific than mRNAs, we identified healthy tissues that preferentially express lncRNAs upregulated in tumors and found that testis, brain, the digestive tract, and blood/spleen were the most prevalent. In addition, specific tumors also upregulate lncRNAs preferentially expressed in other tissues, generating a unique signature for each tumor type. Most tumors studied downregulated lncRNAs preferentially expressed in their tissue of origin, probably as a result of dedifferentiation. However, the same lncRNAs could be upregulated in other tumors, resulting in "bimorphic" transcripts. In hepatocellular carcinoma (HCC), the upregulated genes identified were expressed at higher levels in patients with worse prognosis. Some lncRNAs upregulated in HCC and preferentially expressed in healthy testis or brain were predicted to function as oncogenes and were significantly associated with higher tumor burden, and poor prognosis, suggesting their relevance in hepatocarcinogenesis and/or tumor evolution. Taken together, therapies targeting oncogenic lncRNAs should take into consideration the healthy tissue, where the lncRNAs are preferentially expressed, to predict and decrease unwanted secondary effects and increase potency. Significance: Comprehensive analysis of coding and noncoding genes expressed in different tumors and normal tissues, which should be taken into account to predict side effects from potential coding and noncoding gene-targeting therapies.