Abstract:
Global hypomethylation has been found on L1 in cancer cells. Moreover, having L1 is significantly associated with down regulation of hosting genes for some cancers. Nonetheless, not all genes that possess L1 are down regulated. To identify L1 characteristics that mediate gene expression in cancers, we performed chi-square test and logistic regression for each variable along with decision tree and classification association rules mining for multivariate data analysis. The results from statistical methods indicated the significant L1 characteristics, especially the number of L1, individually associated with gene expression using at significance level α = 0.05. For data mining, the size of the decision tree was too large to be useful. However, rules mining could generate interesting rules. Each cancer dataset has special characteristic rules. Firstly, the derived rules from bladder and liver cancer dataset support the hypothesis that L1 transcription may control down regulation. Both groups of rules suggest the mechanism to promote L1 transcription but different L1 characteristics, the number of L1 > 2 and conserved SRY Site1, respectively. Secondly, the rules derived from prostate cancer represent L1 retrotranspositional activities (conserved ORF1 and/or ORF2) which include L1 transcription, RNA stability and processing, translation, DNA restriction, reverse transcription and insertion. Finally, conserved TF-nkx-2.5 may control down regulation of head and neck cancer. Moreover, the derived rules from the dataset emulating lung cancer by 5-AZA shows that sense and antisense L1 can probably control the expression of genes by either directions of L1 transcription.