Özet:
The DNA microarray technology allows for monitoring and measuring the expression level of a great number of genes in tissue samples simultaneously. In microarray datasets, the number of samples is much smaller than the number of genes. The classification of such data resulting in the known problem of “curse of dimensionality” and data overfitting. For a successful disease diagnosis, it is necessary to select a small number of discriminating genes that are relevant for classification. Gene selection in microarray data analysis not only increases the classification accuracy but also decreases the processing time in the clinical setting. Therefore, it is quite important to determine a minimum subset of genes to develop a successful disease diagnostic system. In this thesis, two approaches for selecting highly discriminating genes in cancer classification based on a hybrid of nature-inspired optimization algorithms and different classifiers are proposed. In the first proposed approach, Black Hole Algorithm is, for the first time, being used to solve a feature selection (FS) problem. By applying the hyperbolic tangent function, a new binary version of BHA called BBHA is utilized to solve FS in the text, image, and biomedical data. Two classifiers (RF and NB) serve as the evaluators of our proposed algorithm. Experimental results show that BBHA wrapper-based feature selection method is superior to BPSO, GA, SA, and CFS in terms of all criteria. BBHA gives a significantly better performance than the BPSO and GA in terms of CPU Time, the number of parameters for configuring the model, and the number of chosen optimized features. Also, BBHA has competitive or better performance than the other methods in the literature.
In the second proposed approach, we improve the performance of Binary Particle Swarm Optimization (BPSO) and help it to avoid being trapped in a local optimum by applying BBHA as the local optimizer for BPSO. Experimental results and statistical analysis on four clinical datasets demonstrate that the proposed method yields very small subsets of informative genes, while achieving significantly better classification performance than other approaches such as Firefly, ant colony, bat search, genetic algorithm, harmony search, Fast Correlation-Based Filter (FCBF), and Correlation-based Feature Subset Selection (CFS). Moreover, It was also shown that applying BBHA as the local optimizer for BPSO can significantly improve the performance of BPSO and help it to avoid being trapped in a local optimum.
Several studies on miRNA expression datasets have been conducted in prostate cancer recurrence. However, the results have varied among different studies. By integrating the individual studies the statistical power is increased and more reliable conclusions and new biological insights can be drawn. In this thesis, we conducted a meta-analysis on six available miRNA expression datasets for prostate cancer recurrence after radical prostatectomy and identified a potentially significant list of differentially expressed microRNA genes. We did gene ontology enrichment, KEGG analysis, and common pathway analysis to identify the molecular pathways in which the identified microRNA genes participate and reveal new directions for drug treatments of recurrent prostate cancer.
MiR-145, an important tumor suppressor microRNA, has shown to be downregulated in many cancer types and has crucial roles in tumor initiation, progression, metastasis, invasion, recurrence, and chemo radioresistance. In this thesis by meta-analysis of eight GEO datasets, we investigated potential common target genes of miR-145 to help to understand the underlying molecular pathways of tumor pathogenesis in association with those common target genes.