Phd Thesis

Student : Devrim Önder

Advisor : Assoc. Prof. Dr. Bilge Karaçalı

Quantitative Analysis of Tissue Composition in Digitized Histology Slides using Automated Texture Classification in Health and Disease Conditions

Recent developments in histology increased the importance of digital storage and processing of tissue slides in computer environments. The development of full-automated image analysis systems that scan and segment normal/abnormal tissue profiles in huge number of recorded digital histology slides also became a favorite topic.

Cancer disease that can easily be identified by abnormal tissue profiles is also one of the
mostly studied subjects. Generally, heterogeneous cancerous regions can be identified in homogenously distributed tissue profiles.

In the thesis, automated methods that can achieve segmentation of heterogeneous tissue profiles in histology slides will be studied. In order to segment heterogeneous texture profiles, multi-dimensional feature vectors will be calculated by several texture profiles described in the literature. The feature vector space, corresponds to many tissue images will be an input to succeeding texture segmentation algorithms.

In the classification of normal/abnormal tissue regions, Quasi-supervised Statistical Learning Algorithm (QSL) will be used. The advantage of this algorithm is that manual segmentation of normal/abnormal regions is not needed in learning phase. All the information required is the existence of normal/abnormal profiles in each histology image. Normal/abnormal texture regions in images will be segmented using this existence information.

Furthermore, abnormal tissue regions will be inter-segmented according to texture profiles by multi-class QSL. The number of abnormal cases will be determined together with pathologists later.

A feature selection algorithm will be applied to select texture profiles which are successful
in segmenting the normal/abnormal and inter-abnormal texture regions in order to achieve efficient texture segmentation framework.

Student : Tunca Doğan

Advisor : Assoc. Prof. Dr. Bilge Karaçalı

Vector Space Methods in the Computational Analysis of Gene and Protein Sequence Data

The aim of the study is to classify gene and protein sequences with respect to their sequential properties and by this way inferring the functions of unknown human genes and proteins by developing a method to express these sequences in high dimensional vector spaces that the statistical learning algorithms can be applied. “Isometric feature mapping” (ISOMAP) algorithm will be used in order to embed gene and protein sequences in high dimensional vector spaces and various classification methods will be applied on these vectorial arrangements such as nearestneighbor, maximum likelihood and support vector machines. Using this procedure, meaningful clusters -regarding functional similarities- will be tried to be obtained.