Institutional Repository
Technical University of Crete
EN  |  EL

Search

Browse

My Space

Experimental comparison of machine learning approaches to medical domains: a case study of genotype influence on oral cancer development

Zervakis Michalis, Alessandro Passaro, V. Stalbovskaya, Flavio Baronti, Anna Maria Rossi, M. Blazantonakis, D. De Rossi, Valentina Maggini, M. Marcucci, Antonina Starita, R. Gonçalvez, Alessio Micheli

Full record


URI: http://purl.tuc.gr/dl/dias/013C067A-BA28-40AC-9073-7FBC97707FA7
Year 2005
Type of Item Conference Full Paper
License
Details
Bibliographic Citation F. Baronti, F. Colla, V. Maggini, A. Micheli, A. Passaro, A. M. Rossi, A.Starita, V. Bevilacqua, S. Cambò, L. Cariello, G. Mastronardi, E.M. Biganzoli, P. Boracchi, F.A. Cardillo, A. Starita, D. Caramella, A. Cilotti, F. Odoguardi, S.T. ChandraShekar, G.L. Varanasi, D. D’Alimonte, D. Lowe, I.T. Nabney, M. Sivaraksa, F. Ferreira, P. Maló, E. Ifeachor, R. Gonçalvez, S.R.I. Gabran, E.F. El-Saadany, M.M.A. Salama, C. Iacconi, A. Cilotti, C. Marini, M. Moretti, D. Mazzotta, F. Odoguardi, F.A. Cardillo, A. S
Appears in Collections

Summary

Research in medical domains is facing new challenges as the available information increases in quantity and quality. In this context, Machine Learning methodologies can provide the right tools for data analysis, which can cope with recurring problems in medical research, such as the integration of clinical and genetic data. In this study we provide an experimental comparison of an heterogeneous subset of Machine Learning methods. For such a purpose, a representative dataset for medical analysis was chosen which regards Head and Neck Squamous Cell Carcinoma (HNSCC). HNSCC is a kind of oral cancer associated with smoking and alcohol drinking habits; however the individual risk could be modified by genetic polymorphisms of enzymes involved in the metabolism of tobacco carcinogens and in the DNA repair mechanisms. To study this relationship, the data set comprised demographic and life-style (age, gender, smoke and alcohol), and genetic data (the individual genotype of 11 polymorphic genes), with the information on 124 HNSCC patients and 231 healthy controls. Strengths and weaknesses of the different algorithms when applied to medical datasets, such as the one considered, will be analyzed, with particular attention to the issue of missing values.

Services

Statistics