衛生福利部日前公布民國 105年國人十大死因,第一名為惡性腫瘤(癌症),其次是心臟疾病,第三名則為肺炎。此外,子宮頸癌是全球婦女第四大癌症死因,僅次於乳癌、大腸癌和肺癌(衛生福利部統計處,2017)。隨著醫療體系的發展,民眾對自己的健康也愈加地重視,希 望藉由資訊技術的支援來建立醫學知識,進而找出各種疾病的醫療指引。在過去醫學診斷多只靠醫師以往的經驗,但是現今疾病因素多元化,本研究配合現在的資訊技術,預測是否罹患子宮頸癌,以及找出子宮頸癌檢驗項目之關聯性。本文以 UCI 網站所收集的資料進行訓練與測試,運用類神經網路、決策樹、最近鄰居演算法三種資料探勘技術,建立子宮頸癌預測模型。此外,本研究也加入了關聯規則,去探討子宮頸癌檢測中,四種檢驗項目之間的關聯性,輔助醫師找出真正有效的檢查項目,並減少不必要的醫療資源浪費。
In 2016, the Ministry of Health and Welfare addressed the top ten causes of death in Taiwan. Cancer was the leading cause of death, followed by heart disease and the pneumonia. Moreover, cervical cancer is the fourth most common cancer in women, behind breast cancer, colorectal cancer and lung cancer (The Ministry of Health and Welfare, 2017).
With the development of the health care system, people are paying more and more attention to their own health. They hope to establish medical knowledge through the support of information technology, and then find out the medical advice for various diseases. In the past, medical diagnosis mostly depended on the previous experience of doctors, but today's disease factors are diversified. In this paper, we attempted to predict the cervical cancer in female patients and discover the relevance between different cervical cancer screening tests.
We used the data from the UCI database and built three cervical cancer forecast models based on data mining techniques: neural network, decision tree and k-nearest-neighbor regression. Furthermore, we used the association rule mining to identify the relationships between four different cervical cancer screening tests. The results can be used as the medical assistance to identify the effective features in cervical cancer detection and reduce unnecessary utilization of medical resources.