Privacy focused classification of prostate cancer using federated learning

Salma, Syeda Umme; Sakib, MD. Sadman; Alvee, Mohammed Moinul Morshed; Yasaar, Nahiyan

View/Open

18101350, 18101089, 18101077, 21141018_CSE.pdf (1.726Mb)

Date

2022-01

Publisher

Brac University

Abstract

The prostate gland is a small gland located in the lower abdomen of a man. Prostate cancer occurs when a tumor, or abnormal, malignant growth of cells, forms in the prostate. Prostate cancer is a slow-growing cancer that often goes undetected until it has progressed to an advanced stage. The majority of men with prostate cancer are unaware of having it, and many of them die of other causes before they even get diagnosed with it. However, prostate cancer becomes hazardous when it grows rapidly or spreads outside of the prostate. With early detection and personalized care, the prostate cancer survival rate is significantly increased. Deep learning can play a significant role regarding this, as the field of medical imaging has shown that identification based on computer-aided diagnosis helps radiologists make more precise diagnoses while still reducing diagnostic time and costs. However, the data concerning prostate cancer can be quite difficult to collect and it is used in a restricted manner due to the unwillingness of the patients to share and the hospital’s confidentiality about their patients’ records. The aim of our research was to address these challenges and it led us to develop such a system where prostate cancer can be classified, maintaining confidentiality of the data using a decentralized method called federated learning, different from how it can be done with current approaches. In this research, we have classified prostate cancer using simple CNN, Xception and VGG19 models in both traditional and federated learning approaches for comparative analysis. In fact, VGG19 outperformed the other two models in both approaches, with centralized classification accuracy being 95.51% and decentralized classification accuracy being 83.76%. Most importantly, through our system, the instance of our server-side model is distributed to different clients so that the clients can independently train their model using their local dataset in their own environment. Eventually, the updated weights of those trained models return back to the server to be aggregated from all the contemporary clients to finally train our server-side model without even accessing confidential medical data in order to ensure privacy focused classification.

Keywords

Federated learning; Prostate cancer; Secure deep learning; Privacy; Distributed learning; Medical imaging

LC Subject Headings

Machine learning.; Federated database systems.; Data protection.; Diagnostic imaging.

Description

This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2022.

Cataloged from PDF version of thesis.

Includes bibliographical references (pages 28-30).

Department

Department of Computer Science and Engineering, Brac University

Type

Thesis

Collections

Thesis & Report, BSc (Computer Science and Engineering) [1589]