Investigating Cross-Institutional Recognition of Cancer Registration Items: A Case Study on Catastrophic Forgetting

You-Chen Zhang, Chen-Kai Wang, Ming-Ju Tsai, Hong-Jie Dai

Abstract

A cancer registry is a critical database for cancer research, which require diverse domain knowledge and manual extraction of vital information from patient records for surveillance. In order to building a real-time and high-quality cancer registry database, a named entity recognition (NER) model based on bidirectional long short-term memory (BiLSTM)-conditional random fields (CRFs) to automatically extract 14 cancer registry items from unstructured pathology reports was developed for five hospitals. Because not all hospitals have sufficient training data, so that we apply transfer learning to develop our models for different hospitals. However, catastrophic forgetting leads to poor performance of the transferred model on the source hospital. To address this issue, we study the effectiveness of applying the elastic weight consolidation (EWC) method for the extraction of cancer registry items from the unstructured pathology reports of colorectal cancer to mitigate the occurrence of catastrophic forgetting. In our results, we observe that effective parameter settings can reduce the impact of catastrophic forgetting.

comments powered by Disqus