Kenneth Ancheta: Deep-learning based morphological segmentation of canine diffuse large B-cell lymphoma
For tissue biopsies, current confirmatory diagnostic approaches for enlarged lymph nodes rely on expert histopathological assessment, which is time-consuming and requires specialist expertise. Therefore, there is an urgent need to develop tools to support and improve veterinary diagnostic workflows. Advances in molecular and computational approaches have opened new avenues for morphological analysis. This study explores the use of convolutional neural networks (CNNs) to differentiate cDLBCL from non-neoplastic lymph nodes, specifically reactive lymphoid hyperplasia (RLH). Whole slide images (WSIs) of haematoxylin-eosin stained lymph node slides were digitised at 20 × magnification and pre-processed using a modified Aachen protocol. Extracted images were split at the patient level into training (60%), validation (30%), and testing (10%) datasets. Here, we introduce HawksheadNet, a novel lightweight CNN architecture for cancer image classification and highlight the critical role of stain normalisation in CNN training. Once fine-tuned, HawksheadNet demonstrated strong generalisation performance in differentiating cDLBCL from RLH, achieving an area under the receiver operating characteristic (AUROC) of up to 0.9691 using fine-tuned parameters on StainNet-normalised images, outperforming pre-trained CNNs such as EfficientNet (up to 0.9492), Inception (up to 0.9311), and MobileNet (up to 0.9498). Additionally, WSI segmentation was achieved by overlaying the tile-wise predictions onto the original slide, providing a visual representation of the diagnosis that closely aligned with pathologist interpretation. Overall, this study highlights the potential of CNNs in cancer image analysis, offering promising advancements for clinical pathology workflows, patient care, and prognostication.
1 Introduction
Canine diffuse large B-cell lymphoma (cDLBCL) is the most common subtype of lymphoma in dogs, typically arising in a multicentric form, and is characterised by an aggressive biological behaviour (1–3). The current gold standard treatment for cDLBCL is multi-agent maximum tolerated dose chemotherapy; the CHOP protocol (cyclophosphamide, doxorubicin, vincristine, and prednisolone) currently confers 1-, 2-, and 3-year survival rates of 20, 13, and 8%, respectively. Accurate diagnosis is important to inform effective therapeutic intervention, which directly impacts prognostic outcomes. While fine-needle aspirate cytological screening and biopsy analysis are commonly used for investigating enlarged lymph nodes, lymphoma diagnosis can be a challenging task for pathologists due to the similarity in appearance of neoplastic and normal lymphocytes and the complex classification of canine lymphomas (2). Therefore, there remains an urgent and unmet need to develop better and cost-effective tools to better diagnose and prognosticate cDLBCL patients.
A convolutional neural network (CNN) is a category of deep learning (DL) architecture that effectively detects important features without human supervision. One effective application of CNN models is computer vision tasks [reviewed in (4)], including the analysis of histological images for human (5–9) and veterinary sciences (10, 11). Different applications of CNN models have been developed for veterinary settings as reviewed in (12, 13). In human DLBCL (hDLBCL), Li et al. (14) developed a CNN platform that resulted in a near-perfect diagnostic accuracy across three different hospitals, outperforming experienced pathologists. Ferrandez et al. (15) developed a model that infers time-to-progression of hDLBCL patient from positron emission tomography images within 2 years compared to international guidelines (15). Recently, Lee et al. (16) and colleagues showcased a model that predicts hDLBCL prognosis in patients treated with immunochemotherapy (rituximab + CHOP). In the veterinary field, CNN models were developed to differentiate different types of canine lymphomas (10) and infer diet-versus steroid-based treatment response of dogs affected with protein-losing enteropathy (11). Although niche studies exist, there remains a need to develop better computer vision models for morphological analysis in underrepresented fields like veterinary science, which lags behind human pathology in adoption of such tools.
Training reliable and robust CNN models for morphological analysis can be computationally demanding, requiring effective image pre-processing and careful hyperparameter fine-tuning. Many pre-trained CNNs previously used to develop morphological models involve complex architectures, extensive fine-tuning (5–9), and often require high-performance computing (HPC) systems for training. Moreover, whole slide images (WSIs) are commonly used to train models for histological slide analysis. Although WSIs provide an abundant source of data for CNN training, they can be challenging to pre-process due to their large size and inconsistent staining. Image processing workflows, such as the Aachen protocol, offer a potential standardisation method for image pre-processing in DL, as demonstrated in various studies (9, 17–20). However, such workflows may not be optimal for all types of tissue, particularly regarding the stain normalisation step (21). There is a need to develop a more dynamic workflow that can be applied to different data types for image pre-processing, along with a lightweight CNN capable of generating reliable models for morphological analysis.
The primary purpose of this study is to determine whether CNN models can be trained to differentiate between cDLBCL and non-neoplastic canine lymph nodes. Additionally, this paper introduces a new lightweight CNN architecture, HawksheadNet, for training computer vision models. Finally, this paper highlights alternative methods for pre-processing lymph node slide images for DL applications.
2 Materials and methods
2.1 Patient cohort
The study cohort consisted of lymph node histopathology samples from 127 cases definitively diagnosed with either cDLBCL or reactive lymphoid hyperplasia (RLH), collected between 1st July 2021 and 31st December 2022. Haematoxylin-eosin (HE)-stained slides were collected from IDEXX Laboratories, United Kingdom. Of these, 59 were cDLBCL and 68 were RLH. Aside from malignancies, one of the key clinical differential diagnose of canine lymphadenopathy is RLH (10). RLH is characterised by non-neoplastic polyclonal lymphocytes that often resolves after antigen clearance (22). In contrary to RLH, cDLBCL consists of an uncontrollable monoclonal neoplastic expansion. Therefore, RLH was used as the contrasting control to cDLBCL to represent non-neoplastic lymphocyte proliferation. cDLBCL or RLH diagnosis was performed by board-certified anatomic pathologists (IDEXX) based on the combination of cellular morphology, immunohistochemistry (IHC), and/or PCR for antigen receptor rearrangements (PARR). For cDLBCL samples without follow-up IHC or PARR, the morphology of the neoplasm was additionally corroborated as large cell lymphoma consistent with presumptive cDLBCL by a board-certified pathologist (JW). For all cases used in model development and testing, the 12 patients with cDLBCL had results for CD79a and CD3 IHC, and one had CD3 plus CD20, while among the RLH cases, one had PARR testing to exclude neoplasia where morphology alone was not definitive. This study primarily focused on patients with enlarged peripheral lymph nodes; therefore, cases with only visceral lymph node involvement were excluded from downstream analysis.
