As the possibilities offered by artificial intelligence, in particular machine learning, for improving the speed and accuracy of medical (and veterinary medical) imaging continue to grow, so does the requirement for robust research into the benefits (and possible pitfalls) of this exciting new technology. Given the ‘always-on’ nature of machine learning solutions, coupled with their seemingly human-beating accuracy, it’s hard to imagine a future for any medical imaging that doesn’t incorporate some kind of AI to a greater or lesser extent.
The aim of this cross-sectional (observational) study was “to determine useful texture parameters for discrimination of four specific lung patterns and develop a predictive model that distinguishes the lung patterns based on the selected parameters.” Radiologists’ classification (opinion) was considered the gold standard, and the accuracy and AUC of the machine learning algorithm were compared to those of the authors.
Twelve hundred regions of interest (ROIs) were selected from five hundred and twelve ventrodorsal (VD)and lateromedial (LM) radiographs from two hundred and fifty-two dogs and sixty-five cats. (One hundred and sixty males and one hundred and fifty-five females in total.) Forty-four texture parameters were generated using eight methods of texture analysis; seventy percent of these data (two hundred and sixty-one samples) were used for training, whilst thirty percent (one hundred and thirty-nine samples) were used as the test data.
Six regions from the VD views, and two from the LM views, were evaluated by three of the four study authors (all radiologists) and the images assigned to one of four categories: normal (P1), alveolar pattern (P2), bronchial pattern (P3), and unstructured interstitial pattern (P4). Any lobes displaying mixed, vascular, or nodular interstitial patterns were excluded from the study to avoid algorithmic misdiagnosis relating to small ROI size (nodular interstitial pattern), difficulties associated with differentiating vascular patterns (arterial and venous, as well as vascular branching), or an inability to discern the predominating pattern (mixed). Any lobes for which the radiologists disagreed were also excluded.
ROIs from the included lobes were selected by one of the authors, with a minimum size of 30 x 30 pixels (mean total 1,914.7 pixels, SD 1,129.0 pixels), with no more than three ROIs coming from any one lobe. ROIs were selected to avoid rib tissue whilst maximising the size of the ROI within the intercostal space. All four lung patterns were represented with the same frequency (four hundred ROIs). Radiography data (all acquired using the same system – REGIUS Model 190; Konica Minolta, Japan), were compressed from 12-bit to 8-bit using an unreported algorithm. (Exposure settings: 50-70kVp, 300mA, 0.02 seconds). WIZPACS (v 1.027; Medien, Korea) was used for image interpretation.
Accuracy of the model in correctly classifying the lungs patterns was 99.1% for the training data, and AUC was >0.98. Results for the test data were 91.9% and >0.92, respectively.
There were numerous limitations to this study, which were covered by the authors. The main drawback to this study in terms of the application of its findings to wider clinical practice was that all images were acquired on one acquisition device. There is no evidence to date to suggest that the results would be applicable for images acquired on another system, so more work would need to be done to establish the machine learning model’s usefulness in the ‘real world’. Compression of the greyscale data from 12-bit to 8-bit is likely to have had an effect on the accuracy of the model, but it’s not possible to say in which direction. Moreover, no details are provided for the compression algorithms used to affect this compression, and the choice of compression algorithm may also have significant effects on the model.
The exclusion of any lobes for which there was not unanimous agreement amongst the pathologists (whose opinion was used as the ‘gold standard’ for this study) may have resulted in an over-representation of more obvious lung patterns; at present it is not known whether the model would perform as well on the more ambiguous cases, and indeed a lack of a clear ‘gold standard’ in those cases would make future research more complex. Finally, there is no attempt made to assess reproducibility of the radiologists’ assessment, either in an intra-operator or inter-operator context. The fact that the ROIs were selected by just one radiologist may also be relevant, and ROIs selected by someone else may very well have produced different results.
The authors were conservative in the summary of their findings, stating that, “a number of texture parameters showed significant differences between the patterns. The developed artificial neural networks demonstrated high performance in discriminating the patterns. Texture analysis and machine learning algorithms have potential for application in the evaluation of medical images.”
Given the limitations of this study, which in effect served as a pilot study for potential future research in this area, this is a reasonable summation. Clearly, a lot more work needs to be done to validate this technique in a far broader range of cases, using a larger study sample drawn from a larger study population and working with images acquired from a range of different acquisition devices. Investigation of the effects of image compression prior to texture pattern analysis would also increase the potential usefulness of this research, and the benefits that may come from such machine learning algorithms in future.
1. Yoon, Y., Hwang, T., Choi, H., Lee, H., (2019). Classification of radiographic lung pattern based on texture analysis and machine learning. J Vet Sci 20(4):e44
Or login with emailForgot password