Application of machine learning algorithms for accurate determination of bilirubin level on in vitro engineered tissue … – Nature.com

Colour space channel sensitivity analysis

An absorbance spectral scan was performed on freshly prepared bilirubin samples using a microplate reader and compared against reported spectral profiles to ensure the absence of biliverdin. The solutions were shown to strongly absorb light in the blue colour region (Fig.1a), with an absorbance peak at 450nm29. We then sought to confirm the spectral behaviour of our prepared bilirubin samples by verifying the linearity of the optical density (OD) at three wavelengthsred (R, 650nm), green (G, 532nm) and blue (B, 456nm)with varying bilirubin concentrations. These three wavelengths also correspond to the RGB colour filters of the mobile phone digital camera sensor30. As shown in Fig.1b, a strong linear correlation (R2=0.996) and relatively higher sensitivity (slope=0.198) were observed in the blue wavelength, indicating that the blue wavelength has demonstrated the strongest bilirubin signal and it is more sensitive to the changes in the bilirubin concentration. Similarly, the green wavelength also displayed a strong correlation (R2=0.983), but a much lower sensitivity (slope=0.017) as compared to the blue wavelength. The red wavelength had the lowest correlation (R2=0.735) and sensitivity (slope=0.002) among all three wavelengths, suggesting the red channel might not respond sensitively to the changes in the bilirubin concentration.

Spectral characterization of bilirubin solutions with different concentrations. (a) Spectral characterization of bilirubin solution at different days. Red curve represents the freshly made bilirubin. A peak shift towards left has been observed for stored bilirubin due to conversion of bilirubin to biliverdin. (b) Scatter plot of absorbance of bilirubin solution in different colour wavelengths; Statistical differences (P<0.01) in both correlation and sensitivity were observed among all three wavelengths.

In vitro evaluation and characterization of the ML-based bilirubin measurement were conducted using an elastomeric tissue phantom mimicking neonatal skin with varying concentrations of bilirubin. Parametric studies of the various biological and external factors were conducted to evaluate the isolated effect of each biological and external factor through tight control of the in vitro fabrication parameters and environmental conditions. First, Polydimethylsiloxane-Titanium dioxide (PDMS-TiO2) tissue phantom samples with varying thicknesses were evaluated (Fig.2a). Relatively strong correlations between blue channel pixel value and different bilirubin concentrations were observed in all 1mm (R2=0.722, slope=2.775), 2mm (R2=0.921, slope=2.630) and 3mm (R2=0.972, slope=2.033) samples. However, a statistically significant difference (P<0.05) in sensitivity was observed between 1mm samples and 3mm samples, as well as between 2mm samples and 3mm samples, demonstrating that the images from the 3mm samples were less responsive to the changes in the bilirubin concentration.

Parametric study of important biological and external features. (a) Scatter plot of the pixel value of bilirubin concentrations in PMDS-TiO2 tissue phantom samples with different thicknesses; A statistically significant difference (P<0.05 in 1mm, P<0.001 in 2mm and P<0.001 in 3mm) was observed in the sensitivity slope but not in correlation among different thicknesses. (b) Scatter plot of the pixel value of bilirubin concentrations in samples with different light scattering ratios (PDMS to TiO2 ratio); Significant statistical difference was observed in both correlation and sensitivity between 0.01 and 0.02 PDMS-TiO2 ratio (P<0.005), as well as between 0.015 and 0.02 PDMS-TiO2 ratio (P<0.005). (c) Scatter plot of the pixel value of bilirubin concentrations in samples with different WB. Significant statistical difference was observed in both correlation and sensitivity among 2000K (P<0.05), 5000K (P<0.005) and 8000K (P<0.01). (d) Scatter plot of the pixel value of bilirubin concentrations in samples with different ISOs; ISO200 and ISO500 (P<0.05), ISO200 and ISO700 (P<0.05), as well as ISO500 and ISO700 (P<0.05) datasets are statistically different from each other in correlation. At the 0.05 significance level, a significant statistical difference was also observed in sensitivity between the ISO200 and ISO500 datasets (P<0.05), as well as between the ISO200 and ISO700 datasets (P<0.05). (e) Scatter plot of the pixel value of bilirubin concentrations in samples with different illumination tones; A significant statistical difference was observed in correlation among all channels (P<0.05). (f) Scatter plot of the pixel value of bilirubin concentrations in sample images with different light intensities; Different light intensities have demonstrated a considerable statistical significance in both sensitivity and correlation (P<0.05).

Next, images of PDMS-TiO2 tissue phantom samples with different extents of light scattering (varying ratios of PDMS to TiO2 light scattering agent) were used to represent skin containing varying amounts of scattering agents such as collagen and adipose tissue31. As shown in Fig.2b, strong correlations between blue channel pixel values and different bilirubin concentrations were observed in samples with a ratio of 0.01 (R2=0.921, slope=2.630) and samples with a ratio of 0.015 (R2=0.936, slope=2.359) respectively. However, no statistically significant linear regression relationship (R2=0.482, slope=1.065, P>0.05) was observed between blue channel pixel values and different bilirubin concentrations in PDMS-TiO2 tissue phantom images with a TiO2 ratio of 0.02.

Two hardware parameters that are commonly adjusted for image quality control were evaluated. Firstly, PDMS-TiO2 tissue phantom images were captured under different white balance (WB) conditions (Fig.2c). When WB is much lower (2000K) than the actual colour temperatures, weaker correlation and sensitivity (R2=0.774 slope=0.425) were observed, suggesting a strong confounding effect on the bilirubin concentration prediction. However, as the WB approached 5000K, which is close to the actual colour temperature in the image collection environment, a much stronger correlation and sensitivity (R2=0.963, slope=1.940) were observed in the scatter plot. As the WB increased higher (8000K), while the sensitivity remained unchanged (slope=2.097, P>0.05), a weaker correlation (R2=0.848) was observed. Secondly, camera sensor light sensitivity (ISO) was varied to establish its impact on the bilirubin estimation. Despite yielding moderately high sensitivities, it was observed that at low and high ISO values, a relatively weaker correlation was observed (ISO100, R2=0.898, slope=1.944; ISO1000, R2=0.868, slope=2.022) compared to the medium ISO (ISO500, R2=0.921, slope=2.630) in response to the changes in bilirubin concentrations. As expected in Fig.2d, the pixel intensity values increased with increasing ISO values. By extrapolation, it is expected that the pixel value of the diluted bilirubin can be saturated under brighter or darker lighting conditions. This suggests that an appropriate ISO setting needs to be selected to control the image brightness and prevent signal saturation at the lower and higher bilirubin.

The effect of ambient lighting conditions (light intensity and illumination tone) was also investigated (Fig.2e). Statistically significant differences in the correlation and sensitivities were calculated between the low, moderate and high light intensity conditions. It was observed that images with low light intensity generated a weak correlation (R2=0.717) but high sensitivity (slope=3.678). The correlation was observed to increase when the light intensity increased from moderate (R2=0.934) to high intensity (R2=0.942) while the sensitivity slope decreased from 2.531 to 1.793. These results suggest that high but appropriate amount of light intensity would aid the camera pixels in response to collect the colour information from the PDMS-TiO2 tissue phantom images.

Lastly, different illumination tones were supplied and tested using a diffused light source (Fig.2f). A relatively strong linear correlation was observed in all three illumination toneswhite light (R2=0.894, slope=2.106), off-white light (R2=0.860, slope=1.953) and yellow light (R2=0.878, slope=2.000). All three conditions demonstrated satisfactory (~0.9) correlation and sensitivity in response to the changes of bilirubin concentrations in the PDMS-TiO2 tissue phantom images. However, as compared to the pixel value in the images with white light, it is observed that the pixel values are generally lower in images with off-white light and are much lower in images with yellow light. This is expected as the spectral characterization (see Supplementary Fig.1) of illumination tones demonstrated the strongest blue light scattering signal (~450nm) in white light, followed by off-white light and yellow light.

The last factor tested (capture distance away from the subject) did not demonstrate any statistical significance in correlation and sensitivity (P>0.05), suggesting that distance does not affect the image information of the PDMS-TiO2 tissue phantom images (see Supplementary Fig.2).

We have shown that the bilirubin level prediction from images is strongly dependent on the colour representation. This necessitates the correction of the WB to achieve higher colour rendering. We applied different colour constancy algorithms (GW, MSGP, MaxRGB, and CH) and compared the WB-corrected images against the corresponding ground truth images captured with a light temperature of 5600K. The relative performances of the various WB correction methods are shown in Table 1. The different WB correction methods produce considerable differences, and the angular error results showed that the Gray World (GW) method performed the best at 3000K (mean angular error: 0.44) for the PDMS-TiO2 tissue phantom images. We also conducted additional tests to assess the GW accuracy in correcting raw images captured at different colour temperatures (Table 2). The results indicated that the GW method consistently produced a low angular error (<0.5) for all groups, demonstrating the stability and effectiveness of the GW method in correcting the WB variations of tissue phantom images.

Similar to the WB correction, we evaluated several colour spaces for their correlation with the bilirubin concentration using the tissue phantom images taken in a controlled image collection environment. As shown in Fig.3a, like the spectrophotometric characterization presented in Fig.1a, a strong linear correlation and high sensitivity (R2=0.958, slope=1.995) were observed between the blue channel pixel values and bilirubin concentrations in PDMS-TiO2 tissue phantom images. A relatively lower correlation and sensitivity were observed in the green channel pixel value (R2=0.871, slope=0.542), corresponding to a lower accuracy and sensitivity as compared to the blue channel. The pixel value acquired from the red channel did not demonstrate a statistically significant linear relationship (R2=0.014, slope=0.034, P<0.05) with the bilirubin solutions, suggesting an insensitive response to the changes in bilirubin concentration.

Linear regression and evaluation of important feature channels from colour spacesRGB, CMYK, L*a*b*, HSV, YCbCr and LUV. (a) Scatter plot of pixel values of bilirubin concentrations of PDMS-TiO2 tissue phantom samples in the RGB channels respectively; Significant statistical difference in correlation was observed between the B and R channel (P<0.05), as well as between the B and G channel (P<0.005). (b) Scatter plot of pixel values of bilirubin concentrations of samples in the CMY(K) channels respectively; A significant statistical difference in both correlation and sensitivity was observed among all channels (P<0.05). (c) Scatter plot of pixel values of bilirubin concentrations of samples in the CIELAB channels respectively. A statistically significant correlation (P<0.05) was observed in correlation among all channels. Sensitivity also demonstrated a significant statistical difference between the L channel and the a* channel, as well as between the L channel and the b* channel (P<0.05). (d) Scatter plot of pixel values of bilirubin concentrations of samples in the HSV channels respectively. A significant statistical difference in both sensitivity and correlation was observed among all channels (P<0.05). (e) Scatter plot of pixel values of bilirubin concentrations of samples in the YCbCr channels respectively. A significant statistical difference (P<0.05) was observed in both correlation and sensitivity among all different channels. (f) Scatter plot of pixel values of bilirubin concentrations of samples in the LUV channels respectively. A significant statistical difference (P<0.05) was observed in both correlation and sensitivity among all different channels.

Mapping of the RGB values to the CMY(K) colour space was also evaluated (Fig.3b). Among all channels, the pixel values obtained from the Y channel were observed to show the strongest linear correlation (R2=0.986) and the highest sensitivity (slope=1.812) whereas the C channel showed a weaker correlation (R2=0.782) and a less sensitive response (slope=1.086) to the changes in bilirubin concentrations. The M channel showed the weakest correlation (R2=0.122) with bilirubin level.

In the CIELAB (L*a*b*) colour space (Fig.3c), values acquired from the b* channel displayed a strong linear correlation (R2=0.971) and relatively high sensitivity (slope=0.902) to the changes in bilirubin concentrations. The value in the a* channel had a comparably weaker correlation (R2=0.749) and a lower sensitivity (slope=0.203). Notably, the pixel value obtained from the L channel did not show a statistically significant linear relationship (R2=0.3019, slope=0.08, P>0.05) with varying bilirubin concentrations.

Meanwhile, Fig.3d indicated the linear relationship between bilirubin level and the channels in Hue-Saturation-Value (HSV) colour space respectively. In the S channel, a strong linear correlation, which an R2 value is 0.917, was observed between the bilirubin concentration and the S channel value. Sensitivity was also observed to be relatively strong (slope=0.0061) as compared to other channels in the HSV colour space, showing strong capability in response to the change of bilirubin concentrations.

The YCbCr channel colour space demonstrated a relatively linear relationship between the 3 channels and the bilirubin concentration, albeit with an evidently lower sensitivity (Fig.3e). A very strong linear correlation and moderate sensitivity were observed in the Cb channel with an R2 value of 0.970 and a slope of 0.892. Compared to the other channels which demonstrated relatively less strong linear correlation and sensitivity (Y Channel: R2=0.769, slope=0.270; Cr Channel: R2=0.381, slope=0.275), the Cb channel displayed superior sensitivity to changes in bilirubin concentration.

The final colour space mapping evaluated was the CIELUV (L*u*v*) colour space (Fig.3f). As compared to the L (slope=0.080) and u* (slope=0.240) channels, which have either low or statistically insignificant sensitivity (P>0.05) to the changes of bilirubin value, the v* channel does not only have a strong linear correlation (R2=0.969), but also a relatively higher sensitivity (slope=1.308) to changes in bilirubin concentration.

Five machine learning modelsdecision tree (DT), K-nearest neighbour (KNN), random forest (RF), support vector machine (SVM), and LightGBMwere evaluated for their accuracy in performing binary classification of jaundice based on the PDMS-TiO2 tissue phantom image data (Fig.4a). The mean accuracy was 0.672 in DT, 0.737 in KNN, 0.774 in RF, 0.827 in LightGBM and 0.848 in SVM. Statistically significant differences (P<0.05) in accuracy were also observed among the models. The pairwise comparison also showed that among all models tested, SVM performed the best in the bilirubin binary classification task, followed by the LightGBM, RF and KNN, with DT performing the worst. The corresponding receiver-operating-characteristic (ROC) curves and the respective area-under-curve (AUC) scores further validated this observation (Fig.4b). The AUC scores obtained for each model were as follows: 0.74 (DT), 0.82 (KNN), 0.86 (RF), 0.91 (LightGBM), and 0.93 (SVM). The results were consistent with the performance comparison based on the cross-validated accuracy, suggesting that the SVM model has the best predictive capability among all other models tested.

Model performances. (a) Accuracy performance among DT, KNN, RF, SVM and LightGBM models in the classification task; A significant statistical difference in accuracy was observed among models (P<0.05). All models demonstrated a significant statistical difference in pairwise comparison except for the comparison between SVM and LightGBM model (P>0.05). (b) ROC performance of the five different models. The inlet graph shows the AUC performance, which represents the capability of the model to distinguish between the tissue phantom images with normal bilirubin levels and those images with abnormal bilirubin concentrations. The AUC demonstrated a significant statistical difference among all models (P<0.05). (c) R2 value among different models in the regression task with a different number of features as training labels. A significant statistical difference (P<0.05) was observed in R2 between 6 and 17 features among all models. Five models are statistically different from each other except the RF, SVM and LightGBM in pairwise comparison. (d) MSE value among different models in the regression task with a different number of features as training labels. A significant statistical difference (P<0.05) was observed in MSE between 6 and 17 features among all models. Asterisk (*) indicates P<0.05.

In addition, as shown in Fig.4c, d, we also tested the model capability in performing the regression task, evaluating the performance of each regression model (DT, RF, KNN, SVR, and LightGBM) in predicting the exact bilirubin concentration in the PDMS-TiO2 tissue phantoms. When the models were trained with limited features, a relatively large mean square error, MSE, was observed in all models (DT: 16.157; KNN: 19.749; RF: 11.499; SVM: 17.651 and LightGBM: 13.830). A low R2 score (DT: 0.555; KNN: 0.465; RF: 0.684; SVM: 0.524 and LightGBM: 0.624) was observed at the same time. However, the results showed significant improvements (P<0.05) in the model performances (R2 score) when additional features were included for all five models (DT: 0.734; KNN: 0.742; RF: 0.812; SVM: 0.806 and LightGBM: 0.808). The corresponding MSE was also greatly reduced across all models (DT: 9.635; KNN: 9.366; RF: 6.816; SVM: 7.054 and LightGBM: 6.946). Similar to the classification task, LightGBM, SVM and RF models performed the best in predicting bilirubin concentrations. The acquired low MSE value, akin to the square of bilirubin level variance, additionally indicates a minor disparity in relation to the true bilirubin levels. A variance ranging from2.61mg/dl to3.10mg/dl was achieved across all evaluated models, thus illustrating the predictive capacity of machine learning models in determining precise bilirubin levels.

Follow this link:
Application of machine learning algorithms for accurate determination of bilirubin level on in vitro engineered tissue ... - Nature.com

Related Posts

Comments are closed.