Functional and structural reorganization in brain tumors: a machine learning approach using desynchronized functional … – Nature.com

Acquisition and usage of MRIs

A detailed explanation of the participants as well as the acquisition of the data is already available63; nonetheless, for the sake of transparency, we briefly present some crucial aspects. Subjects were asked to undergo MR scans both in pre- and post-surgery sessions. Out of the 36 subjects that agreed to take part in the pre-surgery session (11 healthy [58.610.6 years], 14 meningioma [60.412.3 years] and 11 glioma [47.511.3 years]), 28 were scanned after surgery (10 healthy [59.610.3 years], 12 meningioma [57.911.0 years] and 7 glioma [50.711.7 years]). The post-surgery scan session took place during the first medical consultation at the hospital after the surgical intervention (mean: 7.9 months, range: 5.210.7 months). There were no differences in the time intervals between the groups (meningioma [24312 days], glioma [22315 days], p=0.328, two-tailed U-test). As a result, 19 pre- and post-surgery pairs of structural connectomes were usable as training and testing data. All brain tumors were classified as grade I, II, and III according to the World Health Organization. All ethical regulations relevant to human research participants were followed63.

Each MR session consisted of a T1-MPRAGE anatomical scan (160 slices, TR=1750 ms, TE=4.18ms, field of view=256mm, flip angle=9o, voxel size 111mm3, acquisition time of 4:05min) followed by a multi-shell HARDI acquisition (60 slices, TR=8700ms, TE=110ms, field of view=240mm, voxel size 2.52.52.5mm3, acquisition time of 15:14min, 101102 directions b=0, 700, 1200, 2800s/mm2) together with two reversed phase-encoding b=0s/mm2 blips to correct susceptibility-induced distortions64. Resting-state functional echo-planar imaging data were obtained (42 slices, TR=2100ms, TE=27ms, field of view=192mm, flip angle=90o, voxel size 333mm3, acquisition time of 6:24min). The TR was accidentally changed to 2400ms after 4 control subjects, 5 meningioma patients and 2 glioma patients were scanned changing the times of acquisition to 7:19min. For all the subsequent Fourier analyses, this TR mismatch is solved by adding zero padding and truncating the shorter time series to ensure that equivalent spectrums were sampled by the Python methods (for further details see Supplementary Material).

Additionally, segmented lesions including the edema, non-enhancing, enhancing, and necrotic areas were available. Tumor masks were obtained with a combination of manual delineation, disconnectome63, and the Unified Segmentation with Lesion toolbox4. To identify the tumor core of gliomas, two clinicians with more than thirty and ten years of experience performed and independently validated the segmentations using 3D slicer. Data only allowed for the identification of the tumor cores; hence we subtracted the resulting cores from the whole lesion to obtain a non-necrotic region for each of the patients diagnosed with a glioma-like tumor.

High-resolution anatomical T1 weighted images were skull-stripped65, corrected for bias field inhomogeneities66, registered to MNI space67, and segmented into 5 tissue-type images68. Diffusion-weighted images suffer from many artifacts all of which were appropriately corrected. Images were also skull-stripped65, corrected for susceptibility-induced distortions64, denoised69, freed from Gibbs ringing artifacts70 and corrected for eddy-currents and motion artifacts71. The preprocessed images were then co-registered to its corresponding anatomical template (already in MNI space)67, resampled to a 1.5mm3 voxel size and eventually corrected for bias field inhomogeneities66. After motion correction as well as registration to the MNI template, the B-matrix was appropriately rotated72.

Functional data was preprocessed with fMRIprep73 and the eXtensible Connectivity Pipeline (XCP-D)74 which are two BIDS-compatible apps that perform all recommended processing steps to correct for distortion artifacts in functional data. Regression of global signal has been shown to improve denoising in BOLD series without excessive loss of community structure75. In total, 36 nuisance regressors were selected from the nuisance confound matrices of fMRIPrep output which included six motion parameters, global signal, the mean white matter, the mean CSF signal with their temporal derivatives, and the quadratic expansion of six motion parameters, tissues signals and their temporal derivatives76. Volumes with framewise displacement higher than 0.3mm were regressed out. Although smoothed time series were available, our analysis did not consider them. All specific steps were common to all subjects, both control and brain tumor patients. All images (T1s, T1 segmentations, diffusion, lesion masks and functional) were eventually co-registered to MNI space for comparison.

BOLD signals belonging to the DMN were identified with the Gordon functional Parcellation77. More precisely, each one of the 41 regions classified as Default by the parcellation image was used as a binary mask to extract the time series from the functional image. For each subject (patient and control), the pair-wise Pearson correlation coefficient between time series was computed to obtain a functional connectivity matrix. The spatial overlap between DMNs and tumor masks was computed by summing all the voxels in the lesion mask belonging to one of these 41 regions. To normalize this score, we divided the resulting number by the number of voxels belonging to each one of the 41 regions labeled as Default. Note that, with this definition, an overlap of 1 would mean the presence of a tumor the size of the entire DMN.

$${{{{{rm{Overlap}}}}}}=frac{{{{{{rm{|}}}}}}{Tumor}cap {DMN}{{{{{rm{|}}}}}}}{left|{DMN}right|}$$

(1)

Moreover, the spatial distance between the center of mass tumor and the DMN was computed by averaging the Euclidean distances to the center of mass of each one of the DMN nodes.

The DMN of the patients was compared to the mean of the healthy networks with two different metrics to assess (1) differences node-wise and (2) the Richness of the networks. Node similarity was assessed by computing the mean Pearson correlation between the same nodes in two different networks. For that, each row in the adjacency matrices was treated as a vector and compared with the same row of all matrices from the healthy subjects. After iterating through all nodes in the DMN, the mean and standard errors were computed for comparison. Furthermore, to assess the complexity of a given network, we computed the absolute difference between the distribution of correlations building the network and a uniform distribution30. We refer to this score as Richness:

$$Theta =1-frac{m}{2(m-1)}mathop{sum }limits_{mu =1}^{m}left|{P}_{mu }left({r}_{{ij}}right)-frac{1}{m}right|$$

(2)

where (m=15) is the number of bins of the histogram estimating the distribution of correlations in the network ({P}_{mu }({r}_{{ij}})). Zamora-Lpez and ccolleagues showed the robustness of the quantity in Eq. (2) with regard to the value of the parameter (m). However, sensible choices range from 10 to 20 to ensure a sufficiently rich approximation of ({P}_{mu }({r}_{{ij}})). The changes in richness across patients were obtained by computing the difference relative to the richness of the mean DMN obtained from control subjects: (Delta Theta ={Theta }_{{Patient}}-{Theta }_{{Healthy}}).

A similar procedure was followed to study BOLD signals inside the lesioned tissue. For each patient, the binary mask containing the edema was used to extract the time series from the patient, as well as from all control subjects. Consequently, BOLD signals in lesioned regions of the brain were comparable to 11 healthy signals from the same region. No network was computable in this case, making the use of Eq. (2) pointless.

To compare time series between subjects, we computed the Real Fast Fourier Transform of the BOLD series. This allowed us to compare the power spectrum of two or more signals regardless of, for example, the dephasing between them. Let ({A}_{omega }) be the amplitude of the component with frequency . Then, the total power of the signal can easily be obtained by summing the squared amplitudes of all the components:

$${P}_{T}=mathop{sum }limits_{forall omega }{left|{A}_{omega }right|}^{2}$$

(3)

With the Fourier decomposition, we could also characterize the power distribution of the signals as a function of the frequency. Analogous to Eq. (3), we summed the squared amplitudes corresponding to frequencies inside a bin of amplitude (Delta omega).

$${P}_{{omega }_{c}}=frac{100}{{P}_{T}}cdot mathop{sum }limits_{forall omega in [{omega }_{c}-Delta omega ,{omega }_{c}]}{left|{A}_{omega }right|}^{2}$$

(4)

Since each signal had a different ({P}_{T}), to compare between subjects and/or regions, we divided the result by the total power ({P}_{T}) and multiplied by 100 to make it a percentage. Arbitrarily, we chose the parameter (Delta omega) for each subject so that each bin included 10% of the total power. The qualitative results did not depend on the exact choice of the bin width.

Similarly, we computed the cumulative power distribution (C{P}_{omega }) by summing all the squared amplitude coefficients up to a certain threshold. For consistency, we measured the (C{P}_{omega }) as a percentage score and chose the thresholds to be multiples of exact percentages i.e., ({omega }^{{prime} }propto 10 % ,20 % ,ldots)).

$${{CP}}_{{omega }_{c}}=frac{100}{{P}_{T}}cdot mathop{sum }limits_{forall omega in [0,{omega }_{c}]}{left|{A}_{omega }right|}^{2}$$

(5)

Both the power distribution ({P}_{omega }) and cumulative power distribution (C{P}_{omega }) can be used to compare dynamics between time series, but they have the inconvenience of not being scalar numbers. Furthermore, computing any distance-like metric (i.e., KL divergence) between these distributions across subjects would not yield any information of whether BOLD signals had slower dynamics (more power located in low frequencies) or the opposite (i.e., DMN in healthy and patient).

To overcome this, we designed a DAS between time series based on the difference between two cumulative power distributions. It is worth noting that in the limit (Delta omega to 0), the summations in Eqs. (2), (3), and (4) become integrals simplifying the following mathematical expressions. The DAS between two BOLD signals (i,j) was computed as the area between the two cumulative power distributions:

$${DAS}left(i,jright)=int domega left(C{P}_{omega }^{i}-C{P}_{omega }^{j}right)=-{DAS}(, j,i)$$

(6)

Finding a positive ({DAS}(i,j)) would mean that time series i had slower dynamics than time series j since more power is accumulated in lower frequencies with respect to the total. Throughout this manuscript, DASs were defined as the difference in power distribution between patients and the healthy cohort. For a simplified and, hopefully comprehensive, example, we kindly refer the reader to Fig. S1. To characterize a specific DMN, all these measures were computed for each region separately and then averaged [meanSEM]. As opposed to the Richness, the DAS was computable both for DMNs and tumors since it only required two temporal series rather than a complete distribution. To compute absolute values of this score, the DAS for each region (or tumor) was made strictly positive. Only then average between regions and subjects was performed. Notably, these two operations are not interchangeable.

For the score defined in Eq. (6) to make sense, the Real Fast Fourier Transform of the time series needed to be computed using the same frequency intervals, which, in short, implied that the time duration of the signals needed to be equal. For functional images with different TRs, this was solved by adding zero-padding to the shortest signal to match the same time duration (Fig. S14). Further permutation analyses on a reduced subset of subjects with identical TRs confirmed the tendencies reported in the text (Fig. S15).

To ensure a detailed subject-specific network, we used a state-of-the-art pipeline to obtain brain graphs while at the same time not neglecting tracts inside lesioned regions of the brain (i.e., brain tumors). We combined two reconstruction methods, yielding two different tractograms and three connectivity matrices. Roughly, the first tractogram aims at reconstructing white matter fibers using non-contaminated diffusion signal, while the second one carefully assesses the presence of meaningful diffusion signal in perilesional and lesioned tissue. Later, for each tractogram, a personalized connectivity matrix can be obtained and combined to yield a unique abstraction of the brain in surgical contexts. A schematic workflow of the pipeline is in Fig.3a, and a detailed account of the parameters is in Table2.

The first branch of the method consisted of a well-validated set of steps to reconstruct the network without considering lesioned regions of the brain. To ensure this was the case, we used a binary brain mask that did not include the segmented lesion (i.e., we subtracted the lesion from the brain binary mask). This step was added for consistency with the logic of not tracking within the lesion. Nonetheless, the steps were repeated without this mask and the results were found to be almost identical (Fig. S6). This was expected as multi-shell methods highly disregard cerebrospinal fluid contamination inside the lesion15. The lesion mask was added to the 5 tissue-type image to be considered as pathological tissue78. Within this mask, for each b-value shell and tissue type (white matter, gray matter, and cerebrospinal fluid) a response function was estimated79; and the fiber orientation distribution functions (FODs) were built and intensity normalized using a multi-shell multi-tissue (MSMT) constrained spherical deconvolution approach80. Within the same binary mask excluding potentially damaged tissue, anatomically constrained whole-brain probabilistic tractography was performed using dynamic seeding, backtracking, and the iFOD2 algorithm68,81. The total number of streamlines was set to 8 million minus the number of streamlines intersecting the lesion (see below). We used spherical-deconvolution informed filtering to assign a weight to each generated streamline and assess their contribution to the initial diffusion image82. Finally, a healthy structural connectivity matrix was constructed by counting the number of weighted streamlines between each pair of regions of interest as delineated by the third version of the Automated Anatomical Label atlas83.

Next, to consider fiber bundles that originate and traverse lesioned tissue, a recent method for reconstruction was used only in the segmented lesion17. The coined Single-Shell-3-Tissue Constrained Spherical Deconvolution (SS3T) algorithm uses only one diffusion shell and the unweighted b=0 volumes. We used the shell with the highest gradient strength (i.e., b=2800s/mm2) as it offered the best white-matter contrast15,80. These FODs were reconstructed, and intensity normalized only inside the lesion mask using the same underlying response function as estimated earlier in the healthy tissue. We merged the reconstructed FODs with the previously obtained with the multi-shell algorithm (Fig.3a CENTER). It is important to note that both images were in NIFTI format, co-registered, and non-overlapping, therefore making this step straightforward. Anatomical constraints were no longer suited since white- and gray-matter are compromised inside the lesion and in the perilesional tissue. Even more, regardless of the FOD reconstruction procedure, the anatomical constraints caused fibers to stop around the edema since those surrounding voxels were (nearly-)always segmented as gray matter (see Fig. S6). We used dynamic seeding only within the masked lesion and whole-brain probabilistic tractography with backtracking to estimate white-matter fibers within the whole-brain mask68,81. The number of streamlines was set as the average number of streamlines intersecting the lesion in the healthy cohort. We superimposed the lesion on the tractograms of each control subject and tallied the overlapping streamlines78. This was important given that each lesion was in a different location and the natural density of streamlines in that specific location differed. This subject-specific streamline count controlled that the tract densities were comparable to the healthy cases (Fig.3be; see also Figs. S10S13). Spherical-deconvolution informed filtering82 was applied to ensure that each streamline adequately contributed to the lesioned diffusion signal (i.e., filtering was applied inside the lesion mask). Then, a lesion structural connectivity matrix was constructed similarly to the previous case.

$${N}_{{streamlines}, {in}, {lesion}}=frac{1}{{N}_{{control}}}mathop{sum }limits_{i=1}^{{N}_{{control}}}mathop{sum }limits_{{streamline}=1}^{{streamlines}}1{{{{{rm{cdot }}}}}}left{begin{array}{c}1;{{{{{rm{if}}}}}},{streamline}in {{{{{rm{Lesion}}}}}}hfill\ 0;{{{{{rm{otherwise}}}}}}hfillend{array}right.$$

(7)

Finally, we merged the two available connectivity matrices to reconstruct a lesioned structural brain network. To do so, we employed a greedy approach where we took the maximum connectivity strength for each pair of regions:

$${omega }_{{ij}}=max left({omega }_{{ij}}^{{{{{{rm{healthy}}}}}}},{omega }_{{ij}}^{{{{{{rm{lesion}}}}}}}right)$$

(8)

Thus, for each pre-operative scan, a total of 3 different connectivity matrices were available: the healthy connections, the (potentially) lesioned connections, and the full lesioned structural network. The networks from the control subjects and post-operative scans from patients were reconstructed using a usual multi-shell multi-tissue pipeline without the binary lesion-free mask but with the same parameters (see Table2). Note that the 3rd version of the Automated Anatomically Labeled Atlas has 4 empty regions out of 170 to maximize compatibility with the previous versions.

As suggested by previous works, guiding learning with healthy cohorts should be useful to inform predictions43,44,45. Brain graphs are notoriously heterogeneous when considering age-related differences. To take this into account, we selected subjects with significant age overlap between healthy subjects and patients in both tumor types. However, we did not consider sex segregation, since structural differences are rather unclear84; even more, the sample size for each subgroup would be severely reduced. We built a prior probability distribution of healthy links to guide the predictions using a thresholded average of the set of connections present in this healthy cohort (see Supplementary Material). This thresholded average allowed us to control for the inclusion (or exclusion) of spurious connections, while minimizing the false positive rate of connections85.

For each reconstructed network, a total of 13695 normalized edges needed to be reconstructed, thus making the problem ill-posed. Nonetheless, as argued in the introduction, we hypothesized that a fully connected network adequately guided with anatomical information could capture essential properties (see Supplementary Material). We evaluated the model using Leave One Out Cross Validation, therefore, training and testing a total of 19 models or 19 folds.

The high number of reconstructed fibers yielded high values for the connectivity between ROIs (~103). To prevent numerical overflow as well as to enhance differences in lower connections, all weights w were normalized by computing (log left(1+omega right)) before feeding them into the artificial deep neural network.

The model consisted of a 1 hidden layer deep neural network which was trained minimizing the Mean Squared Error (MSE) between the output and the ground truth determined from the MRIs (see Supplementary Material). The weights were optimized using stochastic gradient descent with a learning rate of 0.01 and 100 epochs to avoid overfitting. Evaluation metrics included the Mean Absolute Error (MAE), Pearson Correlation Coefficient (PCC) and the Cosine Similarity (CS) between the flattened predicted and ground truth graphs. The topology of the generated networks was evaluated by computing the Kullback-Leiber and Jensen-Shannon divergences between the weight probability distributions of the generated and real graphs.

Leave One Out cross-validation was done using 18 connectomes to train each one of the 19 models. For each model, the training data was randomly split into train (80%) and validation (20%) sets to prevent overfitting. Validation steps were run every 20 training epochs. For each fold, the testing of each model was done in the left-out connectome (TableS1).

Statistical tests and p-value computations were done with Scipys stats module and in-house permutation scripts. Unless stated otherwise, we used one-tailed hypotheses only when addressing the significance of strictly positive magnitudes combined with non-parametric methods. Non-negative magnitudes cannot be tested for negative results and do not need to satisfy normality.

The Leave One Out cross-validation approach yielded a pool of 19 subjects that were independently tested. For each metric, we computed the z-score by subtracting the mean and dividing by the standard deviation of the sample. Despite verifying that all of them were normally distributed, we ran parametric and non-parametric statistical tests due to the small sample size. Topological metrics were computed using the Networkx Python library86. Since the brain graphs were weighted, we computed a weight probability distribution instead of the more common degree distribution (see Supplementary Material). To compare the weight probability distributions of two graphs, we computed the Kullback-Leiber as well as the Jensen-Shannon divergences. The Jensen-Shannon divergence has the advantage of being both symmetric and normalized between 0 and 1 therefore interpretable as a distance between two distributions (i.e., predicted vs ground truth).

Further information on research design is available in theNature Portfolio Reporting Summary linked to this article.

Continued here:
Functional and structural reorganization in brain tumors: a machine learning approach using desynchronized functional ... - Nature.com

Related Posts

Comments are closed.