Category Archives: Machine Learning
Functional and structural reorganization in brain tumors: a machine learning approach using desynchronized functional … – Nature.com
Acquisition and usage of MRIs
A detailed explanation of the participants as well as the acquisition of the data is already available63; nonetheless, for the sake of transparency, we briefly present some crucial aspects. Subjects were asked to undergo MR scans both in pre- and post-surgery sessions. Out of the 36 subjects that agreed to take part in the pre-surgery session (11 healthy [58.610.6 years], 14 meningioma [60.412.3 years] and 11 glioma [47.511.3 years]), 28 were scanned after surgery (10 healthy [59.610.3 years], 12 meningioma [57.911.0 years] and 7 glioma [50.711.7 years]). The post-surgery scan session took place during the first medical consultation at the hospital after the surgical intervention (mean: 7.9 months, range: 5.210.7 months). There were no differences in the time intervals between the groups (meningioma [24312 days], glioma [22315 days], p=0.328, two-tailed U-test). As a result, 19 pre- and post-surgery pairs of structural connectomes were usable as training and testing data. All brain tumors were classified as grade I, II, and III according to the World Health Organization. All ethical regulations relevant to human research participants were followed63.
Each MR session consisted of a T1-MPRAGE anatomical scan (160 slices, TR=1750 ms, TE=4.18ms, field of view=256mm, flip angle=9o, voxel size 111mm3, acquisition time of 4:05min) followed by a multi-shell HARDI acquisition (60 slices, TR=8700ms, TE=110ms, field of view=240mm, voxel size 2.52.52.5mm3, acquisition time of 15:14min, 101102 directions b=0, 700, 1200, 2800s/mm2) together with two reversed phase-encoding b=0s/mm2 blips to correct susceptibility-induced distortions64. Resting-state functional echo-planar imaging data were obtained (42 slices, TR=2100ms, TE=27ms, field of view=192mm, flip angle=90o, voxel size 333mm3, acquisition time of 6:24min). The TR was accidentally changed to 2400ms after 4 control subjects, 5 meningioma patients and 2 glioma patients were scanned changing the times of acquisition to 7:19min. For all the subsequent Fourier analyses, this TR mismatch is solved by adding zero padding and truncating the shorter time series to ensure that equivalent spectrums were sampled by the Python methods (for further details see Supplementary Material).
Additionally, segmented lesions including the edema, non-enhancing, enhancing, and necrotic areas were available. Tumor masks were obtained with a combination of manual delineation, disconnectome63, and the Unified Segmentation with Lesion toolbox4. To identify the tumor core of gliomas, two clinicians with more than thirty and ten years of experience performed and independently validated the segmentations using 3D slicer. Data only allowed for the identification of the tumor cores; hence we subtracted the resulting cores from the whole lesion to obtain a non-necrotic region for each of the patients diagnosed with a glioma-like tumor.
High-resolution anatomical T1 weighted images were skull-stripped65, corrected for bias field inhomogeneities66, registered to MNI space67, and segmented into 5 tissue-type images68. Diffusion-weighted images suffer from many artifacts all of which were appropriately corrected. Images were also skull-stripped65, corrected for susceptibility-induced distortions64, denoised69, freed from Gibbs ringing artifacts70 and corrected for eddy-currents and motion artifacts71. The preprocessed images were then co-registered to its corresponding anatomical template (already in MNI space)67, resampled to a 1.5mm3 voxel size and eventually corrected for bias field inhomogeneities66. After motion correction as well as registration to the MNI template, the B-matrix was appropriately rotated72.
Functional data was preprocessed with fMRIprep73 and the eXtensible Connectivity Pipeline (XCP-D)74 which are two BIDS-compatible apps that perform all recommended processing steps to correct for distortion artifacts in functional data. Regression of global signal has been shown to improve denoising in BOLD series without excessive loss of community structure75. In total, 36 nuisance regressors were selected from the nuisance confound matrices of fMRIPrep output which included six motion parameters, global signal, the mean white matter, the mean CSF signal with their temporal derivatives, and the quadratic expansion of six motion parameters, tissues signals and their temporal derivatives76. Volumes with framewise displacement higher than 0.3mm were regressed out. Although smoothed time series were available, our analysis did not consider them. All specific steps were common to all subjects, both control and brain tumor patients. All images (T1s, T1 segmentations, diffusion, lesion masks and functional) were eventually co-registered to MNI space for comparison.
BOLD signals belonging to the DMN were identified with the Gordon functional Parcellation77. More precisely, each one of the 41 regions classified as Default by the parcellation image was used as a binary mask to extract the time series from the functional image. For each subject (patient and control), the pair-wise Pearson correlation coefficient between time series was computed to obtain a functional connectivity matrix. The spatial overlap between DMNs and tumor masks was computed by summing all the voxels in the lesion mask belonging to one of these 41 regions. To normalize this score, we divided the resulting number by the number of voxels belonging to each one of the 41 regions labeled as Default. Note that, with this definition, an overlap of 1 would mean the presence of a tumor the size of the entire DMN.
$${{{{{rm{Overlap}}}}}}=frac{{{{{{rm{|}}}}}}{Tumor}cap {DMN}{{{{{rm{|}}}}}}}{left|{DMN}right|}$$
(1)
Moreover, the spatial distance between the center of mass tumor and the DMN was computed by averaging the Euclidean distances to the center of mass of each one of the DMN nodes.
The DMN of the patients was compared to the mean of the healthy networks with two different metrics to assess (1) differences node-wise and (2) the Richness of the networks. Node similarity was assessed by computing the mean Pearson correlation between the same nodes in two different networks. For that, each row in the adjacency matrices was treated as a vector and compared with the same row of all matrices from the healthy subjects. After iterating through all nodes in the DMN, the mean and standard errors were computed for comparison. Furthermore, to assess the complexity of a given network, we computed the absolute difference between the distribution of correlations building the network and a uniform distribution30. We refer to this score as Richness:
$$Theta =1-frac{m}{2(m-1)}mathop{sum }limits_{mu =1}^{m}left|{P}_{mu }left({r}_{{ij}}right)-frac{1}{m}right|$$
(2)
where (m=15) is the number of bins of the histogram estimating the distribution of correlations in the network ({P}_{mu }({r}_{{ij}})). Zamora-Lpez and ccolleagues showed the robustness of the quantity in Eq. (2) with regard to the value of the parameter (m). However, sensible choices range from 10 to 20 to ensure a sufficiently rich approximation of ({P}_{mu }({r}_{{ij}})). The changes in richness across patients were obtained by computing the difference relative to the richness of the mean DMN obtained from control subjects: (Delta Theta ={Theta }_{{Patient}}-{Theta }_{{Healthy}}).
A similar procedure was followed to study BOLD signals inside the lesioned tissue. For each patient, the binary mask containing the edema was used to extract the time series from the patient, as well as from all control subjects. Consequently, BOLD signals in lesioned regions of the brain were comparable to 11 healthy signals from the same region. No network was computable in this case, making the use of Eq. (2) pointless.
To compare time series between subjects, we computed the Real Fast Fourier Transform of the BOLD series. This allowed us to compare the power spectrum of two or more signals regardless of, for example, the dephasing between them. Let ({A}_{omega }) be the amplitude of the component with frequency . Then, the total power of the signal can easily be obtained by summing the squared amplitudes of all the components:
$${P}_{T}=mathop{sum }limits_{forall omega }{left|{A}_{omega }right|}^{2}$$
(3)
With the Fourier decomposition, we could also characterize the power distribution of the signals as a function of the frequency. Analogous to Eq. (3), we summed the squared amplitudes corresponding to frequencies inside a bin of amplitude (Delta omega).
$${P}_{{omega }_{c}}=frac{100}{{P}_{T}}cdot mathop{sum }limits_{forall omega in [{omega }_{c}-Delta omega ,{omega }_{c}]}{left|{A}_{omega }right|}^{2}$$
(4)
Since each signal had a different ({P}_{T}), to compare between subjects and/or regions, we divided the result by the total power ({P}_{T}) and multiplied by 100 to make it a percentage. Arbitrarily, we chose the parameter (Delta omega) for each subject so that each bin included 10% of the total power. The qualitative results did not depend on the exact choice of the bin width.
Similarly, we computed the cumulative power distribution (C{P}_{omega }) by summing all the squared amplitude coefficients up to a certain threshold. For consistency, we measured the (C{P}_{omega }) as a percentage score and chose the thresholds to be multiples of exact percentages i.e., ({omega }^{{prime} }propto 10 % ,20 % ,ldots)).
$${{CP}}_{{omega }_{c}}=frac{100}{{P}_{T}}cdot mathop{sum }limits_{forall omega in [0,{omega }_{c}]}{left|{A}_{omega }right|}^{2}$$
(5)
Both the power distribution ({P}_{omega }) and cumulative power distribution (C{P}_{omega }) can be used to compare dynamics between time series, but they have the inconvenience of not being scalar numbers. Furthermore, computing any distance-like metric (i.e., KL divergence) between these distributions across subjects would not yield any information of whether BOLD signals had slower dynamics (more power located in low frequencies) or the opposite (i.e., DMN in healthy and patient).
To overcome this, we designed a DAS between time series based on the difference between two cumulative power distributions. It is worth noting that in the limit (Delta omega to 0), the summations in Eqs. (2), (3), and (4) become integrals simplifying the following mathematical expressions. The DAS between two BOLD signals (i,j) was computed as the area between the two cumulative power distributions:
$${DAS}left(i,jright)=int domega left(C{P}_{omega }^{i}-C{P}_{omega }^{j}right)=-{DAS}(, j,i)$$
(6)
Finding a positive ({DAS}(i,j)) would mean that time series i had slower dynamics than time series j since more power is accumulated in lower frequencies with respect to the total. Throughout this manuscript, DASs were defined as the difference in power distribution between patients and the healthy cohort. For a simplified and, hopefully comprehensive, example, we kindly refer the reader to Fig. S1. To characterize a specific DMN, all these measures were computed for each region separately and then averaged [meanSEM]. As opposed to the Richness, the DAS was computable both for DMNs and tumors since it only required two temporal series rather than a complete distribution. To compute absolute values of this score, the DAS for each region (or tumor) was made strictly positive. Only then average between regions and subjects was performed. Notably, these two operations are not interchangeable.
For the score defined in Eq. (6) to make sense, the Real Fast Fourier Transform of the time series needed to be computed using the same frequency intervals, which, in short, implied that the time duration of the signals needed to be equal. For functional images with different TRs, this was solved by adding zero-padding to the shortest signal to match the same time duration (Fig. S14). Further permutation analyses on a reduced subset of subjects with identical TRs confirmed the tendencies reported in the text (Fig. S15).
To ensure a detailed subject-specific network, we used a state-of-the-art pipeline to obtain brain graphs while at the same time not neglecting tracts inside lesioned regions of the brain (i.e., brain tumors). We combined two reconstruction methods, yielding two different tractograms and three connectivity matrices. Roughly, the first tractogram aims at reconstructing white matter fibers using non-contaminated diffusion signal, while the second one carefully assesses the presence of meaningful diffusion signal in perilesional and lesioned tissue. Later, for each tractogram, a personalized connectivity matrix can be obtained and combined to yield a unique abstraction of the brain in surgical contexts. A schematic workflow of the pipeline is in Fig.3a, and a detailed account of the parameters is in Table2.
The first branch of the method consisted of a well-validated set of steps to reconstruct the network without considering lesioned regions of the brain. To ensure this was the case, we used a binary brain mask that did not include the segmented lesion (i.e., we subtracted the lesion from the brain binary mask). This step was added for consistency with the logic of not tracking within the lesion. Nonetheless, the steps were repeated without this mask and the results were found to be almost identical (Fig. S6). This was expected as multi-shell methods highly disregard cerebrospinal fluid contamination inside the lesion15. The lesion mask was added to the 5 tissue-type image to be considered as pathological tissue78. Within this mask, for each b-value shell and tissue type (white matter, gray matter, and cerebrospinal fluid) a response function was estimated79; and the fiber orientation distribution functions (FODs) were built and intensity normalized using a multi-shell multi-tissue (MSMT) constrained spherical deconvolution approach80. Within the same binary mask excluding potentially damaged tissue, anatomically constrained whole-brain probabilistic tractography was performed using dynamic seeding, backtracking, and the iFOD2 algorithm68,81. The total number of streamlines was set to 8 million minus the number of streamlines intersecting the lesion (see below). We used spherical-deconvolution informed filtering to assign a weight to each generated streamline and assess their contribution to the initial diffusion image82. Finally, a healthy structural connectivity matrix was constructed by counting the number of weighted streamlines between each pair of regions of interest as delineated by the third version of the Automated Anatomical Label atlas83.
Next, to consider fiber bundles that originate and traverse lesioned tissue, a recent method for reconstruction was used only in the segmented lesion17. The coined Single-Shell-3-Tissue Constrained Spherical Deconvolution (SS3T) algorithm uses only one diffusion shell and the unweighted b=0 volumes. We used the shell with the highest gradient strength (i.e., b=2800s/mm2) as it offered the best white-matter contrast15,80. These FODs were reconstructed, and intensity normalized only inside the lesion mask using the same underlying response function as estimated earlier in the healthy tissue. We merged the reconstructed FODs with the previously obtained with the multi-shell algorithm (Fig.3a CENTER). It is important to note that both images were in NIFTI format, co-registered, and non-overlapping, therefore making this step straightforward. Anatomical constraints were no longer suited since white- and gray-matter are compromised inside the lesion and in the perilesional tissue. Even more, regardless of the FOD reconstruction procedure, the anatomical constraints caused fibers to stop around the edema since those surrounding voxels were (nearly-)always segmented as gray matter (see Fig. S6). We used dynamic seeding only within the masked lesion and whole-brain probabilistic tractography with backtracking to estimate white-matter fibers within the whole-brain mask68,81. The number of streamlines was set as the average number of streamlines intersecting the lesion in the healthy cohort. We superimposed the lesion on the tractograms of each control subject and tallied the overlapping streamlines78. This was important given that each lesion was in a different location and the natural density of streamlines in that specific location differed. This subject-specific streamline count controlled that the tract densities were comparable to the healthy cases (Fig.3be; see also Figs. S10S13). Spherical-deconvolution informed filtering82 was applied to ensure that each streamline adequately contributed to the lesioned diffusion signal (i.e., filtering was applied inside the lesion mask). Then, a lesion structural connectivity matrix was constructed similarly to the previous case.
$${N}_{{streamlines}, {in}, {lesion}}=frac{1}{{N}_{{control}}}mathop{sum }limits_{i=1}^{{N}_{{control}}}mathop{sum }limits_{{streamline}=1}^{{streamlines}}1{{{{{rm{cdot }}}}}}left{begin{array}{c}1;{{{{{rm{if}}}}}},{streamline}in {{{{{rm{Lesion}}}}}}hfill\ 0;{{{{{rm{otherwise}}}}}}hfillend{array}right.$$
(7)
Finally, we merged the two available connectivity matrices to reconstruct a lesioned structural brain network. To do so, we employed a greedy approach where we took the maximum connectivity strength for each pair of regions:
$${omega }_{{ij}}=max left({omega }_{{ij}}^{{{{{{rm{healthy}}}}}}},{omega }_{{ij}}^{{{{{{rm{lesion}}}}}}}right)$$
(8)
Thus, for each pre-operative scan, a total of 3 different connectivity matrices were available: the healthy connections, the (potentially) lesioned connections, and the full lesioned structural network. The networks from the control subjects and post-operative scans from patients were reconstructed using a usual multi-shell multi-tissue pipeline without the binary lesion-free mask but with the same parameters (see Table2). Note that the 3rd version of the Automated Anatomically Labeled Atlas has 4 empty regions out of 170 to maximize compatibility with the previous versions.
As suggested by previous works, guiding learning with healthy cohorts should be useful to inform predictions43,44,45. Brain graphs are notoriously heterogeneous when considering age-related differences. To take this into account, we selected subjects with significant age overlap between healthy subjects and patients in both tumor types. However, we did not consider sex segregation, since structural differences are rather unclear84; even more, the sample size for each subgroup would be severely reduced. We built a prior probability distribution of healthy links to guide the predictions using a thresholded average of the set of connections present in this healthy cohort (see Supplementary Material). This thresholded average allowed us to control for the inclusion (or exclusion) of spurious connections, while minimizing the false positive rate of connections85.
For each reconstructed network, a total of 13695 normalized edges needed to be reconstructed, thus making the problem ill-posed. Nonetheless, as argued in the introduction, we hypothesized that a fully connected network adequately guided with anatomical information could capture essential properties (see Supplementary Material). We evaluated the model using Leave One Out Cross Validation, therefore, training and testing a total of 19 models or 19 folds.
The high number of reconstructed fibers yielded high values for the connectivity between ROIs (~103). To prevent numerical overflow as well as to enhance differences in lower connections, all weights w were normalized by computing (log left(1+omega right)) before feeding them into the artificial deep neural network.
The model consisted of a 1 hidden layer deep neural network which was trained minimizing the Mean Squared Error (MSE) between the output and the ground truth determined from the MRIs (see Supplementary Material). The weights were optimized using stochastic gradient descent with a learning rate of 0.01 and 100 epochs to avoid overfitting. Evaluation metrics included the Mean Absolute Error (MAE), Pearson Correlation Coefficient (PCC) and the Cosine Similarity (CS) between the flattened predicted and ground truth graphs. The topology of the generated networks was evaluated by computing the Kullback-Leiber and Jensen-Shannon divergences between the weight probability distributions of the generated and real graphs.
Leave One Out cross-validation was done using 18 connectomes to train each one of the 19 models. For each model, the training data was randomly split into train (80%) and validation (20%) sets to prevent overfitting. Validation steps were run every 20 training epochs. For each fold, the testing of each model was done in the left-out connectome (TableS1).
Statistical tests and p-value computations were done with Scipys stats module and in-house permutation scripts. Unless stated otherwise, we used one-tailed hypotheses only when addressing the significance of strictly positive magnitudes combined with non-parametric methods. Non-negative magnitudes cannot be tested for negative results and do not need to satisfy normality.
The Leave One Out cross-validation approach yielded a pool of 19 subjects that were independently tested. For each metric, we computed the z-score by subtracting the mean and dividing by the standard deviation of the sample. Despite verifying that all of them were normally distributed, we ran parametric and non-parametric statistical tests due to the small sample size. Topological metrics were computed using the Networkx Python library86. Since the brain graphs were weighted, we computed a weight probability distribution instead of the more common degree distribution (see Supplementary Material). To compare the weight probability distributions of two graphs, we computed the Kullback-Leiber as well as the Jensen-Shannon divergences. The Jensen-Shannon divergence has the advantage of being both symmetric and normalized between 0 and 1 therefore interpretable as a distance between two distributions (i.e., predicted vs ground truth).
Further information on research design is available in theNature Portfolio Reporting Summary linked to this article.
Continued here:
Functional and structural reorganization in brain tumors: a machine learning approach using desynchronized functional ... - Nature.com
Accelerating industrialization of Machine Learning at BMW Group using the Machine Learning Operations (MLOps … – AWS Blog
The BMW Group and Amazon Web Services (AWS) announced a strategic collaboration in 2020. The goal of that collaboration is to help further accelerate the BMW Groups pace of innovation by placing data and analytics at the center of its decision-making.
The BMW Groups Cloud Data Hub (CDH) manages company-wide data and data solutions on AWS. The CDH provides BMW Analysts and Data Scientists with access to data that helps drive business value through Data Analytics and Machine Learning (ML). As a part of BMWs larger strategy to leverage the availability of data within CDH and to help accelerate the industrialization of Machine Learning, the BMW Group worked closely with AWS Professional Services to develop their Machine Learning Operations (MLOps) solution.
The BMW Groups MLOps solution includes (1) Reference architecture, (2) Reusable Infrastructure as Code (IaC) modules that use Amazon SageMaker and Analytics services, (3) ML workflows using AWS Step Functions, and (4) Deployable MLOps template that covers the ML lifecycle from data ingestion to inference.
The MLOps solution supported the BMW Group in accelerating their industrialization of AI/ML use cases, resulting in significant value generation within the first two years after the solutions release. The long-term goal of the BMWs MLOps solution team is to help accelerate industrialization of over 80% of the AI/ML use cases at the BMW Group, helping to enable continuous innovation and improvement in AI/ML at the BMW Group.
Starting in 2022, the MLOps solution has been rolled out to AI/ML use cases at the BMW Group. It has seen widespread adoption and recognition as the BMW internal master solution for MLOps.
In this blog, we talk about BMW Groups MLOps solution, its reference architecture, high-level technical details, and benefits to the AI/ML use cases who develop and productionize ML models using the MLOps solution.
The MLOps solution has been developed to address the requirements of AI/ML use cases at the BMW Group. This includes integration with the BMW data lake, such as CDH, as well as ML workflow orchestration, data and model lineage, and governance requirements such as compliance, networking, and data protection.
AWS Professional Services and the MLOps solution team from the BMW Group collaborated closely with various AI/ML use cases to discover successful patterns and practices. This enabled the AWS and the BMW Groups MLOps solution team to gain a comprehensive understanding of the technology stack, as well as the complexities involved in productionizing AI/ML use cases.
To meet the BMW Groups AI/ML use case requirements, the team worked backwards and developed the MLOps solution architecture as mentioned in Figure 1 below.
Figure 1: MLOps Solution Architecture
In the section below, we explain the details of each component of the MLOps solution as represented in the MLOps solution architecture.
The MLOps template is a composition of IaC modules and ML workflows built using AWS managed services with a serverless first strategy designed to allow the BMW Group to use the scalability, reduced maintenance costs, and agility of ML on AWS. The template will be deployed to the AWS account of the AI/ML use cases to create an end-to-end, deployable ML and infrastructure pipeline. This is designed to act as a starting point for building AI/ML use cases at the BMW Group.
The MLOps template offers functional capabilities for the BMW Groups Data Scientists and ML Engineers ranging from data import, exploration, training, to deployment of ML model for inference. It supports applications in the operations of AI/ML use cases at the BMW Group by offering version control and infrastructure and ML monitoring capabilities.
The MLOps solution is designed to offer functional and infrastructure capabilities for use cases as independent building blocks. These capabilities can be used by AI/ML use cases as a whole or can choose selected blocks to help the BMW Group to meet their business goals.
Below is the overview of the MLOps Template building blocks offered by the BMW Groups MLOps Solution:
Figure 2: MLOps Template building blocks
MLOps solution offers Data Scientists and ML Engineers at the BMW Group with example notebooks to help enhance the learning curve of the BMW Groups Data Scientists and ML Engineers with AWS Services. These example notebooks include:
The MLOps solutions training pipeline developed using AWS Step Functions Data Science Python SDK, consists of required steps to train ML models, including data loading, feature engineering, model training, evaluation, and model monitoring.
Use case teams at the BMW Group have the flexibility to modify or expand the MLOps solutions Training pipeline as required for their specific projects. Common customizations thus far have included parallel model training, simultaneous experiments, pre-production approval workflows, and monitoring and alert notifications via Amazon SNS integration.
The details of MLOps solutions training pipeline steps are shown in Figure 3 below:
Figure 3: Training Pipeline
MLOps solution employs AWS CodePipeline to facilitate continuous integration and deployment workflows. The AWS CodePipeline sourcing steps allow users at the BMW Group to select their preferred source control, such as AWS CodeCommit or GitHub Enterprise.
AI/ML use case teams at the BMW Group can use AWS CodePipeline to help deploy the ML training pipeline, and thereby bootstrapping the required AWS infrastructure for orchestrating the ML training pipeline from reading data from the BMW Group data lake e.g., CDH to model training, evaluation, and ML model registration.
When the model training pipeline completes with registering the ML model in the Amazon SageMaker Model registry, the MLOps Solution uses Amazon EventBridge notifications to trigger AWS CodePipeline to deploy the inference module.
Around 80% of AI/ML use cases at the BMW Group served by the MLOps solution require high-performance and high-throughput methods for transforming raw data and generating inference from them. To meet the use case needs, the MLOps solution offers a batch inference pipeline with the required steps for those users at the BMW Group to load and pre-process the raw data, generate predictions, and monitor the predicted results for quality and offer explainability.
Along with the batch inference pipeline, the AI/ML use case teams at the BMW Group are provided with the required modules to help set up real-time inference in case they require low latency predictions and API integration with external use case applications.
The details of MLOps solutions batch inference pipeline steps are shown in Figure 4 below:
Figure 4: Inference Pipeline
The MLOps solution offers AI/ML use cases of the BMW Group to bring their own application stack in addition to the set of modules offered as a part of the MLOps solution. This helps AI/ML use cases at the BMW Group to make necessary customization as per their business and technical needs.
The MLOps solution helped the AI/ML use cases of the BMW Group to build and deploy production grade models, thereby reducing overall time to market by approximately 75%. The MLOps solution also offers a broad range of benefits to the BMW Group, including:
Learn more about BMWs Cloud Data Hub (CDH) in this blog post, AWS offerings at the AWS for Automotive page or contact your AWS team today.
See the original post:
Accelerating industrialization of Machine Learning at BMW Group using the Machine Learning Operations (MLOps ... - AWS Blog
Getting Machine Learning Projects from Idea to Execution – The Machine Learning Times
Humanitys latest, greatest invention is stalling right out of the gate. Machine learning projects have the potential to help us navigate our most significant risks including wildfires, climate change, pandemics, and child abuse. It can boost sales, cut costs, prevent fraud, streamline manufacturing, and strengthen health care.
But ML initiatives routinelyfail to deliver returns orfail to deploy entirely. They stall before deploying, and at great cost. One of the major issues is that companies tend to focus more on the technology than how it should deploy. This is like being more excited about the development of a rocket than its launch.
In this article, I offer an antidote: a six-step practice for ushering machine learning projects from conception to deployment that I callbizML. This framework is an effort to establish an updated, industry-standard playbook for running successful ML projects that is pertinent and compelling to both business professionals and data professionals.
MLs problem is in its popularity. For all the hoopla about the core technology, the gritty details of how its deployment improves business operations are often glossed over. In this way, ML is now too hot for its own good. After decades of consulting and running ML conferences, the lesson has sunk in.
Todayshype about MLis overzealous because it feeds a common misconception: the ML fallacy. It goes like this: Since ML algorithms can successfully generate models that hold up for new, unseen situations (which is both amazing and true), their models are intrinsically valuable (which is not necessarily true).The value of MLcomes only when it creates organizational change that is, when an ML-generated model is deployed to actively improve operations. Until a model isusedto actively reshape how your organization works, itsuse-less literally. A model doesnt solve any business problems on its own and it aint gonna deploy itself. ML can be the disruptive technology its cracked up to be, but only if you disrupt with it.
Unfortunately, businesses oftenfail to bridge the business/tech culture gap,a disconnect between data scientists and business stakeholders that precludes deployment and leads to models collecting dust. On the one hand, data scientists, who perform the model development step, fixate solely on data science and generally prefer to not be bothered with mundane managerial activities. Often, they take the deployment of their model for granted and jump past a rigorous business process that would engage stakeholders to collaboratively plan for deployment.
On the other hand, many business professionals especially those already inclined to forgo the particulars as too technical have been seduced into seeing this stunning technology as a panacea that solves problems on its own. They defer to data scientists for any project specifics. But when theyre ultimately faced with the operational change that a deployed model would incur, its a tough sell. Taken off-guard, the stakeholder hesitates before altering operations that are key to the companys profitability.
With no one taking proactive ownership, the hose and the faucet fail to connect. Far too often, the data scientist delivers a viable model, but the operational team isnt ready for the pass and they drop the ball. There arewonderful exceptions and glowing successes, but the generally poor track record we witness today forewarns of broad disillusionment with ML even a dreadedAI winter.
The remedy is to rigorouslyplan for deploymentfrom the inception of each ML project. Laying the groundwork for the operational change that deployment would bring to fruition takes more preaching, socializing,cross-disciplinary collaboration, andchange-management panachethan many, including myself, initially realized.
To accomplish this, a knowledgeable team mustcollaborativelyfollow an end-to-end practice that begins by backward planning for deployment. As I mentioned above, I call this practicebizMLand it consists of the following six steps.
Define the business value proposition: how ML will affect operations in order to improve them (i.e.,operationalizationorimplementation).
Example:UPS predicts which destination addresses will receive a package deliveryin order to plan a more efficient delivery process.
Define what the ML model will predict for each individual case. Each detail matters from a business perspective.
Example: For each destination, how many packages across how many stops will be required tomorrow? For example, a group of three office buildings with 24 business suites at 123 Main St. will require two stops with three packages each by 8:30 a.m.
Determine the salient benchmarks to track during both model training and model deployment and determine what performance level must be achieved for the project to be considered a success.
Examples: Miles driven, gallons of fuel consumed, tons of carbon emitted, and stops-per-mile (the more densely a route is packed with deliveries, the more value is generated from each mile of driving).
Define what the training data must look like and get it into that form.
Example: Assemble a large number of positive and negative examples from which to learn both destinations that did receive deliveries on certain days and others that did not.
Generate a predictive model from the data. The model is the thing thats learned.
Examples: decision trees, logistic regression, neural networks, and ensemble models.
Use the model to render predictive scores (probabilities) thereby applying whats been learned to new cases and then act on those scores to improve business operations.
Example: By accounting for predicted packages along with known packages, UPS improved its system that assigns packages to delivery trucks at shipping centers. This improvement annually saves an estimated 18.5 million miles, $35 million, 800,000 gallons of fuel, and 18,500 metric tons of emissions.
These six steps define a business practice that charts a shrewd path to ML deployment. Anyone who wishes to participate in ML projects must be familiar with them, no matter whether theyre in a business or technical role.
After culminating with step 6, deployment, you have finishedstarting something new. BizML only begins an ongoing journey, a new phase of running improved operations and of keeping things working. Once launched, a model requires upkeep: monitoring it, maintaining it, and periodically refreshing it.
Following these six steps in this order is almost a logical inevitability. To understand why, lets start with the end. The final two culminating steps, steps 5 and 6, are the two main steps of ML, model training and deployment. BizML ushers the project through to their completion.
The step just before those two Step 4: Prepare the data is a known requirement that always precedes model training. You must provide ML software with data in the right form in order for it to work. That step has always been an integral part of modeling projects, ever since linear regression was first applied by businesses in the 1960s.
Before the technical magic, you must perform business magic. Thats where the first three steps come in. They establish a greatly needed preproduction phase of pitching, socializing, and collaborating in order to jointly agree on how ML will be deployed and how its performance will be evaluated. Importantly, these first steps go much further than only agreeing on a projects business objective. They ask business professionals to dive into the mechanics that define exactly how predictions will alter operations and they ask data scientists to reach beyond their usual sphere and work closely with business-side personnel. This cross-disciplinary team is uniquely equipped to navigate to a deployment plan that is both technically feasible and operationally viable.
Following all six of the steps of the bizML practice is uncommon, but hardly unheard of. Many ML projects succeed wildly, even if theyre in the minority. While a well-known, established framework has been a long time coming, the ideas at the heart of the bizML framework are not new to many experienced data scientists.
And yet the folks who need it the most business leaders and other business stakeholders are least likely to be familiar with it. In fact, the business world in general has yet to become aware of even the need for a specialized business practice in the first place. This is understandable, since the common narrative leads them astray. AI is often oversold as an impenetrable yet exciting cure-all. Meanwhile, many data scientists far prefer to crunch numbers than to take pains to elucidate.
First things first: Business professionals need some edification. Before those in charge can participate in the bizML practice and, ultimately, green-light model deployment with confidence, they must gain a concrete understanding of how an ML project works from end to end:What will the model predict? Precisely how will those predictions affect operations? Which metric meaningfully tracks how well it predicts?andWhat kind of data is needed?This isnt the rocket science part, but its still a modest books worth.
Considering the innumerable dollars and resources pumped into ML, how much more potential value could we capture by adopting a universal procedure that facilitates the collaboration and planning needed to reach deployment? Lets find out.
This article is adapted from the book,The AI Playbook: Mastering the Rare Art of Machine Learning Deployment, with permission from the publisher, MIT Press. It is a product of the authors work while he held a one-year position as the Bodily Bicentennial Professor in Analytics at the UVA Darden School of Business.
View original post here:
Getting Machine Learning Projects from Idea to Execution - The Machine Learning Times
Artificial Intelligence and Machine Learning in Clinical Data Management: Opportunities and Ethical Considerations – PharmiWeb.com
Disclaimer: You are now leaving PharmiWeb.com website and are going to a website that is not operated by us. We are not responsible for the content or availability of linked sites.
ABOUT THIRD PARTY LINKS ON OUR SITE
PharmiWeb.com offers links to other third party websites that may be of interest to our website visitors. The links provided in our website are provided solely for your convenience and may assist you in locating other useful information on the Internet. When you click on these links you will leave the PharmiWeb.com website and will be redirected to another site. These sites are not under the control of PharmiWeb.com.
PharmiWeb.com is not responsible for the content of linked third party websites. We are not an agent for these third parties nor do we endorse or guarantee their products. We make no representation or warranty regarding the accuracy of the information contained in the linked sites. We suggest that you always verify the information obtained from linked websites before acting upon this information.
Also, please be aware that the security and privacy policies on these sites may be different than PharmiWeb.com policies, so please read third party privacy and security policies closely.
If you have any questions or concerns about the products and services offered on linked third party websites, please contact the third party directly.
Originally posted here:
Artificial Intelligence and Machine Learning in Clinical Data Management: Opportunities and Ethical Considerations - PharmiWeb.com
Can language models read the genome? This one decoded mRNA to make better vaccines. – EurekAlert
image:
Machine learning expert Mengdi Wang, in partnership with biotech startup RVAC Medicines, has developed a language model that used its powers of semantic representation to design a more effective mRNA vaccine for COVID-19. Photo by Sameer A. Khan/Fotobuddy
Credit: Photo by Sameer A. Khan/Fotobuddy
The same class of artificial intelligence that made headlines coding software and passing the bar exam has learned to read a different kind of text the genetic code.
That code contains instructions for all of lifes functions and follows rules not unlike those that govern human languages. Each sequence in a genome adheres to an intricate grammar and syntax, the structures that give rise to meaning. Just as changing a few words can radically alter the impact of a sentence, small variations in a biological sequence can make a huge difference in the forms that sequence encodes.
Now Princeton University researchers led by machine learning expert Mengdi Wang are using language models to home in on partial genome sequences and optimize those sequences to study biology and improve medicine. And they are already underway.
In a paper published April 5 in the journal Nature Machine Intelligence, the authors detail a language model that used its powers of semantic representation to design a more effective mRNA vaccine such as those used to protect against COVID-19.
Found in Translation
Scientists have a simple way to summarize the flow of genetic information. They call it the central dogma of biology. Information moves from DNA to RNA to proteins. Proteins create the structures and functions of living cells.
Messenger RNA, or mRNA, converts the information into proteins in that final step, called translation. But mRNA is interesting. Only part of it holds the code for the protein. The rest is not translated but controls vital aspects of the translation process.
Governing the efficiency of protein production is a key mechanism by which mRNA vaccines work. The researchers focused their language model there, on the untranslated region, to see how they could optimize efficiency and improve vaccines.
After training the model on a small variety of species, the researchers generated hundreds of new optimized sequences and validated those results through lab experiments. The best sequences outperformed several leading benchmarks for vaccine development, including a 33% increase in the overall efficiency of protein production.
Increasing protein production efficiency by even a small amount provides a major boost for emerging therapeutics, according to the researchers. Beyond COVID-19, mRNA vaccines promise to protect against many infectious diseases and cancers.
Wang, a professor of electrical and computer engineering and the principal investigator in this study, said the models success also pointed to a more fundamental possibility. Trained on mRNA from a handful of species, it was able to decode nucleotide sequences and reveal something new about gene regulation. Scientists believe gene regulation, one of lifes most basic functions, holds the key to unlocking the origins of disease and disorder. Language models like this one could provide a new way to probe.
Wangs collaborators include researchers from the biotech firm RVAC Medicines as well as the Stanford University School of Medicine.
The Language of Disease
The new model differs in degree, not kind, from the large language models that power todays AI chat bots. Instead of being trained on billions of pages of text from the internet, their model was trained on a few hundred thousand sequences. The model also was trained to incorporate additional knowledge about the production of proteins, including structural and energy-related information.
The research team used the trained model to create a library of 211 new sequences. Each was optimized for a desired function, primarily an increase in the efficiency of translation. Those proteins, like the spike protein targeted by COVID-19 vaccines, drive the immune response to infectious disease.
Previous studies have created language models to decode various biological sequences, including proteins and DNA, but this was the first language model to focus on the untranslated region of mRNA. In addition to a boost in overall efficiency, it was also able to predict how well a sequence would perform at a variety of related tasks.
Wang said the real challenge in creating this language model was in understanding the full context of the available data. Training a model requires not only the raw data with all its features but also the downstream consequences of those features. If a program is designed to filter spam from email, each email it trains on would be labeled spam or not spam. Along the way, the model develops semantic representations that allow it to determine what sequences of words indicate a spam label. Therein lies the meaning.
Wang said looking at one narrow dataset and developing a model around it was not enough to be useful for life scientists. She needed to do something new. Because this model was working at the leading edge of biological understanding, the data she found was all over the place.
Part of my dataset comes from a study where there are measures for efficiency, Wang said. Another part of my dataset comes from another study [that] measured expression levels. We also collected unannotated data from multiple resources. Organizing those parts into one coherent and robust whole a multifaceted dataset that she could use to train a sophisticated language model was a massive challenge.
Training a model is not only about putting together all those sequences, but also putting together sequences with the labels that have been collected so far. This had never been done before.
The paper, A 5' UTR Language Model for Decoding Untranslated Regions of mRNA and Function Predictions, was published in Nature Machine Learning. Additional authors include Dan Yu, Yupeng Li, Yue Shen and Jason Zhang, from RVAC Medicines; Le Cong from Stanford; and Yanyi Chu and Kaixuan Huang from Princeton.
Nature Machine Intelligence
Experimental study
Cells
A 5' UTR Language MoA 5' UTR Language Model for Decoding Untranslated Regions of mRNA and Function Predictionsdel for Decoding Untranslated Regions of mRNA and Function Predictions
5-Apr-2024
Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.
Read more:
Can language models read the genome? This one decoded mRNA to make better vaccines. - EurekAlert
18 of the best large language models in 2024 – TechTarget
Large language models are the dynamite behind the generative AI boom of 2023. However, they've been around for a while.
LLMs are black box AI systems that use deep learning on extremely large datasets to understand and generate new text. Modern LLMs began taking shape in 2014 when the attention mechanism -- a machine learning technique designed to mimic human cognitive attention -- was introduced in a research paper titled "Neural Machine Translation by Jointly Learning to Align and Translate." In 2017, that attention mechanism was honed with the introduction of the transformer model in another paper, "Attention Is All You Need."
Some of the most well-known language models today are based on the transformer model, including the generative pre-trained transformer series of LLMs and bidirectional encoder representations from transformers (BERT).
ChatGPT, which runs on a set of language models from OpenAI, attracted more than 100 million users just two months after its release in 2022. Since then, many competing models have been released. Some belong to big companies such as Google and Microsoft; others are open source.
Constant developments in the field can be difficult to keep track of. Here are some of the most influential models, both past and present. Included in it are models that paved the way for today's leaders as well as those that could have a significant effect in the future.
Below are some of the most relevant large language models today. They do natural language processing and influence the architecture of future models.
BERT is a family of LLMs that Google introduced in 2018. BERT is a transformer-based model that can convert sequences of data to other sequences of data. BERT's architecture is a stack of transformer encoders and features 342 million parameters. BERT was pre-trained on a large corpus of data then fine-tuned to perform specific tasks along with natural language inference and sentence text similarity. It was used to improve query understanding in the 2019 iteration of Google search.
The Claude LLM focuses on constitutional AI, which shapes AI outputs guided by a set of principles that help the AI assistant it powers helpful, harmless and accurate. Claude was created by the company Anthropic. The latest iteration of the Claude LLM is Claude 3.0.
Cohere is an enterprise AI platform that provides several LLMs including Command, Rerank and Embed. These LLMs can be custom-trained and fine-tuned to a specific companys use case. The company that created the Cohere LLM was founded by one of the authors of Attention Is All You Need. One of Coheres strengths is that it is not tied to one single cloud -- unlike OpenAI, which is bound to Microsoft Azure.
Ernie is Baidus large language model which powers the Ernie 4.0 chatbot. The bot was released in August 2023 and has garnered more than 45 million users. Ernie is rumored to have 10 trillion parameters. The bot works best in Mandarin but is capable in other languages.
Falcon 40B is a transformer-based, causal decoder-only model developed by the Technology Innovation Institute. It is open source and was trained on English data. The model is available in two smaller variants as well: Falcon 1B and Falcon 7B (1 billion and 7 billion parameters). Amazon has made Falcon 40B available on Amazon SageMaker. It is also available for free on GitHub.
Gemini is Google's family of LLMs that power the company's chatbot of the same name. The model replaced Palm in powering the chatbot, which was rebranded from Bard to Gemini upon the model switch. Gemini models are multimodal, meaning they can handle images, audio and video as well as text. Gemini is also integrated in many Google applications and products. It comes in three sizes -- Ultra, Pro and Nano. Ultra is the largest and most capable model, Pro is the mid-tier model and Nano is the smallest model, designed for efficiency with on-device tasks. Gemini outperforms GPT-4 on most evaluated benchmarks.
Gemma is a family of open-source language models from Google that were trained on the same resources as Gemini. Gemma comes in two sizes -- a 2 billion parameter model and a 7 billion parameter model. Gemma models can be run locally on a personal computer, and surpass similarly sized Llama 2 models on several evaluated benchmarks.
GPT-3 is OpenAI's large language model with more than 175 billion parameters, released in 2020. GPT-3 uses a decoder-only transformer architecture. In September 2022, Microsoft announced it had exclusive use of GPT-3's underlying model. GPT-3 is 10 times larger than its predecessor. GPT-3's training data includes Common Crawl, WebText2, Books1, Books2 and Wikipedia.
GPT-3 is the last of the GPT series of models in which OpenAI made the parameter counts publicly available. The GPT series was first introduced in 2018 with OpenAI's paper "Improving Language Understanding by Generative Pre-Training."
GPT-3.5 is an upgraded version of GPT-3 with fewer parameters. GPT-3.5 was fine-tuned using reinforcement learning from human feedback. GPT-3.5 is the version of GPT that powers ChatGPT. There are several models, with GPT-3.5 turbo being the most capable, according to OpenAI. GPT-3.5's training data extends to September 2021.
It was also integrated into the Bing search engine but has since been replaced with GPT-4.
GPT-4 is the largest model in OpenAI's GPT series, released in 2023. Like the others, it's a transformer-based model. Unlike the others, its parameter count has not been released to the public, though there are rumors that the model has more than 170 trillion. OpenAI describes GPT-4 as a multimodal model, meaning it can process and generate both language and images as opposed to being limited to only language. GPT-4 also introduced a system message, which lets users specify tone of voice and task.
GPT-4 demonstrated human-level performance in multiple academic exams. At the model's release, some speculated that GPT-4 came close to artificial general intelligence (AGI), which means it is as smart or smarter than a human. GPT-4 powers Microsoft Bing search, is available in ChatGPT Plus and will eventually be integrated into Microsoft Office products.
Lamda (Language Model for Dialogue Applications) is a family of LLMs developed by Google Brain announced in 2021. Lamda used a decoder-only transformer language model and was pre-trained on a large corpus of text. In 2022, LaMDA gained widespread attention when then-Google engineer Blake Lemoine went public with claims that the program was sentient. It was built on the Seq2Seq architecture.
Large Language Model Meta AI (Llama) is Meta's LLM released in 2023. The largest version is 65 billion parameters in size. Llama was originally released to approved researchers and developers but is now open source. Llama comes in smaller sizes that require less computing power to use, test and experiment with.
Llama uses a transformer architecture and was trained on a variety of public data sources, including webpages from CommonCrawl, GitHub, Wikipedia and Project Gutenberg. Llama was effectively leaked and spawned many descendants, including Vicuna and Orca.
Mistral is a 7 billion parameter language model that outperforms Llama's language model of a similar size on all evaluated benchmarks. Mistral also has a fine-tuned model that is specialized to follow instructions. Its smaller size enables self-hosting and competent performance for business purposes. It was released under the Apache 2.0 license.
Orca was developed by Microsoft and has 13 billion parameters, meaning it's small enough to run on a laptop. It aims to improve on advancements made by other open source models by imitating the reasoning procedures achieved by LLMs. Orca achieves the same performance as GPT-4 with significantly fewer parameters and is on par with GPT-3.5 for many tasks. Orca is built on top of the 13 billion parameter version of LLaMA.
The Pathways Language Model is a 540 billion parameter transformer-based model from Google powering its AI chatbot Bard. It was trained across multiple TPU 4 Pods -- Google's custom hardware for machine learning. Palm specializes in reasoning tasks such as coding, math, classification and question answering. Palm also excels at decomposing complex tasks into simpler subtasks.
PaLM gets its name from a Google research initiative to build Pathways, ultimately creating a single model that serves as a foundation for multiple use cases. There are several fine-tuned versions of Palm, including Med-Palm 2 for life sciences and medical information as well as Sec-Palm for cybersecurity deployments to speed up threat analysis.
Phi-1 is a transformer-based language model from Microsoft. At just 1.3 billion parameters, Phi-1 was trained for four days on a collection of textbook-quality data. Phi-1 is an example of a trend toward smaller models trained on better quality data and synthetic data.
"We'll probably see a lot more creative scaling down work: prioritizing data quality and diversity over quantity, a lot more synthetic data generation, and small but highly capable expert models," wrote Andrej Karpathy, former director of AI at Tesla and OpenAI employee, in a tweet.
Phi-1 specializes in Python coding and has fewer general capabilities because of its smaller size.
StableLM is a series of open source language models developed by Stability AI, the company behind image generator Stable Diffusion. There are 3 billion and 7 billion parameter models available and 15 billion, 30 billion, 65 billion and 175 billion parameter models in progress at time of writing. StableLM aims to be transparent, accessible and supportive.
Vicuna is another influential open source LLM derived from Llama. It was developed by LMSYS and was fine-tuned using data from sharegpt.com. It is smaller and less capable that GPT-4 according to several benchmarks, but does well for a model of its size. Vicuna has only 33 billion parameters, whereas GPT-4 has trillions.
Although LLMs are a recent phenomenon, their precursors go back decades. Learn how recent precursor Seq2Seq and distant precursor ELIZA set the stage for modern LLMs.
Seq2Seq is a deep learning approach used for machine translation, image captioning and natural language processing. It was developed by Google and underlies some of their modern LLMs, including LaMDA. Seq2Seq also underlies AlexaTM 20B, Amazon's large language model. It uses a mix of encoders and decoders.
Eliza was anearly natural language processing programcreated in 1966. It is one of the earliest examples of a language model. Eliza simulated conversation using pattern matching and substitution. Eliza, running a certain script, could parody the interaction between a patient and therapist by applying weights to certain keywords and responding to the user accordingly. The creator of Eliza, Joshua Weizenbaum, wrote a book on the limits of computation and artificial intelligence.
View original post here:
18 of the best large language models in 2024 - TechTarget
When Students Get Lost in the Algorithm: The Problems with Nevada’s AI School Funding Experiment – New America
Nevada is not going to be the last state to integrate big data and predictive analytics into its school funding formula, so we need to think critically about how best to deploy these tools. There is value in efforts to use these resources to better target high-need students, and education funding models can certainly take full advantage of the growing data resources available. However, if school fundingand by extension, the opportunities available to studentsare directly linked to the outputs of a machine learning model, then that model must be designed with transparency, equity, and accountability in mind from the start. The methodology must be in the public domain, so that they can be evaluated for how fairly and well they actually support high-need studentswhich should include students from low-income backgrounds as well as those with other needs and challenges. Nevadas new policy falls short on all these fronts, leaving too many students out of the equation.
See more here:
When Students Get Lost in the Algorithm: The Problems with Nevada's AI School Funding Experiment - New America
Leash Biosciences Announces $9.3 Million Seed Financing to Pioneer AI-Driven Medicinal Chemistry – GlobeNewswire
Financing will support development of a foundational machine learning model of medicinal chemistry that can accurately predict small molecule drug candidates for any protein
Releases unprecedented dataset publicly to address critical challenges in drug discovery with machine learning
SALT LAKE CITY, April 05, 2024 (GLOBE NEWSWIRE) -- Leash Biosciences, an artificial intelligence and machine learning (AI/ML)-native biotechnology company unleashing machine learning to solve medicinal chemistry, today announced the completion of a $9.3 million seed financing round to advance its mission of revolutionizing medicinal chemistry through modern computational methods and massive biological data collection. The oversubscribed round was led by Springtide Ventures with participation from MetaPlanet, Top Harvest Capital, Mitsui Global Investment, MFV Partners, Recursion CEO and co-founder Chris Gibson, and Recursion co-founder Blake Borgeson.
Leash aims to develop a foundational and generalizable machine learning model of medicinal chemistry that can accurately predict small molecule drug candidates for any protein in silico, and more broadly, interactions between any protein and any chemical. To achieve this, Leash is producing bespoke, expansive datasets of protein targets binding to chemicals. To date, the Company has physically generated over 17 billion high-quality protein-chemical interaction measurements. In its new Salt Lake City headquarters, Leash plans to screen 500+ protein targets against many millions of machine learning-designed, proprietary chemicals by 2025.
ML improvements in chess, Go, image recognition, language translation, text generation, and protein folding all were driven by the collection and curation of massive datasets. We believe a similar strategy will revolutionize how we approach medicinal chemistry," said Ian Quigley, CEO of Leash Biosciences. "We are thrilled to have the support of this group of top-tier investors who share our vision for transforming drug discovery through an ML-first approach."
To advance its machine learning engine, Leash will use the funding to scale its data collection and computational capabilities. The Companys ML engine will also support advancing multiple internal therapeutics programs toward in vivo studies.
"Leash's platform stands apart with its combined excellence in machine learning, experimental biology, and medicinal chemistry," said Claire Smith, Lead Investor at Springtide Ventures. "We are excited to back this exceptional team as they leverage cutting-edge tech to tackle the toughest drug discovery challenges."
Alexey Morgunov of MetaPlanet added, "Leash sits at the forefront of innovating the next paradigm of AI-driven, scalable, and rapid drug design. We are honored to partner with them as thought leaders in this space."
The Leash team is comprised of TechBio veterans with expertise spanning AI/ML, biology, and chemistry. Five of the company's six employees are former Recursion employees with experience building and scaling transformational drug discovery platforms. The team also brings experience from Eikon Therapeutics, Myriad Genetics, insitro Biosciences, LinkedIn, Stripe, and other leading technology and biotechnology players.
In parallel, Leash announced the launch of its inaugural machine learning Kaggle competition, the Big Encoded Library for Chemical Assessment (BELKA). Leveraging a dataset of unprecedented scale, BELKA sets out to address one of the most critical challenges in drug discovery: predicting the likelihood of chemical materials binding to pharmaceutically-relevant targets. The competition will be hosted on the Kaggle platform, the worlds largest data science community.
"By providing participants with access to such a comprehensive dataset, we are empowering the global scientific community to develop innovative solutions that could revolutionize the way we identify potential drug candidates, said Ian Quigley, Leash Bio CEO.
About the Kaggle Competition: Predict New Medicines with BELKA (Big Encoded Library for Chemical Assessment)
BELKA aims to contribute to groundbreaking advancements in predictive modeling for pharmaceutical research by harnessing the capabilities of artificial intelligence and machine learning. Participants will be tasked with analyzing a vast dataset comprised of 133 million physically-measured activities for each of three key protein targets.
Leash rigorously produced a dataset that exceeds all existing small molecule binding datasets combined. With 133 million molecules screened against each protein and evaluated with deep sequencing coverage and many replicates, participants will have access to an unparalleled wealth of data in scale and depth. Importantly, this competition dataset is larger than the worlds largest existing drug-target dataset (PubChem), providing a unique opportunity for groundbreaking insights and discoveries. It represents a small fraction of Leashs screening data.
Committed to transparency and collaboration in scientific research, Leash plans to publicly release the full dataset of all conditions and replicates aggregated for the contest dataset, some 3.6 billion physically-measured interactions, at the conclusion of the competition. This resulting collection, expected to be released in the summer of 2024, will be approximately 10 times larger than the largest publicly available dataset to date and 1,000 times larger than higher-quality, curated public datasets, providing researchers worldwide with an invaluable resource for future drug discovery efforts.
The BELKA competition is open for registration and concludes on July 8, 2024. For more information, including participation criteria and registration, visit the competition page on Kaggle.
About Leash Leash Biosciences is a biotechnology company unleashing machine learning to solve medicinal chemistry, with headquarters in Salt Lake City, UT. Powered by a team of experts, Leash aims to expand the boundaries of whats possible in drug discovery. Through the combination of leading-edge machine learning and large-scale chemical and biological datasets, Leash aims to rapidly design novel small molecule therapeutics. For more information, visit https://www.leash.bio and follow on LinkedIn.
Leash Contact Info Becca Levin, PhD Head of Business Development & Strategy becca@leash.bio
Follow this link:
Leash Biosciences Announces $9.3 Million Seed Financing to Pioneer AI-Driven Medicinal Chemistry - GlobeNewswire
Machine learning-based survival prediction nomogram for postoperative parotid mucoepidermoid carcinoma | Scientific … – Nature.com
Screening and characteristics of the patients
This study examined 882 patients with stage IIVA P-MEC, who met the inclusionexclusion criteria, from the SEER database between 2004 and 2015. Figure1 illustrates the patient selection process, while Table 1 summarizes patients demographic and clinicopathological characteristics. The lymph node ratio (LNR) cut-off was determined using X-tile analysis, with a resultant cut-off of 1.15%. The median (95% CI) follow-up time was 99 (92105) months, and the median (IQR) age at diagnosis was 52 (3766) years. A majority of the patients were white (661, 74.9%), with most tumors being grade II (396, 44.9%), stage I (353, 40%), T1-stage (381, 43.2%), N0-stage (685, 77.7%), and LNR0 (686, 77.8%) according to the AJCC 6th stage. All variables, except for chemotherapy (94.2% vs 5.8%), had proportions exceeding 10%. The study encompassed 12 variables, including age, gender, grade, stage, tumor (T) stage, node (N) stage, radiation, chemotherapy, laterality, marriage, and LNR. Nine factorsage, gender, grade, stage, T stage, N stage, radiation, chemotherapy, and LNRwere selected based on univariate Cox regression. Multivariate Cox regression revealed that four factors (age, grade, T stage, and chemotherapy) were independent risk factors, each with P-values less than 0.05. In the multivariate analysis, individuals aged 6070years (HR=5.936, 95% CI=3.01611.681, P<0.001), those over 70years old (HR=11.962, 95% CI=6.30322.703, P<0.001), Grade III (HR=2.324, 95% CI=1.2354.375, P=0.009), Grade IV (HR=3.148, 95% CI=1.7105.795, P<0.001), T2 (HR=3.162, 95% CI=1.0599.440, P=0.039), T3 (HR=4.300, 95% CI=1.50112.316, P=0.007), T4 (HR=4.414, 95% CI=1.43913.535, P=0.009), and chemotherapy (HR=1.721, 95% CI=1.0962.703, P=0.018) emerged as independent risk factors for overall survival (OS). Nevertheless, radiation(HR=0.750, 95% CI=0.5251.072, P=0.114), LNR (HR=0.868, 95% CI=0.1146.602, P=0.891), and other variables demonstrated no prognostic value (Table 2).
Figure2A displays the relationship between the LASSO coefficients and the regularization parameter, lambda (), and demonstrates the variable selection process and the effect of on the coefficients. The lambda.min value, which represents the lambda value corresponding to the minimum likelihood deviation or the highest C-index, was utilized for selecting tuning parameters in LASSO regression. Another vertical line was lambda.1se, which corresponds to the most regularized model within one standard error of the minimum (Fig.2B). The .min (=0.0050724) was chosen for the best predictive performance. A ten-fold cross-validation was employed. Ten variables were chosen through the LASSO regression algorithm, including age, gender, grade, T stage, N stage, radiation, chemotherapy, laterality, marriage, and LNR. Employing the adjusted R-squared maximum of the BSR, we selected eight variables: age, grade, stage, T stage, N stage, radiation, chemotherapy, and marriage(Fig.3). In the RF model and XGBoost, we independently extracted the top 10 variables, excluding laterality, radiation (RF), and LNR (XGBoost) (Fig.4). We assessed the key performance of machine learning and traditional statistics using AUC and AIC. Multivariate Cox stepwise backward regression reconfirmation identified LASSO, BSR, and XGBoost as the best of the five screening methods based on both AUC (AUC=88.4) and AIC (AIC=2118.9) criteria (Table 3).
Predictor Screening: the least absolute shrinkage and selection operator (LASSO) regression and fivefold cross-validation.
Predictor Screening: A SHAP plot and a feature importance plot are visualizations used to interpret XGBoost model results.
Predictor Screening: (A) Random Forest importance plot; (B) Best Subset Regression (BSR), it selected the best subset of predictor variables to accurately model a response variable.
Consequently, we constructed a nomogram with seven variables from the three algorithms (LASSO, BSR, and XGBoost), including age, grade, tumor stage, node stage, chemotherapy, radiation, and marriage. We developed an OS-nomogram capable of predicting a patients 3-, 5-, and 10-year OS rates using these variables (Fig.5). By converting clinical, pathological, and therapeutic factors into points, the nomogram accurately predicted OS. The total risk point score, calculated by summing all points, significantly correlated with 3-, 5-, and 10-year OS. We utilized a 5-year ROC curve to determine the optimum risk score cut-off point. KaplanMeier curves revealed that low-risk group patients (risk score<80.29) had better survival prognosis compared to high-risk group patients (risk score80.29, log-rank test, P<0.001) (Fig. S1).
A survival nomogram for predicting overall survival (OS) for patients with P-MEC. (1) When using the nomogram, seven predictors were quantified as point based on patient-specific factors and then the sum of the point corresponded to the total point below, which corresponded to the 3, 5, 10year OS ; (2) The optimal cut-off total point was 80.29 (the median of patients point), which divided the patients into high-risk group and low-risk group.
We evaluated the predictive ability of our nomogram by constructing time-dependent receiver operating characteristic (ROC) curves at 3, 5, and 10years. The ROC curves demonstrated excellent discriminative capacity of our model, with areas under the curves (AUCs) of 86.9 (95% CI=83.390.6), 88.4 (95% CI=83.591.4), and 87.7 (95% CI=84.191.3) (Fig.6). This indicates that our model has high accuracy in predicting overall survival in parotid MEC patients.
(AC) The calibration curves. The calibration curves of the nomogram predicting (A) 3-years, (B) 5-years, and (C) 10-years OS. (DF) Time dependent ROC curve. (D) ROC curves for 3-year, (E) 5-year, and (F) 10-year overall survival rates. (GI) Decision curve analysis (DCA) plot. (G) DCA plot for 3-year, (H) 5-year, and (I) 10-year overall survival rates.
We also performed 1000 bootstrap resampling analyses on the dataset and generate calibration plots for the prediction model. The calibration plots showed that the curves closely aligned with the 45-degree line, indicating a well-calibrated model in practical use (Fig.6). Furthermore, the 1000 bootstrap resamplings indicated good concordance between actual and predicted values in both the training and validation datasets, as evidenced by C-index (3-year, 0.8499, 0.7750.914; 5-year 0.8557, 0.7930.911; 10-year, 0.8375, 0.7720.897) and AUC (3-year, 0.8670, 95 CI%=0.7870.935; 5-year, 0.8879, 95 CI%=0.820.945; 10-year, 0.8767, 95 CI%=0.7920.947). (Fig.7). These results further support the reliability and accuracy of our prediction model.
This figure presents a bootstrap analysis of a dataset, displaying the 3-year and 5-year AUC and C-index values. The analysis was performed using 1000 bootstrap replicates. The figure demonstrates the accuracy and predictive power of the model for the specified time intervals.
To determine the clinical utility of our prediction model, we utilized the decision curve analysis (DCA) plot. The DCA plot illustrates the net benefit of the prediction model across a spectrum of threshold probabilities. Our model demonstrates clinical utility, as evidenced by its net benefit curve lies above both two lines across the range of threshold probabilities (Fig.6). This suggests that our prediction model is more effective than TNM stage or grade and can aid in making clinical decisions for P-MEC patients.
In summary, our nomogram exhibited excellent predictive ability and calibration, as well as clinical utility, indicating its potential usefulness in clinical practice.
Read the original post:
Machine learning-based survival prediction nomogram for postoperative parotid mucoepidermoid carcinoma | Scientific ... - Nature.com
AI Training Could be a Lifeline to Tackling the Skills Gap – Technology Magazine
AI may be developing at a fast pace, but global workforces are currently not skilled enough to handle such rapid changes.
The technology holds great potential to transform the global business landscape, boosting productivity and improving workplace efficiency. However, whilst companies are keen to invest, they do not have the skills required to handle AI, so it is not utilised to its full potential.
Research conducted throughout 2023 found that many employees feared being replaced by AI, contributing to workplace anxieties. However, HSBC found at the start of 2024 that most businesses are considering how AI could advance employee skillsets, with 83% surveyed planning to re-train their workforces to better utilise the rapidly developing technology.
As AI continues to play myriad roles moving forward, one thing is absolutely certain - it will impact nearly every job.
Big tech companies are already starting to enact transformational changes in favour of boosting AI. SAP, for instance, recently announced plans to focus on upskilling workers and driving growth in AI business areas.
In a rapidly changing digital landscape that sees businesses continue to be impacted by threat actors exploiting business infrastructure, the need for employee upskilling is crucial. As a result of AI technology developing at such a fast pace, there are ever-increasing talent shortages within the technology sector that need addressing.
IBMs Global AI Adoption Index found that a lack of relevant skills was the top barrier to AI adoption among UK enterprises. Jon Lester, VP of HR Technology, Data and AI at IBM explains that this is why upskilling for AI is arguably the most critical development area for the workforce moving forward.
Employees who are seen as domain experts will still be highly sought-after as they are the ones who will help to develop and train AI, he says. We have seen some of our support desk employees who have the highest customer satisfaction when answering phone calls or responding to emails, reskill to become conversational specialists who design chatbot interactions that provide a great end-user experience.
Those same people are now learning to become prompt engineers who are training large language models to generate responses to questions. AI is measurably moving domain experts to higher value work.
In recent months, the business landscape has changed beyond measure, with AI-related skills no longer considered desirable - but necessary, across the majority of job sectors. AI has ultimately opened up new opportunities and challenges by development teams, as businesses must retain their employees to satisfy AI demands.
Research conducted by ServiceNow found that the majority of office workers already use generative AI (Gen AI) for tasks like drafting content (69%), transcribing meeting notes (66%) and reviewing documents (65%). However, ServiceNow also found that almost half of workers still dont understand how AI can best support them in their role, suggesting that employers are still not making the most of the technology.
In the future I predict that we will see all jobs be categorised into two buckets: sunrise jobs and sunset jobs, the companys Area VP of Solution Consulting, Simon Morris, says. Sunset jobs are at risk of being significantly disrupted by AI and may eventually be replaced. Sunrise jobs will also change as they are enhanced and supported by AI.
In the future your job will involve training, supervising and correcting the algorithm rather than completing the task yourself.
As Akhil Seth, Head of Open Talent at UST, explains, AI is no longer simply a piece of innovative technology. In the maturity curve, it is now an implementation technology, he says. Companies should be looking to acquire a workforce that knows how to implement AI based solutions.
Having a workforce that is well acquainted with deep machine learning infrastructure skills as well as the critical thinking required to apply the technology to business initiatives will be critical in maintaining a competitive edge.
In line with such rapid AI advancements, business leaders are now seeking to develop new strategies that adopt an AI-as-copilot approach. Put simply, this means having AI work alongside human employees.
IBMs Jon Lester identifies three key emerging areas within AI that are already driving new skillsets, citing code generation, customer engagement and the concept of a hybrid workforce.
These emerging areas for AI are driving speed of task completion, improving decision-making for managers and leaders, and enabling significant productivity gains for employees, he says. What educators and professionals need to think about is how to ready the organisation for this seismic change.
Simon Morris, meanwhile, envisages a future where it will be normal for AI to suggest ways to develop employees or even assign workers to projects, tying employee learning closely to workforce planning, all while also ensuring they feel more fulfilled in their jobs. This will in turn not only result in talent retention, but also more satisfying customer experiences, creating a positive ripple effect on a business bottom line.
What is becoming clearer is that the human element to AI will remain integral to business developments, with companies working to address ethical challenges. With this in mind, ensuring that workforces can fully harness AI will be necessary, particularly when it comes to sectors like cybersecurity.
Increasing numbers of business leaders around the world have called for AI risk training in a new age of digital threats, in addition to ensuring that employees can handle the technology when things go wrong.
So, how can companies best upskill their workforces in the age of AI?
Jon Lester says that understanding the difference between traditional and generative AI will be key for businesses, so that employees can develop skills from a clear benchmark.
The pace of change that AI is creating means that the AI skills you learn today may have a half-life of less than three years and so a mindset of continuous development will be required to keep skills up to date, he says.
Ultimately, the onus is on the employer to ensure that their workforces are up-to-date with existing AI strategies and are adequately trained to propel the company forward to enact real technological business transformation.
This can be in the form of regular training courses and workshops with external organisations that can provide industry-leading expertise. It is also important that employees are given the opportunity to gain practical experiences and implement real-world AI projects, Morris comments.
As Seth concludes, organisations need to establish ongoing training programmes for employees. Workshops and online courses allow workers to acquire new skills and update existing ones. Another priority should be around customised training, ensuring that the courses are tailored to the companys specific needs, given that the applications of AI can vary widely from industry to industry.
Companies need to get the workforce comfortable with playing with AI tools and nurture a sense of curiosity around them. Companies that enable workforces to play with large language models (LLMs) will undoubtedly stumble upon market-defining use cases for the technology.
******
Make sure you check out the latest edition of Technology Magazine and also sign up to our global conference series - Tech & AI LIVE 2024
******
Technology Magazine is a BizClik brand
Read this article:
AI Training Could be a Lifeline to Tackling the Skills Gap - Technology Magazine