Page 369«..1020..368369370371..380390..»

Pulmonary vascular reactivity in growth restricted fetuses using computational modelling and machine learning … – Nature.com

Egaa-Ugrinovic, G., Sanz-Cortes, M., Figueras, F., Bargall, N. & Gratacs, E. Differences in cortical development assessed by fetal MRI in late-onset intrauterine growth restriction. Am. J. Obstet. Gynecol. 209(126), e1-8 (2013).

Google Scholar

Rodrguez-Lpez, M. et al. Descriptive analysis of different phenotypes of cardiac remodeling in fetal growth restriction. Ultrasound Obstet. Gynecol. Off. J. Int. Soc. Ultrasound Obstet. Gynecol. 50, 207214 (2017).

Crispi, F. et al. Fetal growth restriction results in remodeled and less efficient hearts in children. Circulation 121, 24272436 (2010).

Article PubMed Google Scholar

Eixarch, E. et al. Neurodevelopmental outcome in 2-year-old infants who were small-for-gestational age term fetuses with cerebral blood flow redistribution. Ultrasound Obstet. Gynecol. 32, 894899 (2008).

Article CAS PubMed Google Scholar

Maritz, G. S., Cock, M. L., Louey, S., Suzuki, K. & Harding, R. Fetal growth restriction has long-term effects on postnatal lung structure in sheep. Pediatr. Res. 55, 287295 (2004).

Article PubMed Google Scholar

Simpson, S. J. et al. Altered lung structure and function in mid-childhood survivors of very preterm birth. Thorax 72, 702711 (2017).

Article PubMed Google Scholar

Ronkainen, E., Dunder, T., Kaukola, T., Marttila, R. & Hallman, M. Intrauterine growth restriction predicts lower lung function at school age in children born very preterm. Arch. Dis. Child. Fetal Neonatal Ed. 101, F412-417 (2016).

Article PubMed Google Scholar

Tandra, M. et al. Small for gestational age is associated with reduced lung function in middle age: A prospective study from first to fifth decade of life. Respirology 28, 159165 (2023).

Article PubMed Google Scholar

Groenenberg, I. A., Wladimiroff, J. W. & Hop, W. C. Fetal cardiac and peripheral arterial flow velocity waveforms in intrauterine growth retardation. Circulation 80, 17111717 (1989).

Article CAS PubMed Google Scholar

Groenenberg, I. A. L., Stijnen, T. & Wladimiroff, J. W. Blood flow velocity waveforms in the fetal cardiac outflow tract as a measure of fetal well-being in intrauterine growth retardation. Pediatr. Res. 27, 379382 (1990).

Article CAS PubMed Google Scholar

Rasanen, J. et al. Reactivity of the human fetal pulmonary circulation to maternal hyperoxygenation increases during the second half of pregnancy: A randomized study. Circulation 97, 257262 (1998).

Article CAS PubMed Google Scholar

Done, E. et al. Maternal hyperoxygenation test in fetuses undergoing FETO for severe isolated congenital diaphragmatic hernia. Ultrasound Obstet. Gynecol. Off. J. Int. Soc. Ultrasound Obstet. Gynecol. 37, 264271 (2011).

Cikes, M. et al. Machine learning-based phenogrouping in heart failure to identify responders to cardiac resynchronization therapy: Machine learning-based approach to patient selection for CRT. Eur. J. Heart Fail. 21, 7485 (2019).

Article PubMed Google Scholar

Garcia-Canadilla, P., Sanchez-Martinez, S., Crispi, F. & Bijnens, B. Machine learning in fetal cardiology: What to expect. Fetal Diagn. Ther. 47, 363372 (2020).

Article PubMed Google Scholar

Garcia-Canadilla, P. et al. Machine-learningbased exploration to identify remodeling patterns associated with death or heart-transplant in pediatric-dilated cardiomyopathy. J. Heart Lung Transplant. 41, 516526 (2022).

Article PubMed Google Scholar

Garcia-Canadilla, P. et al. A computational model of the fetal circulation to quantify blood redistribution in intrauterine growth restriction. PLoS Comput. Biol. 10, e1003667 (2014).

Article PubMed PubMed Central Google Scholar

Garcia-Canadilla, P. et al. Understanding the aortic isthmus doppler profile and its changes with gestational age using a lumped model of the fetal circulation. Fetal Diagn. Ther. 41, 4150 (2017).

Article PubMed Google Scholar

Figueras, F. & Gratacs, E. Update on the diagnosis and classification of fetal growth restriction and proposal of a stage-based management protocol. Fetal Diagn. Ther. 36, 8698 (2014).

Article PubMed Google Scholar

Hadlock, F. P., Harrist, R. B., Sharman, R. S., Deter, R. L. & Park, S. K. Estimation of fetal weight with the use of head, body, and femur measurementsA prospective study. Am. J. Obstet. Gynecol. 151, 333337 (1985).

Article CAS PubMed Google Scholar

Figueras, F. et al. Customized birthweight standards for a Spanish population. Eur. J. Obstet. Gynecol. Reprod. Biol. 136, 2024 (2008).

Article CAS PubMed Google Scholar

Salomon, L. J. et al. ISUOG Practice Guidelines: Ultrasound assessment of fetal biometry and growth. Ultrasound Obstet. Gynecol. 53, 715723 (2019).

Article CAS PubMed Google Scholar

Bhide, A. et al. isuog Practice Guidelines (updated): use of Doppler velocimetry in obstetrics. Ultrasound Obstet. Gynecol. 58, 331339 (2021).

Article CAS PubMed Google Scholar

Peralta, C. F. A., Cavoretto, P., Csapo, B., Vandecruys, H. & Nicolaides, K. H. Assessment of lung area in normal fetuses at 1232 weeks: Lung area and LHR reference intervals. Ultrasound Obstet. Gynecol. 26, 718724 (2005).

Article CAS PubMed Google Scholar

Palacio, M. et al. Prediction of neonatal respiratory morbidity by quantitative ultrasound lung texture analysis: A multicenter study. Am. J. Obstet. Gynecol. 217(196), e1-196.e14 (2017).

Google Scholar

Azpurua, H. et al. Acceleration/ejection time ratio in the fetal pulmonary artery predicts fetal lung maturity. Am. J. Obstet. Gynecol. 203(40), e1-40.e8 (2010).

Google Scholar

Moreno-Alvarez, O. et al. Association between intrapulmonary arterial Doppler parameters and degree of lung growth as measured by lung-to-head ratio in fetuses with congenital diaphragmatic hernia. Ultrasound Obstet. Gynecol. 31, 164170 (2008).

Article CAS PubMed Google Scholar

Laudy, J. A., De Ridder, M. A. & Wladimiroff, J. W. Doppler velocimetry in branch pulmonary arteries of normal human fetuses during the second half of gestation. Pediatr. Res. 41, 897901 (1997).

Article CAS PubMed Google Scholar

Turk, E. A. et al. Spatiotemporal alignment of in utero BOLD-MRI series: Spatiotemporal alignment of MRI series. J. Magn. Reson. Imaging 46, 403412 (2017).

Article PubMed PubMed Central Google Scholar

Lin, Y.-Y., Liu, T.-L. & Fuh, C.-S. Multiple kernel learning for dimensionality reduction. IEEE Trans. Pattern Anal. Mach. Intell. 33, 11471160 (2011).

Article PubMed Google Scholar

Sanchez-Martinez, S. et al. Characterization of myocardial motion patterns by unsupervised multiple kernel learning. Med. Image Anal. 35, 7082 (2017).

Article PubMed Google Scholar

Garcia-Canadilla, P. et al. Patient-specific estimates of vascular and placental properties in growth-restricted fetuses based on a model of the fetal circulation. Placenta 36, 981989 (2015).

Article PubMed Google Scholar

Pennati, G., Bellotti, M. & Fumero, R. Mathematical modelling of the human foetal cardiovascular system based on Doppler ultrasound data. Med. Eng. Phys. 19, 327335 (1997).

Article CAS PubMed Google Scholar

Cynober, E. et al. Fetal pulmonary artery doppler waveform: A preliminary report. Fetal Diagn. Ther. 12, 226231 (1997).

Article CAS PubMed Google Scholar

Khatib, N. et al. The effect of maternal hyperoxygenation on fetal circulatory system in normal growth and IUGR fetuses. What we can learn from this impact. J. Matern. Fetal. Neonatal. Med. 31, 914918 (2018).

Guan, Y. et al. The role of doppler waveforms in the fetal main pulmonary artery in the prediction of neonatal respiratory distress syndrome: Doppler Waveforms Predict Neonatal RDS. J. Clin. Ultrasound 43, 375383 (2015).

Article PubMed Google Scholar

Rizzo, G. et al. Blood flow velocity waveforms from peripheral pulmonary arteries in normally grown and growth-retarded fetuses: Doppler and fetal pulmonary circulation. Ultrasound Obstet. Gynecol. 8, 8792 (1996).

Article CAS PubMed Google Scholar

Cruz-Martinez, R. et al. Contribution of intrapulmonary artery Doppler to improve prediction of survival in fetuses with congenital diaphragmatic hernia treated with fetal endoscopic tracheal occlusion. Ultrasound Obstet. Gynecol. 35, 572577 (2010).

Article CAS PubMed Google Scholar

Basurto, D. et al. Intrapulmonary artery Doppler to predict mortality and morbidity in fetuses with mild or moderate left-sided congenital diaphragmatic hernia. Ultrasound Obstet. Gynecol. 58, 590596 (2021).

Article CAS PubMed Google Scholar

Bravo-Valenzuela, N. J. M. et al. Dynamics of pulmonary venous flow in fetuses with intrauterine growth restriction: Pulmonary venous flow in IUGR fetuses. Prenat. Diagn. 35, 249253 (2015).

Article PubMed Google Scholar

DeKoninck, P. et al. Sonographic evaluation of vascular pulmonary reactivity following oxygen administration in fetuses with normal lung development: Fetal pulmonary reactivity to oxygen in fetuses with normal lung development. Prenat. Diagn. 32, 13001304 (2012).

Article CAS PubMed Google Scholar

Broth, R. E. et al. Prenatal prediction of lethal pulmonary hypoplasia: The hyperoxygenation test for pulmonary artery reactivity. Am. J. Obstet. Gynecol. 187, 940945 (2002).

Article PubMed Google Scholar

Sylvester, J. T., Shimoda, L. A., Aaronson, P. I. & Ward, J. P. T. hypoxic pulmonary vasoconstriction. Physiol. Rev. 92, 367520 (2012).

Article CAS PubMed Google Scholar

Torrance, H. L., Mulder, E. J. H., Brouwers, H. A. A., van Bel, F. & Visser, G. H. A. Respiratory outcome in preterm small for gestational age fetuses with or without abnormal umbilical artery Doppler and/or maternal hypertension. J. Matern. Fetal Neonatal Med. 20, 613621 (2007).

Article PubMed Google Scholar

Lio, A. et al. Fetal Doppler velocimetry and bronchopulmonary dysplasia risk among growth-restricted preterm infants: An observational study. BMJ Open 7, e015232 (2017).

Article PubMed PubMed Central Google Scholar

Baumann, S., Godtfredsen, N. S., Lange, P. & Pisinger, C. The impact of birth weight on the level of lung function and lung function decline in the general adult population. The Inter99 study. Respir. Med. 109, 12931299 (2015).

den Dekker, H. T. et al. Early growth characteristics and the risk of reduced lung function and asthma: A meta-analysis of 25,000 children. J. Allergy Clin. Immunol. 137, 10261035 (2016).

Article Google Scholar

Lange, P. et al. Lung-function trajectories leading to chronic obstructive pulmonary disease. N. Engl. J. Med. 373, 111122 (2015).

Article CAS PubMed Google Scholar

Breyer-Kohansal, R. et al. Factors associated with low lung function in different age bins in the general population. Am. J. Respir. Crit. Care Med. 202, 292296 (2020).

Article PubMed Google Scholar

Vellv, K. et al. Lung function in young adults born small for gestational age at term. Respirology 14361. https://doi.org/10.1111/resp.14361 (2022).

Crispi, F. et al. Exercise capacity in young adults born small for gestational age. JAMA Cardiol. https://doi.org/10.1001/jamacardio.2021.2537 (2021).

Article PubMed PubMed Central Google Scholar

Olvera, N. et al. Circulating biomarkers in young individuals with low peak FEV 1. Am. J. Respir. Crit. Care Med. 207, 354358 (2023).

Article PubMed Google Scholar

More here:
Pulmonary vascular reactivity in growth restricted fetuses using computational modelling and machine learning ... - Nature.com

Read More..

Inuvo Puts the Power of its AI in the Hands of Agencies and Brands – AiThority

Rolls out new AI-as-a-Service Solution

Inuvo, Inc., a leading provider of advertising technology, powered by artificial intelligence (AI) that serves brands and agencies, announced the self-serve availability of IntentKey models within demand-side platforms (DSPs) for advertisers.

This new Artificial Intelligence as a Service gives brand and agency clients direct access to IntentKeys powerful AI-driven audience selection and targeting recommendations. The frictionless access via Deal ID is customized to the marketing strategies associated with a clients products, services, or brands.

Recommended AI News: Box Expands its Collaboration with Microsoft with New Azure OpenAI Service Integration

Our self-serve solution puts control into the hands of our clients, allowing them to activate campaigns tailored to their brands using their programmatic campaign system of choice, said Rich Howe, Inuvo CEO. With a dedicated insights dashboard offering real-time visibility into the targeting rationale behind why audiences are interested, they can feel confident knowing IntentKey is maximizing performance.

Inuvo offers two service plans for clients accessing IntentKeys AI to inform their media buys. The first plan leverages the companys Managed Services team. The second plan, the new self-serve solution, empowers advertisers to independently activate IntentKeys proprietary models optimized for their unique brands and KPIs. After providing campaign details and goals, clients gain access to a Deal ID for PMP (private marketplace) activation across their preferred buying platforms.

Recommended AI News: Horizon3.ai Unveils Pentesting Services for Compliance Ahead of PCI DSS v4.0 Rollout

Unlike look-a-like models built on third-party cookies or stale offline data, IntentKeys self-serve solution uses real-time signals associated with online content to reach audiences as their interests change. This allows media to be dynamically placed across the open web. The IntentKey Insights Dashboard provides unmatched transparency revealing insights contributing to model optimization refreshed every five minutes.

With unparalleled transparency, independence, and customization, you simply cant compete with IntentKeys self-serve integration in todays privacy-centric landscape, added Howe. At a minimum, clients should immediately be setting up an IOS Deal ID so they can reach the Apple users their DSP is not currently capable of targeting.

If marketers can describe their audiences contextually, IntentKey can find and target them. Weve defined what the next generation of advertising technology looks like, concluded Howe.

Recommended AI News: OurCrowd AI Fund to Collaborate with NVIDIA Inception

[To share your insights with us as part of editorial or sponsored content, please write to sghosh@martechseries.com]

Original post:
Inuvo Puts the Power of its AI in the Hands of Agencies and Brands - AiThority

Read More..

Unlocking Innovation – AWS Blog

Amazon Bedrock is the best place to build and scale generative AI applications with large language models (LLM) and other foundation models (FMs). It enables customers to leverage a variety of high-performing FMs, such as the Claude family of models by Anthropic, to build custom generative AI applications.Looking back to 2021, when Anthropic first started building on AWS, no one could have envisioned how transformative the Claude family of models would be. We have been making state-of-the-art generative AI models accessible and usable for businesses of all sizes through Amazon Bedrock. In just a few short months since Amazon Bedrock became generally available on September 28, 2023, more than 10K customers have been using it to deliver, and many of them are using Claude. Customers such as ADP, Broadridge, Cloudera, Dana-Farber Cancer Institute, Genesys, Genomics England, GoDaddy, Intuit, M1 Finance, Perplexity AI, Proto Hologram, Rocket Companies and more are using Anthropics Claude models on Amazon Bedrock to drive innovation in generative AI and to build transformative customer experiences. And today, we are announcing an exciting milestone with the next generation of Claude coming to Amazon Bedrock: Claude 3 Opus, Claude 3 Sonnet, and Claude 3 Haiku.

Anthropic is unveiling its next generation of Claude with three advanced models optimized for different use cases. Haiku is the fastest and most cost-effective model on the market. It is a fast compact model for near-instant responsiveness. For the vast majority of workloads, Sonnet is 2x faster than Claude 2 and Claude 2.1 with higher levels of intelligence. It excels at intelligent tasks demanding rapid responses, like knowledge retrieval or sales automation. And it strikes the ideal balance between intelligence and speed qualities especially critical for enterprise use cases. Opus is the most advanced, capable, state-of-the-art FM with deep reasoning, advanced math, and coding abilities, with top-level performance on highly complex tasks. It can navigate open-ended prompts, and novel scenarios with remarkable fluency, including task automation, hypothesis generation, and analysis of charts, graphs, and forecasts. And Sonnet is first available on Amazon Bedrock today. Current evaluations from Anthropic suggest that the Claude 3 model family outperformscomparable models in math word problem solving (MATH) and multilingual math (MGSM) benchmarks, critical benchmarks used today for LLMs.

Specifically, Opus outperforms its peers on most of the common evaluation benchmarks for AI systems, including undergraduate level expert knowledge (MMLU), graduate level expert reasoning (GPQA), basic mathematics (GSM8K), and more. It exhibits high levels of comprehension and fluency on complex tasks, leading the frontier of general intelligence.

Through Amazon Bedrock, customers will get easy access to build with Anthropics newest models. This includes not only natural language models but also their expanded range of multimodal AI models capable of advanced reasoning across text, images, charts, and more. Our collaboration has already helped customers accelerate generative AI adoption and delivered business value to them. Here are a few ways customers have been using Anthropics Claude models on Amazon Bedrock:

We are developing a generative AI solution on AWS to help customers plan epic trips and create life-changing experiences with personalized travel itineraries. By building with Claude on Amazon Bedrock, we reduced itinerary generation costs by nearly 80% percent when we quickly created a scalable, secure AI platform that can organize our book content in minutes to deliver cohesive, highly accurate travel recommendations. Now we can repackage and personalize our content in various ways on our digital platforms, based on customer preference, all while highlighting trusted local voicesjust like Lonely Planet has done for 50 years.

Chris Whyde, Senior VP of Engineering and Data Science, Lonely Planet

We are working with AWS and Anthropic to host our custom, fine-tuned Anthropic Claude model on Amazon Bedrock to support our strategy of rapidly delivering generative AI solutions at scale and with cutting-edge encryption, data privacy, and safe AI technology embedded in everything we do. Our new Lexis+ AI platform technology features conversational search, insightful summarization, and intelligent legal drafting capabilities, which enable lawyers to increase their efficiency, effectiveness, and productivity.

Jeff Reihl, Executive VP and CTO, LexisNexis Legal & Professional

At Broadridge, we have been working to automate the understanding of regulatory reporting requirements to create greater transparency and increase efficiency for our customers operating in domestic and global financial markets. With use of Claude on Amazon Bedrock, were thrilled to get even higher accuracy in our experiments with processing and summarizing capabilities. With Amazon Bedrock, we have choice in our use of LLMs, and we value the performance and integration capabilities it offers.

Saumin Patel, VP Engineering generative AI, Broadridge

The Claude 3 model family caters to various needs, allowing customers to choose the model best suited for their specific use case, which is key to developing a successful prototype and later production systems that can deliver real impactwhether for a new product, feature or process that boosts the bottom line. Keeping customer needs top of mind, Anthropic and AWS are delivering where it matters most to organizations of all sizes:

And AWS and Anthropic are continuously reaffirming our commitment to advancing generative AI in a responsible manner. By constantly improving model capabilities committing to frameworks like Constitutional AI or the White House voluntary commitments on AI, we can accelerate the safe, ethical development, and deployment of this transformative technology.

Looking ahead, customers will build entirely new categories of generative AI-powered applications and experiences with the latest generation of models. Weve only begun to tap generative AIs potential to automate complex processes, augment human expertise, and reshape digital experiences. We expect to see unprecedented levels of innovation as customers choose Anthropics models augmented with multimodal skills leveraging all the tools they need to build and scale generative AI applications on Amazon Bedrock. Imagine sophisticated conversational assistants providing fast and highly-contextual responses, picture personalized recommendation engines that seamlessly blend in relevant images, diagrams and associated knowledge to intuitively guide decisions. Envision scientific research turbocharged by generative AI able to read experiments, synthesize hypotheses, and even propose novel areas for exploration. There are so many possibilities that will be realized by taking full advantage of all generative AI has to offer through Amazon Bedrock. Our collaboration ensures enterprises and innovators worldwide will have the tools to reach the next frontier of generative AI-powered innovation responsibly, and for the benefit of all.

Its still early days for generative AI, but strong collaboration and a focus on innovation are ushering in a new era of generative AI on AWS. We cant wait to see what customers build next.

Check out the following resources to learn more about this announcement:

Swami Sivasubramanian is Vice President of Data and Machine Learning at AWS. In this role, Swami oversees all AWS Database, Analytics, and AI & Machine Learning services. His teams mission is to help organizations put their data to work with a complete, end-to-end data solution to store, access, analyze, and visualize, and predict.

Read more:
Unlocking Innovation - AWS Blog

Read More..

Northrop Grumman Partners to Advance Deep Sensing for the US Army | Northrop Grumman – Northrop Grumman Newsroom

The TITAN ground system solution will provide multi-domain integrated data directly to the front lines

AZUSA, Calif. March 7, 2024 Northrop Grumman Corporation (NYSE: NOC) is partnering with Palantir USG, Inc. on the newly awarded Tactical Intelligence Targeting Access Node (TITAN) ground system for the U.S. Army. The program supports one of the Armys key modernization imperatives by using artificial intelligence (AI) and machine learning (ML) to enhance the automation of target recognition and geolocation and integrate data from multiple sensors to reduce sensor-to-shooter timelines.

Northrop Grumman will partner to:

The TITAN ground system will enable faster decision making on the frontlines by providing actionable intelligence to reduce sensor-to-shooter timelines and maximize effectiveness of long-range fires. (Photo Credit: Palantir)

Expert:

Aaron Dann, vice president, strategic force programs, Northrop Grumman: Northrop Grummans extensive experience in large-scale system integration will help enable mission success and provide information superiority for our warfighters in complex operating environments.Our work on TITAN continues our long history of supporting our nations need for actionable intelligence when and where it matters most.

Details:

TITAN is a ground system that has access to space, high altitude, aerial, and terrestrial sensors to provide actionable targeting information for enhanced mission command. TITAN will enable the Army to fuse, correlate, and integrate intelligence data from a rapidly expanding series of sensors providing operational forces a full picture of their surroundings. This robust capability allows real-time decision making that will substantially increase the accuracy, precision, and effects of long-range precision fires.

Northrop Grumman is a leading global aerospace and defense technology company. Our pioneering solutions equip our customers with the capabilities they need to connect and protect the world, and push the boundaries of human exploration across the universe. Driven by a shared purpose to solve our customers toughest problems, our employees define possible every day.

The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the U.S. Government.

Continued here:
Northrop Grumman Partners to Advance Deep Sensing for the US Army | Northrop Grumman - Northrop Grumman Newsroom

Read More..

Multimodal artificial intelligence-based pathogenomics improves survival prediction in oral squamous cell carcinoma … – Nature.com

Sung, H. et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 Countries. CA Cancer J. Clin. 71, 209249 (2021).

Article Google Scholar

Chen, S.-H., Hsiao, S.-Y., Chang, K.-Y. & Chang, J.-Y. New insights into oral squamous cell carcinoma: From clinical aspects to molecular tumorigenesis. Int J. Mol. Sci. 22, 2252 (2021).

Article CAS PubMed Central Google Scholar

Adrien, J., Bertolus, C., Gambotti, L., Mallet, A. & Baujat, B. Why are head and neck squamous cell carcinoma diagnosed so late? Influence of health care disparities and socio-economic factors. Oral Oncol. 50, 9097 (2014).

Article CAS Google Scholar

Gonzlez-Moles, M. ., Aguilar-Ruiz, M. & Ramos-Garca, P. Challenges in the early diagnosis of oral cancer, evidence gaps and strategies for improvement: A scoping review of systematic reviews. Cancers 14, 4967 (2022).

Article PubMed Central Google Scholar

Russo, D. et al. Development and validation of prognostic models for oral squamous cell carcinoma: A systematic review and appraisal of the literature. Cancers 13, 5755 (2021).

Article PubMed Central Google Scholar

Carreras-Torras, C. & Gay-Escoda, C. Techniques for early diagnosis of oral squamous cell carcinoma: Systematic review. Med. Oral. Patol. Oral. Cir. Bucal. 20, e305-315 (2015).

Article PubMed Central Google Scholar

Alabi, R. O. et al. Machine learning in oral squamous cell carcinoma: Current status, clinical concerns and prospects for future-A systematic review. Artif. Intell. Med. 115, 102060 (2021).

Article Google Scholar

Qiu L, Khormali A, & Liu K. Deep Biological Pathway Informed Pathology-Genomic Multimodal Survival Prediction. (2023) [cited 2023 Apr 3]; https://arxiv.org/abs/2301.02383

Vale-Silva, L. A. & Rohr, K. Long-term cancer survival prediction using multimodal deep learning. Sci. Rep. 11, 13505 (2021).

Article CAS PubMed Central Google Scholar

Carrillo-Perez, F. et al. Machine-learning-based late fusion on multi-omics and multi-scale data for non-small-cell lung cancer diagnosis. JPM 12, 601 (2022).

Article PubMed Central Google Scholar

Lipkova, J. et al. Artificial intelligence for multimodal data integration in oncology. Cancer Cell. 40, 10951110 (2022).

Article CAS PubMed Central Google Scholar

Steyaert, S. et al. Multimodal deep learning to predict prognosis in adult and pediatric brain tumors. Commun. Med. 3, 44 (2023).

Article PubMed Central Google Scholar

Saravi, B. et al. Artificial intelligence-driven prediction modeling and decision making in spine surgery using hybrid machine learning models. J. Personal. Med. 12, 509 (2022).

Article Google Scholar

Zuley, M.L., Jarosz, R., Kirk, S., Lee, Y., Colen, R., & Garcia, K., et al. The Cancer Genome Atlas Head-Neck Squamous Cell Carcinoma Collection (TCGA-HNSC), The Cancer Imaging Archive, 2016 (Accessed 3 Apr 2023); https://wiki.cancerimagingarchive.net/x/VYG0

Li, X. et al. Multi-omics analysis reveals prognostic and therapeutic value of cuproptosis-related lncRNAs in oral squamous cell carcinoma. Front. Genet. 13, 984911 (2022).

Article CAS PubMed Central Google Scholar

Zou, C. et al. Identification of immune-related risk signatures for the prognostic prediction in oral squamous cell carcinoma. J. Immunol. Res. 2021, 6203759 (2021).

Article PubMed Central Google Scholar

Macenko, M., Niethammer, M., Marron, J.S., Borland, D., Woosley, J.T., Xiaojun, G., et al. A method for normalizing histology slides for quantitative analysis. In 2009 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, 2009 (IEEE, accessed 4 Apr 2023]. P. 11071110. http://ieeexplore.ieee.org/document/5193250/

Vahadane, A. et al. Structure-preserving color normalization and sparse stain separation for histological images. IEEE Trans. Med. Imaging 35, 19621971 (2016).

Article Google Scholar

Salvi, M., Acharya, U. R., Molinari, F. & Meiburger, K. M. The impact of pre- and post-image processing techniques on deep learning frameworks: A comprehensive review for digital pathology image analysis. Comput. Biol. Med. 128, 104129 (2021).

Article Google Scholar

Carpenter, A. E. et al. Cell Profiler: Image analysis software for identifying and quantifying cell phenotypes. Genome Biol. 7, R100 (2006).

Article PubMed Central Google Scholar

Hughey, J. J. & Butte, A. J. Robust meta-analysis of gene expression using the elastic net. Nucleic Acids Res. 43, e79 (2015).

Article PubMed Central Google Scholar

Tschodu, D. et al. Re-evaluation of publicly available gene-expression databases using machine-learning yields a maximum prognostic power in breast cancer. Sci. Rep. 13, 16402 (2023).

Article CAS PubMed Central Google Scholar

Kanehisa, M. & Goto, S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 2730 (2000).

Article CAS PubMed Central Google Scholar

Kanehisa, M. Toward understanding the origin and evolution of cellular organisms. Protein Sci. 28, 19471951 (2019).

Article CAS PubMed Central Google Scholar

Kanehisa, M., Furumichi, M., Sato, Y., Kawashima, M. & Ishiguro-Watanabe, M. KEGG for taxonomy-based analysis of pathways and genomes. Nucleic Acids Res. 51, D587D592 (2023).

Article CAS Google Scholar

Ye, H. et al. Metabolism-related bioinformatics analysis reveals that HPRT1 facilitates the progression of oral squamous cell carcinoma in vitro. J. Oncol. 2022, 116 (2022).

Google Scholar

Ferreira, A.-K. et al. Survival and prognostic factors in patients with oral squamous cell carcinoma. Med. Oral. Patol. Oral. Cir. Bucal. 26, e387e392 (2021).

Article Google Scholar

Asio, J., Kamulegeya, A. & Banura, C. Survival and associated factors among patients with oral squamous cell carcinoma (OSCC) in Mulago hospital, Kampala, Uganda. Cancers Head Neck. 3, 9 (2018).

Article PubMed Central Google Scholar

Girod, A., Mosseri, V., Jouffroy, T., Point, D. & Rodriguez, J. Women and squamous cell carcinomas of the oral cavity and oropharynx: Is there something new?. J. Oral Maxillof. Surg. 67, 19141920 (2009).

Article Google Scholar

Wong, K., Rostomily, R. & Wong, S. Prognostic gene discovery in glioblastoma patients using deep learning. Cancers 11, 53 (2019).

Article CAS PubMed Central Google Scholar

Hsich, E., Gorodeski, E. Z., Blackstone, E. H., Ishwaran, H. & Lauer, M. S. Identifying important risk factors for survival in patient with systolic heart failure using random survival forests. Circ. Cardiovasc. Qual. Outcomes 4, 3945 (2011).

Article Google Scholar

Ishwaran, H., Kogalur, U. B., Gorodeski, E. Z., Minn, A. J. & Lauer, M. S. High-dimensional variable selection for survival data. J. Am. Stat. Assoc. 105, 20517 (2010).

Article MathSciNet CAS Google Scholar

Ishwaran, H., Kogalur, U. B., Chen, X. & Minn, A. J. Random survival forests for high-dimensional data. Stat. Anal. Data Min. ASA Data Sci. J. 2011(4), 11532 (2011).

Article MathSciNet Google Scholar

Katzman, J. L. et al. Deepsurv: personalized treatment recommender system using a cox proportional hazards deep neural network. BMC Med. Res. Methol. 18, 187202 (2018).

Google Scholar

Sargent, D. J. Comparison of artificial neural networks with other statistical approaches. Cancer 91, 16361642 (2001).

Article CAS Google Scholar

Xiang, A., Lapuerta, P., Ryutov, A., Buckley, J. & Azen, S. Comparison of the performance of neural network methods and Cox regression for censored survival data. Comput. Stat. Data Anal. 34, 24357 (2000).

Article Google Scholar

Nie, Z., Zhao, P., Shang, Y. & Sun, B. Nomograms to predict the prognosis in locally advanced oral squamous cell carcinoma after curative resection. BMC Cancer 21, 372 (2021).

Article PubMed Central Google Scholar

Nojavanasghari, B., Gopinath, D., Koushik, J., Baltruaitis, T., & Morency, L. P. Deep multimodal fusion for persuasiveness prediction. In Proceedings of the 18th ACM International Conference on Multimodal Interaction. 284288 (2016).

Kampman, O., Barezi, E. J., Bertero, D., & Fung, P. Investigating audio, video, and text fusion methods for end-to-end automatic personality prediction. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics vol. 2.606611 (2018).

Wang, Z., Li, R., Wang, M. & Li, A. Gpdbn: Deep bilinear network integrating both genomic data and pathological images for breast cancer prognosis prediction. Bioinformatics 27, 29632970 (2021).

Article Google Scholar

Subramanian, V., Syeda-Mahmood, T., & Do, M. N. Multimodal fusion using sparse cca for breast cancer survival prediction. In Proceedings of IEEE 18th International Symposium on Biomedical Imaging (ISBI).14291432 (2021).

Mai, S., Hu, H., & Xing, S. Modality to modality translation: An adversarial representation learning and graph fusion network for multimodal fusion. In Proceedings of the AAAI Conference on Artificial Intelligence 164172 (2020).

Mobadersany, P. et al. Predicting cancer outcomes from histology and genomics. Predicting cancer outcomes from histology and genomics using convolutional networks. Proc. Natl. Acad. Sci. 115, 29702979 (2018).

Article Google Scholar

Wang, C. et al. A cancer survival prediction method based on graph convolutional network. IEEE Trans. Nanobiosci. 19, 117126 (2020).

Article Google Scholar

Zadeh, A., Chen, M., Poria, S., Cambria, E., & Morency, L. P Tensor fusion network for multimodal sentiment analysis. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 11031114 (2017).

Chen, R. J. et al. Pathomic fusion: An integrated framework for fusing histopathology and genomic features for cancer diagnosis and prognosis. IEEE Trans. Med. Imaging 41, 757770 (2022).

Article PubMed Central Google Scholar

Kim, J. H., On, K. W., Lim, W., Kim, J., Ha, J. W., & Zhang, B. T. Hadamard product for low-rank bilinear pooling. In Proceedings of International Conference on Learning Representations, 114 (2017)

Liu, Z., Shen, Y., Lakshminarasimhan, V. B., Liang, P. P., Zadeh, A., & Morency, L. P. Efficient low-rank multimodal fusion with modality-specific factors. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 22472256 (2021)

Li, R., Wu, X., Li, A. & Wang, M. Hfbsurv: Hierarchical multimodal fusion with factorized bilinear models for cancer survival prediction. Bioinformatics 38, 25872594 (2022).

Article CAS PubMed Central Google Scholar

Read more here:
Multimodal artificial intelligence-based pathogenomics improves survival prediction in oral squamous cell carcinoma ... - Nature.com

Read More..

Inside AI: Talking to the Data – Inside Unmanned Systems

At opposite ends of Illinois, two leaders in emerging construction technologies are collaborating to demonstrate how AI, ML (machine learning) and CV (computer vision) can validate constructions digital-data mandate: Save time, save money, save lives.

For the fourteenth consecutive year, U.S. construction valuation has grown and is now at $1.98 trillion, noted Mani Golparvar, professor of civil engineering, computer science and technology entrepreneurship at the Grainger College of Engineering/University of Illinois. But, hes reported, 53% of construction projects are behind schedule, 66% are over budget and nearly all carry cost overruns.

Golparvar estimated that $280 billion in potential added value exists through improved coordination. As an industry, weve significantly improved the way we plan projects. But the way we plan the job, execute, monitor our execution and use whatever we monitor to update a planthese four problems all contribute to why projects are behind schedule and over budget. If youre looking to these issues, 80% are preventable: Project team members do not have visibility early enough to be able to come up with a remedy.

To that end, Golparvar is founder and chief strategy officer of Reconstruct, a Visual Command Center that uploads schedules, 2D and 3D models, and reality capture so users can track progress as well as coordination and communications problems. Reconstruct says its benefits include a 30% reduction in time reporting field progress, and 25%-40% in improved schedule management and proactive risk mitigation. Eighty-five companies and several universities now partner with it.

Burcin Kaplanoglu mirrored Golparvars remarks. Speaking from the Oracle Industry Lab he heads just outside Chicago, he noted that U.S. infrastructure is graded as a C- by the American Society of Civil Engineers. At the same time, he said, we [would] need to hire 500,000 people a year, on top of regular hiring, to meet demand.

Consequently, he added, we need to automate processes, gain efficiency and improve safety with technology. Talking to your data is going to have tremendous benefits for the construction engineer. Getting that response really quickly is going to change how we interact with technology.

Golparvar and Kaplanoglu recently joined forces around a Reality Mapping Experiment (see Constructing the Future sidebar, pg. 46) at the Oracle lab to explore what combination of tools, processes and teams can best conduct reality capture and mapping on-site, reducing time and cost and ensuring safety.

Thats of great value to an industry facing an ever-growing tech stack, as Jennifer Suerth, senior vice president of Pepper Construction, which built the lab, said in an Oracle TV video. Autonomy allows us to continue to get the work done but be efficient with the resources we have.

Golparvar and Kaplanoglu offered quick definitions of ML, CV and AI.

Machine learning allows computers to learn from data to improve and streamline processes. Machine learning is continuing to improve the construction industry, Kaplanoglu said. We need to figure out how to optimize and how to forecast better.

Computer vision is a form of AI, Golparvar said. It involves the process of analyzing pictures and videos and generating actionable insights from them, doing image processing and understanding the geometry of the scene. Recognizing objects by automatically analyzing them can track progress, detect and recognize anomalies, and coordinate and optimize operations.

To solve that coordination issue requires everyone to be on the same page in terms of what was there versus what should be there, Golparvar continued. In construction, weve really taken advantage of design information that we call BIM, building information models, 3D representation of design. What if you tie them against the schedule? At any snapshot in time, you can click on any two points, you can measure the length, area and volume. The picture shows you what was promised to be done; the picture in the background shows you what has been achieved on the job. And the delta between the twowhat needs to be coordinated in terms of quality, safety and progresscan be color-coded in red and green: what is on the schedule, what is behind.

Generative AI is the third leg of this digital triad. AI involves machines mimicking human cognitive functions, but theres more to that than large language models. GenAI is going to help us do project management and planning, Kaplanoglu said. AI can inform the entire construction cycle, from estimating preliminary costs, to tracking quality and productivity, to the more tactical tasks of ordering and payment.

Kaplanoglu ventured predictions for AI, ML and CV in construction:

Unified, intelligent clouds will unite software, platforms and infrastructure within end-to-end solutions, easing management and accelerating change.

Oracle is a major player in the world of project management and design, and in 2018 Kaplanoglu co-founded the Oracle Industry Lab to advance the companys innovation in construction and engineering. He explained the rationale. When youre running day-to-day operations, its very hard to stop what youre doing and try these new technologies. So, by creating a testbed where we bring the technology and they bring their problems, we are creating a neutral space to try and learn from it.

To realize this, writing a data strategy around machine learning was followed by building an ecosystem because our products do scheduling, safety, risk management, cost. We had almost 2,000 engagements in the first two years, he said. The pandemic only spurred a concentration on autonomy. All of a sudden, we did remote site monitoring, cameras and drone flights, route inspections. We could work with architects and designers remotely; they didnt have to come to the site. Everything was video.

The original sandbox gave way to a much larger, industry-benchmarking construction space for investigatingrealistic job site situations. It opened in April 2022on time, on budget. With analogs now operating in the United Kingdom and Australia, it has passed 13,000 engagements. I think construction really got used to digital imagery, 3D point clouds, using drones, capturing data, Kaplanoglu said.

Construction has some relatively unique characteristics, from a constantly changing environment to wind and weather exposure. Its like building a prototype each time, Kaplanoglu added. Data can bridge that. For example, Komatsu, the massive construction and mining equipment manufacturer, is using Nvidias Jetson edge AI platform to impart intelligence to trucks, excavators and the like. AI and machine learning are critical to providing real value to the construction space, a Nvidia spokesman toldInside Unmanned Systems.

In terms of project management, Kaplanoglu said, theres tons of opportunities when we want to use machine learning, computer vision and GenAI.

The need to test construction innovation, validation and measurement in real-world situations led to a collaboration between Oracles. Smart Construction Platform ecosystem and founding partner tenant Reconstructs visualization tools. The goal was to create guidelines and tool utilizations best suited for particular use cases, improving documentation of and guidelines for progress, quality and assessment of as-built conditions through a structures lifecycle.

The Report explains costs, time spent on different tools, and what kind of results you should be expecting from each, Golparvar explained about the 2023 document.

Pepper Construction and Clayco offered project expertise on the construction side, with leading drone company Skydio and 3D measurement company FARO Technologies providing technical expertise.

Eight different ways of capturing and mapping data were scanned across resolutions, speed, cost and deliverables. These included manual and autonomous drones, 360 and smartphone cameras, and various stationary and mobile LiDAR techniques. Capture data was processed into Oracles platform and then into Reconstruct, which could process them for a consistent viewing experience. Post-processing, photogrammetry and 4D simulations could be viewed for actionable and exportable results.

Construction is said to be the number one civil market for drones, which can take pictures quickly in hard-to-access areas while their sensors allow for obstacle avoidance. For the experiment, a fully autonomous Skydio drone used its Indoor Capture vision-based autonomous software over several iterations to, as Solutions Engineering Manager Colin Romberger put it on Oracle TV, get total coverageto have the best data to put into photogrammetry software for Reconstruction.

Kaplanoglu discussed gains from using autonomous drones. The time savings dropped significantly, because youre already preprogrammed. Human drone flights also improved with repetition. Theres still some autonomy, like it avoids obstacles and you can do flight plans.

But the biggest drop happens when its fully autonomous.

You always want to know about what youre trying to solve, what kind of tools you have and what kind of resources you can leverage to solve that problem, Golparvar said.

He and Kaplanoglu enumerated how AI and associated technologies can self-direct and break down planning and actions.

Site selection already involves machine learning, Kaplanoglu said, because you have certain parameters to make sure you comply with, like height restrictions, zoning. You can use machine learning to pick the optimal location.

Design can use machine learning to investigate bigger datasets and find helpful patterns. Golparvar: You want to capture the architects intent and transform it into a document that can be used as a base of design. GenAI can provide and reconcile design alternatives, incorporating everything from routing mechanical systems to complying with local codes. The way it works, Kaplanoglu added, you sketch things and then people try to visualize it and build models. Learning from them can offer significant time savings. Im talking about doing this in an hour versus doing it in months.

Construction has its own mini cycle. You define your scope, you hire engineers and architects, and now you need a contractor to build it Kaplanoglu said. You send an RFP for a contractor to build it.

RFPs, however, can be limited and schedulers may cut and paste. Kaplanoglu offered a solution: We can upload an RFP document to Oracle Cloud infrastructure. It reads the document and gives you a summary and then it asks you questions. Its going to show you an early-stage prototype, and it builds a schedule for you. Its going to be a good template for you to build your own schedule.

Now something that would have taken you three days is going to take you maybe two to three hours.

Operations can use computer vision. How many times does someone show up at their location and they dont even know what the specifications are or what theyre supposed to do, Kaplanoglu said. You can take a picture or video and then computer vision can tell you, Its this manufacturer, this model, or vision can process the video or image and say, Oh, theres rust in this corner.

Predictive maintenance can significantly reduce costs and downtime. Safety also can be empowered by AI. Many of the AI tools that we have are completely focusing on offering better awareness for our workers, Golparvar said.

Skydio has participated in the Oracle labs work, and Kaplanoglu offered an example of its drones adding value. We have a great relationship with Skydio. Instead of taking images that humans need to tag, locate and input, that data goes to our product, and then our Vision Services recognizes the rust and registers it to our work order system. But it doesnt just register; it tells you the lat[itude] and long[itude], the location, the severity of the damage, it helps you create a ticket.

Basically, we automate the whole thing.

Imprecise capture, incorrect inputting and lack of updating can undermine results. Still, the industry has significantly improved its use of cloud, Kaplanoglu said. The data is more accessible, you can put more line, compute is easier. There are a lot of benefits to it, and our industry really took that to heart.

Golparvar advocates a step-by-step process. We do a maturity assessment on the readiness of a company for adopting and adapting AI-driven products, and what we can do to make sure that these products fit into the existing workflows. Because people are resistant to change.

We also need to demonstrate the return on investment, at the project level and the individual levelquality, speed, time and money saved. So, the best strategy is to kind of introduce one next step to that project team so we can take them from where they are to that future where many of their steps will be fully automated, and make sure they see how AI is verifiable, to have that element of trust.

Kaplanoglu is bullish in terms of technology adoption. Machine learning, computer vision and GenAI are all going to have a big impact in the next three to five years.

Continued here:
Inside AI: Talking to the Data - Inside Unmanned Systems

Read More..

Statisticians and physicists team up to bring a machine learning approach to mining of nuclear data – EurekAlert

Article Highlight | 8-Mar-2024

Bayesian statistical methods help improve the predictability of complex computational models in experimentally unknown research

DOE/US Department of Energy

image:

Schematic illustration of the density of the Dirichlet distribution for true model mixing.

Credit: Image courtesy of V. Kejzlar

Physicists use theoretical models to study physical quantities, such as the mass ofnuclei, where they do not have experimental data. However, using a single imperfect theoretical model can lead to misleading results. To improve the quality of extrapolated predictions, scientists can instead use several different models and mix their results. In this way, scientists make the most of the collective wisdom of multiple models and obtain the best prediction from the most current experimental information.

To improve the predictability of complex computational models, a team of nuclear physicists and statisticians proposed a novel statistical method. This method uses a statistical process called Bayes' theorem to revise the probability of a hypothesis as new data are obtained. The resultingmachine learningframework uses the so-called Dirichlet distribution. This statistical process combines the results of several imperfect models. The researchers demonstrated the ability of the proposed mixing techniques to minedataon nuclear masses.

This research demonstrated that global and local mixtures of models have excellent performance in both the accuracy of their predictions and their uncertainty quantification. These mixtures appear to be preferable to classical Bayesian model averaging, the conventional approach. Additionally, the researchers analysis indicates that improving model predictions through straightforward mixing leads to more robust extrapolations than does mixing of corrected models.

This material is based on work supported by the Department of Energy Office of Science, Office of Nuclear Physics.

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.

View original post here:
Statisticians and physicists team up to bring a machine learning approach to mining of nuclear data - EurekAlert

Read More..

Building trust in deep learning-based immune response predictors with interpretable explanations | Communications … – Nature.com

MHC-Bench

The MHC-Bench dataset consists of 2,232,937 peptides of length 9 and 115 MHC alleles. All the MHC alleles in this dataset are human MHC molecules (i.e. Human Leukocyte Antigens or HLA). Out of the 115 MHC alleles, about half are HLA-B, a third are HLA-A and remaining HLA-C. The MHC-Bench dataset contains 3,464,013 peptide-MHC pairs previously unseen by predictors during training. It is worth noting that the peptides by themselves may have been seen in the training data paired with a different HLA allele. The peptide overlap between training data for investigated predictors and MHC-Bench is shown in Table1. A description of the construction and composition of the dataset is presented in Methods section.

We evaluated the performance of four MHC class I predictors on the MHC-Bench dataset MHCflurry, MHCfovea, NetMHCpan and TransPHLA. The choice of the four predictors was guided by their popularity and performance reported in the literature1,2,4,47. MHCflurry2.047 is an ensemble predictor of 10 neural networks for predicting presentation of a peptide. It supports 14,993 MHC class I alleles. MHCflurry provides three scores, namely Binding Affinity (BA), Processing score and Presentation Score (PS). PS is produced by a logistic regression model combining the binding affinity and processing scores. Processing score captures the antigen probability which combined with binding affinity substantially improves the performance of the predictor47.

NetMHCpan4.11, an ensemble of 50 neural network models, produces Elution Ligand (EL) and Binding Affinity (BA) score and we refer to these modes as NetMHCpan-EL and NetMHCpan-BA respectively. It utilizes NNAlign_MA48, an artificial neural network that predict both BA and EL scores. For both modes, peptides with rank 2% or less are considered as binders and 0.5% or less as strong binders. It supports 11,000 MHC class I alleles.

MHCfovea2, is an ensemble of multiple CNN models that takes MHC allele sequence and peptide sequence as input to predict binding probability. In MHCfovea, ScoreCAM2,49 is applied to identify important positions and corresponding residues in input peptides and MHC allele sequence. It provides the motifs for the first and last 4 positions of the peptide for each allele along with the motif for MHC sequence.

TransPHLA4, is a transformer architecture that predicts binding probability for an input peptide and MHC allele sequence. Using the attention scores, important residues for each position of a 9mer can be obtain to generate a peptide motif for a given allele.

We use Area Under ROC (AUROC) and Area Under Precision-Recall Curve (AUPRC) as benchmark or performance metrics. Since peptide binding is MHC allele specific, we calculated the scores for each allele separately. The benchmark metrics for each allele are reported in Supplementary Data1, 2. The average benchmark metrics for the MHC class I predictors are reported in Table2. We find that all four predictors are comparable in their average performances across all alleles, as seen in Fig.2c. While the differences between the scores among predictors is minimal, we see that for most alleles, performance of MHCflurry-PS is marginally higher. Figure2a displays number of times (or number of alleles) the predictor was top performer based on AUROC and AUPRC score. We provide additional benchmarking analysis of the predictors in theSupplementary material (See Supplemental Fig.1 and Supplementary Note2) which indicates that the performance of the predictors are comparable across various metrics. The predictors achieve a higher performance on the AUROC metric, 0.950.98, as opposed to AUPRC where it is in the range of 0.750.86. For most alleles, there are fewer binding peptide-MHC pairs in comparison to non-binding peptide-MHC pairs. This is evident in the distribution plot in Fig.2b where for most of the alleles, only 110% of the peptides are binding. The %binding peptide-MHC pairs per allele is provided in Supplementary Data3). In imbalanced scenarios with few positive labels, the AUROC metric can be overly optimistic50. In contrast, AUPRC is relatively unaffected by this imbalance in the dataset owing to its focus on true positives (as true negatives are not into consideration). These two metrics were selected because they are used for benchmarking in the original paper describing the predictors. Recent work by Carrington et al.51 provides a useful discussion on use of AUROC and AUPRC metrics while also introducing a new metric called DeepROC.

a The number of alleles for which predictors exhibited the highest performances based on AUROC and AUPRC scores. b The percentage of binders in the MHC-Bench dataset per allele. Each dot represents a measurement for one allele. c Distribution of AUROC and AUPRC scores for the MHC class I predictors.

Explanations can be classified as either Global or Local. Global explanations for a predictor are distribution of input feature values across all inputs in the dataset. This offers a consolidated perspective on how the model utilizes input features for a specific output label. In contrast, local (or instance-based) explanations, focus on a single input - output instance. Local explanations typically show attribution of input features used for the prediction outcome. In the context of MHC class I predictors, binding motifs are examples of global explanations while a vector of attribution values for individual peptide positions forms a local explanation. It is worth noting that our work focuses on post-hoc explanations, i.e., explanations for existing models that have been previously trained. Post-hoc explanations are widely applicable as they can be used over models whose internal structure is not known or is too complex to interpret. Existing MHC class I predictors, like MHCfovea, focus on global explanations by generating binding motifs for MHC alleles. There is limited work on local instance-based explanations for this problem. In the next two sections, we motivate the need for local explanations for MHC class I predictors and discuss the additional information it can provide over global explanations.

An explanation for a 9mer peptide in our instance-based approach is represented as a length-9 vector of attribution values, generated through LIME or SHAP. Each positions attribution value can be positive or negative, with a positive (or negative) value indicating that the residue at that position contributes positively (or negatively) to the prediction. Our MHCXAI framework facilitates the generation of explanations for any MHC class I predictor by simply substituting the predictor module while keeping the LIME and SHAP modules unchanged. Using the MHCXAI framework, we successfully generated LIME and SHAP explanation vectors for input peptides from the MHC-Bench dataset across all examined predictors.

In this study, our main emphasis is on providing explanations for input peptides rather than allele sequences, given that not all predictors can process allele input in the form of an amino acid sequence. For instance, NetMHCpan and MHCflurry accept allele inputs as HLA-A02:01 and HLA-A0201, respectively. However, it is important to highlight that our framework has the capability to generate explanations for allele sequences in addition to peptides. This feature can be particularly beneficial for predictors like TransPHLA, which accepts the allele as an input sequence. An illustration of an allele explanation is presented in Supplementary Fig.S2 of Supplementary Note3.

Figure3a illustrates LIME and SHAP explanations generated for all examined predictors for a specific peptide-MHC allele pair (LLVEVLREIHLA-A*02:01). To visualize the explanations, heatmaps are constructed using the attribution values, with lighter colors indicating positive contributions and darker colors indicating little or negative contributions to the binding class. The peptide LLVEVLREI is a binding peptide for the HLA-A*02:01 allele, correctly classified by all MHC class I predictors. However, it is worth noting that the explanations from SHAP and LIME exhibit slight differences. For example, in Fig.3a, both LIME and SHAP attribute high importance to peptide position P2, but SHAP also recognizes peptide position P9 as significant. P9 is typically considered important for binding based on existing literature52,53.

a Examples of SHAP and LIME explanations for all investigated MHC class I predictors for the LLVEVLREI--HLA-A*02:01 pair. To visualize the explanations, the attribution values of positions are used to create heatmaps. For each explanation, a lighter color indicates a positive contribution, while a darker color indicates a smaller or negative contribution to the positive class. SHAP explanations for all the predictors highlight peptide positions P2 and P9 as the most important for binding, while LIME explanations highlight only peptide position P2 as the most important. b LIME and SHAP explanations for NetMHCpan-EL and -BA for the peptide KVAQKQFQL binding to HLA-A*02:01. NetMHCpan-EL classifies KVAQKQFQL correctly, but NetMHCpan-BA does not. SHAP captures these differences in performance and produces different explanations for the two predictor modes. c LIME and SHAP explanations for MHCflurry-PS and -BA for the peptide RVMAPRALL binding to HLA-C*06:02. MHCflurry-PS classifies RVMAPRALL correctly, but MHCflurry-BA does not. Similar to the example in b, SHAP captures these differences in performance and produces different explanations for the two predictor modes. In both NetMHCpan and MHCflurry examples, LIME explanations are unable to indicate positions leading to the difference in prediction outcome.

In Fig.3b, the peptide KVAQKQFQL binds to the MHC allele HLA-A*02:01. However, within NetMHCpan, NetMHCpan-EL correctly predicts it as a binder, while NetMHCpan-BA classifies it as a non-binder. SHAP and LIME explanations were generated for both modes, revealing that SHAP can identify the features (or peptide positions) responsible for the predictions made by NetMHCpan-EL and -BA. For instance, at peptide position P1, the amino acid K positively contributes to the prediction outcome in NetMHCpan-EL, whereas it negatively contributes in NetMHCpan-BA. LIME, on the other hand, produces similar attribution values for both predictors and is unable to highlight the cause for the difference in prediction between the two NetMHCpan modes.

In Fig.3c, the peptide RVMAPRALL is a binder to the MHC allele HLA-C*06:02, classified as a binder by MHCflurry-PS but not by MHCflurry-BA. SHAP and LIME were employed to explore the difference in predictions. SHAP identified that for MHCflurry-PS, peptide positions P1 and P9 play an important role. For MHCflurry-BA, while peptide position P9 is crucial, P1 is not deemed important (refer to the heatmaps in Fig.3c). This distinction between the two explanations helps identify and understand the reasons for different predictions.

These examples demonstrate that XAI techniques can generate explanations for MHC class I predictors. However, explanations produced by distinct XAI techniques for the same predictor may not align. Consequently, we evaluate the validity and quality of LIME and SHAP explanations against XAI metrics.

The average time required to generate an explanation for a single instance is influenced more by the choice of the predictor than the XAI technique itself. For example, generating an explanation (either LIME or SHAP) for MHCfovea takes twice as long as generating a corresponding explanation for MHCflurry.

As stated earlier, global explanations for MHC class I predictors manifest as binding motifs. Determining the binding preference for an MHC allele involves examining the most frequently occurring amino acids at anchor positions, specifically positions 2 and 9 in a 9mer peptide, which are the primary sites responsible for binding to an MHC molecule52,53. This can be extended to other peptide positions, forming a binding motif for an MHC allele.

Biological binding motifs for MHC alleles are generated using experimentally validated strong binders54. With recent MHC class I predictors, binding motifs are derived from peptides predicted as strong binders for a particular allele. These peptides serve as the basis for generating position-specific scoring matrices (PSSM), which are then visually represented as binding motifs. This approach is applied to generate binding motifs for MHCflurry, NetMHCpan, and MHCfovea. By comparing these predictor-generated motifs against the biological motif, it becomes possible to assess whether the predictor has effectively learned the correct binding patterns for a given allele. In this study, we utilize binding motifs from the MHC Motif Atlas database55.

Global explanations may overlook deviations observed in specific inputs. Consider a binding peptide that diverges from the typical biological binding motif pattern. In cases where a predictor correctly classifies this peptide as a binder, it becomes valuable to examine the specific features used by the predictor for this classification. Understanding the input features employed by the predictor for a particular peptide requires a local explanation rather than a binding motif. Specialized patterns like these are difficult to infer with a binding motif.

We illustrate the necessity for local explanations to capture specialized patterns that deviate from biological binding motifs with specific examples. In Fig.4a, the motif for HLA-A*02:01 suggests a preference for amino acids L, I, and, M at anchor position P2, binding to the super hydrophobic B pocket of the MHC molecule. However, despite the unfavorable nature of water-soluble Glutamine (Q) for such a pocket, solved peptide-bound HLA molecule structures indicate that many peptides with Q do bind strongly to HLA-A*02:0156. An example is the peptide FQFICNLLL (see Fig.4a), correctly classified as a binder by the MHCflurry-PS predictor. We generated a local explanation for this peptide using SHAP (see Fig.4a). The highest attribution values were assigned to peptide positions P1, P2, and P9. While the high importance of peptide positions P2 and P9 aligns with their roles as anchor positions, the elevated attribution value for peptide position P1 is rationalized by its crucial role in stabilizing the bound structure, as observed in refs. 53,56,57. It is worth noting that the amino acid Q in position P2 does not appear in the biological binding motif (global explanation) prominently which fails to capture the specialized pattern in this instance.

For a and b, biological binding motifs for HLA-A*02:01 and HLA-A*24:03 are obtained from the MHC Motif Atlas55. For each MHC allele, there are four heatmaps, which are SHAP explanations generated for true positive, true negative, false positive, and false negative peptides predicted using MHCflurry-PS. The peptides in both a and b defy the reasoning for binding based on biological motifs. However, the SHAP explanations are able to highlight the cause behind unexpected outcomes. A lighter color in the explanation heatmap indicates a positive contribution, while a darker color indicates a smaller or negative contribution to the positive class.

In Fig.4a, the true negative instance HQKKNEISF also contains the amino acid Q in position P2, similar to the true positive instance discussed. However, in this case the local explanation shows low attribution values for all other peptide positions. This indicates lack of strong binding signal from those positions, explaining the negative classification.

Figure4b is another example of explanations generated for true positive, true negative, false positive and false negative predictions made by MHCflurry-PS for HLA-A*24:03. In this example, peptide conforming to binding motif is correctly classified as non-binder whereas peptide not conforming to binding motif is correctly classified as binder.

In summary, instance-based explanations are particularly useful in explaining scenarios where binder peptides do not conform to motifs, misclassifications, and understanding a peptide specific pattern used for prediction. Additionally, we show that global explanations can be created using instance-based SHAP and LIME explanations in Supplementary Fig.S3 (in Supplementary Note4) which can be useful for quick comparison across predictors.

To achieve trust in a predictor, the generated explanations for the predictions must be reliable. Assessing the reliability of explanations involves comparing them to ground truth about the input-output relationship, as suggested by prior work such as58,59,60. The ground truth in our case are the residues in the peptide that genuinely contribute to the binding. The contribution of residues (hotspots) can be estimated experimentally using Alanine-scanning mutagenesis61, a resource intensive45 technique. Computationally, this can be achieved using BAlaS45,46 which calculates the difference between the free-energy of binding of original bound complex and mutated bound complex where just one residue of ligand peptide is replaced with alanine. This difference in free-energy of binding is indicated as G and G4.184kJ/mol is considered hot or important residue for binding46. G4.184kJ/mol indicates alanine enhances binding relative to the original residue46. Any value between denotes neutral substitution46. As it is difficult to obtain ground truth for all peptides, we use this G as an independent way of highlighting important residues in the peptide.

First, we compile all the available PDB structures featuring bound peptide-MHC allele complexes as documented in the MHC Motif Atlas55. Subsequently, we refine the list to encompass bound peptides with a length of 9 and narrow down the selection to structures that were consistently classified as binders by all examined MHC class I predictors. The resulting list comprises 250 PDB structures, encompassing 40 distinct MHC alleles (as listed in Supplementary Data5).

We compared the G for these 250 peptide-MHC pairs with the LIME and SHAP explanations generated for all the predictors. The LIME and SHAP values can be positive or negative, similar to G, which indicates residue contribution to the prediction. For each of these 250 peptide-MHC pair we calculated Pearson correlation coefficients between LIME/SHAP explanations and G, for each of the investigated predictors.

In Fig.5a, consider the instance of the peptide ITDQVPFSV bound to HLA-A*02:01, which is correctly classified. BAlaS identifies peptide positions P1, P2, P7, and P9 as hot residues (with G4.184kJ/mol), which are highlighted in red within the peptide-MHC complex. The SHAP explanations, feature red arrows pointing to the positions identified as important by BAlaS. Generally, the models consistently prioritize these positions when making predictions. However, despite having a high G, peptide position P7 is not deemed important by most of the predictors. This suggests that the information from the other three residues is sufficient for the predictors to infer the classification outcome. The distribution of correlation coefficients between SHAP-G and LIME-G (depicted in Fig.5b) indicates a positive correlation between the explanations and the important positions identified by BAlaS. Overall, it is observed that SHAP explanations exhibit a closer correlation compared to LIME explanations.

It is done by comparing the attribution values to the difference in free-energy of binding between wild-type protein-protein interaction and mutated protein-protein interaction, known as G. a For the ITDQVPFSV--HLA-A*02:01 complex, BAlaS highlights that peptide positions P1, P2, P7, and P9 are crucial for binding. Replacing the residues at these positions with alanine leads to an increase in G, indicating instability. This peptide is correctly classified by all the investigated MHC class I predictors, and SHAP explanations are generated for each of them. The explanations mostly match the ground truth, as P1, P2, P7, and P9 (indicated by red arrows) are rightly highlighted as the factors influencing the prediction. b G was calculated for each peptide position in 250 PDB structures containing peptide-MHC allele bound complexes that were correctly classified by all the investigated predictors. The SHAP and LIME explanations correlated positively for most complexes, indicating that the explanations mostly align with the ground truth and can be trusted. The correlation coefficient values are reported in Supplementary Data4. A lighter color in the explanation heatmap indicates a positive contribution, while a darker color indicates a smaller or negative contribution to the positive class.

The observed variance in the distribution of correlation coefficients is not surprising, given that BAlaS G serves as only an approximation of the actual positions involved in binding, and the approach is subject to certain limitations. Notably, the accuracy of the G calculation is influenced by the resolution of the PDB structure (refer to Supplementary Fig.S4 and Supplementary Note5). To address this, we selectively choose PDB structures with the highest resolution when multiple structures are available. Additionally, since G is computed by substituting a residue with alanine, it is challenging to ascertain the contribution of alanine, if present (refer to Supplementary Fig.S5).

Consistency refers to similarity in the explanations produced for two similarly performing predictors on a given input. We assess consistency of an XAI technique by comparing explanations for a given peptide between two similarly performing MHC class I predictors (Fig.1c).

To select two predictors with comparable performance, we choose the top two predictors from our results in Section 2 (see Fig.2a), namely MHCflurry-PS and MHCfovea. Additionally, the AUROC scores for these two predictors exhibit a high correlation, as indicated in Fig.6c, demonstrating substantial similarity in their performance.

a For an input peptide, explanations were generated for MHCflurry-PS and MHCfovea using SHAP and LIME. Pearson correlation coefficients were calculated between these explanations, and the process was repeated for 200 input peptides for each of the alleles presented in the plot. The distribution of Pearson correlation coefficients is closer to one, indicating high similarity between the two explanations for the two predictors on the same input. The correlation coefficient values are reported in Supplementary Data6. b In addition to the Pearson correlation coefficient, Euclidean distances were calculated between two explanations for two predictors on the same input. For Euclidean distance, values closer to zero indicate high similarity and high consistency. c Correlation heatmap for AUROC scores between investigated MHC class I predictors. MHCflurry-PS and MHCfovea are highly correlated in their performances.

We selected 9 alleles (3 each from HLA-A, B and C) and for each allele, we randomly selected 200 peptides from our MHC-Bench dataset to generate local explanations, independently using each of SHAP and LIME, for MHCfovea and MHCflurry-PS. For both LIME and SHAP, to compare the similarity between the explanations from the two predictors, we computed Pearson correlation and Euclidean distance.

In Fig.6a, the distribution of Pearson correlations between explanations generated individually for MHCflurry-PS and MHCfovea using LIME and SHAP is presented for all nine alleles. Overall, the majority of the SHAP and LIME explanations exhibited high correlation. In Fig.6b, the distribution of Euclidean distances between the explanations of the two predictors is presented. Explanations that are similar will have a Euclidean distance closer to zero. It is noteworthy that the Euclidean distance distribution for LIME has a narrow range and tends to be closer to zero compared to SHAP. This observation suggests that LIME produces more consistent explanations compared to SHAP.

We also created a baseline distance between explanations for the two predictors using the following procedure. First, we generated 100 random explanations for each original MHCfovea explanation by randomly permuting the attribution values. Next, we calculated the distance between each of these 100 random explanations and the original MHCflurry-PS explanation. The baseline distance was then computed by averaging these 100 distances. This process was repeated for all 200 peptides chosen per allele. Consequently, we obtained 200 Euclidean distances between the original MHCflurry-PS and MHCfovea explanations, along with their corresponding baseline distances. We compared these two distributions for each allele. We confirmed that the two distributions for both SHAP and LIME were statistically different using Kruskal-Wallis test at 5% significance level. The p-value, H-statistics and effect size are reported in Supplementary Tables1 and 2 for SHAP and LIME respectively. It is also worth noting that the Euclidean distance for both LIME and SHAP explanations were smaller than the corresponding average baseline Euclidean distance for nearly all the input peptides (99% input peptides).

We confirmed that the two distributions for both SHAP and LIME were statistically different using Kruskal-Wallis test at 5% significance level. The p-value, H-statistics, and effect size are reported in Supplementary Tables1 and 2 for SHAP and LIME, respectively. Additionally, it is noteworthy that the Euclidean distances for both LIME and SHAP explanations were smaller than the corresponding average baseline Euclidean distances for nearly all input peptides (99% of input peptides).

Stability of an explanation technique refers to the extent to which explanations for similar inputs (with same output) over a given predictor are close. We use the MHCflurry-PS predictor to asses stability of the LIME and SHAP techniques, independently. To identify input peptides that are similar, we perform clustering over a subset of peptide sequences for HLA-A*02:01. Using GibbsCluster-2.062,63, we cluster the peptides into 110 clusters. The number of clusters that yields highest average KullbackLeibler Distance (KLD) is considered to be the optimum number of clusters. We found choosing 10 clusters has the highest KLD with cluster size ranging between 7001000 peptides. The plot showing KLD distribution and cluster motifs generated from GibbCluster is provided in Supplementary Fig.S6. Peptides within a cluster are considered similar.

From each of these clusters, we sampled 100 peptides that are binders and generated explanations for these peptides. We calculated the Euclidean distance between all pairs of peptides within each cluster, and this is referred to as the intracluster distance distribution. As a comparison, we also computed the distance between explanations for peptides from different clusters, referred to as Intercluster distance. We show results for the top six most unrelated cluster pairs (c2, c5), (c3, c5), (c3, c8), (c5, c6), (c5, c9), (c5, c10), based on the similarity of their position-specific scoring matrix, in Fig.7.

a Euclidean distance distribution for the top six cluster pairs using LIME. For each pair, there are three distributions - IntraclusterL, Intercluster, and IntraclusterR. For any two pairs (e.g., c2, c5), the intracluster explanation distance distribution for the left cluster (c2) and right cluster (c5) are IntraclusterL and IntraclusterR, while Intercluster is the distribution of explanation distances between the two clusters. b Euclidean distance distribution for the top six cluster pairs using SHAP. For each pair, there are three distributions - IntraclusterL, Intercluster, and IntraclusterR. For any two pairs (e.g., c2, c5), the intracluster explanation distance distribution for the left cluster (c2) and right cluster (c5) are IntraclusterL and IntraclusterR, while Intercluster is the distribution of explanation distances between the two clusters.

For each cluster pair in Fig.7, we have three distributions: IntraclusterL, Intercluster, and IntraclusterR. Consider the pair (c2, c5), where IntraclusterL represents the intracluster Euclidean distance distribution for cluster c2 (or the left cluster), Intercluster is the intercluster Euclidean distance distribution between c2 and c5, and IntraclusterR is the intracluster Euclidean distance distribution for cluster c5 (or the right cluster). The notation of left-right for Intracluster is arbitrary. It is worth noting that intracluster distances are lower than the intercluster distances, indicating that LIME and SHAP explanations for peptides within the same cluster are more similar, suggesting stability of explanations. We confirmed that the differences between intracluster and intercluster distance distributions in Fig.7 are statistically significant (KruskalWallis test). The p-value, H-statistics, and effect size are reported in Supplementary Table3.

Read more:
Building trust in deep learning-based immune response predictors with interpretable explanations | Communications ... - Nature.com

Read More..

Development of machine learning-based predictors for early diagnosis of hepatocellular carcinoma | Scientific Reports – Nature.com

Derivation of HCC predictors

The whole procedure of analysis was designed as follows in Fig.1. In present study, we used two feature selection methods and eight classification algorithms mentioned above to build sixteen predictors for HCC diagnosis by using gene expression profiles of 988 HCC and 332 CwoHCC accessed from the GEO database. First, on the basis of gene expression profiles of 988 HCC and 332 CwoHCC, 25,341,086 and 20,559,429 stable gene pairs were acquired, respectively. Among 25,341,086 and 20,559,429 gene pairs, there were 5765 stable reversal gene pairs between HCC tissues and CwoHCC tissues. Then, filtering gene pairs using 2902 secreted genes, we obtained 242 gene pairs, where gene i and gene j were secreted gene. Next, based on novel profiles with 242 features (gene pairs) (see Methods section), we captured the optimal feature (see Fig.2). Table 1 showed the comparison of classification performance of various predictors obtained based on accuracy, F1-Score fitness function and AUC value. The results presented in Table 1 illustrated that nine predictors, including mRMR+KNN, mRMR+SVM, mRMR+LR, mRMR+XGBoost, mRMR+LMT, MRMD+KNN, MRMD+SVM, MRMD+LR and MRMD+LMT, showed excellent results for all performance metrices, and reached accuracy of 1, F1-score of 1 and AUC of 1, respectively. Among these nine predictors, the predictor of mRMR+KNN and mRMR+SVM had the least number of 11 gene pairs (see Table 2).

The workflow of analyses.

A plot to show the IFS curve. Through adding features (gene pairs) ranked by mRMR and MRMD feature selection method one by one, the optimal feature was obtained when the highest accuracy was achieved.

Subsequently, we used independent datasets (including testing set, GEO sets, ICGC set and TCGA set) to validate the performance of various algorithms. In Table 3, for the 3057 HCC samples and 84 CwoHCC samples, MRMD+SVM predictor with 28 gene pairs (see Table S3) gained the highest accuracy and F1-score than other predictors in independent datasets, the accuracy, F1-score, and AUC were 0.9834, 0.9915, 0.9278 (95% CI is 0.89150.9642), respectively. However, the results also indicated that mRMR+SVM predictor with 11 gene pairs gained the highest AUC than other predictors in independent datasets, the AUC was 0.9384 (95% CI 0.92550.9514).

Since mRMR+SVM predictor and mRMR+KNN predictor with the least number of 11 gene pairs showed great results for all performance metrices in independent data, and MRMD+SVM predictor gained the highest accuracy and F1-score in independent datasets among 16 predictors, thus we focused on these three predictors in the next analysis. The detailed validation results of these three predictors in biopsy and surgery samples were shown in Table 4. For biopsy samples, both mRMR+SVM predictor and mRMR+KNN predictor yielded sensitivity of 1, specificity of 1 by using testing set (29 HCC samples and 48 CwoHCC samples), while MRMD+SVM predictor yielded sensitivity of 1, specificity of 0.8542. In GEO biopsy sets, mRMR+SVM predictor correctly classified 96.18% of the 131 HCC samples (GSE121248, GSE47197), mRMR+KNN predictor correctly classified 66.41% of the 131 HCC samples as well as all (100%) of the 131 HCC samples were correctly classified by MRMD+SVM predictor. For surgery samples, in the testing set (220 HCC samples and 36 CwoHCC samples), the sensitivity and specificity of two predictors (mRMR+SVM predictor and mRMR+KNN predictor) were 1. While, the sensitivity and specificity of MRMD+SVM predictor was 1 and 0.8889. This result demonstrated that mRMR+SVM predictor, mRMR+KNN predictor and MRMD+SVM predictor could discriminate HCC from CwoHCC correctly when using biopsy samples.

For surgery samples, in GEO surgery sets, 84.1% of the 2063 HCC samples were correctly classified by mRMR+SVM predictor, 70.04% of the 2063 HCC samples were correctly classified by mRMR+KNN predictor and 98.01% of the 2063 HCC samples were correctly classified by MRMD+SVM predictor. Moreover, among 2063 HCC samples, based on mRMR+SVM predictor, 79.76% of the 657 formalin-fixed paraffin-embedded (FFPE) HCC samples (GSE109211, GSE62743, GSE46444, GSE10141, GSE164760, GSE19977) were correctly recognized as HCC; while 58.14% of the 657 FFPE HCC samples was correctly classified by mRMR+KNN predictor and 99.85% of the 657 FFPEHCC samples was correctly classified by MRMD+SVM predictor. This result demonstrated that mRMR+SVM and mRMR+KNN predictor were available to the FFPE samples with RNA degradation. For the RNA-seq expression data obtained from TCGA and ICGC, the 11 gene pairs based on mRMR+SVM predictor could correctly identify 99.19% of the 371 HCC and the 98.77% of the 243 HCC samples, respectively.

While the 11 gene pairs based mRMR+KNN predictor could correctly identify 98.11% of the 371 HCC RNA-seq and the 97.94% of the 243 HCC RNA-seq samples. And MRMD+SVM predictor with 28 gene pairs could correctly identify all 371 HCC RNA-seq and all 243 HCC RNA-seq samples. This result demonstrated that mRMR+SVM predictor, mRMR+KNN predictor and MRMD+SVM predictor had a cross-platform ability. In summary, these three predictors had a cross-platform ability and could discriminate HCC from CwoHCC when using surgery samples, including FFPE samples with RNA degradation.

Furthermore, in Table S4, 82.86% of the 741 normal tissues in patients with HCC samples (NwHCC) samples and 82.04% of the 334 cirrhosis tissues in patients with HCC samples (CwHCC) samples were correctly classified by mRMR+SVM predictor, 67.48% of the 741 NwHCC samples and 57.49% of the 334 CwHCC samples were correctly classified by mRMR+KNN predictor, and 99.87% of the 741 NwHCC samples and 97.01% of the 334 CwHCC samples were correctly classified by MRMD+SVM predictor. This result showed that these three predictors could identify HCC adjacent tissues (CwHCC and NwHCC) from CwoHCC when using biopsy and surgery samples.

In conclusion, for biopsy and surgery samples, these three predictors could identify HCC and its adjacent tissues (CwHCC and NwHCC) from CwoHCC even when sample location is not accurate and samples are FFPE samples with RNA degradation. Additionally, these three predictors had a cross-platform ability. Importantly, the performance of HCC diagnostic signature based on MRMD+SVM is superior to mRMR+KNN predictor and mRMR+SVM predictor in some independent datasets.

To further verify the performance of mRMR+SVM, mRMR+KNN and MRMD+SVM predictor developed in current study, we compared with the existing predictors. Two published studies about finding REOs-based signature for early HCC diagnosis have been completed by Ao et al. and our previous work. In 2018, combining rank difference with majority voting rule, Ao et al. presented a signature by applying 491 HCC samples and 149 CwoHCC samples. This signature, including 19 gene pairs, was chosen from 72 reversal gene pairs. And it yiled the accuracy of 0.9969. In 2020, we identified an early diagnostic signature of HCC from 857 reversal gene pairs on the basis of mRMR and SVM. Using 1091 HCC samples and 242 CwoHCC samples, 11 gene pairs were derived and denoted as the signature, which achieved 1 of accuracy. Due to the difference of training data, a comparison of current results in this paper with existing results in previous studies is an unfair comparison. Therefore, we utilized the same evaluation criteria. To further assessed effectiveness of presented predictors, experimental results in independent datasets were used to perform comparison objectively.

In Table 2, for training set, both mRMR+SVM predictor with 11 gene pairs and mRMR+KNN predictor with 11 gene pairs achieved accuracy of 1, F1-score of 1, as well as the number of gene pairs is the least. Also, MRMD+SVM predictor with 28 gene pairs achieved accuracy of 1, F1-score of 1. As shown in Table 3, for a total of 3057 HCC samples and 84 CwoHCC samples, mRMR+SVM predictor was the best predictor, which yielded AUC of 0.9384, and its accuracy and F1-score were 0.8914 and 0.9351, respectively. In Table 4 and Table S4, for biopsy samples, based on the mRMR+SVM predictor, 96.18% of the 131 HCC samples from 2 datasets (GSE121248, GSE47197) could be correctly identified as HCC. Moreover, 75.26% of the 97 NwHCC samples from 2 datasets (GSE121248 and GSE64041) and all 80 CwHCC samples in GSE54236 were classified as HCC. While, based on MRMD+SVM predictor, all of 131 HCC samples could be correctly identified as HCC, all 97 NwHCC samples and all 80 CwHCC samples were classified as HCC. For surgery samples, 1800 HCC samples from 24 datasets were used to perform evaluation and 657 of them were FFPE HCC samples from 6 datasets. Thus, mRMR+SVM predictor could correctly discriminate 1800 HCC samples and 657 FFPE HCC samples with the sensitivity of 0.8428 and 0.7976, respectively. Also, MRMD+SVM predictor could correctly discriminate 1800 HCC samples and 657 FFPE HCC samples with the sensitivity of 0.9872 and 0.9985, respectively. This result demonstrated that mRMR+SVM predictor and MRMD+SVM predictor had the potential to classify FFPE samples with partial RNA degradation. Moreover, based on mRMR+SVM predictor, 614 out of 741 NwHCC samples from 9 datasets and 229 out of 334 CwHCC samples from 6 datasets were predicted as HCC. While based on MRMD+SVM predictor, all 741 NwHCC samples and all 334 CwHCC samples were predicted as HCC. For RNA-seq data, based on mRMR+SVM predictor, 368 out of 371 HCC samples from TCGA and 11 out of 50 NwHCC tissues were correctly identified as HCC. While based on MRMD+SVM predictor, all 371 HCC samples and all 50 NwHCC tissues were correctly identified as HCC. In addition, 240 out of 243 HCC samples from TCGA were also correctly identified as HCC. While based on MRMD+SVM predictor, all 243 HCC samples were also correctly identified as HCC.

Results in Table S4 displayed the identification of both HCC and its adjacent non-cancer (NwHCC and CwHCC) from CwoHCC by biopsy and surgery samples. For 131 HCC biopsy samples, the sensitivity of proposed mRMR+SVM predictor with 11 gene pairs (18 secreted genes) and MRMD+SVM predictor with 28 gene pairs was 0.7526 and 1, which were higher than Aos method (0.6031). The identification ability of proposed mRMR+SVM predictor was also better than Aos method in 80 CwHCC samples. Additionally, among these methods, mRMR+SVM predictor and MRMD+SVM predictor displayed the better classification in 657 HCC FFPE samples, 1800 HCC surgery samples (657 HCC FFPE samples were included) and all 1931 HCC samples (1800 HCC surgery samples and 131 HCC biopsy samples were contained). For 657 HCC FFPE samples, the accuracy of Aos method, our previous method (11 gene pairs, 2020), proposed mRMR+SVM predictor and MRMD+SVM predictor in this study was 0.172, 0.3973, 0.7976, 0.9985, respectively. For 1800 HCC samples, the accuracy of Aos method, our previous method, proposed mRMR+SVM predictor and MRMD+SVM predictor was 0.6639, 0.7656, 0.8428, 0.9872, respectively. For 1931 HCC samples, the accuracy of Aos method was 0.6572, the accuracy of our previous method was 0.7815, while the accuracy of the proposed mRMR+SVM predictor and MRMD+SVM predictor could increase to 0.8503 and 0.97, respectively. Above result suggested that mRMR+SVM predictor and MRMD+SVM predictor displayed the better performance when comparing with Aos method and our previous method.

In conclusion, methods developed in this paper produced higher accuracy and had superior prediction and diagnosis abilities compared to other published methods, especially for FFPE samples. Therefore, the mRMR+SVM predictor and MRMD+SVM predictor were deemed superior and more suitable predictors for facilitating early HCC diagnosis in clinical practice.

Read this article:
Development of machine learning-based predictors for early diagnosis of hepatocellular carcinoma | Scientific Reports - Nature.com

Read More..

Advancing Chemistry with AI: New Model for Simulating Diverse Organic Reactions – Lab Manager Magazine

Key Takeaways:

Researchers from Carnegie Mellon University and Los Alamos National Laboratory have used machine learning to create a model that can simulate reactive processes in a diverse set of organic materials and conditions.

Subscribe to our free Lab Manager Monitor newsletter.

"It's a tool that can be used to investigate more reactions in this field," said Shuhao Zhang, a graduate student in Carnegie Mellon University'sDepartment of Chemistry. "We can offer a full simulation of the reaction mechanisms."

Zhang is the first author on the paper that explains the creation and results of this new machine learning model, "Exploring the Frontiers of Chemistry with a General Reactive Machine Learning Potential," which was published in Nature Chemistry on March 7.

Though researchers have simulated reactions before, previous methods had multiple problems. Reactive force field models are relatively common, but they usually require training for specific reaction types. Traditional models that use quantum mechanics, where chemical reactions are simulated based on underlying physics, can be applied to any materials and molecules, but these models require supercomputers to be used.

This new general machine learning interatomic potential (ANI-1xnr) can perform simulations for arbitrary materials containing the elements carbon, hydrogen, nitrogen, and oxygen and requires significantly less computing power and time than traditional quantum mechanics models. According to Olexandr Isayev, associate professor of chemistry at Carnegie Mellon and head of the lab where the model was developed, this breakthrough is due to developments in machine learning.

"Machine learning is emerging as a powerful approach to construct various forms of transferable atomistic potentials utilizing regression algorithms. The overall goal of this project is to develop a machine learning method capable of predicting reaction energetics and rates for chemical processes with high accuracy, but with a very low computational cost," Isayev said. "We have shown that those machine learning models can be trained at high levels of quantum mechanics theory and can successfully predict energies and forces with quantum mechanics accuracy and an increase in speed of as much as 6-7 orders of magnitude. This is a new paradigm in reactive simulations."

Researchers tested ANI-1xnr on different chemical problems, including comparing biofuel additives and tracking methane combustion. They even recreated the Miller experiment, a famous chemical experiment meant to demonstrate how life originated on Earth. Using this experiment, they found that the ANI-1xnr model produced accurate results in condensed-phase systems.

Zhang said that the model could potentially be used for other areas in chemistry with further training.

"We found out it can be potentially used to simulate biochemical processes like enzymatic reactions," Zhang said. "We didn't design it to be used in such a way, but after modification it may be used for that purpose.

In the future, the team plans to refine ANI-1xnr and allow it to work with more elements and in more chemical areas, and they will try to increase the scale of the reactions it can process. This could allow it to be used in multiple fields where designing new chemical reactions could be relevant, such as drug discovery.

- This press release was originally published on the Carnegie Mellon University website

Originally posted here:
Advancing Chemistry with AI: New Model for Simulating Diverse Organic Reactions - Lab Manager Magazine

Read More..