Page 208«..1020..207208209210..220230..»

The protesters aren’t having enough sex, and other beautiful political theories we were gifted this week. – Slate

This is Totally Normal Quote of the Day, a feature highlighting a statement from the news that exemplifies just how extremely normal everything has become.

Resentful childless harpies unconsciously longing for domination. Jordan Peterson, in a tweet characterizing 18-to-22-year-old college students participating in pro-Palestine protests

While pundits lashing out at the student protesters participating in recent actions started out making allegations of antisemitism among the activists, a new, more personal flavor of scorn is now emerging. Some have accused the students of acting only out of a desire for attention or thrills, not actual conviction. Others have mocked students for being foolish, childish, and self-indulgent, as when a bunch of adults ridiculed the Columbia student occupying Hamilton Hall who asked for the delivery of food and water and described it as humanitarian aid.

But an even troll-ier line of reasoning emerged this week. According to some on the right, the real problem with college students today is a matter of confusion over gender and sex. These protesters are undersexed and ugly. And feminism is probably to blame.

Studies have shown that Gen Z is having less sex than previous generations had in their early adulthood, a fact that has inspired a certain set of concerned pundits, who in previous decades might have gotten stressed out about teen pregnancy, to fret instead about the effete men and cold women of the younger generation. When the well-known New York University marketing professor Scott Galloway, a person whose politics you might describe as center-left, went on Real Time with Bill Maher on April 26, he pulled from this source of elders angst in making his critiques.

I think part of the problem is young people arent having enough sex, so they go on the hunt for fake threats, he said on the show.

Former CNN host Don Lemon, who appeared on the show with Galloway, didnt disagree. It would definitely take the edge off, he quipped.

A day later, speaking on CNN, Galloway tried out the argument again: I think that protesting is kind of the new, if you will, sex, he said. Young people arent having as much sex.

And also for the species to survive, you get a dopa[mine] hit from gathering together and fighting off a perceived enemy, he explained, adding some pseudo-evolutionary logic to the idea. And I think theyre erring on the latter, if you will. I think theyre on the hunt for what I call a fake mortal enemy.

The next permutation of this emerging idea came from Megyn Kelly, who on her podcast on Tuesday mulled an important question about the students protesting the mass murder of civilians in Gaza.

Why are they so unattractive? she asked. Why are all the protesters so homely? I dont think theyre unconnected, Im not going to lie. I think attractive, smart people are not drawn to this nonsense. Theyre living their lives being successful. Its the unattractive and/or dumb people who feel the need to do this to feel like they matter.

Galloway crawled and Kelly walked so that Kellys former colleague Greg Gutfeld could fly. On The Five on Fox News on Thursday, Gutfeld pointed out that the protests appeared to be populated largely by young women.

They look miserable, disheveled, he said. Meanwhile when you see those counterprotesters, those frat boyshealthy, good-looking guys. Theyve got it together. These women are a mess.

Here, Gutfeld pulled on another string familiar to conservativescampuses are too full of women, and run by too many other women, and all this feminization is probably causing student mental illnessand came close to articulating the argument in its final form: Liberal women are unappealing and unhappy, and theyre taking it out on everyone else.

Theres something going on in our culture where leftism has led women down a path where the only purpose they perceive is outrage, he explained. Theyve devoted their aimless energy into causes that serve only to undermine their future, their happiness.

Gutfeld means something specific here: a rejection of traditional gender roles. He makes that clear in his next point: Weve derided motherhood to a point where their only baby is abortion, he said. Thats the thing they want to protect.

So women are unhappy because they are aimless; they are aimless because they are too liberated; they are therefore unleashing violence on college campuses. (A subtextand sometimes overtly stated pointin these discussions is that feminist women are also unhappy because they are always ugly, and men, therefore, dont date them.)

Its this last point that the right-wing celebrity psychologist and author Jordan Peterson distilled to its ultimate form Wednesday.

Responding to a tweet from the conservative commentator Richard Hanania arguing that masked college girls are just really into Hamas, Peterson theorized that these unappealing, aimless, liberated women had found a solution to fix all the problems that emerged from their feminist worldview: Bowing at the feet of Hamas, finally getting the subjugation they dont even know they want.

Resentful childless harpies unconsciously longing for domination, he wrote of these teenagers. Why else worship at the altar of Hamas? Why else would it be so overwhelmingly female?

Why else, indeed?

The rest is here:
The protesters aren't having enough sex, and other beautiful political theories we were gifted this week. - Slate

Read More..

An Honest Conversation About Hollywood | Adrian Grenier – The Daily Wire

The Jordan B. Peterson PodcastMay 2, 2024

Dr. Jordan B. Peterson sits down with actor and environmental advocate Adrian Grenier. They discuss his early life on the streets of New York City, his eventual rise to stardom, the reality and decline of Hollywood, why he walked away after reaching the top, and how he grounds himself now through natural practices and the cultivation of his fully sustainable environment-first community.

Adrian Grenier is an American actor, musician, and environmental advocate. He is best known for his portrayal of Vincent Chase in the television series Entourage (20042011). He has appeared in films such as Drive Me Crazy (1999), The Devil Wears Prada (2006), Trash Fire (2016), and Marauders (2016). In 2021, he acted in the Netflix series Clickbait. In 2010, Grenier and his business partner Peter Glatzer started SHFT, a brand that promotes sustainability through multimedia. The website won two Webby Awards in 2011 for best website in its category. In 2020, he moved to Austin, Texas, settling on the Kintsugi Ranch, which he and his family have been turning into a fully sustainable, renewable community.

- Links -

2024 tour details can be found here https://jordanbpeterson.com/events

Peterson Academy https://petersonacademy.com/

For Adrian Grenier:

On X https://twitter.com/adriangrenier?lang=en

On Instagram https://www.instagram.com/adriangrenier/?hl=en

Earth Speed on Youtube https://www.youtube.com/@EarthSpeedbyAdrianGrenier

Continue reading here:
An Honest Conversation About Hollywood | Adrian Grenier - The Daily Wire

Read More..

Hong Kong big data utilised for building predictive AI and more AI briefs – Healthcare IT News

CU Medicine develops severe hypoglycemia predictive AI

Researchers from the Faculty of Medicine at the Chinese University of Hong Kong (CU Medicine) have utilised anonymised big data from the Hospital Authority Data Collaboration Laboratory to develop a new machine learning model that can predict the risk of severe hypoglycemia among older diabetic adults.

They analysed about 1.5 million records of more than 360,000 senior individuals with diabetes from 2013-2018. Based on the XGBoost machine learning algorithm, the risk prediction model uses 258 predictors, including demographics, admissions, and diagnoses, to predict severe hypoglycemia events requiring hospitalisation in the next 12 months.

Besides prolonged hospitalisation, severe hypoglycemia isalso associated with an increased risk of falls, cardiovascular disease, dementia, and all-cause mortality, CU Medicine noted.

Achieving an 85% positive predictive value in a study, the model can bepotentially integrated into EHR decision support systems for pre-emptive interventions, such as correcting the timing and dosage of insulin injections orchanging to diabetes medications with lower hypoglycemic potential.

Indian military looks to develop diagnosis support AI

India's Armed Forces Medical Services has partnered with the Indian Institute of Technology in Kanpur, Uttar Pradesh for the joint research and development of technology solutions addressing the health problems of soldiers deployed in difficult terrains.

Under their memorandum of understanding, IIT Kanpur will also help the Armed Forces Medical College's Armed Forces Centre for Computational Medicine in creating diagnostic AI models.

Alodokter joins Indonesia's digital health mission

Digital health company Alodokter is cooperating with the Indonesian government in expanding access to health services across the country through telemedicine.

It signed a memorandum of understanding with the Ministry of Health to collaborate in such areas as raising healthcare workers' capacity by providing professional credit units; health communications and education; conducting health development surveys; and providing telemedicine services.

Mahidol Univesity to trial Japanese endoscopic AI

Mahidol University in Thailand is collaborating with Japanese startup AI Medical Service (AIM) to assess the applicability of the latter's endoscopic AI in the Thai setting.

This comes as AIM looks to expand its market presence globally after receiving regulatory approvals in Japan and Singapore over the past four months.

Indonesian university to test Korean medical AI for lung disease, stroke diagnosis

The Universitas Gadjah Mada Academic Hospital in Indonesia will also conduct a clinical trial of three diagnosis support AI from South Korean medical AI company, Deepnoid.

Under their memorandum of understanding, the hospital will be testing Deepnoid's diagnosis aid software for multiple lung diseases, lung nodules, and brain aneurysms for 18 months. This comes as the hospital, which saw a two-fold rise in X-ray, MRI, and CT readings over the past three years, is bracing forgrowingdemand for imaging while only having 22 readers to date.

The results of this clinical trial will inform Deepnoid's application for a regulatory licence in Indonesia, the company shared.

Read more:
Hong Kong big data utilised for building predictive AI and more AI briefs - Healthcare IT News

Read More..

Deep learning for high-resolution seismic imaging | Scientific Reports – Nature.com

Review of seismic imaging

The goal of seismic imaging is to infer subsurface structures based on observed seismic data. This can be achieved by solving inverse problems. Reverse Time Migration (RTM) is an imaging technique based on the wave equation25, which utilizes the cross-correlation of the underground forward and backward wavefields, demonstrating excellent adaptability, especially in areas with complex structures and high velocity variations. The formula for the cross-correlation imaging condition is expressed as:

$$I(x,z)={int }_{0}^{T}{u}_{text{f }}(x,z,t)*{u}_{text{b }}(x,z,t)dt$$

(1)

Here, (I(x,z)) represents the RTM result, ({u}_{text{f }}(x,z,t)) denotes the forward wavefield, and ({u}_{text{b }}(x,z,t)) is the backward wavefield.

However, RTM suffers from low-frequency noise and inaccurate amplitudes, limiting its application in seismic imaging. To address the shortcomings of RTM, Least Squares Reverse Time Migration (LSRTM) associates the migration imaging result with seismic data26, constructing the least squares objective function:

$$E({varvec{m}})=frac{1}{2}{parallel {varvec{L}}{varvec{m}}-{{varvec{d}}}_{{text{obs}}}parallel }^{2}$$

(2)

Here, ({{varvec{d}}}_{{text{obs}}}) represents the observed data, ({varvec{L}}) is the forward operator, and ({varvec{m}}) is the subsurface structural parameter.

LSRTM involves key steps such as forward simulation, backpropagation, gradient computation, and optimization algorithms. Through iterative optimization to minimize the error between observed and simulated data, LSRTM enhances the quality of seismic imaging.

In this study, we introduce a hybrid architecture (Fig.1) that integrates Transformer and CNN to address seismic imaging tasks. Within the Transformer framework, the need for a one-dimensional sequence as input necessitates an initial transformation of the input image. The Image Patching phase involves partitioning the input image into a series of equally sized image patches, each with a size of ({P}^{2}). This transforms the original (Htimes W) image into an (Ntimes Ptimes P) sequence, where (N) represents the sequence length, encompassing (frac{Htimes W}{{P}^{2}}) image patches. Consequently, the input image is reshaped into a one-dimensional sequence, with each image patch corresponding to a vector. The adoption of a smaller patch size enables enhanced capture of intricate details within the image, thus elevating the model's accuracy, albeit at the expense of increased computational overhead27. In view of balancing between model efficacy and computational efficiency, we establish (P=16). In the Input Embedding stage, a linear transformation is applied to each segmented image patch, mapping it to a continuous vector representation. As the Transformer model abstains from utilizing recurrent or convolutional layers for sequence processing, positional encoding is incorporated into the input embedding vector to discern the positional information of each image patch.

Network architecture diagram.

$${mathbf{Z}}_{0}=[{{varvec{X}}}_{p}^{1}mathbf{E};{mathbf{X}}_{p}^{2}mathbf{E};dots ;{mathbf{X}}_{p}^{N}mathbf{E}]+{mathbf{E}}_{text{pos}}$$

(3)

The proposed model employs a Transformer Encoder comprising ({text{L}}=12) layers to process the image sequence, with each encoder layer composed of Multi-Head Self-Attention (MSA) and Multi-Layer Perceptron (MLP).

$${mathbf{Z}}_{l}^{mathrm{^{prime}}}={text{MSA}}({text{LN}}({mathbf{Z}}_{l-1}))+{mathbf{Z}}_{l-1},l=1dots L$$

(4)

$${mathbf{Z}}_{l}={text{MLP}}({text{LN}}({mathbf{Z}}_{l}^{mathrm{^{prime}}}))+{mathbf{Z}}_{l}^{mathrm{^{prime}}},l=1dots L$$

(5)

Here, ({text{LN}}(cdot )) denotes layer normalization, (l) is the identifier for intermediate blocks, and L is the number of Transformer layers.

These stacked Transformer layers facilitate capturing the complexity of the data from a multiscale perspective. To prevent the loss of primary features by solely relying on the last layer output, we employ a multi-level feature extraction strategy. In addition to the final layer (12th layer), features are extracted from the 6th and 9th layers, representing deep, intermediate, and shallow features, providing a rich and multiscale feature space. These three layers of features are adjusted to different resolutions of feature maps and fused through ASFF, resulting in adaptive aggregation at each scale.

ASFF constitutes an attention-based spatial feature integration strategy devised to amalgamate feature maps originating from diverse spatial resolutions within deep neural networks28. Its principal objective is to augment the model's perceptual acuity concerning targets across varying scales. ASFF dynamically weights and fuses features from distinct spatial resolutions by learning task-specific attention weights.

We represent features at resolution level ({ell}) (where ({ell}in left{mathrm{1,2},3right})) as ({x}^{l}). For level ({ell}), we resize features from other levels (n) ((nne {ell})) to the same shape as ({x}^{l}). Let ({x}_{ij}^{nto {ell}}) denote the feature vector at position ((i,j)) on the feature map, adjusted from level (n) to level ({ell}). We perform the following fusion of corresponding level ({ell}) features:

$${y}_{ij}^{{ell}}={alpha }_{ij}^{{ell}}cdot {x}_{ij}^{1to {ell}}+{beta }_{ij}^{{ell}}cdot {x}_{ij}^{2to {ell}}+{gamma }_{ij}^{{ell}}cdot {x}_{ij}^{3to {ell}}$$

(6)

Here, ({y}_{ij}^{{ell}}) signifies the vector at position ((i,j)) in the output feature map ({y}^{{ell}}) across channels. The spatial importance weights ({alpha }_{ij}^{{ell}}), ({beta }_{ij}^{{ell}}), and ({gamma }_{ij}^{{ell}}) for features from three different levels to level ({ell}) are adaptively learned by the network. To ensure the effectiveness of weights, constraints ({alpha }_{ij}^{{ell}}+{beta }_{ij}^{{ell}}+{gamma }_{ij}^{{ell}}=1) and ({alpha }_{ij}^{{ell}}, {beta }_{ij}^{{ell}}, {gamma }_{ij}^{{ell}}in [mathrm{0,1}]) are enforced. These constraints ensure the validity and range of the weights. The weights are computed using softmax functions with control parameters as follows:

$${alpha }_{ij}^{{ell}}=frac{{e}^{{lambda }_{{alpha }_{ij}}^{{ell}}}}{{e}^{{lambda }_{{alpha }_{ij}}^{{ell}}}+{e}^{{lambda }_{{beta }_{ij}}^{{ell}}}+{e}^{{lambda }_{{gamma }_{ij}}^{{ell}}}}$$

(7)

$${beta }_{ij}^{{ell}}=frac{{e}^{{lambda }_{{beta }_{ij}}^{{ell}}}}{{e}^{{lambda }_{{alpha }_{ij}}^{{ell}}}+{e}^{{lambda }_{{beta }_{ij}}^{{ell}}}+{e}^{{lambda }_{{gamma }_{ij}}^{{ell}}}}$$

(8)

$${gamma }_{ij}^{{ell}}=frac{{e}^{{lambda }_{{gamma }_{ij}}^{{ell}}}}{{e}^{{lambda }_{{alpha }_{ij}}^{{ell}}}+{e}^{{lambda }_{{beta }_{ij}}^{{ell}}}+{e}^{{lambda }_{{gamma }_{ij}}^{{ell}}}}$$

(9)

The calculation of control parameters ({lambda }_{{alpha }_{ij}}^{{ell}}),({lambda }_{{beta }_{ij}}^{{ell}}), and ({lambda }_{{gamma }_{ij}}^{{ell}}) is performed through (1{text{x}}1) convolution layers from ({x}_{ij}^{1to {ell}}),({x}_{ij}^{2to {ell}}), and ({x}_{ij}^{3to {ell}}), respectively. These parameters are learned through standard backpropagation during network training.

Overall, this approach furnishes the model with a rich and multiscale feature space, thereby contributing to its performance in complex seismic imaging tasks.

Visit link:
Deep learning for high-resolution seismic imaging | Scientific Reports - Nature.com

Read More..

Science has an AI problem. This group says they can fix it. – University of California San Diego

One of the main takeaways is transparency. The checklist calls on researchers to provide detailed descriptions of each machine learning model, including the code, the data used to train and test the model, the hardware specifications used to produce the results, the experimental design, the projects goals and any limitations of the studys findings. The standards are flexible enough to accommodate a wide range of nuance, including private datasets and complex hardware configurations, according to the authors.

While the increased rigor of these new standards might slow the publication of any given study, the authors believe wide adoption of these standards would increase the overall rate of discovery and innovation, potentially by a significant amount.

What we ultimately care about is the pace of scientific progress, said sociologist Emily Cantrell, one of the lead authors, who is pursuing her Ph.D. at Princeton. By making sure the papers that get published are of high quality and that theyre a solid base for future papers to build on, that potentially then speeds up the pace of scientific progress. Focusing on scientific progress itself and not just getting papers out the door is really where our emphasis should be.

Kapoor concurred. The errors hurt. At the collective level, its just a major time sink, he said. That time costs money. And that money, once wasted, could have catastrophic downstream effects, limiting the kinds of science that attract funding and investment, tanking ventures that are inadvertently built on faulty science and discouraging countless numbers of young researchers.

In working toward a consensus about what should be included in the guidelines, the authors said they aimed to strike a balance: simple enough to be widely adopted, comprehensive enough to catch as many common mistakes as possible.

They say researchers could adopt the standards to improve their own work; peer reviewers could use the checklist to assess papers; and journals could adopt the standards as a requirement for publication.

The scientific literature, especially in applied machine learning research, is full of avoidable errors, Narayanan said. And we want to help people. We want to keep honest people honest.

The paper, Consensus-based recommendations formachine-learning-based science, published on May 1 in Science Advances, included the following authors: Sayash Kapoor, Princeton University; Emily Cantrell, Princeton University; Kenny Peng, Cornell University; Thanh Hien (Hien) Pham, Princeton University; Christopher A. Bail, Duke University; Odd Erik Gundersen, Norwegian University of Science and Technology; Jake M. Hofman, Microsoft Research; Jessica Hullman, Northwestern University; Michael A. Lones, Heriot-Watt University; Momin M. Malik, Center for Digital Health, Mayo Clinic; Priyanka Nanayakkara, Northwestern; Russell A. Poldrack, Stanford University; Inioluwa Deborah Raji, University of California-Berkeley; Michael Roberts, University of Cambridge; Matthew J. Salganik, Princeton University; Marta Serra-Garcia, University of California-San Diego; Brandon M. Stewart, Princeton University; Gilles Vandewiele, Ghent University; and Arvind Narayanan, Princeton University.

Adapted from a Princeton University release

Learn more about research and education at UC San Diego in: Artificial Intelligence

Follow this link:
Science has an AI problem. This group says they can fix it. - University of California San Diego

Read More..

Research on a machine learning-based adaptive and efficient screening model for psychological symptoms of … – Nature.com

Data collection Data collection

The research group collected 17-dimensional basic trait data (Supplementary Information1. andSupplementary Information 2.) of 25480 samples of community correction prisoners in Zhejiang Province, China, and the corresponding Symptom Checklist-90 (SCL-90) and Health Survey Short Form (SF-12) data. These data were collected through the standardized community correction digital management platform of the Zhejiang Provincial Department of Justice, covering the period from January 2020 to December 2020. The 17-dimensional characteristics mainly include age, sex, treatment level (general control, strict control), whether adult, education level, dmicile (urban or rural), whether there are infectious diseases, whether belongs to the following three categories (unemployed individuals, those without relatives to rely on, individuals without a place to live), whether there is a criminal record, crime type, supervision time, whether there is recidivism, whether there is anti-government tendency, whether there are five kinds of involvement (terrorism, cults, drugs, gangs, and gun trafficking), whether there are four histories (drug use history, escape history, suicide history, police assault history), correction status (in correction, released from the status of correction), occupation before arrest. The SCL-90 traditional scale obtained 9 kinds of psychological measurement indicators: somatization, obsessive-compulsive symptoms, interpersonal sensitivity, depression, anxiety, hostility, terror, paranoia, and psychosis. Due to the incomplete basic information registered in some judicial offices, the samples with missing values in the basic information were removed and matched, resulting in a total of 25,214 sample data.

Due to the privacy and compliance issue of patients, it is difficult to collect a large number of medical data, especially the data of specific groups. The research group has invested a lot of manpower, material and financial resources in the construction of this data set (Supplementary Information 3.).

The research design has been approved by the Ethics Research Committee of the Zhejiang Community Correction Management Bureau. This study was carried out in accordance with the Declaration of Helsinki, and all procedures were carried out in accordance with relevant guidelines and regulations. The Committee waived the requirement of informed consent for this study because the researchers only access the database for analysis purposes, and all personnel, including patient data, are desensitized, and there is no conflict of interest among personnel of each unit.

The pretreatment of tabulated data described in the paper includes missing value imputation, outlier detection and removal and data standardization, as follows:

Missing values refer to situations where the values of certain features or variables in a table are missing or not recorded. In machine learning modeling, handling missing values is crucial36. Choosing appropriate filling methods can improve the predictive performance of the model, making the data more complete and reliable37. In this study, there were some missing values in the raw data we used, and most of the missing values were filled in by manually tracing the raw materials. For a small amount of other missing values such as age and other quantitative data, we use mean interpolation to fill in, as the mean can represent the central trend of the data and help maintain its distribution.For qualitative data such as crime types, we use the median to fill in, which is a better choice because it can reduce the impact of extreme values while maintaining the order and level of the data38.

Outliers refer to data points that are significantly different from other data points or deviate from the normal range. Outliers may have adverse effects on data analysis and modeling, so they need to be eliminated or handled. To ensure the accuracy and reliability of the data, we carried out outlier detection and elimination. We use the Rajda criterion to deal with outliers. The process takes the given confidence probability of 99.7% as the standard, and is based on the standard deviation of 3 times of the data column. The abnormal data row greater than the value is deleted, and when the residual error vb of the measured value xb is greater than 3 times , outliers should be eliminated.

$$left| {vb} right| = left| {xb - x} right| > 3sigma .$$

Data standardization is to transform the data of different scales and ranges into a unified standard scale to eliminate the influence of dimensions and make different features comparable. In the stage of data preprocessing, we normalize the numerical features from minimum to maximum. By linearly mapping the values of each feature to the range of 0 to 1, we eliminate the differences of different feature scales and make them comparable.

Based on symptom checklist-90(SCL-90), this study constructed an adaptive scale (between question groups) simplification screening evaluation model based on multi-label classification algorithm, and used Health Survey Short Form(SF-12), a primary screening tool commonly used by community correction management institutions, as a simplified baseline method for comparative analysis.

We used the multi-label classification model for scale (between question groups) simplification to analyze the risk degree of individuals in nine categories of psychological measurement indicators, and simplified the scale structure based on the risk distribution. The goal of scale simplification is to simplify the questions, make the scale more readable and easy to understand, and help readers get core information and insight more quickly. During the process of scale simplification, it is necessary to make trade-offs and decisions according to the data and the needs of the audience to ensure that enough information is retained while maintaining simplicity and clarity.

The basic principle of the multi-label classification algorithm (as shown in Fig. 1 and Table 1) is to recognize the association between features and labels by learning historical data, so as to predict new labels. It can integrate the results of multiple tags, find the association between multiple tags, and solve the multiple conflicts that may exist in the multi-tag classification problem, so as to effectively improve the accuracy of classification. It can also help us quickly identify features, thus reducing the time of classification.

Binary relevance (first-order, y tags are independent of each other). It is a problem transformation method. The core idea is to decompose the multi-label classification problem. BR is simple and easy to understand. When there is no dependent relationship between Y values, the effect of the model is good.

Classifier chains (high-order, y tags are interdependent). Its principle is similar to the BR conversion method. In this case, the first classifier is trained only on the input data, and then each classifier is trained on all previous classifiers in the input space and chain. A certain number of binary classifiers can be combined into a single multi-label model to explore the correlation between multiple targets.

Rakle (random k-labelsets, high-order, y tags are interdependent). It can divide the original large tag set into a certain number of small tag sets, then use RF to train the corresponding classifier, and finally integrate the prediction results. RakeID is a high-order strategy algorithm, which can mine the correlation of multiple tags according to the size of the tag subset.

Multi label classification algorithm.

For the latter two algorithms, if there is a clear dependency between tags, the generalization ability of the final model is better than that of the model constructed by binary relevance. The problem is that it is difficult to find a more suitable tag dependency.

The core principle of oversampling method is to increase some samples in the category with fewer samples to achieve category balance. SMOTE is the representative algorithm of the oversampling method. In the process of modeling, SMOTE (Synthetic Minority Over-sampling Technique) is used to solve the problem of category imbalance. SMOTE increases the number of minority samples by synthesizing new minority samples, to balance the unbalanced data set.

Because the total number of samples collected is sufficient, the training data adopts 5-fold cross-validation to prevent the model from overfitting and increase the robustness of the model. The extracted feature data is randomly divided into five parts, four of which are used for training, and one part is retained as test data. The above process is repeated five times, using different test data each time. Then the results of these five times are summarized, and the average value is taken as the estimation of the algorithm performance index. Five cross-validation is a popular algorithm choice at present.

In this paper, SF-12 was used as a comparison tool. SF-12 is a commonly used health questionnaire survey tool, which is used to assess the health status and quality of life of individuals. SF-12 is a simplified version derived from the SF-36 questionnaire, which retains the core concepts and dimensions of SF-36. However, it reduces the number of questions and improves the efficiency of questionnaire implementation. The simplicity and efficiency of the SF-12 questionnaire make it a common tool in large-scale epidemiological research and clinical practice. It can be used to evaluate the health status of different groups and the effect of health intervention, and compare the health differences between different groups.

If all SCL-90 subscales of the actual sample are diagnosed as risk-free, the sample is defined as a negative sample. If any subscale test is risky, the sample is defined as a positive sample. Similarly, if all the sub-tags predicted by the multi-label model are 0, the sample is negative. If there is any positive sub-tag, the sample is positive:

If the actual 9 labels are all negative, the mental state is healthy and marked as a negative sample.

If one of the actual 9 labels is positive, the mental state is unhealthy and marked as a positive sample.

Similarly, if all of the predicted 9 tags are negative, the mental state is healthy and the tag is negative.

If one of the predicted 9 tags is positive, the mental state is unhealthy and marked as a positive sample.

According to the actual mental state and the predicted value, the confusion matrix (as shown in Table 2) is drawn, which is composed of the following four important definitions: true positive (TP), false positive (FP), false negative (FN) and true negative (TN).

The overall effect of the model is evaluated by the following indicators, including accuracy, sensitivity, specificity and F1. The relevant measurement standards are as follows:

$${text{Accuracy }} = left( {{text{TP }} + {text{ TN}}} right)/left( {{text{TP }} + {text{ TN }} + {text{ FP }} + {text{ FN}}} right),$$

$${text{Sensitivity }} = {text{ TP}}/left( {{text{TP }} + {text{ FN}}} right),$$

$${text{Precision }} = {text{ TP}}/left( {{text{TP }} + {text{ FP}}} right),$$

$${text{F1}} = {2} times {text{Sensitivity}} times {text{Precision}}/left( {{text{Precision}} + {text{Sensitivity}}} right).$$

In the multi label classification problem, accuracy_Score, Hamming loss and 0-1 loss related evaluation indicators can be based on the prediction results of a single tag or the overall prediction results.

Accuracy_Score is the correctly predicted score (default) or count. In multi-label classification, the function returns the subset precision. If the whole set of predicted tags of the sample matches the real tag combination, the subset accuracy is 1. Otherwise, it is 0.

Hamming loss: Hamming loss measures the prediction accuracy of the model for each label, that is, the ratio of the number of labels with average prediction errors to the total number of labels. It calculates the prediction result of each tag and returns a value between 0 and 1. The smaller the value, the more accurate the prediction is.

0-1 loss is a common classification loss function, which is used to measure the prediction error of the classification model. It takes 1 when the prediction is wrong and 0 when the prediction is correct, so it is named 0-1 loss.

Simplification rate refers to the proportion of the simplified scale to the original scale, which can be used to evaluate the degree of simplification of the scale. Scale simplification refers to simplifying the structure of the original scale by reducing the number of items, deleting redundant or unnecessary items, or merging multiple items. The simplification rate of the scale can be calculated in the following way: simplification rate (number of simplified items/original number of items) 100%. In other words, the simplification rate based on the multi-label model is calculated as follows: simplification rate (the number of sub-labels predicted to be negative)/(the total number of samples).

The Ethics Committee of the Zhejiang Community Correction Management Bureau has waived the informed consent requirement for this study, as researchers accessing the database is only for analytical purposes, including patient data, which is desensitized, and there are no conflicts of interest between personnel in each unit. The research design has been approved by the Ethics Research Committee of the Zhejiang Community Correction Management Bureau. This study was conducted in accordance with the Helsinki Declaration, and all procedures were conducted in accordance with relevant guidelines and regulations.

Continued here:
Research on a machine learning-based adaptive and efficient screening model for psychological symptoms of ... - Nature.com

Read More..

Revolutionize Customer Satisfaction with tailored reward models for your business on Amazon SageMaker | Amazon … – AWS Blog

As more powerful large language models (LLMs) are used to perform a variety of tasks with greater accuracy, the number of applications and services that are being built with generative artificial intelligence (AI) is also growing. With great power comes responsibility, and organizations want to make sure that these LLMs produce responses that align with their organizational values and provide the same unique experience they always intended for their end-customers.

Evaluating AI-generated responses presents challenges. This post discusses techniques to align them with company values and build a custom reward model using Amazon SageMaker. By doing so, you can provide customized customer experiences that uniquely reflect your organizations brand identity and ethos.

Out-of-the-box LLMs provide high accuracy, but often lack customization for an organizations specific needs and end-users. Human feedback varies in subjectivity across organizations and customer segments. Collecting diverse, subjective human feedback to refine LLMs is time-consuming and unscalable.

This post showcases a reward modeling technique to efficiently customize LLMs for an organization by programmatically defining rewards functions that capture preferences for model behavior. We demonstrate an approach to deliver LLM results tailored to an organization without intensive, continual human judgement. The techniques aim to overcome customization and scalability challenges by encoding an organizations subjective quality standards into a reward model that guides the LLM to generate preferable outputs.

Not all human feedback is the same. We can categorize human feedback into two types: objective and subjective.

Any human being who is asked to judge the color of the following boxes would confirm that the left one is a white box and right one is a black box. This is objective, and there are no changes to it whatsoever.

Determining whether an AI models output is great is inherently subjective. Consider the following color spectrum. If asked to describe the colors on the ends, people would provide varied, subjective responses based on their perceptions. One persons white may be anothers gray.

This subjectivity poses a challenge for improving AI through human feedback. Unlike objective right/wrong feedback, subjective preferences are nuanced and personalized. The same output could elicit praise from one person and criticism from another. The key is acknowledging and accounting for the fundamental subjectivity of human preferences in AI training. Rather than seeking elusive objective truths, we must provide models exposure to the colorful diversity of human subjective judgment.

Unlike traditional model tasks such as classification, which can be neatly benchmarked on test datasets, assessing the quality of a sprawling conversational agent is highly subjective. One humans riveting prose is anothers aimless drivel. So how should we refine these expansive language models when humans intrinsically disagree on the hallmarks of a good response?

The key is gathering feedback from a diverse crowd. With enough subjective viewpoints, patterns emerge on engaging discourse, logical coherence, and harmless content. Models can then be tuned based on broader human preferences. There is a general perception that reward models are often associated only with Reinforcement Learning from Human Feedback (RLHF). Reward modeling, in fact, goes beyond RLHF, and can be a powerful tool for aligning AI-generated responses with an organizations specific values and brand identity.

You can choose an LLM and have it generate numerous responses to diverse prompts, and then your human labelers will rank those responses. Its important to have diversity in human labelers. Clear labeling guidelines are critical. Without explicit criteria, judgments can become arbitrary. Useful dimensions include coherence, relevance, creativity, factual correctness, logical consistency, and more. Human labelers put these responses into categories and label them favorite to least favorite, as shown in the following example. This example showcases how different humans perceive these possible responses from the LLM in terms of their most favorite (labeled as 1 in this case) and least favorite (labeled as 3 in this case). Each column is labeled 1, 2, or 3 from each human to signify their most preferred and least preferred response from the LLM.

By compiling these subjective ratings, patterns emerge on what resonates across readers. The aggregated human feedback essentially trains a separate reward model on writing qualities that appeal to people. This technique of distilling crowd perspectives into an AI reward function is called reward modeling. It provides a method to improve LLM output quality based on diverse subjective viewpoints.

In this post, we detail how to train a reward model based on organization-specific human labeling feedback collected for various prompts tested on the base FM. The following diagram illustrates the solution architecture.

For more details, see the accompanying notebook.

To successfully train a reward model, you need the following:

Complete the following steps to launch SageMaker Studio:

Lets see how to create a reward model locally in a SageMaker Studio notebook environment by using a pre-existing model from the Hugging Face model hub.

When doing reward modeling, getting feedback data from humans can be expensive. This is because reward modeling needs feedback from other human workers instead of only using data collected during regular system use. How well your reward model behaves depends on the quality and amount of feedback from humans.

We recommend using AWS-managed offerings such as Amazon SageMaker Ground Truth. It offers the most comprehensive set of human-in-the-loop capabilities, allowing you to harness the power of human feedback across the machine learning (ML) lifecycle to improve the accuracy and relevancy of models. You can complete a variety of human-in-the-loop tasks with SageMaker Ground Truth, from data generation and annotation to model review, customization, and evaluation, either through a self-service or AWS-managed offering.

For this post, we use the IMDB dataset to train a reward model that provides a higher score for text that humans have labeled as positive, and a lower score for negative text.

We prepare the dataset with the following code:

The following example shows a sample record from the prepared dataset, which includes references to rejected and chosen responses. We have also embedded the input ID and attention mask for the chosen and rejected responses.

In this case, we use the OPT-1.3b (Open Pre-trained Transformer Language Model) model in Amazon SageMaker JumpStart from Hugging Face. If you want to do all of the training locally on your notebook instead of distributed training, you need to use an instance with enough accelerator memory. We run the following training on a notebook running on ml.g4dn.xlarge instance type:

In the following code snippet, we create a custom trainer that calculates how well a model is performing on a task:

It compares the models results for two sets of input data: one set that was chosen and another set that was rejected. The trainer then uses these results to figure out how good the model is at distinguishing between the chosen and rejected data. This helps the trainer adjust the model to improve its performance on the task. The CustomTrainer class is used to create a specialized trainer that calculates the loss function for a specific task involving chosen and rejected input sequences. This custom trainer extends the functionality of the standard Trainer class provided by the transformers library, allowing for a tailored approach to handling model outputs and loss computation based on the specific requirements of the task. See the following code:

The TrainingArguments in the provided code snippet are used to configure various aspects of the training process for an ML model. Lets break down the purpose of each parameter, and how they can influence the training outcome:

By configuring these parameters in the TrainingArguments, you can influence various aspects of the training process, such as model performance, convergence speed, memory usage, and overall training outcome based on your specific requirements and constraints.

When you run this code, it trains the reward model based on the numerical representation of subjective feedback you gathered from the human labelers. A trained reward model will give a higher score to LLM responses that humans are more likely to prefer.

You can now feed the response from your LLM to this reward model, and the numerical score produced as output informs you of how well the response from the LLM is aligning to the subjective organization preferences that were embedded on the reward model. The following diagram illustrates this process. You can use this number as the threshold for deciding whether or not the response from the LLM can be shared with the end-user.

For example, lets say we created an reward model to avoiding toxic, harmful, or inappropriate content. If a chatbot powered by an LLM produces a response, the reward model can then score the chatbots responses. Responses with scores above a pre-determined threshold are deemed acceptable to share with users. Scores below the threshold mean the content should be blocked. This lets us automatically filter chatbot content that doesnt meet standards we want to enforce. To explore more, see the accompanying notebook.

To avoid incurring future charges, delete all the resources that you created. Delete the deployed SageMaker models, if any, and stop the SageMaker Studio notebook you launched for this exercise.

In this post, we showed how to train a reward model that predicts a human preference score from the LLMs response. This is done by generating several outputs for each prompt with the LLM, then asking human annotators to rank or score the responses to each prompt. The reward model is then trained to predict the human preference score from the LLMs response. After the reward model is trained, you can use the reward model to evaluate the LLMs responses against your subjective organizational standards.

As an organization evolves, the reward functions must evolve alongside changing organizational values and user expectations. What defines a great AI output is subjective and transforming. Organizations need flexible ML pipelines that continually retrain reward models with updated rewards reflecting latest priorities and needs. This space is continuously evolving: direct preference-based policy optimization, tool-augmented reward modeling, and example-based control are other popular alternative techniques to align AI systems with human values and goals.

We invite you to take the next step in customizing your AI solutions by engaging with the diverse and subjective perspectives of human feedback. Embrace the power of reward modeling to ensure your AI systems resonate with your brand identity and deliver the exceptional experiences your customers deserve. Start refining your AI models today with Amazon SageMaker and join the vanguard of businesses setting new standards in personalized customer interactions. If you have any questions or feedback, please leave them in the comments section.

Dinesh Kumar Subramani is a Senior Solutions Architect based in Edinburgh, Scotland. He specializes in artificial intelligence and machine learning, and is member of technical field community with in Amazon. Dinesh works closely with UK Central Government customers to solve their problems using AWS services. Outside of work, Dinesh enjoys spending quality time with his family, playing chess, and exploring a diverse range of music.

Read more from the original source:
Revolutionize Customer Satisfaction with tailored reward models for your business on Amazon SageMaker | Amazon ... - AWS Blog

Read More..

Conversational AI vs Generative AI: Which is Best for CX? – CX Today

Conversational AI vs. Generative AI: Which solution will turbocharge your contact centers performance and help you achieve your CX goals? Worldwide, the evolution of artificial intelligence has unlocked new waves of productivity for business leaders and teams.

While the impact of advanced AI algorithms can be felt everywhere, its particularly prominent in the contact center. In the last year alone, weve lost count of the number of contact center, CRM, and CX software vendors introducing new AI capabilities for customer service teams.

Though ChatGPT, Microsoft Copilot, and even solutions like NICEs Enlighten AI suite are driving focus to the rise of generative AI, its not the only intelligent tech making waves. Conversational AI is also emerging as a critical part of contact center success.

The question is, which of these two solutions do you need, and do you need to choose between one or the other? Heres your guide to conversational AI and generative AI in the contact center.

Conversational AI is a type of artificial intelligence that allows computer programs (bots) to simulate human conversations. It combines various AI techniques to ensure people can interact with computer systems just like talking to another human being.

Examples of conversational AI are everywhere. Smart assistants like Alexa and Siri use conversational AI to interact with users. Many of the chatbots installed on company websites leverage the same technology.

So, how does it all work?

While the nature of each conversational AI solution can vary depending on your chosen vendor, most tools feature the same central components:

After processing input, conversational AI tools can generate responses based on their data. Some more advanced solutions can even enhance their responses by using additional forms of analysis, such as sentiment analysis.

Conversational AI has become the backbone of many advances in the customer experience and contact center landscapes. It forms part of the tech behind conversational intelligence tools, such as those offered by CallMiner, Calabrio, and Talkdesk.

Its also a common component in the chatbots and virtual assistants customers interact with through text and speech, for self-service interactions.

The most common examples of conversational AI in customer service include:

Older chatbots were primarily rule-based solutions that used scripts to answer customer questions. Advanced chatbots, powered by conversational AI, use natural language processing to recognize speech, imitate human interaction, and respond to more complex inputs.

They can also operate across multiple channels, accompanying your contact center IVR system, chat apps, social media service strategies, and more. Plus, they can learn from interactions over time, becoming more effective and advanced.

Modern IVR systems also leverage conversational AI. Instead of giving customers a list of limited options to choose from, they can listen to what customers say, recognize their intent, and route them to the best agent or department.

With NLP, IVR systems can provide more accurate responses and even draw insights from company databases and CRMs to personalize interactions. They can also be configured to route conversations based on various factors, such as customer sentiment or agent skill level.

As mentioned above, conversational AI tools are a common component of conversational intelligence. Because they can process language and analyze interactions, they can offer companies insight into customer sentiment, track customer service trends, and highlight growth opportunities.

Some solutions can also automatically transcribe and translate calls, which can be ideal for enhancing compliance, as well as training initiatives.

When analyzing conversational AI vs. generative AI, its worth noting that both solutions have strengths and limitations. Conversational AI, for instance, can empower teams to deliver fantastic service across multiple channels 24/7. It can also help personalize interactions.

By analyzing previous discussions and real-time sentiment or intent, conversational AI can help ensure every customer gets a bespoke experience with your contact center.

Beyond that, conversational AI can:

However, conversational AI also has limitations. Although conversational AI tools are more advanced than traditional chatbots, they can still struggle with complex linguistic nuances and requests. They dont always understand customer accents or things like humor or sarcasm.

Plus, since theyre reliant on collecting and processing customer data, theres always a risk to the privacy and security of your contact center. Business leaders need to ensure they have the right security strategies in place to protect sensitive data.

Generative AI is a form of artificial intelligence that can generate new, original content, such as text and images, based on basic prompts. It uses deep learning and neural networks to produce highly creative answers to queries and requests.

Like conversational AI, generative AI is becoming a more common component of the contact center. CCaaS vendors offer companies access to generative AI-powered bots that can provide real-time coaching and assistance to agents or enhance the customer service experience.

Most of these solutions build on the foundations of conversational AI, enhancing bot performance with access to large language models (LLMs).

Alongside leveraging NLP technologies, most generative AI solutions rely on:

Since generative AI tools share many of the same features as conversational AI solutions, they can also address many of the same use cases. Were already seeing an increase in companies using generative AI to create intuitive chatbots and virtual assistants.

However, there are also additional opportunities for generative AI in the contact center, such as:

Generative AI excels at producing original content. It can help contact centers create knowledge bases, drawing on existing data in their ecosystem to design comprehensive guides. Generative AI bots can then surface this information to contact center agents in real-time and offer recommendations to guide them through a conversation.

They can even help organizations create more comprehensive training resources and onboarding tools for new contact center agents, boosting team performance.

Like conversational AI, generative AI tools can have a huge impact on customer service. They can understand the input shared by customers in real time and use their knowledge and data to help agents deliver more personalized, intuitive experiences.

Generative AI solutions can automatically create responses to questions on behalf of an agent and recognize keywords spoken in a conversation to surface relevant information. It can even draw insights from multiple different environments to help answer more complex queries.

One major use case for generative AI in the contact center is the ability to automate repetitive tasks, improving workplace efficiency. Generative AI bots can transcribe and translate conversations like their conversational alternatives and even summarize discussions.

They can pinpoint key action items and discussion trends, automatically classify and triage customer service tickets, and improve the routing process.

Like conversational AI, generative AI has both its pros and cons to consider. It can significantly enhance team productivity and creativity and guide agents through the process of delivering exceptional customer service. It can also help improve team efficiency by automating repetitive tasks like call summarization.

Plus, generative AI solutions can:

However, there are risks to generative AI, too. Like most forms of AI, generative AI relies on access to large volumes of data, which needs to be protected for compliance purposes. It can cause issues with data governance, particularly when teams have limited transparency into how an LLM works.

Plus, since generative AI creates unique original content, its subject to AI hallucinations, which means not all of the answers it gives will be correct.

Conversational AI and generative AI have a lot of overlapping capabilities and features. They both make it easier for human beings to interact intuitively with machines, and they can both understand natural input. However, there are some major differences:

So, conversational AI vs generative AI: which do you actually need?

Though conversational AI and generative AI have different strengths, they can both work in tandem to improve customer experience. Tools like Microsoft Copilot for Sales are considered generative AI models, but they actually use conversational AI, too.

There are various ways contact centers can connect generative AI and conversational AI. For instance, conversational AI bots can generate better answers to customer questions by calling on the insights of back-end generative models.

Smart conversational assistants can analyze inbound ticket information and assign issues to specialized generative models to help with customer service. Conversational bots can even draw insights from FAQs and knowledge bases created by generative AI during discussions.

Ultimately, weaving conversational and generative AI together amplifies the strengths of both solutions. While conversational AI bots can handle high-volume routine interactions in contact centers, solutions powered with generative algorithms can address more complex queries and offer additional support to agents.

The chances are, as both of these technologies continue to mature, well see CCaaS and contact center leaders introducing more tools that allow users to design their own systems that use the best of both models, such as Five9s generative AI studio.

Link:
Conversational AI vs Generative AI: Which is Best for CX? - CX Today

Read More..

AWS Inferentia and AWS Trainium deliver lowest cost to deploy Llama 3 models in Amazon SageMaker JumpStart … – AWS Blog

Today, were excited to announce the availability of Meta Llama 3 inference on AWS Trainium and AWS Inferentia based instances in Amazon SageMaker JumpStart. The Meta Llama 3 models are a collection of pre-trained and fine-tuned generative text models. Amazon Elastic Compute Cloud (Amazon EC2) Trn1 and Inf2 instances, powered by AWS Trainium and AWS Inferentia2, provide the most cost-effective way to deploy Llama 3 models on AWS. They offer up to 50% lower cost to deploy than comparable Amazon EC2 instances. They not only reduce the time and expense involved in training and deploying large language models (LLMs), but also provide developers with easier access to high-performance accelerators to meet the scalability and efficiency needs of real-time applications, such as chatbots and AI assistants.

In this post, we demonstrate how easy it is to deploy Llama 3 on AWS Trainium and AWS Inferentia based instances in SageMaker JumpStart.

SageMaker JumpStart provides access to publicly available and proprietary foundation models (FMs). Foundation models are onboarded and maintained from third-party and proprietary providers. As such, they are released under different licenses as designated by the model source. Be sure to review the license for any FM that you use. You are responsible for reviewing and complying with applicable license terms and making sure they are acceptable for your use case before downloading or using the content.

You can access the Meta Llama 3 FMs through SageMaker JumpStart on the Amazon SageMaker Studio console and the SageMaker Python SDK. In this section, we go over how to discover the models in SageMaker Studio.

SageMaker Studio is an integrated development environment (IDE) that provides a single web-based visual interface where you can access purpose-built tools to perform all machine learning (ML) development steps, from preparing data to building, training, and deploying your ML models. For more details on how to get started and set up SageMaker Studio, refer to Get Started with SageMaker Studio.

On the SageMaker Studio console, you can access SageMaker JumpStart by choosing JumpStart in the navigation pane. If youre using SageMaker Studio Classic, refer to Open and use JumpStart in Studio Classic to navigate to the SageMaker JumpStart models.

From the SageMaker JumpStart landing page, you can search for Meta in the search box.

Choose the Meta model card to list all the models from Meta on SageMaker JumpStart.

You can also find relevant model variants by searching for neuron. If you dont see Meta Llama 3 models, update your SageMaker Studio version by shutting down and restarting SageMaker Studio.

You can choose the model card to view details about the model, such as the license, data used to train, and how to use it. You can also find two buttons, Deploy and Preview notebooks, which help you deploy the model.

When you choose Deploy, the page shown in the following screenshot appears. The top section of the page shows the end-user license agreement (EULA) and acceptable use policy for you to acknowledge.

After you acknowledge the policies, provide your endpoint settings and choose Deploy to deploy the endpoint of the model.

Alternatively, you can deploy through the example notebook by choosing Open Notebook. The example notebook provides end-to-end guidance on how to deploy the model for inference and clean up resources.

In SageMaker JumpStart, we have pre-compiled the Meta Llama 3 model for a variety of configurations to avoid runtime compilation during deployment and fine-tuning. TheNeuron Compiler FAQhas more details about the compilation process.

There are two ways to deploy Meta Llama 3 on AWS Inferentia and Trainium based instances using the SageMaker JumpStart SDK. You can deploy the model with two lines of code for simplicity, or focus on having more control of the deployment configurations. The following code snippet shows the simpler mode of deployment:

To perform inference on these models, you need to specify the argument accept_eula as True as part of the model.deploy() call. This means you have read and accepted the EULA of the model. The EULA can be found in the model card description or from https://ai.meta.com/resources/models-and-libraries/llama-downloads/.

The default instance type for Meta LIama-3-8B is is ml.inf2.24xlarge. The other supported model IDs for deployment are the following:

SageMaker JumpStart has pre-selected configurations that can help get you started, which are listed in the following table. For more information about optimizing these configurations further, refer to advanced deployment configurations

OPTION_N_POSITI

ONS

The following code shows how you can customize deployment configurations such as sequence length, tensor parallel degree, and maximum rolling batch size:

Now that you have deployed the Meta Llama 3 neuron model, you can run inference from it by invoking the endpoint:

For more information on the parameters in the payload, refer to Detailed parameters.

Refer to Fine-tune and deploy Llama 2 models cost-effectively in Amazon SageMaker JumpStart with AWS Inferentia and AWS Trainium for details on how to pass the parameters to control text generation.

After you have completed your training job and dont want to use the existing resources anymore, you can delete the resources using the following code:

The deployment of Meta Llama 3 models on AWS Inferentia and AWS Trainium using SageMaker JumpStart demonstrates the lowest cost for deploying large-scale generative AI models like Llama 3 on AWS. These models, including variants like Meta-Llama-3-8B, Meta-Llama-3-8B-Instruct, Meta-Llama-3-70B, and Meta-Llama-3-70B-Instruct, use AWS Neuron for inference on AWS Trainium and Inferentia. AWS Trainium and Inferentia offer up to 50% lower cost to deploy than comparable EC2 instances.

In this post, we demonstrated how to deploy Meta Llama 3 models on AWS Trainium and AWS Inferentia using SageMaker JumpStart. The ability to deploy these models through the SageMaker JumpStart console and Python SDK offers flexibility and ease of use. We are excited to see how you use these models to build interesting generative AI applications.

To start using SageMaker JumpStart, refer to Getting started with Amazon SageMaker JumpStart. For more examples of deploying models on AWS Trainium and AWS Inferentia, see the GitHub repo. For more information on deploying Meta Llama 3 models on GPU-based instances, see Meta Llama 3 models are now available in Amazon SageMaker JumpStart.

Xin Huang is a Senior Applied Scientist Rachna Chadha is a Principal Solutions Architect AI/ML Qing Lan is a Senior SDE ML System Pinak Panigrahi is a Senior Solutions Architect Annapurna ML Christopher Whitten is a Software Development Engineer Kamran Khan is a Head of BD/GTM Annapurna ML Ashish Khetan is a Senior Applied Scientist Pradeep Cruz is a Senior SDM

See more here:
AWS Inferentia and AWS Trainium deliver lowest cost to deploy Llama 3 models in Amazon SageMaker JumpStart ... - AWS Blog

Read More..

An engineer made history as Georgia Tech’s first Black graduate; 59 years later, he passes the torch to his granddaughter – Yahoo! Voices

Nearly 60 years after Atlanta native and engineer Ronald Yancey overcame barriers to become Georgia Institute of Technologys first Black graduate, he presented his granddaughter with her diploma as she followed in her familys footsteps.

Deanna Yancey, who is among a few of her relatives to have attended the public research university also known as Georgia Tech, graduated with a masters degree in electrical and computer engineering at Fridays spring commencement ceremony.

As she walked across the stage at the universitys McCamish Pavilion, she greeted her grandfather with a smile and a hug, and he handed her the hard-earned diploma, an Instagram clip from Georgia Tech shows.

The elder Yanceys June 1965 achievement was recognized on-campus with a sculpture of him dedicated in 2019, according to Georgia Tech.

The university says it was the first in the Deep South to integrate peacefully and without a court order. Georgia Tech admitted its first Black students in 1961.

Deanna Yancey, who earned an undergraduate engineering degree from Penn State University in 2020, says she didnt initially tell her family she was applying for an online masters program at her grandfathers alma mater, according to a news release from Georgia Tech.

When I got in, I got to read the acceptance email to my grandfather, Deanna Yancey said in the release. He was so happy. He almost started jumping; he was so excited.

She acknowledged her grandfather as a trailblazer at Georgia Tech.

Its a different world to be known for something especially as powerful as a movement as he was able to start, the new graduate said in a video clip played at Fridays ceremony.

Ronald Yancey was rejected twice from Georgia Tech in the 1960s, and he and his family were told he did not fit the Tech model for success, according to a 2015 news release from the university.

In the meantime, he attended Morehouse, a historically Black college/university. Morehouse did not have an engineering program, though, so in the spring of 1961, Yancey again applied to Tech, the release stated.

He was accepted upon the condition that he retook the SAT and passed a summer class, according to Georgia Tech.

Once on campus, (Ronald) Yancey was cautioned against using public transportation or attending any athletic events for his own safety, the news release said. He endured isolation; no one would sit near him in the classroom. He never had a lab partner. He did all of his papers and exams in ink so he could not be accused of cheating or have his work tampered with.

Ronald Yancey also had to complete graduation requirements not asked of other seniors, who were exempt from taking final exams. He, however, spent his last three weeks at Georgia Tech taking 18 exams across five classes, according to the university.

To ensure that he made the grade, he requested and was given an additional six-hour exam for extra credit. He also had to write a 30-page paper on transistor theory, the release stated.

Ronald Yancey defied the odds and earned his electrical engineering degree from Georgia Tech 59 years before his granddaughter would achieve a similar feat.

We are extremely proud that Deanna took the initiative to select her field, to quietly and quickly apply, arrange her curriculum and follow through with the completion of her matriculation, the elder Yancey said in the news release. Deannas graduate degree is truly an impressive achievement.

For more CNN news and newsletters create an account at CNN.com

See more here:

An engineer made history as Georgia Tech's first Black graduate; 59 years later, he passes the torch to his granddaughter - Yahoo! Voices

Read More..