Page 2,031«..1020..2,0302,0312,0322,033..2,0402,050..»

Early Detection of Arthritis Now Possible Thanks to Artificial Intelligence – SciTechDaily

A new study finds that utilizing artificial intelligence could allow scientists to detect arthritis earlier.

Researchers have been able to teach artificial intelligence neural networks to distinguish between two different kinds of arthritis and healthy joints. The neural network was able to detect 82% of the healthy joints and 75% of cases of rheumatoid arthritis. When combined with the expertise of a doctor, it could lead to much more accurate diagnoses. Researchers are planning to investigate this approach further in another project.

This breakthrough by a team of doctors and computer scientists has been published in the journal Frontiers in Medicine.

There are many different varieties of arthritis, and determining which type of inflammatory illness is affecting a patients joints may be difficult. Computer scientists and physicians from Friedrich-Alexander-Universitt Erlangen-Nrnberg (FAU) and Universittsklinikum Erlangen have now taught artificial neural networks to distinguish between rheumatoid arthritis, psoriatic arthritis, and healthy joints in an interdisciplinary research effort.

Within the scope of the BMBF-funded project Molecular characterization of arthritis remission (MASCARA), a team led by Prof. Andreas Maier and Lukas Folle from the Chair of Computer Science 5 (Pattern Recognition) and PD Dr. Arnd Kleyer and Prof. Dr. Georg Schett from the Department of Medicine 3 at Universittsklinikum Erlangen was tasked with investigating the following questions: Can artificial intelligence (AI) recognize different forms of arthritis based on joint shape patterns? Is this strategy useful for making more precise diagnoses of undifferentiated arthritis? Is there any part of the joint that should be inspected more carefully during a diagnosis?

Currently, a lack of biomarkers makes correct categorization of the relevant form of arthritis challenging. X-ray pictures used to help diagnosis are also not completely trustworthy since their two-dimensionality is insufficiently precise and leaves room for interpretation. This is in addition to the challenge of placing the joint under examination for X-ray imaging.

To find the answers to its questions, the research team focused its investigations on the metacarpophalangeal joints of the fingers regions in the body that are very often affected early on in patients with autoimmune diseases such as rheumatoid arthritis or psoriatic arthritis. A network of artificial neurons was trained using finger scans from high-resolution peripheral quantitative computer tomography (HR-pQCT) with the aim of differentiating between healthy joints and those of patients with rheumatoid or psoriatic arthritis.

HR-pQCT was selected as it is currently the best quantitative method of producing three-dimensional images of human bones in the highest resolution. In the case of arthritis, changes in the structure of bones can be very accurately detected, which makes precise classification possible.

A total of 932 new HR-pQCT scans from 611 patients were then used to check if the artificial network can actually implement what it had learned: Can it provide a correct assessment of the previously classified finger joints?

The results showed that AI detected 82% of the healthy joints, 75% of the cases of rheumatoid arthritis, and 68% of the cases of psoriatic arthritis, which is a very high hit probability without any further information. When combined with the expertise of a rheumatologist, it could lead to much more accurate diagnoses. In addition, when presented with cases of undifferentiated arthritis, the network was able to classify them correctly.

We are very satisfied with the results of the study as they show that artificial intelligence can help us to classify arthritis more easily, which could lead to quicker and more targeted treatment for patients. However, we are aware of the fact that there are other categories that need to be fed into the network. We are also planning to transfer the AI method to other imaging methods such as ultrasound or MRI, which are more readily available, explains Lukas Folle.

Whereas the research team was able to use high-resolution computer tomography, this type of imaging is only rarely available to physicians under normal circumstances because of restraints in terms of space and costs. However, these new findings are still useful as the neural network detected certain areas of the joints that provide the most information about a specific type of arthritis which is known as intra-articular hotspots. In the future, this could mean that physicians could use these areas as another piece in the diagnostic puzzle to confirm suspected cases, explains Dr. Kleyer. This would save time and effort during the diagnosis and is already in fact possible using ultrasound, for example. Kleyer and Maier are planning to investigate this approach further in another project with their research groups.

Reference: Deep Learning-Based Classification of Inflammatory Arthritis by Identification of Joint Shape PatternsHow Neural Networks Can Tell Us Where to Deep Dive Clinically by Lukas Folle, David Simon, Koray Tascilar, Gerhard Krnke, Anna-Maria Liphardt, Andreas Maier, Georg Schett and Arnd Kleyer, 10 March 2022, Frontiers in Medicine.DOI: 10.3389/fmed.2022.850552

See the rest here:
Early Detection of Arthritis Now Possible Thanks to Artificial Intelligence - SciTechDaily

Read More..

Artificial Intelligence Tech Solutions Inc (OTCMKTS: AITX) Investors Looking for a Big Week Ahead as Robotics AI Innovator Secures New Deals &…

Artificial Intelligence Tech Solutions Inc (OTCMKTS: AITX) recently made a significant reversal off $0.0092 after months of downtrend. The stock was one of the biggest penny stock runners of 2021 skyrocketing from tiple zeroes to highs near $0.30 per share. Now that the stock is based at a fraction of its former value investors are starting to take notice. AITX is a really exciting Company, through its wholly owned subsidiary, Robotic Assistance Devices, Inc. (RAD), the Company is making moves in the $100 billion plus global security services market. RADs current goal is to disrupt and capture a significant portion of both the human security guard market (over $30 billion) and physical security (video surveillance, access control, visitor management, etc.) market (over $20 billion) through its innovative RAD solution ecosystem. AITX is an SEC filer and recently applied for up listing its shares to fully reporting OTCQB. AITX sales are on the rise with the Company doing over $100,000 a month in revenues during 2022 and growing quickly.

AITX has been very busy in recent weeks, besides the up listing to OTCQB, AITX signed a new authorized dealer and expects to receive an order for at least 8 ROSA security robots from the dealers largest client within days. With the addition of the new authorized dealer, RADs dealer network has expanded to well over 40, covering the US, Canada, the United Kingdom, and the European Union. The Company received an order from Civitas PSG, one of the largest security companies in Romania for an AVA (Autonomous Verified Access) access control device, and one ROSA (Responsive Observation Security Agent) robotic surveillance unit. This will be RADs first deployment in the European market for AVA. AITX also signed U.S. Secure Ventures (USSV) as a new authorized dealer and has received an order for a ROSA security robot from this new dealer. USSV is a commercial security services provider with offices in Dallas, TX growing from regional leader to a national authority in commercial and integrated security. Robotic Assistance Devices, Inc. (RAD) will host an event focused on public safety technology in New York City, Thursday June 30 at a location in lower Manhattan and time to be determined.

Artificial Intelligence Tech Solutions Inc (OTCMKTS: AITX) is a high-tech start-up that delivers robotics and artificial intelligence-based solutions that empower organizations to gain new insight, solve complex security challenges, and fuel new business ideas at reduced costs. RAD developed its advanced security robot technology from the ground up including circuit board design, and base code development. This allows RAD to have complete control over all design elements, performance, quality, and the users experience of all security robots whether SCOT, ROSA, Wally, Wally HSO, AVA, ROAMEO, or RAD Light My Way. AITX achieved SOC2 Type I certification in 2021 and is currently undergoing an audit required for Type II accreditation. Additional related certifications for GDPR, CE and ISO27001 are either underway or under consideration.

Artificial Intelligence Tech Solutions mission is to apply Artificial Intelligence (AI) technology to solve enterprise problems categorized as expensive, repetitive, difficult to staff, and outside of the core competencies of the client organization. RADs first industry focus is the more than $100 billion global security services market. RADs current goal is to disrupt and capture a significant portion of both the human security guard market (over $30 billion) and physical security (video surveillance, access control, visitor management, etc.) market (over $20 billion) through its innovative RAD solution ecosystem.

Robotic Assistance Devices, LLC was incorporated in the State of Nevada on July 26, 2016, as an LLC and was founded by current President Steve Reinharz. Mr. Reinharz, has 25+ years in various leadership/ownership roles in the security industry and was part of a successful exit to a global multinational security company in 2004. Mr. Reinharz started his first security integration company in 1996, which he grew to 30+ employees before closing that company in 2003. RADs first industry focus is the more than $100 billion global security services market.1 RADs current goal is to disrupt and capture a significant portion of both the human security guard market (over $30 billion)2 and physical security (video surveillance, access control, visitor management, etc.) market (over $20 billion) through its innovative RAD solution ecosystem.

ROSA is a compact, self-contained, portable, security and communication solution that can be deployed in about 15 minutes. Like other RAD solutions, it only requires power as it includes all necessary communications hardware. ROSAs AI-driven security analytics include human and vehicle detection, license plate recognition, responsive digital signage and audio messaging, and complete integration with RADs software suite notification and autonomous response library. Two-way communication is optimized for cellular, including live video from ROSAs dual high-resolution, full-color, always-on cameras. RAD has published two Case Studies detailing how ROSA has helped eliminate instances of theft, trespassing and loitering at car rental locations and construction sites across the country. (Image: Rosa installed at 7/11s in Pittsburg)

AVA is a compact and stanchion mountable unit that provides an edge-to-edge 180 field of vision with advanced access control over gates and other controlled points of entry. AVA takes full advantage of the RAD Software Suite providing an ideal solution for logistics and distribution centers, storage yards, parking structures and lots, corporate campuses; anywhere that increased visibility is needed at a fraction of the cost. At ISC West in late March, AVA was named a winner of the 2022 SIA New Products and Solutions Awards in the category of Access Control Software, Hardware, Devices and Peripherals.

To Find out the inside Scoop on AITX Subscribe to Microcapdaily.com Right Now by entering your Email in the box below

AITX recently reported its wholly owned subsidiary Robotic Assistance Devices, Inc. (RAD) will include gun detection in its upcoming release of version 7 of its analytics software suite. RAD further announced that in partnership with Centralized Vision, active monitoring of gun detection alerts will be offered at no cost for all RAD deployments, subject to terms and conditions to be announced later.

RADs gun detection identifies the presence of side arms and long gun firearms. For clients who opt-in, as soon as a gun is identified as such by RADs AI-driven analytics the system may perform a variety of actions including appropriately activating a local autonomous alert, notifications to remote monitoring or onsite security staff, and appropriate authorities ideally before any shots are fired. The alert could be in the form of an audible and visual response on the RAD device. This immediate response will provide building security (#PROPTECH) and law enforcement precious minutes to respond to the situation, mitigating the loss of life, injuries, and property losses. Full details, terms and conditions will be released publicly in July. Gun detection will be available on all RAD devices and is backward compatible with RAD devices already deployed. Clients will be invited to opt-in beginning in mid-June. RADs gun detection analytic is just one of the many elements that will be prioritized and managed by the companys upcoming incident management system. The platform allows RAD dealers to avoid expensive and high maintenance alarm management solutions and is part of RADs efforts to rewrite the entire security industrys software library.

On June 3, AITX announced its wholly owned subsidiary Robotic Assistance Devices, Inc. (RAD) has signed USA Security as a new authorized dealer and has received an order for 2 ROSA security robots from this new dealer. USA Security designs fully integrated commercial security systems that utilize cutting-edge technology. USA Security is headquartered in Eden Prairie, Minnesota and supports a variety of industries across the United States. Although not named due to non-disclosure agreements, the Company confirmed that the 2 ROSA security devices will be deployed at a large retail center located in downtown Minneapolis, Minnesota. The dual ROSAs are expected to audibly greet shoppers with welcoming messages and visuals while performing routine surveillance of the propertys entrances from the parking garages.

Chris Daniels, Director of Sales and Marketing at USA Security said: RAD solutions are what the security industry needs right now. We expect to save this client close to $300,000 over the next three years with just two ROSAs. Our clients want to save money while keeping their properties and guests safe and secure. Were able to do that with RAD.

For More on AITX Subscribe Right Now!

Currently trading at a $76 million market valuation AITX has over $4 million in the treasury and $9 million in assets with just over $20 million in liabilities. Much of RADs existing convertible debt was acquired in support of the RAD/SMP robotics program. This convertible debt has largely been converted to long-term debt and warrants and will not cause short term dilution. AITX is a really exciting story developing in small caps, its subsidiary Robotic Assistance Devices, Inc. (RAD), is making moves in the $100 billion plus global security services market and finding great success under the able leadership of Steve Reinharz. AITX was one of the biggest runners of 2021, skyrocketing to highs near $0.30 per share. At current levels AITX has a massive gap to fill, ready liquidity, and a large group of investors who are already buying at current levels. Since the reversal just under a penny AITX has been under heavy accumulation and looks ready for action. We will be updating on AITX when more details emerge so make sure you are subscribed to Microcapdaily.

Disclosure: we hold no position in AITX either long or short and we have not been compensated for this article.

Here is the original post:
Artificial Intelligence Tech Solutions Inc (OTCMKTS: AITX) Investors Looking for a Big Week Ahead as Robotics AI Innovator Secures New Deals &...

Read More..

Val Kilmers Return: A.I. Created 40 Models to Revive His Voice Ahead of Top Gun: Maverick – Variety

SPOILER ALERT: Do not read unless you have watched Top Gun: Maverick, in theaters now.

Top Gun fans knew ahead of time that Val Kilmer would be reprising his role of Tom Iceman Kazansky in the sequel, but the specifics of the actors return were a question mark considering Kilmer lost the ability to speak after undergoing throat cancer treatment in 2014. The script for Paramount Pictures Top Gun: Maverick pulls from Kilmers real life, with Iceman also having cancer and communicating through typing. Kilmer gets to say one brief line of dialogue. In real life, Kilmers speaking voice has been revived courtesy of artificial intelligence.

Kilmer announced in August 2021 that he had partnered with Sonantic to create an A.I.-powered speaking voice for himself. The actor supplied the company with hours of archival footage featuring his speaking voice that was then fed through the companys algorithms and turned into a model. According to Fortune, this process was used again for the actors Top Gun: Maverick appearance although a studio sources tells Variety no A.I. was used in the making of the movie.

In the end, we generated more than 40 different voice models and selected the best, highest-quality, most expressive one, John Flynn, CTO and cofounder of Sonantic, said in a statement to Forbes about reviving Kilmers voice, unrelated to the movie. Those new algorithms are now embedded into our voice engine, so future clients can automatically take advantage of them as well.

Im grateful to the entire team at Sonantic who masterfully restored my voice in a way Ive never imagined possible, Kilmer originally said in a statement about the A.I. As human beings, the ability to communicate is the core of our existence and the side effects from throat cancer have made it difficult for others to understand me. The chance to narrate my story, in a voice that feels authentic and familiar, is an incredibly special gift.

As Fortune reports: After cleaning up old audio recordings of Kilmer, [Sonantic] used a voice engine to teach the voice model how to speak like Kilmer. The engine had around 10 times less data than it would have been given in a typical project, Sonantic said, and it wasnt enough. The company then decided to come up with new algorithms that could produce a higher-quality voice model using the available data.

Read more:
Val Kilmers Return: A.I. Created 40 Models to Revive His Voice Ahead of Top Gun: Maverick - Variety

Read More..

15 top data science certifications | CIO

Data scientist is one of the hottest jobs in IT. Companies are increasingly eager to hire data professionals who can make sense of the wide array of data the business collects. If you are looking to get into this lucrative field, or want to stand out against the competition, certification can be key.

Data science certifications give you an opportunity not only to develop skills that are hard to find in your desired industry but also to validate your data science know-how so that recruiters and hiring managers know what theyre getting if they hire you.

Whether youre looking to earn a certification from an accredited university, gain experience as a new grad, hone vendor-specific skills, or demonstrate your knowledge of data analytics, the following certifications (presented in alphabetical order) will work for you.

The Certified Analytics Professional (CAP) is a vendor-neutral certification that validates your ability to transform complex data into valuable insights and actions, which is exactly what businesses are looking for in a data scientist: someone who understands data, can draw logical conclusions and express to key stakeholders why those data points are significant. Youll need to apply and meet certain criteria before you can take the CAP or the associate level aCAP exams. To qualify for the CAP certification exam, youll need three years of related experience if you have a masters degree in a related field, five years of related experience if you hold a bachelors in a related field, or seven years of experience if you have any degree unrelated to analytics. To qualify for the aCAP exam, you will need a masters degree and less than three years of related experience in data or analytics.

Cost: CAP exam: $495 for INFORMS members, $695 for non-members; aCAP exam: $200 for INFORMS members, $300 for non-members

Location: In person at designated test centers

Duration: Self-paced

Expiration: Valid for three years

Cloudera has discontinued its Cloudera Certified Professional (CCP) and Cloudera Certified Associate (CCA) certifications in favor of the new Cloudera Data Platform (CDP) Generalist certification, which verifies proficiency with the platform. The new exam tests general knowledge of the platform and applies to multiple roles, including administrator, developer, data analyst, data engineer, data scientist, and system architect. The exam consists of 60 questions and the candidate has 90 minutes to complete it.

Cost: $300

Location: Online

Duration: 90 minutes

Expiration: Valid for two years

The Data Science Council of America (DASCA) Senior Data Scientist (SDS) certification program is designed for professionals with five or more years of experience in research and analytics. Its recommended that students have knowledge of databases, spreadsheets, statistical analytics, SPSS/SAS, R, quantitative methods, and the fundamentals of object-oriented programming and RDBMS. The program includes five tracks that will appeal to a range of candidates each track has differing requirements in terms of degree-level, work experience, and prerequisites to apply. Youll need at least a bachelors degree and more than five years of experience in data science to be eligible for each track, while other tracks require a masters degree or past certifications.

Cost: $775

Location: Online

Duration: Self-paced

Expiration: 5 years

The Data Science Council of America (DASCA) offers the Principal Data Scientist (PDS) certification, which includes three tracks for data science professionals with 10 or more years of experience in big data. The exam covers everything from fundamental to advanced data science concepts such as big data best practices, business strategies for data, building cross-organizational support, machine learning, natural language processing, scholastic modeling, and more. The exam is designed for seasoned and high-achiever data science thought and practice leaders.

Cost: $850, track 1; $1,050, track 2; $750, Track 3; $1,250, track 4

Location: Online

Duration: Self-paced

Expiration: Credentials do not expire

The IBM Data Science Professional Certificate consists of nine courses on data science, open source tools, data science methodology, Python, Databases and SQL, data analysis, data visualization, machine learning, and a final applied data science capstone. The certification coursework takes place online through Coursera with a flexible schedule and takes an average of three months to complete, but you are free to take more or less time. The course includes hands-on projects to help you build a portfolio to showcase your data science talents to potential employers.

Cost: Free

Location: Online

Duration: Self-paced

Expiration: Credentials do not expire

Microsofts Azure AI Fundamentals certification validates your knowledge of machine learning and artificial intelligence concepts and how they relate to Microsoft Azure services. Its a fundamentals exam, so you dont need extensive experience to pass the exam. Its a good place to start if you are new to AI or AI on Azure and want to demonstrate your skills and knowledge to employers.

Cost: $99

Location: Online

Duration: Self-paced

Expiration: Credentials do not expire

The Azure Data Scientist Associate certification from Microsoft focuses your ability to utilize machine learning to implement and run machine learning workloads on Azure. Candidates for the exam are tested on ML, AI solutions, NLP, computer vision, and predictive analytics. You will need to be skilled in deploying and managing resources, managing identities and governance, implementing and managing storage, and configuring and managing virtual networks.

Cost: $165

Location: Online

Duration: Self-paced

Expiration: Credentials do not expire

The Open Group Professional Certification Program for the Data Scientist Professional (Open CDS) is an experience-based certification without any traditional training courses or exams. Youll start at level one as a Certified Data Scientist, then you can move to level two where youll become a Master Certified Data Scientist and finally, you can pass the third level to become a Distinguished Certified Data Scientist. Certification requires a three-step process that includes applying for the certification, completing the experience application form, and attending a board review.

Cost: Contact for pricing

Location: On-site

Duration: Varies by level

Expiration: Credentials do not expire

The AI and Machine Learning Professional certification from SAS demonstrates your ability to use open source tools to gain insight from data using AI and analytics skills. The certification consists of several exams that cover topics such as machine learning, natural language processing, computer vision, and model forecasting and optimization. Youll need to pass the SAS Certified Specialist exams in Machine Learning, Forecasting and Optimization, and Natural Language Processing and Computer Vision to earn the AI and Machine Learning professional designation.

Cost: $180 per exam

Location: Online

Duration: Self-paced

Expiration: Credentials do not expire

The SAS Certified Advanced Analytics Professional Using SAS 9 credential validates your ability to analyze big data with a variety of statistical analysis and predictive modeling techniques. Youll need experience in machine learning and predictive modeling techniques, including their use with big, distributed, and in-memory data sets. You should also have experience with pattern detection, experimentation in business optimization techniques, and time-series forecasting. The certification requires passing three exams: Predictive Modeling Using SAS Enterprise Miner 7, 13, or 14; SAS Advanced Predictive Modeling; and SAS Text Analytics, Time Series, Experimentation, and Optimization.

Cost: $250 for the Predictive Modeling Using SAS Enterprise Miner exam; $180 each for the other two required exams

Location: Online

Duration: Self-paced

Expiration: Credentials do not expire

The SAS Certified Data Scientist certification is a combination of the other two data certifications offered through SAS. It covers programming skills; managing and improving data; transforming, accessing, and manipulating data; and how to work with popular data visualization tools. Once you earn both the Big Data Professional and Advance Analytics Professional certifications, you can qualify to earn your SAS Certified Data Scientist designation. Youll need to complete all 18 courses and pass the five exams between the two separate certifications.

Cost: $180 per exam

Location: Online

Duration: Self-paced

Expiration: Credentials do not expire

The TensorFlow Developer Certificate program is a foundational certificate for students, developers, and data scientists who want to demonstrate practical machine learning skills through the building and training of models using TensorFlow. The exam tests your knowledge of and ability to integrate machine learning into various tools and applications. To pass the exam you will need to be experienced with the foundational principles of ML and deep learning, building ML models, image recognition algorithms, deep neural networks, and natural language processing.

Cost: $100 per exam

Location: Online

Duration: Self-paced

Expiration: Credentials do not expire

More on data science:

Follow this link:

15 top data science certifications | CIO

Read More..

How to Become a Better Data Science Team – Built In

A lot of my articles, as well as much of the writing on data science in general, focus on the work of individual data scientists. In this article, though, I want to focus on something different: the data science team. But first, let's define what such a team usually consists of. Although this configuration isnt set in stone, here is an example of a data science team: a few data scientists, a data engineer, a business/data analyst and a data science manager.

The specific composition of the team is less important than how the team works together, however. With that being said, lets look at the tools and methods you can use to improve collaboration among your data science team, whether you are a data scientist, a manager or possibly a technical recruiter.

More From Matt PrzybylaHow to Use SQL in Python

This first tool is a combination of planning and grooming. These terms can be a little muddled, though, so lets define them first.

Grooming falls under the umbrella of organization, but what sets this process apart from planning (in certain companies) is that it serves as the first review of whatever is in your backlog. This queue may be composed of several Jira tickets or other general tasks that your team has come up with over time but has not yet prioritized into an active process.

You can think of planning as more specific on a sprint level. Even if you dont use Jira, you can still plan weekly, bi-weekly, or on whatever cadence you prefer, and log it with more check-ins. Typically, in these check-ins, youll discuss upcoming projects. More importantly, though, youll address the digestible tasks of a particular project for that given week or time period.

Here are a few takeaways and benefits that can come from collaborating on planning and grooming:

Once again, these terms might switch meanings or be interchangeable depending on your companys processes or if youre working in an agile environment. Whats important, however, is improving your teams overall organizational ability.

Stakeholder updates are not often discussed in the process of becoming a data scientist since the training is usually more focused on learning algorithms, coding, and the respective, underlying concepts of each of those.

Stakeholders are the people (or the single person) who assign your tasks or projects or who will digest your final project and its impact on the business. That being said, stakeholders do not all have the same role; they may be data science managers, product managers, sales engineers or in some other position, depending on the company.

You can always update stakeholders through Jira tickets, Slack messages, Google Slide decks and many other methods. The point is not the platform you use; its the way in which you share your information and updates.

Here are some ways that you and your team can effectively update stakeholders:

Also look at breakdowns of specific groups of data: You may organize it geographically, by type and so on.

There are many ways to explain and update your data science projects, but the most important thing is how you articulate them.

Finally, retrospectives are crucial to your data science team. This meeting is usually a thorough discussion of a few situations that your team has faced and can be held bi-weekly or monthly.

Planning and grooming take place before the project or task, and stakeholder updates occur during and often at the end of the task. The retrospective, however, encompasses everything that happened in the projects entire timeframe.

You will typically look at a few things in this retrospective discussion:

All these questions and areas will generally cover everything important that has happened over the given timeframe. You will gain a better sense of whats important to the team, the company and yourself.

Start Your New Career in Data ScienceCareers in Data Science: How to Get Started

While improving your own work will improve the team, you can focus on other, more team-centric items to make your data science team even better overall.To summarize, here are three ways that your data science team as a whole can improve:

View original post here:

How to Become a Better Data Science Team - Built In

Read More..

Roadblocks to getting real-time AI right – VentureBeat

Analysts estimate that by 2025, 30% of generated data will be real-time data. That is 52 zettabytes (ZB) of real-time data per year roughly the amount of total data produced in 2020. Since data volumes have grown so rapidly, 52 ZB is three times the amount of total data produced in 2015. With this exponential growth, its clear that conquering real-time data is the future of data science.

Over the last decade, technologies have been developed by the likes of Materialize, Deephaven, Kafka and Redpanda to work with these streams of real-time data. They can transform, transmit and persist data streams on-the-fly and provide the basic building blocks needed to construct applications for the new real-time reality. But to really make such enormous volumes of data useful, artificial intelligence (AI) must be employed.

Enterprises need insightful technology that can create knowledge and understanding with minimal human intervention to keep up with the tidal wave of real-time data. Putting this idea of applying AI algorithms to real-time data into practice is still in its infancy, though. Specialized hedge funds and big-name AI players like Google and Facebook make use of real-time AI, but few others have waded into these waters.

To make real-time AI ubiquitous, supporting software must be developed. This software needs to provide:

Developers and data scientists want to spend their time thinking about important AI problems, not worrying about time-consuming data plumbing. A data scientist should not care if data is a static table from Pandas or a dynamic table from Kafka. Both are tables and should be treated the same way. Unfortunately, most current generation systems treat static and dynamic data differently. The data is obtained in different ways, queried in different ways, and used in different ways. This makes transitions from research to production expensive and labor-intensive.

To really get value out of real-time AI, developers and data scientists need to be able to seamlessly transition between using static data and dynamic data within the same software environment. This requires common APIs and a framework that can process both static and real-time data in a UX-consistent way.

The sexiest work for AI engineers and data scientists is creating new models. Unfortunately, the bulk of an AI engineers or data scientists time is devoted to being a data janitor. Datasets are inevitably dirty and must be cleaned and massaged into the right form. This is thankless and time-consuming work. With an exponentially growing flood of real-time data, this whole process must take less human labor and must work on both static and streaming data.

In practice, easy data cleaning is accomplished by having a concise, powerful, and expressive way to perform common data cleaning operations that works on both static and dynamic data. This includes removing bad data, filling missing values, joining multiple data sources, and transforming data formats.

Currently, there are a few technologies that allow users to implement data cleaning and manipulation logic just once and use it for both static and real-time data. Materialize and ksqlDb both allow SQL queries of Kafka streams. These options are good choices for use cases with relatively simple logic or for SQL developers. Deephaven has a table-oriented query language that supports Kafka, Parquet, CSV, and other common data formats. This kind of query language is suited for more complex and more mathematical logic, or for Python developers.

Many possibly even most new AI models never make it from research to production. This hold up is because research and production are typically implemented using very different software environments. Research environments are geared towards working with large static datasets, model calibration, and model validation. On the other hand, production environments make predictions on new events as they come in. To increase the fraction of AI models that impact the world, the steps for moving from research to production must be extremely easy.

Consider an ideal scenario: First, static and real-time data would be accessed and manipulated through the same API. This provides a consistent platform to build applications using static and/or real-time data. Second, data cleaning and manipulation logic would be implemented once for use in both static research and dynamic production cases. Duplicating this logic is expensive and increases the odds that research and production differ in unexpected and consequential ways. Third, AI models would be easy to serialize and deserialize. This allows production models to be switched out simply by changing a file path or URL. Finally, the system would make it easy to monitor in real time how well production AI models are performing in the wild.

Change is inevitable, especially when working with dynamic data. In data systems, these changes can be in input data sources, requirements, team members and more. No matter how carefully a project is planned, it will be forced to adapt over time. Often these adaptations never happen. Accumulated technical debt and knowledge lost through staffing changes kill these efforts.

To handle a changing world, real-time AI infrastructure must make all phases of a project (from training to validation to production) understandable and modifiable by a very small team. And not just the original team it was built for it should be understandable and modifiable by new individuals that inherit existing production applications.

As the tidal wave of real-time data strikes, we will see significant innovations in real-time AI. Real-time AI will move beyond the Googles and Facebooks of the world and into the toolkit of all AI engineers. We will get better answers, faster, and with less work. Engineers and data scientists will be able to spend more of their time focusing on interesting and important real-time solutions. Businesses will get higher-quality, timely answers from fewer employees, reducing the challenges of hiring AI talent.

When we have software tools that facilitate these four requirements, we will finally be able to get real-time AI right.

Chip Kent is the chief data scientist at Deephaven Data Labs.

Welcome to the VentureBeat community!

DataDecisionMakers is where experts, including the technical people doing data work, can share data-related insights and innovation.

If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data tech, join us at DataDecisionMakers.

You might even considercontributing an articleof your own!

Read More From DataDecisionMakers

Read the rest here:

Roadblocks to getting real-time AI right - VentureBeat

Read More..

Wolters Kluwer shows how data is rewriting the future of audit – Business Wire

NEW YORK--(BUSINESS WIRE)--Colleen Knuff, Vice President of Product Management for Audit at Wolters Kluwer Tax & Accounting North America, will be leading an Education Lab session at the AICPA Engage Conference in Las Vegas, Nevada. In her session Increase Your Value as a Client Advisor, Colleen will explain why the AICPA emphasizes Audit Data Analytics (ADA) in reporting requirements, and how ADA tools create new possibilities for auditors to add value as advisors to their clients.

Additionally, on Tuesday, June 7, Knuff will be presenting a session Rewriting the Book on Audit during which attendees will learn how audit work is trending toward data-driven audits, or cloud-enabled data integration. She will explain the technologies behind Wolters Kluwers evolving data-driven audit and show how data and automation can be incorporated into the audit process to reduce risk, add value, and complete audits more quickly.

We provide diagnostics and coaching in our audit approach to guide auditors through the work they need to perform. Doing so creates a much more optimized and intelligent workflow that yields the profitable audit and satisfied client that every firm is looking for, says Knuff. Currently, you provide the audit opinion on financial statements, but wouldnt clients be so much happier if you also came back to them with data visualizations to say, Heres how you compare to your peers in your industry, or heres an analysis weve done over several years, along with where we think your business is going.

ADA tools like Wolters Kluwers TeamMate Analytics provide auditors with robust analytics without needing highly specialized technical training or data science expertise, making it possible to analyze even large amounts of data efficiently and effectively. Colleen will present key concepts during her presentation on Monday during the Education Lab session.

The biggest complaint I get from our auditors is that sample sizes are too big. But the reality is if that is what your risk assessment is producing, then you have to approach it in a different way. So there's a lot of savings with TeamMate Analytics because it digs into those transactions that fall outside the curve or the spectrum of normal, says Christopher ONeal, CPA and Partner at Roedl Management Inc.

Wolters Kluwers suite of cloud-based audit solutions is constructed with this same emphasis on data and cloud-enabled automation. With CCH Axcess Engagement (currently in an early adopter phase) and CCH Axcess Knowledge Coach (Wolters Kluwers proprietary risk-based audit methodology solution) serving as the foundation, additional tools like TeamMate Analytics round out the suite for fully integrated data and data analytics throughout the audit process.

Wolters Kluwer is innovating and investing in ideas like data-driven audit, which will help automate time consuming and highly manual tasks that are performed in almost every engagement. We are incorporating data analytics and harnessing those analytics to perform preliminary analysis on our clients, spot patterns and identify trends. Thats part of our Data-Driven Approach that were very, very excited about, says Knuff.

Visit booth 927 during the 2022 AICPA Engage Conference to learn more about the award-winning portfolio of tax, accounting, and audit solutions from Wolters Kluwer.

About Wolters Kluwer

Wolters Kluwer (WKL) is a global leader in professional information, software solutions, and services for the healthcare; tax and accounting; governance, risk and compliance; and legal and regulatory sectors. We help our customers make critical decisions every day by providing expert solutions that combine deep domain knowledge with advanced technology and services.

Wolters Kluwer reported 2021 annual revenues of 4.8 billion. The group serves customers in over 180 countries, maintains operations in over 40 countries, and employs approximately 19,800 people worldwide. The company is headquartered in Alphen aan den Rijn, the Netherlands.

Wolters Kluwer shares are listed on Euronext Amsterdam (WKL) and are included in the AEX and Euronext 100 indices. Wolters Kluwer has a sponsored Level 1 American Depositary Receipt (ADR) program. The ADRs are traded on the over-the-counter market in the U.S. (WTKWY).

For more information, visit http://www.wolterskluwer.com, follow us on Twitter, Facebook, LinkedIn, and YouTube.

View original post here:

Wolters Kluwer shows how data is rewriting the future of audit - Business Wire

Read More..

Modern immigrants’ children have climbed the economic ladder as fast as the Ellis Island generation – Princeton University

Long before Leah Boustan was a professor of economics at Princeton, she was a Princeton undergraduate putting the final touches on her senior thesis.

Working alongside her advisor, longtime professor Henry Hank Farber, Boustan published a 100-page research project that compared outcomes for students who dropped out of high school in the early 1960s with those who dropped out decades later.

"I cant remember the exact moment I decided to become an economic historian, but I remember telling Hank I was really interested in comparisons of cohorts over time," Boustan said. "That interest is the basis for a lot of my work even today."

Twenty-two years after graduating from Princeton, Boustan has published a book that uses troves of data and the latest innovations in data science to examine an issue Boustan considers "one of the most fraught issues in U.S. politics" both today and in the past: immigration.

Leah Boustan as a Princeton undergraduate with her father, Harlan Platt

Photo courtesy of Leah Boustan

Written with her longtime collaborator Ran Abramitzky of Stanford University, Streets of Gold: America's Untold Story of Immigrant Success introduces the public to more than a decade of her rigorous, empirical research on the personal and society-wide impacts of immigration.

Weaving together personal family stories including their own with insights from the data, Boustan and Abramitzky tell an uplifting story about the promise of immigration. One finding Boustan found particularly surprising is how well children of immigrants have done economically, both today and in the past.

The fact that children of immigrants who came from poor families in the 1980s moved up the economic ladder at the same pace as children of the Ellis Island generation that floored me," said Boustan.

One hundred years ago, Italy a major sending country of immigrants to the U.S. had about half the GDP per capita of the United States. Once in America, however, the sons of Italian immigrants rose up. Those who grew up in the 25th percentile of income distribution in the late 1800s earned enough as adults to be near the 60th percentile.

Today, children of immigrants from Nicaragua, which has about one-tenth the GDP per capita of the United States, see similar rates of economic mobility.

"Theres no reason that has to be true but it turned out to be," Boustan said. "It's something really remarkable we're able to see because of the data."

That data and the methodologies Boustan and Abramitzky developed to make use of it deserve almost as much attention as the findings.

In addition to working with and linking modern data like IRS tax records and birth certificate files, a partnership with the genealogy website Ancestry.com made it possible for Boustan and Abramitzky to automate searches and follow millions of families over more than 100 years of Census data. From there, they worked with audio recordings of historical interviews and congressional speeches, using machine-learning tools to analyze these texts and glean big-data insights.

The rigor of the research is one reason it's so groundbreaking, and a tradition Boustan can trace to her days as a Princeton undergraduate.

As a high school debate student interested in public policy, Boustan applied early decision to Princeton with the aim of declaring a concentration at the School of Public and International Affairs (SPIA). But her time learning from Professor Farber changed her mind.

"I took economics classes in order to major in SPIA, she said. One of those classes was ECO 313: Applied Econometrics with Hank Farber, where we used real data sets to answer questions. I fell in love with that class."

Boustan told Farber she wanted to spend a summer working in Washington, but he persuaded her to stay in Princeton instead to learn more about data analysis and to see how building expertise in a discipline like economics could help her produce the kind of policy-relevant work that legislators really need.

"So thats what I did, and I never really looked back," she said.

Boustan declared Economics as her concentration at Princeton and started spending her free time in the computer lab of the Industrial Relations (IR) Section a group widely known for training and supporting some of the most famous labor economists and empiricists in the field, including 2021 Nobel Laureates and Princeton alumni David Card and Joshua Angrist.

From there, she went on to earn a Ph.D. from Harvard. After several years teaching at the University of California Los Angeles, Boustan returned to Princeton in 2017.

As Boustan hits the road to talk about her new book, she's able to marvel at how things have come full circle. When Farber signed on as Boustan's undergraduate senior thesis advisor, he was the director of the IR Section. Last year, Boustan herself was given the title, an honor she doesn't take for granted.

The IR Section is a true intellectual community, she said. The faculty sit right beside the graduate students almost like in a lab and work closely together. And the research coming out of the Section is always connected to the real world, from minimum wage to unemployment to the immigration work that I have been doing.

For Farber, who says Boustan was one of the best undergraduate students he had the pleasure of teaching in his 30 years at Princeton, having Boustan as a colleague has been a source of some pride.

Farber also noted how Boustan, no longer the student, has herself excelled in role of advisor. Leah has really played a key role in guiding the IR Section not only on the research side, but on the teaching side as well, he said.

In addition to committing her time as an advisor to dozens of undergraduate and graduate students, Boustan recently took on the task of teaching Princetons Principles of Microeconomics course a popular class for undergraduates across a wide range of majors.

I taught that class for many years myself, said Farber. It was wonderful to see Leah make it her own and take it in a whole new direction. The reaction from students this year was very positive.

Boustan says her research and her role as an economic historian give her hope for the future of immigration policy.

Sometimes we feel so stuck. We feel polarized. Congress cant pass legislation. On immigration, weve been at a stalemate for 50 years. But you look at history, and you see weve had wild change. It reminds me Im living in one small moment in history.I think economic history helps us recognize the possibility of scope for change.

Specifically, she hopes much-needed policy change will come for the Dreamers, undocumented immigrants who came to the United States as children. "Our research shows the most optimistic vision of what the children of immigrants can achieve," she said.

Because it can take 30 to 40 years to follow children into the labor market, her research on modern immigration focuses on children born in the 1980s. These children lived in households who benefited from immigration amnesty programs during the Reagan administration. Boustan worries that studies of more recent immigrant arrivals many of whom are undocumented without any path to citizenship could produce less optimistic findings.

"Im worried about the next generation and what I'll find when we write Streets of Gold 2.0," she said. "Theres lots of promise for children of immigrants if they and their parents have some pathway into the formal labor market. I think its urgent to pass DACA as legislation and really return to the idea of comprehensive immigration reform."

Readers interested in learning more can read about five immigration myths dispelled in "Streets of Gold." This Thursday, June 9, at 8:30 p.m. ET, Professor Boustan will answer questions about her research in a Twitter Spaces event with Joey Politano, author of the Apricitas Economics blog. Join the event.

Read the original post:

Modern immigrants' children have climbed the economic ladder as fast as the Ellis Island generation - Princeton University

Read More..

City of Bloomington Partners with Google.org to Improve Access to Government Services – City of Bloomington

Bloomington, Ind.Mayor John Hamilton announced today that the City of Bloomington will receive pro bono support from a team of Google.org Fellows to deploy CiviForm, a tool to simplify and centralize online applications for government assistance programs. CiviForm is an open-source tool developed originally by the City of Seattle with support from Google.org to make applying for government programs easier and faster.

A team of 12 Google employees will work full-time with the City of Bloomington for six months as part of a Google.org Fellowship, providing pro bono technical expertise to nonprofits and civic entities. The City of Bloomington ITS Department (Information & Technology Services) will lead this initiative in partnership with Parks & Recreation.

The CiviForm pilot program will focus on improving the application processes for public benefits programs like the Parks Foundation Youth Scholarship program and the ITS Surplus Computer Request process. After the Fellowship ends, City staff can continue using CiviForm further to improve online access to other City services.

This partnership can help our residents access and apply for City programs. It can help City departments review applicants in an equitable and consistent manner. Thats a win-win, said Mayor Hamilton, and good local democracy into action.

Google and the City of Bloomington share a commitment to creating opportunity for everyone, said Rob Biederman, Director of External Affairs for Google. By bringing together the best of Googles tech expertise with the Citys knowledge of the communitys needs, we hope to simplify the benefits application process for Bloomington residents.

During a project coordinating site visit last week, at the direction of the City, Google researchers conducted user interviews with residents to gain a better understanding of customer needs and experiences related to program access and online applications. This input will help the City improve its online experience for customers.

PROJECT SUMMARY

The City of Bloomington views CiviForm (initially built through a Google.org Fellowship with the City of Seattle) as a means of supporting the Citys goal of providing sustainable, resilient, and equitable economic opportunity for all City residents by enabling residents to apply for City services.

Many Bloomington residents have limited awareness of City programs and must navigate complicated enrollment steps to apply -- some of which are still offline. Google.org Fellows will collaborate across City Departments to deploy CiviForm to enable low-income residents to enter their information once to apply to many programs securely and efficiently.

The City of Bloomingtons goals using CiviForm:

ABOUT GOOGLE.ORG FELLOWSHIP PROGRAM

Read more:

City of Bloomington Partners with Google.org to Improve Access to Government Services - City of Bloomington

Read More..

How to Find Residuals in Regression Analysis – Built In

Regression models, both single and multivariate, are the backbone of many types of machine learning. Using the structure you specify, these tools create equations that match the modeled data set as closely as possible. Regression algorithms create the optimal equation by minimizing the error between the results predicted by the model and the provided data.

That said, no regression model will ever be perfect (and if your model does appear to be nearly perfect I recommend you check for overfitting). There will always be a difference between the values predicted by a regression model and the actual data. Those differences will change dramatically as you change the structure of the model, which is where residuals come into play.

The residual for a specific data point is the difference between the value predicted by the regression and the observed value for that data point. Calculating the residual provides a valuable clue into how well your model fits the data set. To calculate residuals we need to find the difference between the calculated value for the independent variable and the observed value for the independent variable.

The residual for a specific data point is the difference between the value predicted by the regression and the observed value for that data point. Calculating the residual provides a valuable clue into how well your model fits the data set. A poorly fit regression model will yield residuals for some data points that are very large, which indicates the model is not capturing a trend in the data set. A well-fit regression model will yield small residuals for all data points.

Lets talk about how to calculate residuals.

In order to calculate residuals we first need a data set for the example. We can create a fairly trivial data set using Pythons Pandas, NumPy and scikit-learn packages. You can use the following code to create a data set thats essentially y = x with some noise added to each point.

That code performs the following steps:

We can now use that data frame as our sample data set.

Want More Data Science Tutorials? We Got You.How to Use Float in Python (With Sample Code!)

The Dependent variable is our x data series, and the Independent variable is our y. Now we need a model which predicts y as a function of x. We can do that using scikit-learns linear regression model with the following code.

That code works as follows:

If the model perfectly matches the data set, then the values in the Calculated column will match the values in the Dependent column. We can plot the data to see if it does or not.

nope.

We could have seen that coming because we used a first-order linear regression model to match a data set with known noise in it. In other words, we know that this model would have perfectly fit y = x, but the variation we added in each data point made everyy a bit different from the corresponding x. Instead of perfection, we see gaps between the Regression line and the Data points. Those gaps are called the residuals. See the following plot which highlights the residual for the point at x = 4.

To calculate the residuals we need to find the difference between the calculated value for the independent variable and the observed value for the independent variable. In other words, we need to calculate the difference between the Calculated and Independent columns in our data frame. We can do so with the following code:

We can now plot the residuals to see how they vary across the data set. Heres an example of a plotted output:

Notice how some of the residuals are greater than zero and others are less than zero. This will always be the case! Since linear regression reduces the total error between the data and the model to zero the result must contain some errors less than zero to balance out the errors that are greater than zero.

You can also see how some of the errors are larger than others. Several of the residuals are in the +0.25 0.5 range, while others have an absolute value in the range of 0.75 1. These are the signs that you look for to ensure a model is well fit to the data. If theres a dramatic difference, such as a single point or a clustered group of points with a much larger residual, you know that your model has an issue. For example, if the residual at x = 4 was -5 that would be a clear sign of an issue. Note that a residual that large would probably indicate an outlier in the data set, and you should consider removing the point using interquartile range (IQR) methods.

More on IQR? Coming Right Up.How to Find Outliers With IQR Using Python

To highlight the argument that residuals can demonstrate a poor model fit, lets consider a second data set. To create the new data set I made two changes. The changed lines of code are as follows:

The first change increased the length of the data frame index to 100. This created a data set with 100 points, instead of the prior 10. The second change made the Dependent variable be a function of the Independent variable squared, creating a parabolic data set. Performing the same linear regression as before (not a single letter of code changed) and plotting the data presents the following:

Since this is just an example meant to demonstrate the point, we can already tell that the regression doesnt fit the data well. Theres an obvious curve to the data, but the regression is a single straight line. The regression under-predicts at the low and high ends, and over-predicts in the middle. We also know its going to be a poor fit because its a first-order linear regression on a parabolic data set.

That said, this visualization effectively demonstrates how examining the residuals can show a model with a poor fit. Consider the following plot, which I generated using the exact same code as the prior residuals plot.

Can you see the trend in the residuals? The residuals are very negative when the X Data is both low and high, indicating that the model is under-predicting the data at those points. The residuals are also positive when the X Data is around the midpoint, indicating the model is over-predicting the data in that range. Clearly the model is the wrong shape and, since the residuals curve only shows one inflection point, we can reasonably guess that we need to increase the order of the model by one (to two).

If we repeat the process using a second-order regression, we obtain the following residuals plot.

The only discernible pattern here is that the residuals increase as the X Data increases. Since the Dependent data includes noise, which is a function of the X Data, we expect that to happen. What we dont see are very large residuals, or indications of a different shape to the data set. This means we now have a model that fits the data set well.

And with that, youre all set to start evaluating the performance of your machine learning models by calculating and plotting residuals!

See the article here:

How to Find Residuals in Regression Analysis - Built In

Read More..