Heard on the Street 3/20/2023 – insideBIGDATA

Welcome to insideBIGDATAs Heard on the Street round-up column! In this regular feature, we highlight thought-leadership commentaries from members of the big data ecosystem. Each edition covers the trends of the day with compelling perspectives that can provide important insights to give you a competitive advantage in the marketplace. We invite submissions with a focus on our favored technology topics areas: big data, data science, machine learning, AI and deep learning. Enjoy!

Lets pump the brakes on these wilder claims about AI. Commentary byBeerud Sheth, CEO atGupshup

Since the end of last year, two topics have hogged maximum limelight. One is of course ChatGPT and Generative AI technologies in general; the other is disruption these technologies are going to create in jobs, education, fields of creativity etc. With ChatGPT, weve already seen significant improvement in efficiencies and capabilities, making the dividends amply clear. But the thing with technology is, it always has a dual impact- there are winners and there are some losers as well. As a society, what we need tomake sure is that we should be able to support and enable these people to adjust and adapt to this new world, while benefiting from the overall value created. Well have to be careful to balance the commongreater good against individual wins and losses. A good way to think about AI, would be to look at it as augmenting human capability.Because no matter how powerful AI gets, humans plus AI will be even more powerful than either humans alone or AI on its own. Its quite possible that well see the rise of new jobs where people will work with AI as the partner.Think ofsomeone who wants to make an animated movie andhas just the perfect idea. But he cant sketch,animate or programme. Now, with ChatGPT he can prototype quickly, so theres no question that lots of jobs will become augmented and therefore there exists a lot of potential for new jobs that didnt exist before. The future of AI is undoubtedly bright, and it will continue to transform various industries and aspects of our lives. We can expect AI to become more sophisticated and able to perform a wider range of tasks, as it becomes more capable of learning and adapting to new environments. However, AI has its natural limits.For example, while AI can be used to diagnose medical conditions, it cannot replace the empathy and judgment of a human doctor. Similarly, while AI can analyze large data sets and make predictions, it may be limited by biases inherent in the data its trained on. Ultimately, the future of AI is likely to involve a balance between automation and human work.

Digital transformation and its effects on the hybrid workplace. Commentary by Krishna Nacha, SVP and Head of North America and Latin America at Iron Mountain

The hybrid model is here to stay, notwithstanding the fence sitters and the last few corporate holdouts. As digital transformation continues to accelerate at a rapid pace, mobile employees working from flexible locations (including offices, hot-desking spaces, homes, local cafes, hotels, etc.) create an inherent information leakage and data security risk, which in most cases, is unintentional. Organizations must not merely tighten, but completely redesign their policies and procedures around secure access to information across all access points and devices. In response to the demands of the hybrid workplace, cloud-based services are naturally increasing, but it can be difficult to keep track of the tentacles of the information spread in a multi-cloud environment. Knowledge around data residency, including what information you should hold, where it is stored, which regulatory framework(s) it is governed by, and who within your organization has access to it is critical to data security and privacy. In most parts of the world, businesses operate under local data regulations that dictate how the data of a nations citizens or residents must be collected, cleaned, processed and stored within its borders. The primary reason enterprises choose to store data in different locations is often related to regulations, data policies and taxes. However, companies are allowed to transfer data after complying with local data protection and privacy laws. In this scenario, businesses must notify users and get their consent before obtaining and using their information.AI can be used to aggregate, analyze, present data clearly and extract the most relevant information, so that businesses can make the right decisions around data sovereignty. Subsequently, AI can also be applied for content search and redaction whereby personal identifiable information is hidden in documents from unauthorized access and better integrate systems and overcome silos. From an information lifecycle viewpoint, AI can be used to identify and delete unnecessary data, as well as to support compliance and governance.

School shootings wont stop How can AI help? Commentary by Tim Sulzer, CTO and co-founder of ZeroEyes

In 2022, the U.S. saw more than 300 shooting incidents on school grounds. Clearly, existing security approaches are not enough to fully protect students and faculty against gun-related threats. Specially trained AI has the potential to detect brandished guns the moment they appear on a security camera, which is often several minutes before the shooter begins firing.Im not suggesting that AI has the capacity to differentiate between a lethal weapon brandished with ill intent and something that just looks like one (ex. toy gun, wallet, phone), but trained firearms experts certainly have this ability. The combination of AI-based gun detection that is trained on an extensive dataset and an expert eye can result in school staff and first responders receiving alerts with critical situational information in mere seconds. These extra minutes of advance notice could give schools time to lockdown and law enforcement time to arrive and make an arrest before the first shot is fired, or lead to quicker response times to treat injuries in the aftermath.

Preparing for AI regulation. Commentary by Triveni Gandhi, Responsible AI Lead and Jacob Beswick, Director of AI Governance Solutions for Dataiku

We know that regulating the use of AI is forthcoming and that complying with regulation will become important to organizations. While there are no AI regulations yet, there are standards that are trickling out of international standards organizations, NIST, and from some governments such as AI Verify in Singapore.The question that remains to be answered by organizations is: Should you start self-regulating in alignment with the intentions set out by these frameworks, even if obligations dont yet exist? We would argue that ChatGPT provides a good opportunity for this question to be asked and answered. We would argue further that the answer to the aforementioned question is: Yes, self-regulate. And that, fundamentally, this should look like: testing, validating and monitoring towards reliability, accountability, fairness and transparency. So where does self-regulation begin? First off, you need to understand the model youre using (risks, strengths, weaknesses). Secondly, you need to understand your use cases (risks, audiences, objectives). Being transparent about the use of these tools is important, and making sure all output is consistent and factual will be a differentiator for the most successful companies looking to use this tool.Being transparent doesnt just mean saying that youre using it; it means building out your organizations internal capabilities to speak to the model its leveraging, the use case its deploying, and what its done to ensure the ways the model is being used safely and in alignment with objectives set out. Without doing this, no matter how promising the use case, the organization is exposing itself to risks where there is a departure from whats been promised to end users. That risk could range from the financial to the embarrassing. Chat GPT is not the only AI that is in use its only the most popular for now. Regulation and oversight of other types of AI systems is still incredibly relevant and we shouldnt lose sight of that.

On Chief Data Officer & Data Governance. Commentary by Jame Beecham, Founder, and CEO ofALTR

Seeing Data Governance at the top of this list aligns with a number of leading indicators for CDO attention and spend we have seen at ALTR. With reduced budgets and head counts, we are hearing from the industry that base level Governance topics will take priority in 2023. Things like improving data pipelines for speed of data delivery, data security, data access streamline, and quality will take precedence over initiatives like lineage or data cataloging. I think a number of data catalog projects have been stalled or remain in jeopardy as the catalog workloads tends to boil the ocean. Look for small projects within data governance being completed quickly with tightly aligned teams. Key to this will be data governance tool sets that interoperate and work together without requiring large professional services spends to realize base level data governance practices such as security and pipeline improvement.

Machine Learning and Adaptive Observability Take off at Edge. Commentary by Maitreya Natu, Chief Data Scientist, Digitate

More organizations are running Machine Learning algorithms on edge devices, which allows for faster analysis and eliminates the need to transfer large amounts of data. Instead of the traditional model of large servers that analyze large volumes of data, ML on edge opens-up creative avenues to perform analysis at the source itself. Adaptive observability presents a very promising use case for ML at the edge. Adaptive observability is an intelligent approach to collecting just right monitoring data at the right time. The basic idea is to collect high-fidelity data when the system is unhealthy, and low fidelity data otherwise. The analytics engine at the edge can profile the normal behavior, assess the system health, detect anomalies, predict future behavior, and thus recommend the right monitoring approach. Adaptive observability is thus able to collect just the right amount of data at the source, and also detect any abnormal behavior at the origin itself.

Why data pipelines without lineage are useless. Commentary by Tomas Kratky, CEO and founder of MANTA

Everyone talks about data lineage nowadays, but most people consider it for regulatory and compliance reasons. It serves as documentation for auditors asking difficult questions about data origins or data processing to prepare key performance indicators. But at its heart, data lineage represents dependencies in our systems and is a key enabler for efficient change management, making every organization truly agile. Imagine a security architect, changing the way sensitive customer data should be captured, using data lineage to instantly assess impacts of those changes on the rest of the environment. Or an analyst using data lineage to review existing sources for an executive report to decide the best way to calculate new metrics requested by the management team. Lack of lineage only leads to reduced enterprise agility, making your pipelines harder to change and thus useless. Contrary to popular opinion, you dont have to force anyone to learn what metadata is for them to benefit from data lineage. It should be activated and integrated into workflows and workspaces people normally use to do their job. For example, information about the origin of data for a key metric should be available directly in the reporting tool. Or a notification sent to an architect about a data table and associated flow that can be decommissioned because no people or processes are using it. Even a warning shared with a developer as they try to build a critical business metric using a data source with questionable data quality in their data pipeline. That, and much more, is the future of (active) data lineage.

Automation Liberation: The new Self-Service approach to Cloud Migration. Commentary by fromNext PathwayCEO Chetan Mathur

With the recent boom of AI apps, tools and chatbots there is a growing interest in cloud hyperscale providers, such as Microsoft and Google, to incorporate these tools into their cloud platform for advanced AI and ML applications. The jump in the stock value of Microsoft following the announcement of its investment in ChapGPT is testament to the fact that the market likes these innovative solutions. To have these AI tools be meaningful, massive amounts of data needs to be pushed to the cloud. Turning on AI will be predicated on moving legacy applications (code, data platforms, and data) to the cloud. The movement of legacy applications has historically been very challenging. In our research we have seen that companies are attracted to the business benefits of cloud computing, but are hesitant to move large, legacy applications. Most legacy applications have been developed over years and there tends to be application sprawl This makes it difficult for companies to know with a high degree of confidence which applications should be migrated to the cloud. The lack of visibility into the data dependencies and, data lineage further complicates matters. Moreover, those companies that have migrated workloads, either on their own or with a Global Service Integrator, have commented that the migrations took too long and were costly. To answer this challenge, whats needed is a self-service code translation product that provides a high degree of coverage and performance. Customers can select the workloads they want to translate, and select their preferred cloud target, and the solution will perform the translation, automatically.

Quantum computings biggest key in 2023. Commentary by Classiq CEO, Nir Minerbi

Its well known that quantum computing offers the potential to break many of the cryptographic protocols that currently protect our most sensitive information. As a result, there is a pressing need to develop new cryptographic protocols that are resistant to quantum attacks. Furthermore, quantum computing can be used for a variety of military purposes, from developing new materials and advanced AI, to modeling complex systems. As a result, further investment in quantum computing is critical for the United States to maintain its competitive edge in the arms race.

Is low code the key to faster go-lives and revenue realization? Commentary by Deepak Anupalli, co-founder & CTO atWaveMaker

Low code enables professional developers to build products in a visually declarative manner, by abstracting commonly used functionalities as components. These components can be dragged and dropped onto an application canvas and then configured and customized per application. Such a methodology allows developers to tackle complexity with ease and build products almost 3X faster than traditional approaches. Components that represent an entire functionality bundled with UI, logic, and data can be stored in internal repositories that can be customized for any use case across the enterprise application landscape. This allows SMEs and business users to provide their expertise while building applications, and in turn, democratizes software development leading to better ideation and innovation.

Automation in the era of big data. Commentary by Tony Lee, CTO,Hyperscience

We live in a world characterized by big datasweeping sets of structured and unstructured information that organizations can use to drive improvements and accelerate business outcomes. But despite our best efforts, humans cannot keep up with the sheer amount of information that needs to be processed. Thats why I expect to see organizations prioritize automation solutions that can support teams in navigating data and immediately increase productivity and ROI in areas such as document processing. Those advancing their organizations with machine learning (ML) and advanced automation technologies are finding the most success in designing workflows that can process large amounts of data with a human-in-the-loop approach and continuous training. This strategy provides much-needed guardrails if the technology is ever stuck or needs additional supervision, while still allowing both parties to work efficiently alongside each other. If we can educate leaders better on how to unlock the full value of automation and MLand its guiding hand to achieving operational efficiencywell see a lot of progress over the next few years.

AI vs. Humans: Has Insurance Become Too On-Demand. Commentary by Justin Kozak, Executive VP atFounder Shield

I believe insurance could qualify for being too on-demand as we have seen some of insurtechs efficiencies cause complications around coverage and scalability. Insurtech solutions that entail automated underwriting are really the focus here. While they are great for more vanilla risk or smaller companies, the accessibility to securing coverage this way has bled into more complicated risk profiles (like the fintech or crypto industries) that require a professional touch. It is two-fold: for one, tech-dependent brokers are not providing the proper guidance to clients on what to apply for and, more important, how the coverage should be structured or their company classified. This can lead to complications in the event of a claim, as misclassification or poorly structured programs can lead to claim denials or gray areas in coverage applicability. Past that, there are automated underwriting shops as well, which can create scalability issues as more significant risks are being underwritten in line with lower-risk businesses. Once a true underwriter gets eyes on what slipped through the system, they quickly non-renew or increase rates exponentially to match the true risk. Ultimately, I believe on-demand or automated insurance solutions have a place in our industry. Still, there is a need to recognize a proverbial line whereby clients, brokers, and underwriters acknowledge the need for professional attention and expertise.

How Digital Twins Improve Revenue Cycle Management. Commentary by Jim Dumond, Sr. Product Manager,VisiQuate

Digital twins are virtual representations or models of machines or systems that exist in the physical world. For example, because testing a wind turbine in the real world is time-consuming and costly, a team of scientists will instead build a digital twin that is perfect representation of a turbine. Then, the team will test the model by presenting it with different environmental factors and breakdowns. Essential to the development of digital twins is a technique known as process mining, which involves a deep dive to study a process from end-to-end, then to determine how the process deviates from expectations or produces unexpected outcomes. In the revenue cycle world, process mining requires analyzing all the data that is captured by a health system, such as order systems, referral management systems, building platforms, electronic medical records, payer data, and EDI data. Then, this disparate data must be collected into one location, enabling revenue cycle leaders to understand how the data fits together to convey the current state of an account. After performing these tasks, the health system has developed a digital twin of its revenue cycle processes. Once generated, digital twins can be used to simulate the effects that process changes may create for factors like accounts receivable, payer relationships, and staffing. Importantly, developing a digital twin must always begin with a robust data strategy to create a virtual representation of a revenue cycle encounter, such as a claim or invoice, to fully account for an entire process.

MSFT Lays Off AI Ethics Team. Commentary by Noam Harel, CMO & GM North America atClearML

Microsofts decision to fire its AI ethics team is puzzling given their recent investments in this area. Ethical considerations are critical to developing responsible AI. AI systems have the potential to cause harm, from biased decision-making to violating privacy and security. The ethical team provides a crucial oversight function, ensuring that AI technologies align with human values and rights. Without oversight, the development and deployment of AI may prioritize profits or efficiency over ethicality, leading to unintended and or harmful consequences.

Join us on Twitter:https://twitter.com/InsideBigData1

Join us on LinkedIn:https://www.linkedin.com/company/insidebigdata/

Join us on Facebook:https://www.facebook.com/insideBIGDATANOW

Heard on the Street 3/20/2023 - insideBIGDATA