The world of big data is only getting bigger: Organizations of all stripes are producing more data, in various forms, year after year. The ever-increasing volume and variety of data is driving companies to invest more in big data tools and technologies as they look to use all that data to improve operations, better understand customers, deliver products faster and gain other business benefits through analytics applications.
Enterprise data leaders have a multitude of choices on big data technologies, with numerous commercial products available to help organizations implement a full range of data-driven analytics initiatives -- from real-time reporting to machine learning applications.
In addition, there are many open source big data tools, some of which are also offered in commercial versions or as part of big data platforms and managed services. Here are 18 popular open source tools and technologies for managing and analyzing big data, listed in alphabetical order with a summary of their key features and capabilities. The tools were chosen based on information from sources such as Capterra, G2 and Gartner as well as vendor websites.
Airflow is a workflow management platform for scheduling and running complex data pipelines in big data systems. It enables data engineers and other users to ensure that each task in a workflow is executed in the designated order and has access to the required system resources. Airflow is also promoted as easy to use: Workflows are created in the Python programming language, and it can be used for building machine learning models, transferring data and various other purposes.
The platform originated at Airbnb in late 2014 and was officially announced as an open source technology in mid-2015; it joined the Apache Software Foundation's incubator program the following year and became an Apache top-level project in 2019. Airflow also includes the following key features:
Databricks Inc., a software vendor founded by the creators of the Spark processing engine, developed Delta Lake and then open sourced the Spark-based technology in 2019 through the Linux Foundation. The company describes Delta Lake as "an open format storage layer that delivers reliability, security and performance on your data lake for both streaming and batch operations."
Delta Lake doesn't replace data lakes; rather, it's designed to sit on top of them and create a single home for structured, semistructured and unstructured data, eliminating data silos that can stymie big data applications. Furthermore, using Delta Lake can help prevent data corruption, enable faster queries, increase data freshness and support compliance efforts, according to Databricks. The technology also comes with the following features:
The Apache Drill website describes it as "a low latency distributed query engine for large-scale datasets, including structured and semi-structured/nested data." Drill can scale across thousands of cluster nodes and is capable of querying petabytes of data by using SQL and standard connectivity APIs.
Designed for exploring sets of big data, Drill layers on top of multiple data sources, enabling users to query a wide range of data in different formats, from Hadoop sequence files and server logs to NoSQL databases and cloud object storage. It can also do the following:
Druid is a real-time analytics database that delivers low latency for queries, high concurrency, multi-tenant capabilities and instant visibility into streaming data. Multiple end users can query the data stored in Druid at the same time with no impact on performance, according to its proponents.
Written in Java and created in 2011, Druid became an Apache technology in 2018. It's generally considered a high-performance alternative to traditional data warehouses that's best suited to event-driven data. Like a data warehouse, it uses column-oriented storage and can load files in batch mode. But it also incorporates features from search systems and time series databases, including the following:
Another Apache open source technology, Flink is a stream processing framework for distributed, high-performing and always-available applications. It supports stateful computations over both bounded and unbounded data streams and can be used for batch, graph and iterative processing.
One of the main benefits touted by Flink's proponents is its speed: It can process millions of events in real time for low latency and high throughput. Flink, which is designed to run in all common cluster environments, also includes the following features:
A distributed framework for storing data and running applications on clusters of commodity hardware, Hadoop was developed as a pioneering big data technology to help handle the growing volumes of structured, unstructured and semistructured data. First released in 2006, it was almost synonymous with big data early on; it has since been partially eclipsed by other technologies but is still widely used.
Hadoop has four primary components:
Initially, Hadoop was limited to running MapReduce batch applications. The addition of YARN in 2013 opened it up to other processing engines and use cases, but the framework is still closely associated with MapReduce. The broader Apache Hadoop ecosystem also includes various big data tools and additional frameworks for processing, managing and analyzing big data.
Hive is SQL-based data warehouse infrastructure software for reading, writing and managing large data sets in distributed storage environments. It was created by Facebook but then open sourced to Apache, which continues to develop and maintain the technology.
Hive runs on top of Hadoop and is used to process structured data; more specifically, it's used for data summarization and analysis, as well as for querying large amounts of data. Although it can't be used for online transaction processing, real-time updates, and queries or jobs that require low-latency data retrieval, Hive is described by its developers as scalable, fast and flexible.
Other key features include the following:
HPCC Systems is a big data processing platform developed by LexisNexis before being open sourced in 2011. True to its full name -- High-Performance Computing Cluster Systems -- the technology is, at its core, a cluster of computers built from commodity hardware to process, manage and deliver big data.
A production-ready data lake platform that enables rapid development and data exploration, HPCC Systems includes three main components:
Hudi (pronounced hoodie) is short for Hadoop Upserts Deletes and Incrementals. Another open source technology maintained by Apache, it's used to manage the ingestion and storage of large analytics data sets on Hadoop-compatible file systems, including HDFS and cloud object storage services.
First developed by Uber, Hudi is designed to provide efficient and low-latency data ingestion and data preparation capabilities. Moreover, it includes a data management framework that organizations can use to do the following:
Iceberg is an open table format used to manage data in data lakes, which it does partly by tracking individual data files in tables rather than by tracking directories. Created by Netflix for use with the company's petabyte-sized tables, Iceberg is now an Apache project. According to the project's website, Iceberg typically "is used in production where a single table can contain tens of petabytes of data."
Designed to improve on the standard layouts that exist within tools such as Hive, Presto, Spark and Trino, the Iceberg table format has functions similar to SQL tables in relational databases. However, it also accommodates multiple engines operating on the same data set. Other notable features include the following:
Kafka is a distributed event streaming platform that, according to Apache, is used by more than 80% of Fortune 100 companies and thousands of other organizations for high-performance data pipelines, streaming analytics, data integration and mission-critical applications. In simpler terms, Kafka is a framework for storing, reading and analyzing streaming data.
The technology decouples data streams and systems, holding the data streams so they can then be used elsewhere. It runs in a distributed environment and uses a high-performance TCP network protocol to communicate with systems and applications. Kafka was created by LinkedIn before being passed on to Apache in 2011.
The following are some of the key components in Kafka:
Kylin is a distributed data warehouse and analytics platform for big data. It provides an online analytical processing (OLAP) engine designed to support extremely large data sets. Because Kylin is built on top of other Apache technologies -- including Hadoop, Hive, Parquet and Spark -- it can easily scale to handle those large data loads, according to its backers.
It's also fast, delivering query responses measured in milliseconds. In addition, Kylin provides an ANSI SQL interface for multidimensional analysis of big data and integrates with Tableau, Microsoft Power BI and other BI tools. Kylin was initially developed by eBay, which contributed it as an open source technology in 2014; it became a top-level project within Apache the following year. Other features it provides include the following:
Pinot is a real-time distributed OLAP data store built to support low-latency querying by analytics users. Its design enables horizontal scaling to deliver that low latency even with large data sets and high throughput. To provide the promised performance, Pinot stores data in a columnar format and uses various indexing techniques to filter, aggregate and group data. In addition, configuration changes can be done dynamically without affecting query performance or data availability.
According to Apache, Pinot can handle trillions of records overall while ingesting millions of data events and processing thousands of queries per second. The system has a fault-tolerant architecture with no single point of failure and assumes all stored data is immutable, although it also works with mutable data. Started in 2013 as an internal project at LinkedIn, Pinot was open sourced in 2015 and became an Apache top-level project in 2021.
The following features are also part of Pinot:
Formerly known as PrestoDB, this open source SQL query engine can simultaneously handle both fast queries and large data volumes in distributed data sets. Presto is optimized for low-latency interactive querying and it scales to support analytics applications across multiple petabytes of data in data warehouses and other repositories.
Development of Presto began at Facebook in 2012. When its creators left the company in 2018, the technology split into two branches: PrestoDB, which was still led by Facebook, and PrestoSQL, which the original developers launched. That continued until December 2020, when PrestoSQL was renamed Trino and PrestoDB reverted to the Presto name. The Presto open source project is now overseen by the Presto Foundation, which was set up as part of the Linux Foundation in 2019.
Presto also includes the following features:
Samza is a distributed stream processing system that was built by LinkedIn and is now an open source project managed by Apache. According to the project website, Samza enables users to build stateful applications that can do real-time processing of data from Kafka, HDFS and other sources.
The system can run on top of Hadoop YARN or Kubernetes and also offers a standalone deployment option. The Samza site says it can handle "several terabytes" of state data, with low latency and high throughput for fast data analysis. Via a unified API, it can also use the same code written for data streaming jobs to run batch applications. Other features include the following:
Apache Spark is an in-memory data processing and analytics engine that can run on clusters managed by Hadoop YARN, Mesos and Kubernetes or in a standalone mode. It enables large-scale data transformations and analysis and can be used for both batch and streaming applications, as well as machine learning and graph processing use cases. That's all supported by the following set of built-in modules and libraries:
Data can be accessed from various sources, including HDFS, relational and NoSQL databases, and flat-file data sets. Spark also supports various file formats and offers a diverse set of APIs for developers.
But its biggest calling card is speed: Spark's developers claim it can perform up to 100 times faster than traditional counterpart MapReduce on batch jobs when processing in memory. As a result, Spark has become the top choice for many batch applications in big data environments, while also functioning as a general-purpose engine. First developed at the University of California, Berkeley, and now maintained by Apache, it can also process on disk when data sets are too large to fit into the available memory.
Another Apache open source technology, Storm is a distributed real-time computation system that's designed to reliably process unbounded streams of data. According to the project website, it can be used for applications that include real-time analytics, online machine learning and continuous computation, as well as extract, transform and load jobs.
Storm clusters are akin to Hadoop ones, but applications continue to run on an ongoing basis unless they're stopped. The system is fault-tolerant and guarantees that data will be processed. In addition, the Apache Storm site says it can be used with any programming language, message queueing system and database. Storm also includes the following elements:
As mentioned above, Trino is one of the two branches of the Presto query engine. Known as PrestoSQL until it was rebranded in December 2020, Trino "runs at ludicrous speed," in the words of the Trino Software Foundation. That group, which oversees Trino's development, was originally formed in 2019 as the Presto Software Foundation; its name was also changed as part of the rebranding.
Trino enables users to query data regardless of where it's stored, with support for natively running queries in Hadoop and other data repositories. Like Presto, Trino also is designed for the following:
NoSQL databases are another major type of big data technology. They break with conventional SQL-based relational database design by supporting flexible schemas, which makes them well suited for handling huge volumes of all types of data -- particularly unstructured and semistructured data that isn't a good fit for the strict schemas used in relational systems.
NoSQL software emerged in the late 2000s to help address the increasing amounts of diverse data that organizations were generating, collecting and looking to analyze as part of big data initiatives. Since then, NoSQL databases have been widely adopted and are now used in enterprises across industries. Many are open source technologies that are also offered in commercial versions by vendors, while some are proprietary products controlled by a single vendor.
In addition, NoSQL databases themselves come in various types that support different big data applications. These are the four major NoSQL categories, with examples of the available technologies in each one:
Multimodel databases have also been created with support for different NoSQL approaches, as well as SQL in some cases; MarkLogic Server and Microsoft's Azure Cosmos DB are examples. Many other NoSQL vendors have added multimodel support to their databases. For example, Couchbase Server now supports key-value pairs, and Redis offers document and graph database modules.
Mary K. Pratt is an award-winning freelance journalist with a focus on covering enterprise IT and cybersecurity management.
Editor's note: This unranked list is based on web research of big data tools that cover a diverse range of capabilities and features. Our research included data from respected research firms, including Gartner and Capterra, as well as vendor websites.
Visit link:
18 Top Big Data Tools and Technologies to Know About in 2024 - TechTarget
- Global Data Science Platform Market Report 2020 Industry Trends, Share and Size, Complete Data Analysis across the Region and Globe, Opportunities and... [Last Updated On: November 11th, 2020] [Originally Added On: November 11th, 2020]
- Data Science and Machine-Learning Platforms Market Size, Drivers, Potential Growth Opportunities, Competitive Landscape, Trends And Forecast To 2027 -... [Last Updated On: November 11th, 2020] [Originally Added On: November 11th, 2020]
- Industrial Access Control Market 2020-28 use of data science in agriculture to maximize yields and efficiency with top key players - TechnoWeekly [Last Updated On: November 11th, 2020] [Originally Added On: November 11th, 2020]
- IPG Unveils New-And-Improved Copy For Data: It's Not Your Father's 'Targeting' 11/11/2020 - MediaPost Communications [Last Updated On: November 11th, 2020] [Originally Added On: November 11th, 2020]
- Risks and benefits of an AI revolution in medicine - Harvard Gazette [Last Updated On: November 11th, 2020] [Originally Added On: November 11th, 2020]
- UTSA to break ground on $90 million School of Data Science and National Security Collaboration Center - Construction Review [Last Updated On: November 11th, 2020] [Originally Added On: November 11th, 2020]
- Addressing the skills shortage in data science and analytics - IT-Online [Last Updated On: November 11th, 2020] [Originally Added On: November 11th, 2020]
- Data Science Platform Market Research Growth by Manufacturers, Regions, Type and Application, Forecast Analysis to 2026 - Eurowire [Last Updated On: November 11th, 2020] [Originally Added On: November 11th, 2020]
- 2020 AI and Data Science in Retail Industry Ongoing Market Situation with Manufacturing Opportunities: Amazon Web Services, Baidu Inc., BloomReach... [Last Updated On: November 11th, 2020] [Originally Added On: November 11th, 2020]
- Endowed Chair of Data Science job with Baylor University | 299439 - The Chronicle of Higher Education [Last Updated On: November 11th, 2020] [Originally Added On: November 11th, 2020]
- Data scientists gather 'chaos into something organized' - University of Miami [Last Updated On: November 11th, 2020] [Originally Added On: November 11th, 2020]
- AI Update: Provisions in the National Defense Authorization Act Signal the Importance of AI to American Competitiveness - Lexology [Last Updated On: January 12th, 2021] [Originally Added On: January 12th, 2021]
- Healthcare Innovations: Predictions for 2021 Based on the Viewpoints of Analytics Thought Leaders and Industry Experts | Quantzig - Business Wire [Last Updated On: January 12th, 2021] [Originally Added On: January 12th, 2021]
- Poor data flows hampered governments Covid-19 response, says the Science and Technology Committee - ComputerWeekly.com [Last Updated On: January 12th, 2021] [Originally Added On: January 12th, 2021]
- Ilia Dub and Jasper Yip join Oliver Wyman's Asia partnership - Consultancy.asia [Last Updated On: January 12th, 2021] [Originally Added On: January 12th, 2021]
- Save 98% off the Complete Excel, VBA, and Data Science Certification Training Bundle - Neowin [Last Updated On: January 12th, 2021] [Originally Added On: January 12th, 2021]
- Data Science for Social Good Programme helps Ofsted and World Bank - India Education Diary [Last Updated On: January 12th, 2021] [Originally Added On: January 12th, 2021]
- Associate Professor of Fisheries Oceanography named a Cooperative Institute for the North Atlantic Region (CINAR) Fellow - UMass Dartmouth [Last Updated On: January 12th, 2021] [Originally Added On: January 12th, 2021]
- Rapid Insight To Host Free Webinar, Building on Data: From Raw Piles to Data Science - PR Web [Last Updated On: January 12th, 2021] [Originally Added On: January 12th, 2021]
- This Is the Best Place to Buy Groceries, New Data Finds | Eat This Not That - Eat This, Not That [Last Updated On: January 12th, 2021] [Originally Added On: January 12th, 2021]
- Which Technology Jobs Will Require AI and Machine Learning Skills? - Dice Insights [Last Updated On: January 12th, 2021] [Originally Added On: January 12th, 2021]
- Companies hiring data scientists in NYC and how much they pay - Business Insider [Last Updated On: January 12th, 2021] [Originally Added On: January 12th, 2021]
- Calling all rock stars: hire the right data scientist talent for your business - IDG Connect [Last Updated On: January 12th, 2021] [Originally Added On: January 12th, 2021]
- How Professors Can Use AI to Improve Their Teaching In Real Time - EdSurge [Last Updated On: January 12th, 2021] [Originally Added On: January 12th, 2021]
- BCG GAMMA, in Collaboration with Scikit-Learn, Launches FACET, Its New Open-Source Library for Human-Explainable Artificial Intelligence - PRNewswire [Last Updated On: January 12th, 2021] [Originally Added On: January 12th, 2021]
- Data Science Platform Market Insights, Industry Outlook, Growing Trends and Demands 2020 to 2025 The Courier - The Courier [Last Updated On: January 31st, 2021] [Originally Added On: January 31st, 2021]
- UBIX and ORS GROUP announce partnership to democratize advanced analytics and AI for small and midmarket organizations - PR Web [Last Updated On: January 31st, 2021] [Originally Added On: January 31st, 2021]
- Praxis Business School is launching its Post Graduate Program in Data Engineering in association with Knowledge Partners - Genpact and LatentView... [Last Updated On: January 31st, 2021] [Originally Added On: January 31st, 2021]
- What's So Trendy about Knowledge Management Solutions Market That Everyone Went Crazy over It? | Bloomfire, CSC (American Productivity & Quality... [Last Updated On: January 31st, 2021] [Originally Added On: January 31st, 2021]
- Want to work in data? Here are 6 skills you'll need Just now - Siliconrepublic.com [Last Updated On: January 31st, 2021] [Originally Added On: January 31st, 2021]
- Data, AI and babies - BusinessLine [Last Updated On: January 31st, 2021] [Originally Added On: January 31st, 2021]
- Here's how much Amazon pays its Boston-based employees - Business Insider [Last Updated On: January 31st, 2021] [Originally Added On: January 31st, 2021]
- Datavant and Kythera Increase the Value Of Healthcare Data Through Expanded Data Science Platform Partnership - GlobeNewswire [Last Updated On: January 31st, 2021] [Originally Added On: January 31st, 2021]
- O'Reilly Analysis Unveils Python's Growing Demand as Searches for Data Science, Cloud, and ITOps Topics Accelerate - Business Wire [Last Updated On: January 31st, 2021] [Originally Added On: January 31st, 2021]
- Book Review: Hands-On Exploratory Data Analysis with Python - insideBIGDATA [Last Updated On: January 31st, 2021] [Originally Added On: January 31st, 2021]
- The 12 Best R Courses and Online Training to Consider for 2021 - Solutions Review [Last Updated On: January 31st, 2021] [Originally Added On: January 31st, 2021]
- Software AG's TrendMiner 2021.R1 Release Puts Data Science in the Hands of Operational Experts - Yahoo Finance [Last Updated On: January 31st, 2021] [Originally Added On: January 31st, 2021]
- The chief data scientist: Who they are and what they do - Siliconrepublic.com [Last Updated On: January 31st, 2021] [Originally Added On: January 31st, 2021]
- Berkeley's data science leader dedicated to advancing diversity in computing - UC Berkeley [Last Updated On: January 31st, 2021] [Originally Added On: January 31st, 2021]
- Awful Earnings Aside, the Dip in Alteryx Stock Is Worth Buying - InvestorPlace [Last Updated On: February 12th, 2021] [Originally Added On: February 12th, 2021]
- Why Artificial Intelligence May Not Offer The Business Value You Think - CMSWire [Last Updated On: February 12th, 2021] [Originally Added On: February 12th, 2021]
- Getting Prices Right in 2021 - Progressive Grocer [Last Updated On: February 12th, 2021] [Originally Added On: February 12th, 2021]
- Labelbox raises $40 million for its data labeling and annotation tools - VentureBeat [Last Updated On: February 12th, 2021] [Originally Added On: February 12th, 2021]
- How researchers are using data science to map wage theft - SmartCompany.com.au [Last Updated On: February 12th, 2021] [Originally Added On: February 12th, 2021]
- Ready to start coding? What you need to know about Python - TechRepublic [Last Updated On: February 12th, 2021] [Originally Added On: February 12th, 2021]
- Women changing the face of science in the Middle East and North Africa - The Jerusalem Post [Last Updated On: February 12th, 2021] [Originally Added On: February 12th, 2021]
- Mapping wage theft with data science - The Mandarin [Last Updated On: February 12th, 2021] [Originally Added On: February 12th, 2021]
- Data Science Platform Market 2021 Analysis Report with Highest CAGR and Major Players like || Dataiku, Bridgei2i Analytics, Feature Labs and More KSU... [Last Updated On: February 12th, 2021] [Originally Added On: February 12th, 2021]
- Data Science Impacting the Pharmaceutical Industry, 2020 Report: Focus on Clinical Trials - Data Science-driven Patient Selection & FDA... [Last Updated On: February 12th, 2021] [Originally Added On: February 12th, 2021]
- App Annie Sets New Bar for Mobile Analytics with Data Science Innovations - PRNewswire [Last Updated On: February 12th, 2021] [Originally Added On: February 12th, 2021]
- Data Science and Analytics Market 2021 to Showing Impressive Growth by 2028 | Industry Trends, Share, Size, Top Key Players Analysis and Forecast... [Last Updated On: February 12th, 2021] [Originally Added On: February 12th, 2021]
- How Can We Fix the Data Science Talent Shortage? Machine Learning Times - The Predictive Analytics Times [Last Updated On: February 14th, 2021] [Originally Added On: February 14th, 2021]
- Opinion: How to secure the best tech talent | Human Capital - Business Chief [Last Updated On: February 14th, 2021] [Originally Added On: February 14th, 2021]
- Following the COVID science: what the data say about the vaccine, social gatherings and travel - Chicago Sun-Times [Last Updated On: February 14th, 2021] [Originally Added On: February 14th, 2021]
- Automated Data Science and Machine Learning Platforms Market Technological Growth and Precise Outlook 2021- Microsoft, MathWorks, SAS, Databricks,... [Last Updated On: February 14th, 2021] [Originally Added On: February 14th, 2021]
- 9 investors discuss hurdles, opportunities and the impact of cloud vendors in enterprise data lakes - TechCrunch [Last Updated On: February 14th, 2021] [Originally Added On: February 14th, 2021]
- Rapid Insight to Present at Data Science Salon's Healthcare, Finance, and Technology Virtual Event - PR Web [Last Updated On: February 14th, 2021] [Originally Added On: February 14th, 2021]
- Aunalytics Acquires Naveego to Expand Capabilities of its End-to-End Cloud-Native Data Platform to Enable True Digital Transformation for Customers -... [Last Updated On: February 22nd, 2021] [Originally Added On: February 22nd, 2021]
- Tech Careers: In-demand Courses to watch out for a Lucrative Future - Big Easy Magazine [Last Updated On: February 22nd, 2021] [Originally Added On: February 22nd, 2021]
- Willis Towers Watson enhances its human capital data science capabilities globally with the addition of the Jobable team - GlobeNewswire [Last Updated On: February 22nd, 2021] [Originally Added On: February 22nd, 2021]
- Global Data Science Platform Market 2021 Industry Insights, Drivers, Top Trends, Global Analysis And Forecast to 2027 KSU | The Sentinel Newspaper -... [Last Updated On: February 22nd, 2021] [Originally Added On: February 22nd, 2021]
- A Comprehensive Guide to Scikit-Learn - Built In [Last Updated On: February 22nd, 2021] [Originally Added On: February 22nd, 2021]
- Industry VoicesBuilding ethical algorithms to confront biases: Lessons from Aotearoa New Zealand - FierceHealthcare [Last Updated On: February 22nd, 2021] [Originally Added On: February 22nd, 2021]
- How Intel Employees Volunteered Their Data Science Expertise To Help Costa Rica Save Lives During the Pandemic - CSRwire.com [Last Updated On: February 22nd, 2021] [Originally Added On: February 22nd, 2021]
- Learn About Innovations in Data Science and Analytic Automation on an Upcoming Episode of the Advancements Series - Yahoo Finance [Last Updated On: February 22nd, 2021] [Originally Added On: February 22nd, 2021]
- Symposium aimed at leveraging the power of data science for promoting diversity - Penn State News [Last Updated On: February 22nd, 2021] [Originally Added On: February 22nd, 2021]
- Rochester to advance research in biological imaging through new grant - University of Rochester [Last Updated On: February 22nd, 2021] [Originally Added On: February 22nd, 2021]
- SoftBank Joins Initiative to Train Diverse Talent in Data Science and AI - Entrepreneur [Last Updated On: February 22nd, 2021] [Originally Added On: February 22nd, 2021]
- Participating in SoftBank/ Correlation One Initiative - Miami - City of Miami [Last Updated On: February 22nd, 2021] [Originally Added On: February 22nd, 2021]
- Increasing Access to Care with the Help of Big Data | Research Blog - Duke Today [Last Updated On: February 22nd, 2021] [Originally Added On: February 22nd, 2021]
- Heres how Data Science & Business Analytics expertise can put you on the career expressway - Times of India [Last Updated On: March 14th, 2021] [Originally Added On: March 14th, 2021]
- Yelp data shows almost half a million new businesses opened during the pandemic - CNBC [Last Updated On: March 14th, 2021] [Originally Added On: March 14th, 2021]
- Postdoctoral Position in Transient and Multi-messenger Astronomy Data Science in Greenbelt, MD for University of MD Baltimore County/CRESST II -... [Last Updated On: March 14th, 2021] [Originally Added On: March 14th, 2021]
- DefinedCrowd CEO Daniela Braga on the future of AI, training data, and women in tech - GeekWire [Last Updated On: March 14th, 2021] [Originally Added On: March 14th, 2021]
- Gartner: AI and data science to drive investment decisions rather than "gut feel" by mid-decade - TechRepublic [Last Updated On: March 14th, 2021] [Originally Added On: March 14th, 2021]
- Jupyter has revolutionized data science, and it started with a chance meeting between two students - TechRepublic [Last Updated On: March 14th, 2021] [Originally Added On: March 14th, 2021]
- Working at the intersection of data science and public policy | Penn Today - Penn Today [Last Updated On: March 14th, 2021] [Originally Added On: March 14th, 2021]
- The Future of AI: Careers in Machine Learning - Southern New Hampshire University [Last Updated On: April 4th, 2021] [Originally Added On: April 4th, 2021]
- SMU meets the opportunities of the data-driven world with cutting-edge research and data science programs - The Dallas Morning News [Last Updated On: April 4th, 2021] [Originally Added On: April 4th, 2021]
- Data, Science, and Journalism in the Age of COVID - Pulitzer Center on Crisis Reporting [Last Updated On: April 4th, 2021] [Originally Added On: April 4th, 2021]