Year-end review: 10 hot big data startups in 2023

Big data startups are continuously developing leading technologies to help enterprises access, collect, manage, move, transform, analyze, understand, measure, govern, maintain and protect data. Let’s take a look at the ten big data startups that will attract much attention in 2023.

Big data, big ambitions

Data has become a valuable asset for many businesses and organizations. They are analyzing data to gain insights about markets, customers and their own operations. They are leveraging data to drive digital transformation initiatives and support new data-intensive services. Large amounts of data are also an important part of artificial intelligence and machine learning initiatives.

But organizing, managing and analyzing data is a major challenge today. According to market research firm IDC, the total amount of data created, captured, copied and consumed is growing at more than 20% annually and is expected to reach approximately 291 ZB by 2027.

That’s why big data startups are constantly developing leading technologies to help enterprises access, collect, manage, move, transform, analyze, understand, measure, govern, maintain and protect data.

Let’s take a look at the 10 big data startups that will attract much attention in 2023.

Air exchange

Co-Founder and CEO: Michel Tricot

Moving data from operational applications and databases to data warehouses, data lakes, and other analytics systems is one of the most challenging steps in data analytics.

There are many commercial data movement and integration tools on the market, but Airbyte is attracting attention with its open source data movement/data integration engine and connectors for setting up and running data movement operations.

In September, the company said that in just three months, Airbyte's user community had built more than 1,500 data connectors using Airbyte's code-free connector builder, which it launched in June. In October, the company announced the launch of its Vector Database Connector, which is critical for connecting data sources to AI applications.

Airbyte was founded in 2020 and is headquartered in San Francisco, USA. It received US$150 million in Series B financing in December 2021.

Astronomer

CEO: Andy Byron

Astronomer developed the Astro unified data orchestration platform to centralize visibility and control of data flows and simplify data pipeline deployment. The system can help enterprises and organizations scale large-scale data integration, data analysis, and AI and machine learning tasks to meet the data needs of critical financial services, retail and e-commerce applications.

Astro is based on the open source Apache Airflow workflow management technology (originally developed by Airbnb) for data engineering pipelines.

On December 6, Astronomer launched the latest version of Astro with simplified connection management capabilities, new system upgrade utilities and new system deployment capabilities to reduce operating costs.

Founded in 2018 and based in Cincinnati and San Francisco, Astronomer raised $213 million in a Series C round of funding in March 2022. Astronomer laid off employees in early 2023, but according to a September report, Astronomer's revenue increased 206% year-over-year in the first half of this year.

Hex

Co-founder, CEO: Barry McCardel

The big data industry has numerous companies that have developed sophisticated technologies for managing, integrating, transforming, analyzing, and visualizing data, but sharing and publishing the results of analytical tasks remains a challenge.

Hex Technologies develops the Hex Platform, a modern data workspace system for collaborative analytics and data science tasks. The platform includes AI-powered tools, collaborative data notebooks, tools for developing apps with data visualization, and data integration technology—all of which make it possible to connect and analyze data and share work using interactive data apps and stories.

Headquartered in San Francisco, USA, Hex was founded in 2019 by McCardel, Chief Technology Officer Caitlin Colgrove, and Chief Architect Glen Takahashi, who previously worked together at Palantir. Hex raised $52 million in Series B funding in March 2022.

In October this year, Hex launched Hex 3.0, which includes new AI capabilities, a new calculation engine, a new metadata engine, and an App Builder tool for turning insights into interactive experiences. Earlier this year, Hex launched the Hex Magic tool, which brings the power of large language models directly into the Hex workspace.

Time

Co-founder, CEO: Khawaja Shams

Momento came out of stealth mode in November 2022 with the Momento Serverless Cache product, which can optimize and accelerate any database running on AWS or Google Cloud.

Caching speeds up database response by delivering commonly used or frequently used data more quickly. But Momento's founders believe that today's caching technology is not designed for today's modern cloud stacks. Momento says its highly available Momento caching technology can perform millions of these processes per second and runs as a backend-as-a-service platform, meaning users don’t need to manage the infrastructure.

Headquartered in Seattle, USA, Momento was co-founded by CEO Khawaja Shams and CTO Daniela Miao, who previously worked at AWS and were the engineering leads behind AWS DynamoDB, Amazon’s proprietary NoSQL database service.

MotherDuck

Co-founder, CEO: Jordan Tigani

On June 22, MotherDuck launched the first version of the MotherDuck serverless cloud analytics platform, which combines cloud and embedded database technology to easily analyze data no matter where it resides.

The platform is based on MotherDuck's DuckDB open source embedded database. By combining the speed of an in-process database with the scalability of the cloud, this cloud system can easily analyze data of any size.

MotherDuck believes that most of the advances in data analytics in recent years have been geared towards large enterprises and organizations with more than petabytes of data, while ignoring small and medium-sized companies with similar amounts of data.

Headquartered in Seattle, USA, MotherDuck was co-founded in 2022 by Jordan Tigani, a founding engineer of Google BigQuery who now serves as the company's CEO. In September this year, MotherDuck received US$52.5 million in Series B financing, bringing the total financing to US$100 million.

Onehouse

Founder and CEO: Vinoth Chandar

Startup Onehouse bills itself as "the new cornerstone of data," laying the foundation for a cloud-native, fully managed data lake site service.

The company's service is based on Apache Hudi, an open source transactional data lake project that brings database and data warehouse capabilities to data lakes, with the goal of acting as a data integration layer between different data repositories.

Onehouse was founded in 2021 and is headquartered in Menlo Park, California, USA. It came out of stealth mode in early 2022.

In February this year, Onehouse received $25 million in Series A financing. Onehouse also launched new Onetable technology, allowing users to leverage Hudi-based data lakehouses while taking advantage of native performance acceleration capabilities in Databricks and Snowflake.

Starburst

Co-founder, CEO: Justin Borgman

Starburst, a data lake analytics platform development company, was founded in 2017 and is one of the more mature startups in the big data field. But Starburst continues to accelerate momentum with its core MPP SQL query engine (built on Trino open source technology), which enables querying of large data sets spread across multiple data sources.

Starburst's product portfolio includes the Starburst Enterprise platform and Starburst Galaxy fully managed cloud services. In September, Starburst expanded both capabilities with new cloud migration capabilities, including native connectivity in Starburst Galaxy, followed in November by launching new capabilities for building interactive applications on the Starburst Data Lake, including for near real-time Streaming ingestion for analytics and automated data governance.

Headquartered in Boston, USA, Starburst received US$250 million in Series D financing in February 2022, bringing the total financing to US$414 million, and its valuation at the time reached US$3.35 billion.

Telma

Co-founder, CEO: Mona Rakibe

Data observability is one of the most active areas in big data, with a number of startups emerging over the past five years offering technology to monitor data flows to improve data quality and reliability.

Founded in 2020 and headquartered in San Francisco, USA, Telmai is one of the newer startups. Telmai's AI-driven data observability platform helps data teams automatically monitor data pipeline processes using a range of data quality metrics and KPIs, and proactively detect and investigate data anomalies in real-time.

Telmai released a new version of its software in September this year, which contains many new features designed to simplify and accelerate the adoption of data observability, including "time travel" retrospective analysis of historical data, private cloud options across three major public clouds , and end-to-end observability for heterogeneous data pipelines.

Telmai received $5.5 million in seed funding in June this year.

Tessell

Co-founder, CEO: Bala Kuchibhotla

Tessel takes a different approach than traditional cloud databases. Tessel's cloud-native managed database-as-a-service does not use its own underlying proprietary database engine, but supports Oracle, Microsoft SQL Server, Postgres and MySQL databases.

Tessel said that with its unique design of data infrastructure and management platform running on Azure or AWS cloud platform, it can run heavy transaction database workloads with higher performance and lower cost.

Headquartered in San Ramon, California, Tessell was founded in 2021 by CEO Bala Kuchibhotla and VP/Head of Engineering Kamal Khanuja, both of whom previously worked at Nutanix and Oracle. Tessell received $34 million in Series A funding from Lightspeed Venture Partners in November 2022.

sold

Co-founder, CEO: Tim Wagner

Vendia has developed a data collaboration platform based on blockchain technology to help organizations overcome "data sprawl" by enabling real-time data sharing and workflow automation across companies, clouds, systems and business networks.

Vendia (the company name comes from a "Venn diagram" showing overlapping data sets) was founded in 2020 and is headquartered in San Francisco, USA. Vendia received US$30 million in Series B financing in May 2022, bringing total financing to US$50 million. 

Guess you like

Origin blog.csdn.net/leyang0910/article/details/135118643