Introduction to Presto of Big Data

1. What is PRESTO

Presto is an open source distributed SQL query engine, suitable for interactive analytical queries, and the data volume supports GB to PB bytes.

Presto was designed and written entirely to address the interactive analysis and processing speed of a commercial data warehouse the size of Facebook.

 

 

2. What can PRESTO do

Presto supports online data queries, including Hive, Cassandra, relational databases and proprietary data stores. A single Presto query can combine data from multiple data sources for analysis across the entire organization.

Presto targets the needs of analysts who expect response times of less than a second to minutes. Presto ends the data analysis dilemma of using a fast and expensive commercial solution, or a slow "free" solution that consumes a lot of hardware.

 

3. The operating principle of PRESTO

Presto is a distributed system running on multiple servers. A full installation includes a coordinator and multiple workers. Queries are submitted by the client and submitted to the coordinator from the Presto command line CLI. The coordinator parses, analyzes and executes the query plan, and then distributes the processing queue to the workers.



 

4. Who is using PRESTO

Facebook uses Presto for interactive queries for multiple internal data stores, including a 300PB data warehouse. More than 1,000 Facebook employees use Presto every day, executing more than 30,000 queries and scanning more than 1PB of data.

Leading internet companies including Airbnb and Dropbox use Presto.

 

Presto is amazing. Principal engineer Andy Kramolisch just had it in production for a few days. It is orders of magnitude faster than Hive in most cases. Unlike Redshift, which reads data directly from HDFS, it doesn't require a lot of ETL operations before use, and it works.

                                       ---------Christopher Gutierrez, Online Analytics Manager, Airbnb

We are very excited about Presto. We intend to use it to quickly capture the different ways users use Dropbox, as well as diagnose problems they encounter. In our current tests, applied to some of the most important ad hoc use cases, it was stable and very fast.

                                       -------------Fred Wulff, Software Engineer, Dropbox

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326872160&siteId=291194637