[Big Data] Introduction to Sqoop

Introduction to Sqoop

Sqoop (pronounced: skup) is the abbreviation of SQL-to-Hadoop. It is an open source tool mainly used to exchange data between Hadoop and relational data, which can improve the interoperability of data.

Through Sqoop, you can easily import data from MySQL, Oracle, PostgreSQL and other relational databases into Hadoop (for example, into HDFS, Hbase or Hive),

Or export data from Hadoop to relational database, which makes data migration between traditional relational database and Hadoop very convenient.

Sqoop, similar to other ETL tools, uses a metadata model to determine the type of data and ensure type-safe data processing when the data is transferred from the data source to Hadoop.

Sqoop is designed for large data batch transmission. It can split data sets and create Hadoop tasks to process each block.

Partly taken from-"Spark Programming Fundamentals" book.

Guess you like

Origin blog.csdn.net/debimeng/article/details/102653383