Use docker to install Cassandra and perform instance operations

Pull the Cassandra image in docker

Enter the following command in the terminal: (provided that docker has been installed)

docker pull cassandra

Create docker network

Docker has the following network types:

bridge:多由于独立container之间的通信
host: 直接使用宿主机的网络,端口也使用宿主机的
overlay:当有多个docker主机时,跨主机的container通信
macvlan:每个container都有一个虚拟的MAC地址
none: 禁用网络

Default network: By default, Docker will establish a bridge, a host and a none network:

docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
cdc35147c3ae        bridge              bridge              local
0418807a829c        host                host                local
a4b53107ae28        none                null                local

Create command

docker network create [OPTIONS] NETWORK

Note: Both docker client and daemon API must be at least 1.21 to use this command. Use the docker version command to check the client and daemon API versions.

options parameter

Name, shorthand Description
--attachable Enable manual container attachment
--aux-address Auxiliary IPv4 or IPv6 addresses used by Network driver
--config-from The network from which copying the configuration
--config-only Create a configuration only network
--driver , -d Driver to manage the Network
--gateway IPv4 or IPv6 Gateway for the master subnet

The bridge network is used by default when the options parameter is not added

Create a Cassandra instance

docker run --name some-cassandra --network some-network -d cassandra:tag

Replace some-cassandra with the name of the container you want to name, replace some-network with the network you just created, tag the version number of Cassandra, and replace it with latest by default

such as:

docker run --name ctest --network test -d cassandra:latest

Connect to cqlsh

docker run -it --network some-network --rm cassandra cqlsh some-cassandra
docker run -it --network test --rm cassandra cqlsh ctest
Connected to Test Cluster at ctest:9042.
[cqlsh 5.0.1 | Cassandra 3.11.4 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.

CQL language

Cassandra query language (cql) is a declarative language that allows users to query Cassandra using a language similar to SQL. cql was introduced in Cassandra version 0.8 and is now the preferred method for retrieving data from Cassandra. Before the introduction of CQL, saving an RPC-based API was the preferred method to retrieve data from Cassandra. One of the main benefits of cql is its similarity to SQL, which helps reduce Cassandra's learning difficulty. We can think of CQL as a simple API rather than Cassandra's internal storage structure.

CQL basics

Let us first understand some basic cql structure, and then enter a practical example.

keyspace

The keyspace is similar to the RDBMS database. It is a container for application data. Like the database, the keyspace must have a name and a set of associated attributes. The two important attributes that must be set when defining the key space are the replication factor and replication strategy.

column family/table

The column family / table is similar to the RDBMS table. The keyspace consists of many column families / tables.

Primary Key / Tables

The primary key enables the user to uniquely identify the "internal row" of the data. The primary key consists of two parts. Row / partition key and cluster key. The row / partition key determines the node where the data is stored, and the cluster key determines the sort order of the data in a particular row.

The first thing to note is that cql severely limits the predicates that can be applied to queries. This is essentially to prevent wrong queries and force users to carefully consider their data model. The following is a list of contents frequently used in SQL but not available in cql:

  • There is no arbitrary where clause
    – in cql, the predicate can only contain the columns specified in the primary key.
  • No connection structure
    -data cannot be connected across column families. It is discouraged to connect data between two column families, so there is no connection structure in cql.
  • No grouping basis
    -the same data cannot be grouped.
  • There is no arbitrary ORDER BY clause –
    ORDER BY can only be applied to clustered columns.

The best way to learn cql is to write cql queries. cql is a very simple method of interacting with Cassandra, but if you do not understand the underlying internal work, it is easy to be misused. Understanding the underlying structure is the key to mastering CQL.

CQL example

cqlsh> CREATE KEYSPACE animalkeyspace
WITH REPLICATION = { 'class' : 'SimpleStrategy' ,
 'replication_factor' : 1 };

Pay special attention to the order WITH REPLICATIONpart. This shows that animalkeyspace should use a simple replication strategy, and there is only one copy of all data inserted into animalkeyspace. This is great for demonstrations, but it is not a practical choice for any type of test or production environment.
Next, we create a column family. In order to create a column family, you need to navigate to animalkeyspace with the help of "USE command". USE command enables the client to connect to a specific key space, ie, all further cql commands will be executed in the context of the selected key space. Run the following command at the cqlsh prompt to connect the current client to animalkeyspace.

cqlsh> use animalkeyspace;
cqlsh:animalkeyspace> 

Please note that the cqlsh prompt will cqlsh>change from " " to " cqlsh:animalkeyspace>", which will visually remind you of the currently connected keyspace.

Now let's create a column family / table to store monkey related data. To define the table, you must use the CREATE TABLEcommand. Please pay special attention to the primary key. The primary key consists of two parts. That is, the partition / row key and cluster key. The first column of the primary key is the partition key. The remaining columns are used to determine the cluster key. A composite partition key (a partition key composed of multiple columns) can be defined by using an additional set of parentheses before the cluster column. The row key helps to distribute the data in the cluster, and the cluster key determines the order of the data stored in the row. Therefore, when designing tables, think of row keys as a tool for evenly distributing data in clusters, and cluster keys help determine the order of data in rows. The query mode will greatly affect the cluster key because it is used to sort the data stored in the rows. Note that the cluster key is optional.

Let us create the monkey table by executing the following command in the cqlsh prompt.

cqlsh:animalkeyspace> CREATE TABLE Monkey (
identifier uuid,   species text,  nickname text,  population int,   PRIMARY KEY ((identifier), species));

In the above table, we chose to identifierbe the partition key and the speciescluster key.

Let's use the following insert statement to insert a row in the above column family:

cqlsh:animalkeyspace> INSERT INTO monkey (identifier, species, nickname, population)
       VALUES ( 5132b130-ae79-11e4-ab27-0800200c9a66,
        'Capuchin monkey', 'cute', 100000);

Now, let's check what happened after creating and inserting a row in the Monkey table.

cqlsh:animalkeyspace> Select * from monkey;

 identifier                           | species         | nickname | population
--------------------------------------+-----------------+----------+------------
 5132b130-ae79-11e4-ab27-0800200c9a66 | Capuchin monkey |     cute |     100000

(1 rows)

If you convert the data inserted into the Monkey table to JSON, you will get the following result:

[
  {
    "key": "5132b130ae7911e4ab270800200c9a66", // 行/分区键
    "columns": [                              
      [
        "Capuchin monkey:",                    // 集群键。注意,集群键没有任何关联的数据。键和数据相同。
        "",
        1423586894518000                       // 创建此内部列时记录的时间戳。
      ],
      [
        "Capuchin monkey:nickname",           // nickname internal列的头。注意,对于每个额外的内部列,集群键总是有前缀。
        "cute",                               // 实际数据
        1423586894518000
      ],
      [
        "Capuchin monkey:population",        // population内部列的头
        "100000",                            // 实际数据
        1423586894518000
      ]
    ]
  }
]

The data inserted into the Monkey table can be displayed as the following map.
Insert picture description here
Note that the partition keys 5132B130AE7911E4AB270800200C9A66are row keys and externally mapped keys. Capuchin monkey:Is our cluster key and the first entry in the internal sort map. The first entry of the sort map does not have any data as a key, and the data is the same. Subsequent mapping entries create their keys by adding cluster keys after the column names. Capuchin monkey:nicknameIt is nicknamethe result of the cluster key + column header . The data section contains the actual data of the column.

The following diagram intuitively describes the link between the logical map of the cql line, the result sstable, and the sort map.
Insert picture description hereNow let's insert two more lines of cql. The first row inserted will have the same partition key, but will change the cluster key. The second row inserted will have a new partition key and cluster key.

cqlsh:animalkeyspace>  INSERT INTO monkey (identifier, species, nickname, population) 
VALUES ( 5132b130-ae79-11e4-ab27-0800200c9a66, 'Small Capuchin monkey', 'very cute', 100);
 INSERT INTO monkey (identifier, species, nickname, population) 
 VALUES ( 7132b130-ae79-11e4-ab27-0800200c9a66, 'Rhesus Monkey', 'Handsome', 100000); 

Make an inquiry:

cqlsh:animalkeyspace> Select * from monkey;

 identifier                           | species               | nickname  | population
--------------------------------------+-----------------------+-----------+------------
 5132b130-ae79-11e4-ab27-0800200c9a66 |       Capuchin monkey |      cute |     100000
 5132b130-ae79-11e4-ab27-0800200c9a66 | Small Capuchin monkey | very cute |        100
 7132b130-ae79-11e4-ab27-0800200c9a66 |         Rhesus Monkey |  Handsome |     100000

(3 rows)

Now when converted to json you will get the following result:

[
  {
    "key": "5132b130ae7911e4ab270800200c9a66",
    "columns": [
      [
        "Capuchin monkey:",
        "",
        1424557973603000
      ],
      [
        "Capuchin monkey:nickname",
        "cute",
        1424557973603000
      ],
      [
        "Capuchin monkey:population",
        "100000",
        1424557973603000
      ],
      [
        "Small Capuchin monkey:",
        "",
        1424558013115000
      ],
      [
        "Small Capuchin monkey:nickname",
        "very cute",
        1424558013115000
      ],
      [
        "Small Capuchin monkey:population",
        "100",
        1424558013115000
      ]
    ]
  },
  {
    "key": "7132b130ae7911e4ab270800200c9a66",
    "columns": [
      [
        "Rhesus Monkey:",
        "",
        1424558014339000
      ],
      [
        "Rhesus Monkey:nickname",
        "Handsome",
        1424558014339000
      ],
      [
        "Rhesus Monkey:population",
        "100000",
        1424558014339000
      ]
    ]
  }
]

At this time, data is inserted into Monkey table below may be displayed as a map
Insert picture description here
-CQL sources example, click here to view the original .

Published 28 original articles · won praise 2 · Views 3259

Guess you like

Origin blog.csdn.net/Maestro_T/article/details/97538329