Cassandra use - an example of a weather station

scenes to be used:

Cassandra very suitable for storing time series data types, we used herein, an example of a weather station ( the need to store a weather temperature data per minute ) .

 

 

A, Scheme 1 : Each device row occupies

     The idea behind the program is to create a row for each data source, such as a weather station where the temperature on the occupation line, and then every minute to collect a temperature, then let each moment of time scale as the column name, and the temperature value is the column value.

( 1 ) create table statement as follows:   

CREATE TABLE temperature ( 

    weatherstation_id text, 

    event_time timestamp, 

    temperature text, 

    PRIMARY KEY (weatherstation_id,event_time) );

 

( 2 ) and insert the following data.

INSERT INTO temperature(weatherstation_id,event_time,temperature) VALUES ('1234ABCD','2019-08-03 07:01:00','72F'); 

INSERT INTO temperature(weatherstation_id,event_time,temperature) VALUES ('1234ABCD','2019-08-03 07:02:00','73F'); 

INSERT INTO temperature(weatherstation_id,event_time,temperature) VALUES ('1234ABCD','2019-08-03 07:03:00','73F'); 

INSERT INTO temperature(weatherstation_id,event_time,temperature) VALUES ('1234ABCD','2019-08-03 07:04:00','74F');

 

( 3 ) If all the data you want to query the weather, the following

SELECT event_time,temperature FROM temperature WHERE weatherstation_id='1234ABCD';

 

( 4 ) If you want to query data for a time range, as follows:

SELECT temperature FROM temperature WHERE weatherstation_id='1234ABCD' AND event_time > '2019-08-03 07:01:00';

 

 

 

 

 

Second, Scheme 2 : per day occupies one row of data for each device

   Sometimes all the data storage device may be a bit difficult in a row, for example, does not fit (this situation should be rare), then we can do a split on the program, the row key increase in a representation, such as each device can limit the data on a separate day for each line, the number of such line on the magnitude of the controllable.

( 1 ) Create a table

CREATE TABLE temperature_by_day ( 

    weatherstation_id text, 

    date text, 

    event_time timestamp, 

    temperature text, 

 PRIMARY KEY ((weatherstation_id,date),event_time) ); 

 

( 2 ) inserting data

INSERT INTO temperature_by_day(weatherstation_id,date,event_time,temperature) VALUES ('1234ABCD','2019-08-03','2019-08-03 07:01:00','72F'); 

INSERT INTO temperature_by_day(weatherstation_id,date,event_time,temperature) VALUES ('1234ABCD','2019-08-03','2019-08-03 07:02:00','73F'); 

INSERT INTO temperature_by_day(weatherstation_id,date,event_time,temperature) VALUES ('1234ABCD','2019-08-04','2019-08-04 07:01:00','73F'); 

INSERT INTO temperature_by_day(weatherstation_id,date,event_time,temperature) VALUES ('1234ABCD','2019-08-04','2019-08-04 07:02:00','74F');

 

( 3 ) query a device one day of data

   SELECT * FROM temperature_by_day WHERE weatherstation_id='1234ABCD' AND date='2019-08-03';

 

 

 

 

 

Third, the program 3 : data storage with the timeliness of expired automatically deleted

   Another typical application is the data timing cycle do storage, Imagine, for example, we want a dashboard displaying the latest 10 temperature at the data, the old data is useless, you can not ignore. If you use other databases, we often need to set up a background job to do regular clean-up of historical data. But the use of Cassandra , we can use Cassandra called the expiration column ( expiring colmn new features), as long as more than a specified period of time, this column will automatically disappear.

( 1 ) Create a table

CREATE TABLE latest_temperatures ( 

    weatherstation_id text, 

    event_time timestamp, 

    temperature text, 

    PRIMARY KEY (weatherstation_id,event_time), 

) WITH CLUSTERING ORDER BY (event_time DESC);

 

( 2 ) inserting data

INSERT INTO latest_temperatures(weatherstation_id,event_time,temperature) VALUES ('1234ABCD','2019-08-03 07:03:00','72F') USING TTL 20; 

INSERT INTO latest_temperatures(weatherstation_id,event_time,temperature) VALUES ('1234ABCD','2019-08-03 07:02:00','73F') USING TTL 20; 

INSERT INTO latest_temperatures(weatherstation_id,event_time,temperature) VALUES ('1234ABCD','2019-08-03 07:01:00','73F') USING TTL 20; 

INSERT INTO latest_temperatures(weatherstation_id,event_time,temperature) VALUES ('1234ABCD','2019-08-03 07:04:00','74F') USING TTL 20;

 

( 3 ) observed

    After inserting the data, you can continue to use the query look at these data, we can see them disappear one by one, until finally all gone.

 

 

 

 

to sum up:

    time-series is Cassandra one of the most competitive data model

 

Original Abstract:

 Cassandra can store up to 2 billion columns per row

 

References:

  https://academy.datastax.com/resources/getting-started-time-series-data-modeling  

  http://www.rubyscale.com/post/143067470585/basic-time-series-with-cassandra

  http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra  

 

Guess you like

Origin www.cnblogs.com/Soy-technology/p/11310005.html
Recommended