.Net easily handle one hundred million data --ClickHouse data manipulation

The individual pieces of content from the blog click to jump synchronize updates! Please indicate the source!

I do not like to take a bunch of time-consuming operational data to compare the performance of individual solutions, etc., and sometimes look at some comparative evaluation of some long-winded writing time-consuming, and sometimes sent a few hundred milliseconds I think it is necessary nothing, the key is good used on the line, keep things simple, I wrote a blog also like things simple.
.Net operating Clickhouse library is relatively small, most are based on ClickHouse.ADO a package, the following also describes what ClickHouse.ADO of use, and the use of a library of their own package.

Foreword

Clickhouse for large amount of data analysis, my scenario is taken every ten seconds a fixed period of time data analysis from the bus trace some cases, the computer configuration is the common development configuration, data tracks the overall volume of about 300 million, the data processing time segments within one day, the amount of data extracted is about 23,000. When we can learn from one!
time consuming

Specific operations

First, the simple query and add and add bulk (Clickhouse not recommended to edit, and delete data no longer here, for example)

public class Demo
{
        private ClickHouseConnection GetConnection(string cstr= "Compress=True;CheckCompressedHash=False;Compressor=lz4;Host=ch-test.flippingbook.com;Port=9000;Database=default;User=andreya;Password=123")
        {
            var settings = new ClickHouseConnectionSettings(cstr);
            var cnn = new ClickHouseConnection(settings);
            cnn.Open();
            return cnn;
        }
        /*查询*/
        public void Select()
        {
            using (var cnn = GetConnection())
            {
                var reader = cnn.CreateCommand("SELECT * FROM test").ExecuteReader()
                ......省略
            }
        }
        /*增加*/
        public void Insert()
        {
            using (var cnn = GetConnection())
            {
                var cmd = cnn.CreateCommand("INSERT INTO test (date,x, arr)values ('2017-01-01',1,['a','b','c'])");
                cmd.ExecuteNonQuery();
            }
        }
        /*批量新增*/
        public void InsertBulk()
        {
            using (var cnn = GetConnection())
            {
                var cmd = cnn.CreateCommand("INSERT INTO test (date,x, values.name,values.value)values @bulk;");
                cmd.Parameters.Add(new ClickHouseParameter
                {
                    DbType = DbType.Object,
                    ParameterName = "bulk",
                    Value = new[]
                    {
                        new object[] {DateTime.Now, 1, new[] {"[email protected]", "awdasdas"}, new[] {"dsdsds", "dsfdsds"}},
                        new object[] {DateTime.Now.AddHours(-1), 2, new string[0], new string[0]},
                    }
                });
                cmd.ExecuteNonQuery();
            }
        }
}

Second, in view of the method for converting the original data after reading way too much trouble, paging, also we need to implement, so write a helper class, easy to operate Clickhouse, click Jump

Help class
Use is also very simple, as follows:

public HistoryModel GetHistories(string busid, string begindt, string enddt)
        {
            using (var helper = new ClickHouseHelper())
            {
                try
                {
                    HistoryModel historyModel = new HistoryModel();
                    historyModel.Histories = helper .ExecuteList<HistoriesModel>($"select mile,speed,lon,lat,direct,termtime from its.gps_MergeTree where termtime >='{begindt}' and termtime<='{enddt}' and busid={busid} order by termtime");
                    historyModel.Inouts = helper .ExecuteList<InoutModel>($"SELECT * FROM its.inout_t WHERE Adtime>='{begindt}' and Adtime<='{enddt}' and Busid={busid} order by Recvtime");
                    //clickhouse中取出来的时间默认会有时区的问题,这里需要手动转下本地的时区
                    historyModel.Histories.ForEach(u => u.termtime = DateTime.Parse(u.termtime).ToLocalTime().ToString("yyyy-MM-dd HH:mm:ss"));
                    historyModel.Inouts.ForEach(u => u.Recvtime = u.Recvtime.ToLocalTime());
                    return historyModel;
                }
                catch (Exception e)
                {
                    ckhelper.Dispose();
                    Console.WriteLine(e);
                    throw;
                }
            }
        }

Third, some minor problems recorded

  1. Time zone issues
    when Clickhosue will be taken out of more than eight hours, before the installation zone not once suspected server, but in fact are correct, only time manually by ToLocalTime rpm cost areas
  2. Batch interpolated data
    when the data if the incoming batch insert a List, then the corresponding need to increase class GetEnumerator method, like this
public class Demo
{
     public string obu { get; set; }
     public int busid { get; set; }
     public string buscode { get; set; }
     public IEnumerator GetEnumerator()
        {
            yield return obu;
            yield return busid;
            yield return buscode;
            .....
        }
}
  1. Type reunification
    with particular reference to my article click to jump

Micro letter concerns me Oh! (Reproduced indicate the source)Oh, my concern

Guess you like

Origin www.cnblogs.com/ShaoJianan/p/11163091.html