Server Development 28: rapidjson acts as a tool for parsing json data between the http server and the server in the operation center, rapidjson interface and pits

Article directory

1. Background knowledge

1) DOM and SAX

1. What is DOM style API?

The Document Object Model (DOM) is an in-memory JSON representation for querying and modifying JSON.

2. What is SAX style API?

SAX is an event-driven API for parsing and generating JSON.

3. Do I use DOM or SAX?

DOM is easy to query and modify. SAX is very fast and memory-efficient, but generally more difficult to use.

4. What is in situ ( in situ ) analysis?

In-place parsing will decode the JSON string directly into the input JSON. This is an optimization that reduces memory consumption and improves performance, but the input JSON will be changed.

5. When will a parsing error occur?

The parser generates an error when the input JSON contains illegal syntax, or cannot represent a value (such as Number is too large), or the parser's processor interrupts the parsing process.

6. What error message is there?

Error information is stored in ParseResult, which contains the error code and offset value (the number of characters from the start of JSON to the error). Error codes can be translated into human-readable error messages.

7. Why not just use double to represent JSON number?

Some applications require the use of 64-bit unsigned/signed integers. These integers cannot be converted losslessly to double. Therefore, the parser will check whether a JSON number can be converted to various integer types and double.

8. How to clear and minimize the capacity of document or value?

Call SetXXX() methods - these methods will call the destructor and rebuild an empty Object or Array:

Document d;
...
d.SetObject();  // clear and minimize

In addition, you can also refer to an equivalent method in C++ swap with temporary idiom:

Value(kObjectType).Swap(d);

Alternatively, use this slightly longer code to accomplish the same thing:

d.Swap(Value(kObjectType).Move()); 

2) parse json

Pay special attention to rapidjson::Document can be any type of object, array, number, string, boolean and null. Object-related methods such as HasMember can be called only when it is an object.

#include <rapidjson/document.h>
#include <rapidjson/error/en.h>
#include <rapidjson/stringbuffer.h>
#include <rapidjson/writer.h>
#include <stdio.h>

int main(int argc, char* argv[])
{
    
    
    std::string str;
    rapidjson::Document doc;
    doc.Parse(argv[1]);
    if (doc.HasParseError())
        printf("parse error\n");
    // 注意doc可为object, array, number, string, boolean, null中任意一种类型
    if (!doc.IsObject())

        printf("not object\n");
    else
    {
    
    
        printf("parse ok\n");
        if (doc.IsNumber())
        printf("%d\n", doc.GetInt());
        
        // doc为object类型时,才能调用HasMember
        if (doc.HasMember("x"))
            printf("has x\n");
        else
            printf("without x\n");
    }
    return 0;
}

2. Use the interface document

1) Serialization and deserialization

(1) Convert string to json

const char* json = "{\"a\":\"1\",\"b\":2}";
Document d;
d.Parse(json);
if(d.HasParseError()){
    printf("ret=%d\n", d.GetParseError());
}

A Document object needs to be defined and parsed through the Parse function.

  • Remarks
    It should be noted that the parameters of the Parse function do not support std::string

(2) Convert json to string

StringBuffer buffer;
Writer<StringBuffer> writer(buffer);
d.Accept(writer);
printf("%d\n", buffer.GetSize());
printf("%s\n", buffer.GetString());

Due to the decoupling of architectural design, serialization here is more troublesome.

2) Data type

Consistent with the JSON RFC7159 standard, it supports strings, integers, floating point numbers, true, false, arrays, objects, NULL, etc.

  • Interface explanation
    You can use isXXX to judge whether the current value is a certain type
  • Using the interface
    For each value, we can also get and set the value through set/get.
Value  a(1);
a = 2;
a.SetInt(3);
a.GetInt();
a.SetInt64(4);
a.SetBool(true);
a.SetNull();

3) string

  • Remarks
    RapidJSON for high performance, all operations are reference or Move operations by default.
  • For example
    , for a string, when setting the value, if you want to copy instead of reference, you need to declare it explicitly
Value author
char buffer[10] = "hello word";
// 常量字符串只储存指针
author.("hello word");
//下面这句会编译错误
author.(buffer);

//强制储存指针
author.(StringRef(buffer));
// 复制字符串--这里是复制
author.SetString(buffer, len, document.GetAllocator());
  • Instructions for use
    ① For constant strings, it can be determined that the entire life cycle is safe, so directly store pointers.
    ②For the pointer string, it cannot be compiled syntactically (the life cycle cannot be guaranteed). If the pointer constant string is determined, it can be actively declared to avoid copying.
    ③In other cases, you can only actively declare that you need to apply for memory.

4) array

  • Operations on
    arrays are similar to std::vector. The only difference is that you need to pass a memory management class yourself, which makes the array syntax very ugly.

  • code example

Value a(kArrayType);
Document::AllocatorType& allocator = document.GetAllocator();
a.Clear();
a.Reserve(10, allocator);
a.PushBack(1, allocator);
a.PopBack();
a.Erase(a.Begin());
  • Supplement
    In addition, RapidJSON supports the Fluent interface function of java.
a.PushBack(1, allocator).PushBack(2, allocator);
  • The other three methods traverse the array
    ① index method
const Value& a = document["a"];
assert(a.IsArray());
for (SizeType i = 0; i < a.Size(); i++) {
    
    
    printf("a[%d] = %d\n", i, a[i].GetInt())
}

② Iterator method

for (Value::ConstValueIterator itr = a.Begin(); itr != a.End(); ++itr)
    printf("%d ", itr->GetInt());

③c++11 method

for (auto& v : a.GetArray())
    printf("%d ", v.GetInt());

5) object

  • Introduction
    Objects are similar to std::map.
  • Pit point
    There is a big pit here, the object lookup complexity of Rapidjson is O(n), pay attention to being pitted.

(1) Find the object

  • the code
Value contact(kObject);
Document::AllocatorType& allocator = document.GetAllocator();
contact.AddMember("name", "tiankonguse", allocator);
contact.AddMember("sex", "male", allocator);
contact.AddMember("wx", "", allocator);
contact["wx"] = "tiankonguse-code";
contact.RemoveMember("name");
auto itr = d.FindMember("hello");
if(itr == d.MemberEnd()) {
    
    
    printf("no find\n");
}

(2) COPY object

It's a bit troublesome, we need to manually construct the key and value first.

  • the code
// 显式 copy,隐式 move
Value key("name", allocator );
Value val("tiankonguse", allocator );
contact.AddMember(key, val, allocator );
// 参数中进行 copy 与 move
contact.AddMember(
Value("sex", allocator).Move(), 
Value("male", allocator).Move(),
allocator);

(3) Two methods of traversing objects

① Iterator method
In STL, the name of the iterator key is first, and the name of value is second.
In RapidJSON, the name of the key is name, and the name of the value is value.

for (auto itr = d.MemberBegin(); itr != d.MemberEnd(); ++itr) {
    
    
    printf("name=%s, valyetype=%d\n", itr->name.GetString(), itr->value.GetType());
}

②c++11 method

for (auto& m : d.GetObject())
    printf("Tname=%s valueType=%d\n", m.name.GetString(), m.value.GetType());

3. Example of use

1) parse a string

  • result
count=2
name=zhangsan
name=wangwu
  • the code
void x1()
{
    
    
    rapidjson::Document document; // 定义一个Document对象
    std::string str = "{\"count\":2,\"names\":[\"zhangsan\",\"wangwu\"]}";
    document.Parse(str.c_str()); // 解析,Parse()无返回值,也不会抛异常
    if (document.HasParseError()) // 通过HasParseError()来判断解析是否成功
    {
    
    
        // 可通过GetParseError()取得出错代码,
        // 注意GetParseError()返回的是一个rapidjson::ParseErrorCode类型的枚举值
        // 使用函数rapidjson::GetParseError_En()得到错误码的字符串说明,这里的En为English简写
        // 函数GetErrorOffset()返回出错发生的位置
        printf("parse error: (%d:%d)%s\n", document.GetParseError(), document.GetErrorOffset(), rapidjson::GetParseError_En(document.GetParseError()));
    }
    else
    {
    
    
        // 判断某成员是否存在
        if (!document.HasMember("count") || !document.HasMember("names"))
        {
    
    
            printf("invalid format: %s\n", str.c_str());
        }
        else
        {
    
    
            // 如果count不存在,则运行程序会挂,DEBUG模式下直接abort
            rapidjson::Value& count_json = document["count"];
            // 如果count不是整数类型,调用也会挂,DEBUG模式下直接abort
            // GetInt()返回类型为int
            // GetUint()返回类型为unsigned int
            // GetInt64()返回类型为int64_t
            // GetUint64()返回类型为uint64_t
            // GetDouble()返回类型为double
            // GetString()返回类型为char*
            // GetBool()返回类型为bool
            int count = count_json.GetInt();
            printf("count=%d\n", count);
            
            // 方法GetType()返回枚举值: kNullType,kFalseType,kTrueType,kObjectType,kArrayType,kStringType,kNumberType
            // 可用IsArray()判断是否为数组,示例: { "a": [1, 2, 3, 4] }
            // 用IsString()判断是否为字符串值
            // 用IsDouble()判断是否为double类型的值,示例: { "pi": 3.1416 }
            // 用IsInt()判断是否为int类型的值
            // 用IsUint()判断是否为unsigned int类型的值
            // 用IsInt64()判断是否为int64_t类型的值
            // 用IsUint64()判断是否为uint64_t类型的值
            // 用IsBool()判断是否为bool类型的值
            // 用IsFalse()判断值是否为false,示例: { "t": true, "f": false }
            // 用IsTrue()判断值是否为true
            // 用IsNull()判断值是否为NULL,示例: { "n": null }
            // 更多说明可浏览:
            // https://miloyip.gitbooks.io/rapidjson/content/zh-cn/doc/tutorial.zh-cn.html

            const rapidjson::Value& names_json = document["names"];
            for (rapidjson::SizeType i=0; i<names_json.Size(); ++i)
            {
    
    
                std::string name = names_json[i].GetString();
                printf("name=%s\n", name.c_str());
            }
        }
    }
}

2) Construct a json and convert it into a string

  • operation result
{
    
    "count":2,"names":[{
    
    "name":"zhangsan"},{
    
    "name":"wangwu"}]}
  • the code
void x2()

{
    
    
    rapidjson::StringBuffer buffer;
    rapidjson::Writer<rapidjson::StringBuffer> writer(buffer);
    writer.StartObject();

 
    // count
    writer.Key("count");
    writer.Int(2);
    // 写4字节有符号整数: Int(int32_t x)
    // 写4字节无符号整数: Uint(uint32_t x)
    // 写8字节有符号整数: Int64(int64_t x)
    // 写8字节无符号整数: Uint64(uint64_t x)
    // 写double值: Double(double x)
    // 写bool值: Bool(bool x)
 
    // names
    writer.Key("names");
    writer.StartArray();
    writer.StartObject();
    writer.Key("name");
    writer.String("zhangsan");
    writer.EndObject();

 
    writer.StartObject();
    writer.Key("name");
    writer.String("wangwu");
    writer.EndObject();
    writer.EndArray();

    writer.EndObject();

    // 以字符串形式打印输出
    printf("%s\n", buffer.GetString());
}

3) Modify an existing json string

  • operation result
{
    
    "name":"wangwu","age":22}
  • the code
void x3()

{
    
    
    rapidjson::Document document;
    std::string str = "{\"name\":\"zhangsan\",\"age\":20}";
    document.Parse(str.c_str());

 

    rapidjson::Value& name_json = document["name"];
    rapidjson::Value& age_json = document["age"];
    std::string new_name = "wangwu";
    int new_age = 22;

 
    // 注意第三个参数是document.GetAllocator(),相当于深拷贝,rapidjson会分配一块内存,然后复制new_name.c_str(),
    // 如果不指定第三个参数,则是浅拷贝,也就是rapidjson不会分配一块内存,而是直接指向new_name.c_str(),省去复制提升了性能
    // 官方说明:
    // http://rapidjson.org/zh-cn/md_doc_tutorial_8zh-cn.html#CreateString
    name_json.SetString(new_name.c_str(), new_name.size(), document.GetAllocator());
    age_json.SetInt(new_age);

 

    // 转成字符串输出
    rapidjson::StringBuffer buffer;
    rapidjson::Writer<rapidjson::StringBuffer> writer(buffer);
    document.Accept(writer);
    printf("%s\n", buffer.GetString());
}

4) Read the array

  • operation result
zhangsan wangwu
  • the code
void x4()
{
    
    
    rapidjson::Document document;
    std::string str = "{\"count\":2,\"names\":[{\"name\":\"zhangsan\"},{\"name\":\"wangwu\"}]}";
    document.Parse(str.c_str());
    if (document.HasParseError())
    {
    
    
        printf("parse error: %d\n", document.GetParseError());
    }
    else
    {
    
    
        rapidjson::Value& names_json = document["names"];
        for (rapidjson::SizeType i=0; i<names_json.Size(); ++i)
        {
    
    
            if (names_json[i].HasMember("name"))
            {
    
    
                rapidjson::Value& name_json = names_json[i]["name"];
                printf("%s ", name_json.GetString());
            }
        }
        printf("\n");
    }
}

5) Construct a json with Writer, then modify it, and finally convert it into a string

  • operation result
{
    
    "count":2}
{
    
    "count":8}
  • the code
void x5()
{
    
    
    rapidjson::StringBuffer buffer1;
    rapidjson::Writer<rapidjson::StringBuffer> writer1(buffer1);

    writer1.StartObject();
    writer1.Key("count");
    writer1.Int(2);    
    writer1.EndObject();

    printf("%s\n", buffer1.GetString());

 

    // 转成Document对象

    rapidjson::Document document;
    document.Parse(buffer1.GetString());
    // 修改
    rapidjson::Value& count_json = document["count"];
    count_json.SetInt(8);

    // 转成字符串
    rapidjson::StringBuffer buffer2;
    rapidjson::Writer<rapidjson::StringBuffer> writer2(buffer2);
    document.Accept(writer2);
    printf("%s\n", buffer2.GetString());
}

6) Construct a json with Document, then modify it, and finally convert it into a string

  • operation result
{
    
    "count":3,"names":[{
    
    "id":1,"name":"zhangsan"}]}
{
    
    "count":9,"names":[{
    
    "id":1,"name":"zhangsan"}]}
  • the code
void x6()
{
    
    
    rapidjson::Document document;
    std::string str = "{}"; // 这个是必须的,且不能为"",否则Parse出错
    document.Parse(str.c_str());

    // 新增成员count
    // AddMember第一个参数可以为字符串常,如“str”,不能为“const char*”和“std::string”,
    // 如果使用“const char*”,则需要使用StringRefType转换:StringRefType(str.c_str())
    document.AddMember("count", 3, document.GetAllocator());

 

    // 新增数组成员
    rapidjson::Value array(rapidjson::kArrayType);
    rapidjson::Value object(rapidjson::kObjectType); // 数组成员
    object.AddMember("id", 1, document.GetAllocator());
    object.AddMember("name", "zhangsan", document.GetAllocator());

    // 如果数组添加无名字的成员,定义Value时应当改成相应的类型,如:
    //rapidjson::Value value(rapidjson::kStringType);
    //rapidjson::Value value(rapidjson::kNumberType);
    //rapidjson::Value value(rapidjson::kFalseType);
    //rapidjson::Value value(rapidjson::kTrueType);
    //array.PushBack(value, document.GetAllocator());
    //效果将是这样:'array':[1,2,3,4,5]
    

    // 注意下面用法编译不过:
    //std::string str1 = "hello";
    //object.AddMember("name", str1.c_str(), document.GetAllocator());
    //const char* str2 = "hello";
    //object.AddMember("name", str2, document.GetAllocator());
    //

    // 下面这样可以:
    //object.AddMember("name", "hello", document.GetAllocator());
    //const char str3[] = "hello";
    //object.AddMember("name", str3, document.GetAllocator());

    //    
    //std::string str4 = "#####";
    //rapidjson::Value v(str4.c_str(), document.GetAllocator());
    //obj.AddMember("x", v, document.GetAllocator());
    // 上面两行也可以写在一行:
    //obj.AddMember("x", rapidjson::Value(str4.c_str(), document.GetAllocator()).Move(), document.GetAllocator());

 

    // 添加到数组中
    array.PushBack(object, document.GetAllocator());

    // 添加到document中
    document.AddMember("names", array, document.GetAllocator());

 

    // 转成字符串输出
    rapidjson::StringBuffer buffer1;
    rapidjson::Writer<rapidjson::StringBuffer> writer1(buffer1);
    document.Accept(writer1);
    printf("%s\n", buffer1.GetString());
    

    // 修改值
    rapidjson::Value& count_json = document["count"];
    count_json.SetInt(9);

 

    // 再次输出
    rapidjson::StringBuffer buffer2;
    rapidjson::Writer<rapidjson::StringBuffer> writer2(buffer2);
    document.Accept(writer2);
    printf("%s\n", buffer2.GetString());
}

7) Construct a json with Document, then modify it, and finally convert it into a string (transfer ASCII code output)

  • operation result
x7=>
{
    
    "title":"\u8D2B\u56F0\u5B64\u513F\u52A9\u517B"}
  • the code
void x7()
{
    
    
    std::string root = "{}";
    rapidjson::Document document;
    document.Parse(root.c_str());
 
    std::string title = "\u8D2B\u56F0\u5B64\u513F\u52A9\u517B";
    document.AddMember("title", rapidjson::Value(title.c_str(), document.GetAllocator()).Move(), document.GetAllocator());

 

    rapidjson::StringBuffer buffer;
    rapidjson::Writer<rapidjson::StringBuffer, rapidjson::Document::EncodingType, rapidjson::ASCII<> > writer(buffer);

    // 如果上面一句改成普通的:
    // rapidjson::Writer<rapidjson::StringBuffer> writer(buffer);
    // 则输出将变成:
    // x7=>

    document.Accept(writer);
    printf("x7=>\n%s\n", buffer.GetString());
}

8) Construct empty objects and arrays

  • operation result
{
    
    "age":{
    
    },"times":{
    
    },"names":[],"urls":[],"books":[]}
{
    
    "age":6,"times":{
    
    },"names":[],"urls":[],"books":[]}
  • the code
void x8()
{
    
    
    rapidjson::Document document;
    document.Parse("{}"); // 这里换成document.SetObject()也可以

    // 下面为2种构造空对象的方法
    document.AddMember("age", rapidjson::Value(rapidjson::kObjectType).Move(), document.GetAllocator());
    document.AddMember("times", rapidjson::Value().SetObject(), document.GetAllocator());

 
    // 下面为2种构造空数组的方法
    document.AddMember("names", rapidjson::Value(rapidjson::kArrayType).Move(), document.GetAllocator());
    document.AddMember("urls", rapidjson::Value(rapidjson::kArrayType).Move(), document.GetAllocator());
    document.AddMember("books", rapidjson::Value().SetArray(), document.GetAllocator());

 

    rapidjson::StringBuffer buffer1;
    rapidjson::Writer<rapidjson::StringBuffer> writer1(buffer1);
    document.Accept(writer1);
    printf("%s\n", buffer1.GetString());
 

    rapidjson::Value& age = document["age"];
    age.SetInt(6);

    rapidjson::StringBuffer buffer2;
    rapidjson::Writer<rapidjson::StringBuffer> writer2(buffer2);
    document.Accept(writer2);
    printf("%s\n", buffer2.GetString());
}

9) Delete array elements

  • operation result
{
    
     "names": [ {
    
    "name":"zhangsan","age":100}, {
    
    "name":"wangwu","age":90}, {
    
    "name":"xiaozhang","age":20} ]}

{
    
    "names":[{
    
    "name":"zhangsan","age":100},{
    
    "name":"wangwu","age":90}]}
  • the code
void x9()
{
    
    
    std::string str = "{ \"names\": [ {\"name\":\"zhangsan\",\"age\":100}, {\"name\":\"wangwu\",\"age\":90}, {\"name\":\"xiaozhang\",\"age\":20} ]}";
    rapidjson::Document document;
    document.Parse(str.c_str());

    rapidjson::Value& names_json = document["names"];
    for (rapidjson::Value::ValueIterator iter=names_json.Begin(); iter!=names_json.End();)
    {
    
    
        std::string name = (*iter)["name"].GetString();
        // 不要小张了
        if (name == "xiaozhang")
            iter = names_json.Erase(iter);
        else
            ++iter;
    }
    rapidjson::StringBuffer buffer;
    rapidjson::Writer<rapidjson::StringBuffer> writer(buffer);
    document.Accept(writer);

    printf("%s\n", str.c_str());
    printf("%s\n", buffer.GetString());
}

10) Do not escape Chinese

  • output result
{
    
    "title":"贫困孤儿助养"}

{
    
    "title":"\u8D2B\u56F0\u5B64\u513F\u52A9\u517B"}
  • the code
//g++ -g -o b b.cpp -I/usr/local/thirdparty/rapidjson/include

#include <rapidjson/document.h>

#include <rapidjson/stringbuffer.h>

#include <rapidjson/writer.h>

#include <string>

#include <stdio.h>

 

int main()

{
    
    
    std::string str = "{\"title\":\"\u8d2b\u56f0\u5b64\u513f\u52a9\u517b\"}";
    rapidjson::Document document;
    document.Parse(str.c_str());
    if (document.HasParseError())
    {
    
    
        printf("parse %s failed\n", str.c_str());
        exit(1);
    }

    rapidjson::StringBuffer buffer1;
    rapidjson::Writer<rapidjson::StringBuffer> writer1(buffer1);
    document.Accept(writer1);
    printf("%s\n", buffer1.GetString());

 

    rapidjson::StringBuffer buffer2;
    rapidjson::Writer<rapidjson::StringBuffer, rapidjson::Document::EncodingType, rapidjson::ASCII<> > writer2(buffer2);

    document.Accept(writer2);
    printf("%s\n", buffer2.GetString());
    return 0;
}

11) Schema usage example

  • Background premise
    The schema of json is used to check the json data, and it also adopts the json format.
  • the code
rapidjson::Document schema_document;
schema_document.Parse(schema.c_str());
if (!schema_document.HasParseError())
{
    
    
    rapidjson::Document document;
    document.Parse(str.c_str());
    if (!document.HasParseError())
    {
    
    
        SchemaDocument schema(schema_document);
        SchemaValidator validator(schema);
        if (!document.Accept(validator))
        {
    
    
             // 检验出错,输出错误信息
             StringBuffer sb;
validator.GetInvalidSchemaPointer().StringifyUriFragment(sb);

             printf("Invalid schema: %s\n", sb.GetString());
             printf("Invalid keyword: %s\n", 					    validator.GetInvalidSchemaKeyword());
             sb.Clear();

             validator.GetInvalidDocumentPointer().StringifyUriFragment(sb);
             printf("Invalid document: %s\n", sb.GetString());
        }
    }
}
  • example json
{
    
    
    "id": 1,
    "name": "A green door",
    "price": 12.50,
    "tags": ["home", "green"]
}
  • The schema corresponding to the above json
{
    
    
    "$schema": "http://json-schema.org/draft-04/schema#",
    "title": "Product",
    "description": "A product from Acme's catalog",
    "type": "object",
    "properties": {
    
    
        "id": {
    
    
            "description": "The unique identifier for a product",
            "type": "integer"
        },
        "name": {
    
    
            "description": "Name of the product",
            "type": "string"
        },
        "price": {
    
    
            "type": "number",
            "minimum": 0,
            "exclusiveMinimum": true
        },
        "tags": {
    
    
            "type": "array",
            "items": {
    
    
                "type": "string"
            },
            "minItems": 1,
            "uniqueItems": true
        }
    },
    "required": ["id", "name", "price"]
}
  • Supplement
    title" and "description" are descriptive and can be omitted. $schema is also optional, based on "JSON Schema Draft v4".

12) Complete example of schema

  • the code
#include <rapidjson/document.h>
#include <rapidjson/schema.h>
#include <rapidjson/stringbuffer.h>
int main()
{
    
    
    std::string str = "\{\"aaa\":111,\"aaa\":222}"; // "\{\"aaa\":111,\"a\":222}"
#if 0
    std::string schema_str = "{\"type\":\"object\",\"properties\":{\"aaa\":{\"type\":\"integer\"},\"bbb\":{\"type\":\"string\"}},\"required\":[\"aaa\",\"bbb\"]}";
#else
    std::string schema_str = "{\"type\":\"object\",\"properties\":{\"aaa\":{\"type\":\"integer\"},\"bbb\":{\"type\":\"integer\"}},\"required\":[\"aaa\",\"bbb\"]}";
#endif

    printf("%s\n", str.c_str());
    printf("%s\n", schema_str.c_str());
 

    rapidjson::Document doc;
    rapidjson::Document schema_doc;
    schema_doc.Parse(schema_str.c_str());
    doc.Parse(str.c_str());

 

    rapidjson::SchemaDocument schema(schema_doc);
    rapidjson::SchemaValidator validator(schema);
    if (doc.Accept(validator))
    {
    
    
        printf("data ok\n");
    }
    else
    {
    
    
        rapidjson::StringBuffer sb;
        validator.GetInvalidSchemaPointer().StringifyUriFragment(sb);
 
        printf("Invalid schema: %s\n", sb.GetString());
        printf("Invalid keyword: %s\n", validator.GetInvalidSchemaKeyword());
 
        sb.Clear();
validator.GetInvalidDocumentPointer().StringifyUriFragment(sb);
        printf("Invalid document: %s\n", sb.GetString());
    }
    return 0;
}

4. Examples of auxiliary functions and traversal array operations

1) FindMember integer value

int age;
const rapidjson::Value::ConstMemberIterator iter =
    doc.FindMember("age");
if (iter!=doc.MemberEnd() && iter->value.IsInt())
    age = iter->value.GetInt();

2) FindMember string value

std::string name;
const rapidjson::Value::ConstMemberIterator iter =
    doc.FindMember("name");
if (iter!=doc.MemberEnd() && iter->value.IsString())
    name.assign(iter->value.GetString(), iter->value.GetStringLength());

3) Traverse array 1: string array

// 示例数组:
// {"k":["k1","k2","k3"]}
rapidjson::Document doc;
doc.Parse(str.c_str());

const rapidjson::Value& k = doc["k"];
// 遍历数组

for (rapidjson::Value::ConstValueIterator v_iter=k.Begin();
    v_iter!=k.End(); ++v_iter)
{
    
    
    // k1
    // k2
    // k3
    printf("%s\n", (*v_iter).GetString());
}

4) Traverse array 2: first-level object array

// 数组示例:
// {"h":[{"k1":"f1"},{"k2":"f2"}]}
rapidjson::Document doc;
doc.Parse(str.c_str());
 
const rapidjson::Value& h = doc["h"];
// 遍历数组
for (rapidjson::Value::ConstValueIterator v_iter=h.Begin();
    v_iter!=h.End(); ++v_iter)
{
    
    
    const rapidjson::Value& field = *v_iter;
    for (rapidjson::Value::ConstMemberIterator m_iter=field.MemberBegin();
        m_iter!=field.MemberEnd(); ++m_iter) // kf对
    {
    
    
        // k1 => f1
        // k2 => f2
        const char* key = m_iter->name.GetString();
        const char* value = m_iter->value.GetString();
        printf("%s => %s\n", key, value);
        break;
    }
}

5) Traverse array 3: two-level object array

// 数组示例:
// {"h":[{"k1":["f1","f2"]},{"k2":["f1","f2"]}]}
rapidjson::Document doc;
doc.Parse(str.c_str());
const rapidjson::Value& h = doc["h"];
// 遍历第一级数组
for (rapidjson::Value::ConstValueIterator v1_iter=h.Begin();
    v1_iter!=h.End(); ++v1_iter)
{
    
    
    const rapidjson::Value& k = *v1_iter; // k1,k2,k3
    // 成员遍历
    for (rapidjson::Value::ConstMemberIterator m_iter=k.MemberBegin();
        m_iter!=k.MemberEnd(); ++m_iter)
    {
    
    
        const char* node_name = m_iter->name.GetString();
        printf("hk: %s\n", node_name);
                    
        const rapidjson::Value& node = m_iter->value;
        // 遍历第二级数组
        for (rapidjson::Value::ConstValueIterator v2_iter=node.Begin();
            v2_iter!=node.End(); ++v2_iter)  
        {
    
    
            const char* field = (*v2_iter).GetString();
            printf("field: %s\n", field); // f1,f2,f3
        }
    }
}

6) Auxiliary function 1: Return any type as a string

// 如果不存在,或者为数组则返回空字符串。
std::string rapidjson_string_value(rapidjson::Value& value, const std::string& name)
{
    
    
    if (!value.HasMember(name.c_str()))
        return std::string("");
    const rapidjson::Value& child = value[name.c_str()];
    if (child.IsString())
        return child.GetString();
    char str[100];
    if (child.IsInt())
    {
    
    
        snprintf(str, sizeof(str), "%d", child.GetInt());
    }
    else if (child.IsInt64())
    {
    
    
        // 为使用PRId64,需要#include <inttypes.h>,
        // 同时编译时需要定义宏__STDC_FORMAT_MACROS
        snprintf(str, sizeof(str), "%"PRId64, child.GetInt64());
    }
    else if (child.IsUint())
    {
    
    
        snprintf(str, sizeof(str), "%u", child.GetUint());
    }
    else if (child.IsUint64())
    {
    
    
        snprintf(str, sizeof(str), "%"PRIu64, child.GetUint64());
    }
    else if (child.IsDouble())
    {
    
    
        snprintf(str, sizeof(str), "%.2lf", child.GetDouble());
    }
    else if (child.IsBool())
    {
    
    
        if (child.IsTrue())
            strcpy(str, "true");
        else
            strcpy(str, "false");
    }
    else
    {
    
    
        str[0] = '\0';
    }
    return str;
}

7) Auxiliary function 2: take int32_t value

When it is an int32_t value or the string is actually an int32_t value, it returns the corresponding int32_t value, otherwise it returns 0.

// 当为int32_t值,或字符串实际为int32_t值时,返回对应的int32_t值,其它情况返回0
int32_t rapidjson_int32_value(rapidjson::Value& value, const std::string& name)
{
    
    
    if (!value.HasMember(name.c_str()))
        return 0;
    const rapidjson::Value& child = value[name.c_str()];
    if (child.IsInt())
    {
    
    
        return child.GetInt();
    }
    else if (child.IsString())
    {
    
    
        return atoi(child.GetString());
    }
    return 0;
}

8) Auxiliary function 3: take int64_t value

int64_t rapidjson_int64_value(rapidjson::Value& value, const std::string& name)
{
    
    
    if (!value.HasMember(name.c_str()))
        return 0;
    const rapidjson::Value& child = value[name.c_str()];
    if (child.IsInt64())
    {
    
    
        return child.GetInt64();
    }
    else if (child.IsString())
    {
    
    
        return (int64_t)atoll(child.GetString());
    }
    return 0;
}

9) Auxiliary function 4: get uint32_t value

uint32_t rapidjson_uint32_value(rapidjson::Value& value, const std::string& name)
{
    
    
    if (!value.HasMember(name.c_str()))
        return 0;
    const rapidjson::Value& child = value[name.c_str()];
    if (child.IsUint())
    {
    
    
        return child.GetUint();
    }
    else if (child.IsString())
    {
    
    
        return (uint32_t)atoll(child.GetString());
    }
    return 0;
}

10) Auxiliary function 5: take uint64_t value

uint64_t rapidjson_uint64_value(rapidjson::Value& value, const std::string& name)
{
    
    
    if (!value.HasMember(name.c_str()))
        return 0;
    const rapidjson::Value& child = value[name.c_str()];
    if (child.IsUint64())
    {
    
    
        return child.GetUint64();
    }
    else if (child.IsString())
    {
    
    
        return (uint64_t)atoll(child.GetString());
    }
    return 0;
}

11) Auxiliary function 6: object to string

std::string& to_string(const rapidjson::Value& value, std::string* str)
{
    
    
    rapidjson::StringBuffer buffer;
    rapidjson::Writer<rapidjson::StringBuffer> writer(buffer);
    value.Accept(writer);
    str->assign(buffer.GetString(), buffer.GetSize());
    return *str;
}

std::string to_string(const rapidjson::Value& value)
{
    
    
    std::string str;
    to_string(value, &str);
#if __cplusplus < 201103L
    return str;
#else
    return std::move(str);
#endif // __cplusplus < 201103L
}

12) Auxiliary function 7: convert string to object

bool to_rapidjson(const std::string& str, rapidjson::Document* doc)
{
    
    
    doc->Parse(str.c_str());
    return !doc->HasParseError();
}

 
void to_rapidjson(const std::string& str, rapidjson::Document& doc)
{
    
    
    doc.Parse(str.c_str());
    if (doc.HasParseError())
        doc.Parse("{}");
}

5. Bugs and pits related

1) Fix a rapidjson compilation failure caused by the macro Unit

  • Imported header files
#include "rapidjson/document.h"
#include "rapidjson/stringbuffer.h"
#include "rapidjson/prettywriter.h"
  • Error message (the following error occurs after compilation at this time (only the key ones are selected)
bc_out/public/protobuf-json/output/include/rapidjson/reader.h: At global scope:
bc_out/public/protobuf-json/output/include/rapidjson/reader.h:84: error: expected unqualified-id before "unsigned"
bc_out/public/protobuf-json/output/include/rapidjson/reader.h:84: error: expected `)' before "unsigned"
bc_out/public/protobuf-json/output/include/rapidjson/reader.h: In member function `void rapidjson::GenericReader<SourceEncoding, TargetEncoding, Allocator>::ParseNumber(InputStream&, Handler&)':
bc_out/public/protobuf-json/output/include/rapidjson/reader.h:632: error: expected unqualified-id before '(' token
bc_out/public/protobuf-json/output/include/rapidjson/reader.h:632: error: expected primary-expression before "unsigned"
In file included from baidu/xfire/xcore2/plugins/material_search/src/material_search_proc_query_db.cpp:15:
bc_out/public/protobuf-json/output/include/rapidjson/document.h: In member function `const rapidjson::GenericValue<Encoding, Allocator>& rapidjson::GenericValue<Encoding, Allocator>::Accept(Handler&) const':
bc_out/public/protobuf-json/output/include/rapidjson/document.h:549: error: expected unqualified-id before '(' token
bc_out/public/protobuf-json/output/include/rapidjson/document.h:549: error: expected primary-expression before "unsigned"
bc_out/public/protobuf-json/output/include/rapidjson/document.h: At global scope:
bc_out/public/protobuf-json/output/include/rapidjson/document.h:791: error: expected unqualified-id before "unsigned"
bc_out/public/protobuf-json/output/include/rapidjson/document.h:791: error: expected `)' before "unsigned"
In file included from bc_out/public/protobuf-json/output/include/rapidjson/prettywriter.h:4,
                 from baidu/xfire/xcore2/plugins/material_search/src/material_search_proc_query_db.cpp:17:
bc_out/public/protobuf-json/output/include/rapidjson/writer.h:45: error: expected unqualified-id before "unsigned"
bc_out/public/protobuf-json/output/include/rapidjson/writer.h:45: error: expected `)' before "unsigned"
bc_out/public/protobuf-json/output/include/rapidjson/writer.h:45: error: abstract declarator `rapidjson::Writer<OutputStream, SourceEncoding, TargetEncoding, Allocator>&' used as declaration
bc_out/public/protobuf-json/output/include/rapidjson/writer.h:45: error: expected `;' before "unsigned"
bc_out/public/protobuf-json/output/include/rapidjson/writer.h:46: error: expected `;' before "Writer"
In file included from baidu/xfire/xcore2/plugins/material_search/src/material_search_proc_query_db.cpp:17:
bc_out/public/protobuf-json/output/include/rapidjson/prettywriter.h:46: error: expected unqualified-id before "unsigned"
bc_out/public/protobuf-json/output/include/rapidjson/prettywriter.h:46: error: expected `)' before "unsigned"
bc_out/public/protobuf-json/output/include/rapidjson/prettywriter.h:46: error: abstract declarator `rapidjson::PrettyWriter<OutputStream, SourceEncoding, TargetEncoding, Allocator>&' used as declaration
bc_out/public/protobuf-json/output/include/rapidjson/prettywriter.h:46: error: expected `;' before "unsigned"
bc_out/public/protobuf-json/output/include/rapidjson/prettywriter.h:47: error: expected `;' before "PrettyWriter"

The direct meaning is that the reader.h header file is wrong, but obviously the rapid library itself is not wrong.

  • Analyze
    the 84 lines of code that solve the problem in reader.h as follows:
    void Uint(unsigned i);

A Uint function is defined. According to experience, the same function may be defined somewhere, and the namespace is the same, which leads to conflicts. After searching, there is no such problem. At this time, someone raised a similar question on google to github. It may be 宏和Uinta conflict, so make the following changes

#ifdef Uint
#undef Uint
#include "rapidjson/document.h"
#include "rapidjson/stringbuffer.h"
#include "rapidjson/prettywriter.h"
#endif
  • Find macros with the same definition
    At this time, the compilation can pass, but after all, we forcibly undef a macro definition, so we search for the definition of the Uint macro in the dependent files
find . -name "*.h" | xargs grep "#define Uint"

./bc_out/lib2-64/ullib/output/include/ul_def.h:#define Uint(inp)    (unsigned int)(inp)
./lib2-64/ullib/include/ul_def.h:#define Uint(inp)  (unsigned int)(inp)

I found the definition of the macro, and finally added it to prevent problems in some places. After the final solution, the reference was modified as follows

#ifdef Uint
#undef Uint
#include "rapidjson/document.h"
#include "rapidjson/stringbuffer.h"
#include "rapidjson/prettywriter.h"
#define Uint(inp)   (unsigned int)(inp)
#endif

At this point, the compilation is passed and can be executed smoothly.

Rapid json is indeed faster than Boost. For 80,000 rows of data, boost took nearly 30s, and rapid took less than 2s.

  • Other Supplements
    I learned that macros act globally, so they will conflict with other functions and variables. For functions, another solution is to enclose the function name in parentheses, such as
(std::min)(x, y); 

At this time, min will not be overwritten by other macros defined as min.
Of course, this is only useful for your own code, and this cannot be changed for third-party libraries.

2) How to insert a json as a node into another json?

(How to insert a document node into another document?)
For example, there are the following two documents (DOM):

Document person;
person.Parse("{\"person\":{\"name\":{\"first\":\"Adam\",\"last\":\"Thomas\"}}}");
 
Document address;
address.Parse("{\"address\":{\"city\":\"Moscow\",\"street\":\"Quiet\"}}");

Suppose we want to insert the entire address into person as a child node:

{
    
     "person": {
    
    
   "name": {
    
     "first": "Adam", "last": "Thomas" },
   "address": {
    
     "city": "Moscow", "street": "Quiet" }
   }
}

In the process of inserting nodes, you need to pay attention to the life cycle of document and value and use allocator correctly for memory allocation and management. A simple and effective method is to modify the definition of the above address variable to use the node of allocator 初始化person 然后将其添加到根.

Documnet address(&person.GetAllocator());
...
person["person"].AddMember("address", address["address"], person.GetAllocator());

Of course, if you don't want to get the value of the address by explicitly writing out the key, you can use an iterator:

auto addressRoot = address.MemberBegin();
person["person"].AddMember(addressRoot->name, addressRoot->value, person.GetAllocator());

In addition, it can also be achieved by deep copying address document:

Value addressValue = Value(address["address"], person.GetAllocator());
person["person"].AddMember("address", addressValue, person.GetAllocator());

3) Improper use of interfaces will cause memory leaks

  1. Reuse of Document objects
    Reuse of Document objects may lead to memory leaks, such as the following code snippet:
#include <rapidjson/document.h>
int main() {
    
    
  rapidjson::Document doc;
  for (int i=0; i<1000000; ++i) {
    
    
      std::string a = "{\"b\":1" + std::to_string(i) + "}";
      doc.Parse(a.c_str());
  }
  return 0;
}

Reference: https://github.com/Tencent/rapidjson/issues/1333. Solution:

#include <rapidjson/document.h>
int main() {
    
    
  rapidjson::Document doc;
  for (int i=0; i<1000000; ++i) {
    
    
      std::string a = "{\"b\":1" + std::to_string(i) + "}";
      doc.Parse(a.c_str());
      rapidjson::Document tmpdoc;
      doc.Swap(tmpdoc);
  }
  return 0;

}
  1. Use Value type pointer

If the Document type pointer is used as the Value type pointer, a memory leak will occur. The reason is that the destruction of Value is not virtual, and the -Wall compiler will not give an alarm.

4) Shallow copy to add string elements

  • code example
#include "rapidjson/document.h"
#include "rapidjson/prettywriter.h"
#include "rapidjson/stringbuffer.h"
#include <iostream>

using namespace std;
using namespace rapidjson;

int main() {
    
    
    Document doc;
    doc.SetObject();
    Document::AllocatorType &allocator = doc.GetAllocator();

    string value1 = "value1";
    doc.AddMember("key1", StringRef(value1.c_str(), value1.size()), allocator);
    string v = doc["key1"].GetString();
    cout << v.c_str() << endl; // 輸出爲value1

    value1 = "abc";
    v = doc["key1"].GetString();
    cout << v.c_str() << endl; // 輸出爲abc

    system("pause");
    return 0;
}

The output results prove that the above operation is a shallow copy of the string. So if there is code like below, the output will be undefined value.

void addMem(Document& doc) {
    
    
    string value = "123456";
	doc.AddMember("key", StringRef(value.c_str(), value.size()), doc.GetAllocator());
}

int main() {
    
    
	Document doc;
	doc.SetObject();
	addMem(doc);

	string v = doc["key"].GetString();
	cout << v.c_str() << endl; //此處由於局部變量被釋放,將輸出亂碼

	system("pause");
	return 0;
}
  • The problem point
    From the beginning to the end, it has been a shallow copy of AddMember. So when adding a string value to doc, don't add local variables as input parameters. In the same way, the key is not required.

  • correct spelling

void addMem(Document& doc) {
    string value = "123456";
	Value v(kStringType);
	v.SetString(value.c_str(), value.size(), doc.GetAllocator()); //這裏很重要,必須要傳遞allocator作爲參數,否則依然爲淺拷貝。
	doc.AddMember("key", v, doc.GetAllocator());
}

5) Other pits

1. Assignment is a move operation.

rapidjson::Value a(123);
rapidjson::Value b(456);
b = a; // a变成Null,b变成数字123,这样的做法是基于性能考虑

2. The constant array only stores pointers.
3. The map is implemented using an array, and the query performance is low.
4. When using the [] operator of map, the corresponding key must exist, otherwise coredump.
5. AddMember() and PushBack() also adopt Move semantics. Deep copy Value:

Value v1("foo");
// Value v2(v1); // 不容许
Value v2(v1, a); // 制造一个克隆,v1不变

Document d;
v2.CopyFrom(d, a); // 把整个document复制至v2,d不变

Six, json performance evaluation document

https://github.com/miloyip/nativejson-benchmark

Seven, rapidjson design and implementation

Document Portal

Eight, cjson, rapidjson, yyjson large integer accuracy comparison

Document Portal

Guess you like

Origin blog.csdn.net/weixin_43679037/article/details/128347621