Solution for processing Decimal128 type data in Amazon DocumentDB

a simple math problem

Before starting today's content, let's calculate a simple math problem. 0.1 X 0.2 = ? I believe many people laughed, 0.02, this is an answer that a child can answer. Let's ask the computer this math problem and see what the result is.

The Amazon cloud technology developer community provides developers with global development technology resources. There are technical documents, development cases, technical columns, training videos, activities and competitions, etc. Help Chinese developers connect with the world's most cutting-edge technologies, ideas, and projects, and recommend outstanding Chinese developers or technologies to the global cloud community. If you haven't paid attention/favorite yet, please don't rush over when you see this, click here to make it your technical treasure house!

Welcome to the first player Java

Java Code:
class Main {
 public static void main(String[] args) {
   System.out.println(0.1 * 0.2);
 }
}

The computer gave the answer:

0.020000000000000004

How about it, are your palms starting to sweat? We welcome the second contestant, Node.Js:

Node.Js Code:
> 0.1
0.1

> 0.2
0.2
> 0.1 * 0.2
0.020000000000000004

Still 0.020000000000000004. Could it be that multiplication doesn't work? Then we switch to addition and subtraction!

The last player, Golang, is invited to enter:

Golang Code:
a := 1024.1
b := a * 100
fmt.Println(b)

102409.99999999999

c := 2.6
fmt.Println(a - c)

1021.4999999999999

Is this a Java or Golang problem? Does it make you feel like your worldview has been turned upside down as we continue to get the same results with mainstream languages ​​like Python, Ruby, etc.?

Don't doubt yourself, it's the computer that's wrong. Why can't such a simple math problem be given the correct answer by a CPU as powerful as Intel/AMD/Graviton?

Let's look at the real reason. In fact, it is because in the decimal mathematical system, the binary floating-point type is not suitable for expressing or describing the data itself. For example, the number 0.1, if it is described with a binary floating-point type, it will be expressed as 0.0001100110011001101, which leads to the loss of precision or deviation of results in many numerical values.

Of course, this does not cause too much problem in our daily life. For example, the temperature and humidity indicators in the weather forecast are only used as a reference for body feeling. 35.79999992 degrees Celsius will not make you feel cooler than 36 degrees Celsius or hotter than 35.5 degrees Celsius; when you are shopping in the supermarket, the cashier will not be very You need to pay 12.133333 yuan, which is 0.133333 yuan more than 12 yuan, but in some high-precision calculation scenarios, the loss of numerical precision will have serious or even completely opposite results to the final result. So how should we calculate the numerical value under the premise of retaining the numerical precision?

Decimal data format

Unlike our common Float, Double and other approximate data types, Decimal saves the exact original value. It can be said that Decimal is specially designed for the decimal mathematical system, which makes up for the shortcomings of the binary representation of decimals. We use a schematic diagram to understand the principle of decimal.

image.png

Decimal in MongoDB

As a widely used document database, MongoDB also suffers from numerical precision issues. In order to be able to store and restore high-precision values, decimal128 came into being, which can provide technical support in the scene of saving extremely small values.

Amazon Cloud Technology has launched Amazon DocumentDB, a cloud-native document database compatible with MongoDB. Relying on the architecture of separation of computing and storage, it has helped customers achieve rapid cluster expansion, automatic streaming backup, and scaling of the computing layer in many different scenarios. , storage layer automatic expansion and many other cloud-native database functions, simplifying database operation and maintenance work and improving work efficiency. However, as of July 2022, DocumentDB does not support data in Decimal128 format. How to solve this problem?

Looking at the essence through the phenomenon, everyone is "String"

Numbers and decimals are also a kind of characters, so Decimal itself is also an extension based on the character format. What is the essential difference between Decimal128 (14.999999) and Decimal ('14.999999') is left to the technical friends to think about. Below we use a solution to solve the compatibility problem between DocumentDB and Decimal128. Let's all come together!

This solution describes the steps of how to convert the Decimal128 data format to String after a short downtime, which solves the format conversion problem of stock data, and realizes the offline migration from MongoDB to DocumentDB through Amazon Data Migration Service.

Code 部分:
##MongoShell Statement,于 MongoDB 执行
##切换至 poc 数据库
use poc;

##创建 origin 数据表并插入两条测试数据,value 字段为 Decimal128
db.origin.insertMany( [
{"_id": 1,  "item": "Byte", "value": Decimal128("1.333333") },

{ "_id": 2, "item": "Bit", "value": Decimal128("2.666666")  }
] )

##结果返回为插入成功
{ acknowledged: true, insertedIds: { '0': 1, '1': 2 } }

##验证一下数据是否存在
db.origin.find();
##返回结果确认数据创建成功;
[
  { _id: 1, item: 'Byte', value: Decimal128("1.333333") },
  { _id: 2, item: 'Bit', value: Decimal128("2.666666") }
]

##转换开始,将 value 字段的 Decimal128格式转换为字符串 String 并另存新字段/列,取名为 newvalue,并将聚合之后的新表输出保存为 poc 数据库下以 newtable 为名的新表

db.getSiblingDB("poc").origin.aggregate( [
{
$addFields: {
newvalue: { $toString: "$value" }
}
},
{ $out : "newtable" }
] )

##确认一下输出是否成功,在看到原始表 origin 之外,增加了一张新表 newtable
show tables;

##得到结果
newtable
origin

##查看一下转换之后新表 newtabl e里面的数据
db.newtable.find();

##结果返回可以看出除了原始表 origin 里的_id,item,value 三个字段之外,新增了一个字段 newvalue,其值与原始 decimal128格式的 value 字段,数值相等,且为字符串 String

[
  {
    _id: 1,
    item: 'Byte',
    value: Decimal128("1.333333"),
    newvalue: '1.333333'
  },
  {
    _id: 2,
    item: 'Bit',
    value: Decimal128("2.666666"),
    newvalue: '2.666666'
  }
]

#经过比对后,数据无误,我们删除原始 decimal128格式的 value 字段
db.newtable.updateMany(
{ "_id": { $gt: 0 } },

{ $unset: { value : "" } 
}
)

##并将 string 格式的 newvalue 重命名为 value
db.newtable.updateMany(
{ "_id": { $gt: 0 } },

{ $rename: { "newvalue": "value"}
 }
)

##返回两条数据修改完成,本段代码为控制台返回,无需执行
{
  acknowledged: true,
  insertedId: null,
  matchedCount: 2,
  modifiedCount: 2,
  upsertedCount: 0
}

##我们确认一下数据
db.newtable.find();

##返回数据中 value 字段已经是 string 格式,本段代码为控制台返回,无需执行
[
  { _id: 1, item: 'Byte', value: '1.333333' },
  { _id: 2, item: 'Bit', value: '2.666666' }
]
##使用 Mongo Shell 原生客户端登录 Amazon DocumentDB
##Bash Statement,其中 YOUR_DOCUMENTDB_ENDPOINT 请使用您环境的##DocumentDB 终端节点地址替换,YOUR_USER_NAME 请使用您环境的 DocumentDB 用户##替换,本操作使用了 DocumentDB 自定义参数组并关闭了 TLS,在您生产环境,建议保留##TLS 处于启用状态
mongosh --host YOUR_DOCUMENTDB_ENDPOINT -u YOUR_USER_NAME -p

##输入用户密码登陆
Enter password: *************

##DocumentDB Statement
##切换至 poc 数据库
Use poc;

##查看数据表
show tables;

##返回为空,当前我们的数据库中没有数据表存在;

Migration from MongoDB to DocumentDB

In addition to using MongoDB's native mongodump/mongorestore for data migration, we can also use Amazon Data Migration Service (DMS) to use MongoDB as the data source and Amazon DocumentDB as the data target for data migration. This example uses the latter

1. Find the DMS service through the console, and click to enter the DMS console

image.png

2. Click [Subnet Group] on the left menu bar, and then click [Create Subnet Group] in the upper right corner

image.png

3. Create a custom subnet group. If your environment is between MongoDB and DocumentDB, and there is a private network environment built by dedicated line or VPN, you can create a custom subnet group in the private subnet as shown in the figure, otherwise, create a custom subnet group in the public subnet A custom subnet group for .

image.png

4. Click [Create Subnet Group] to complete the subnet creation

5. Create a replication instance

image.png

image.png

  1. If your environment is between MongoDB and DocumentDB, and there is a private network environment built by dedicated line or VPN, you can deselect the [Public Access] function as shown in the figure, otherwise, please check the [Public Access] function.

image.png

7. Create an endpoint

image.png

7.1 Create a source endpoint with MongoDB as the engine

image.png

7.2 Replace the content in the red box according to your actual situation

image.png

7.3 Create a Target Endpoint Targeting Amazon DocumentDB

image.png

image.png

7.4 Use Secret Manager to manage DocumentDB account information (optional)

For details, you can read another special blog, please click here

  1. Create migration tasks

image.png

8.1 Create a migration task using the replication node, source endpoint, and target endpoint we created earlier

image.png

8.2 In the table image section, we create a selection rule to select the newtable data table under the poc database, and then click to create a task.

image.png

8.3 Wait for the migration task to be loaded and the progress reaches 100%

image.png

So far, the stock data has been migrated to DocumentDB through this solution combined with DMS, and the conversion from Decimal128 to string data format has been completed. Let's do a verification.

##登陆到 DocumentDB
##DocumentDB Statement
mongosh --host YOUR_DOCUMENTDB_ENDPOINT -u YOUR_USER_NAME -p

##输入用户密码登陆
Enter password: *************

##切换至 poc 数据库
use poc;

##查看数据表
show tables;

##数据表已经由 DMS 同步到了 DocumentDB
newtable

##验证一下数据
db.newtable.find();

##结果返回符合我们预期
[
  { _id: 1, item: 'Byte', value: '1.333333' },
  { _id: 2, item: 'Bit', value: '2.666666' }
]

Convert Decimal128 to Java BigDecimal

Through the previous solution, we have successfully converted Decimal128 into String and stored in the database, realizing the preservation of precision, but the value saved in string format cannot participate in the calculation. How should we solve this problem?

In the Java language, Decimal128 cannot be used directly, and it needs to be specially designed for BigDecimal before performing various processing and calculations. We know that Decimal128 is an extension based on String, so can String be processed in this way?

The answer is yes, we can achieve our needs with the help of a public class BigDecimal in Java. The following is a sample code of Java, showing how we can use this public class to perform two-way conversion of formats, for reference.

##Java Code
##Transfer String to Java BigDecimal
##引用 BigDecimal 公共类
Import java.math.BigDecimal;
##定义公共类 String2BD
Public class String2BD{
	public static void main(String[ ] args){
	String inputstring = “12.3456”;
	BigDecimal bd = new BigDecimal(inputstring);
	System.out.printIn(bd);
}
}

Convert the input string "12.3456" to the number 12.3456, which can be used to read string format data from the database and convert it to Java's BigDecimal format.

##Java Code
##Transfer Java BigDecimal to String
##引用 BigDecimal 公共类
Import java.math.BigDecimal;
##定义公共类 BD2String
Public class BD2String{
	BigDecimal inputbd = new BigDecimal(65.4321)
	String outputstring = inputbd.toString();
	System.out.println(outputstring);
}

Convert the BigDecimal format 65.4321 to get the string "65.4321", and save the result back to the database in string format.

Summarize

This solution uses String instead of Decimal128 to complete the migration of stock data. For new data, on the premise of ensuring efficiency, the two-way conversion between String and BigDecimal is realized through Java's BigDecimal public class, which solves the need to use Decimal128 format in DocumentDB demand. New features of DocumentDB are being released continuously, so stay tuned.

Reference link:

1. Quickly understand Decimal

What is a Decimal? Definition, Properties, Types, Examples, Facts

2. Use Secret Manager to manage DMS Endpoints

Manage your AWS DMS endpoint credentials with Amazon Secrets Manager | Amazon Database Blog

3.Java Public Class BigDecimal from Oracle

BigDecimal (Java Platform SE 8)

The author of this article

image.png

Fu Xiaoming

Amazon cloud solution architect, responsible for consulting and architectural design of cloud computing solutions, and also committed to the research and promotion of database and edge computing. Before joining Amazon Cloud Technology, he was responsible for the design of the Internet brokerage architecture in the IT department of the financial industry, and has extensive experience in distributed, high concurrency, and middleware.

Article source: https://dev.amazoncloud.cn/column/article/6309d3e2d4155422a4610a4d?sc_medium=regulartraffic&sc_campaign=crossplatform&sc_channel=CSDN 

Guess you like

Origin blog.csdn.net/u012365585/article/details/132031433