Walking in the sun, those invisible characters

Walking in the sun, those invisible characters

Suppose we already know the Unicode character set, also do not know if you can read this, then wait for an article introduces Unicode.

 

background

 

Today, we mainly talk about the walk in the sun invisible characters. Invisible character is called in computer science and science communication control characters or non-printable character, a code bit character set (code point), not a written symbol, which is the environment in general it is not visible in the written presentation character.

In front of the world, we look MDN documentation can see related information, such as String escape character (Escape Notation) module have introduced,

We can try to escape these characters, such as '\ f', '\ b' and so we go to a new console such a string variable and it is not visible, but we did not see the length of the string equal to 0.

We can ECMAScript standard to find relevant introduction

A string literal is zero or more characters enclosed in single or double quotes. Each character may be represented by an escape sequence. All characters may appear literally in a string literal except for the closing quote character, backslash, carriage return, line separator, paragraph separator, and line feed. Any character may appear in the form of an escape sequence.

就是除了"closing quote character, backslash, carriage return, line separator, paragraph separator, and line feed" 都能在字符串中逐字的出现。

 

工具

 

总的来说不可见字符最大的作用就是不可见性,那么我们可以利用这个生成一些带有不可见字符的信息展示在某些地方。那么,怎么去生成这一整套工具呢?让我们来做一下任务拆解,

  1. 将特定的信息生成不可见字符,即隐形加密
  2. 将不可见字符与需要显示的信息也就是明文信息组合生成最后的展示文本信息,即明文与密文的组合
  3. 能够将上一步生成的展示文本信息解析得到原始的可见的特定信息,即反隐形加密

根据以上任务步骤,就能进行功能开发了,

 

隐形加密

 1   const zeroWidthSpace = '\u200B'
 2   const zeroWidthJoiner = '\u200D'
 3   const zeroWidthNonJoiner = '\u200C'
 4   const zeroWidthNonBreakSpace = '\uFEFF'
 5 
 6   function createEncryptionText(text) {
 7   if (!text || typeof text !== 'string') {
 8     throw new Error('invalid param, param must be string')
 9   }
10 
11   const binaryText = textToBinary(text)
12   return binaryText
13     .split('')
14     .map(b => {
15       const num = parseInt(b, 10)
16       if (num === 1) {
17         return zeroWidthSpace
18       }
19 
20       if (num === 0) {
21         return zeroWidthNonJoiner
22       }
23 
24       return zeroWidthJoiner
25     })
26     .join(zeroWidthNonBreakSpace)
27 }
28 
29   function charToBinary(char) {
30       return char.charCodeAt(0).toString(2)
31   }
32 
33   function textToBinary(text) {
34       return text
35        .split('')
36        .map(item => padStar(charToBinary(item)))
37        .join(' ')
38    }
39 
40    function padStar(text, length = 8, chars = '0') {
41     if (typeof text !== 'string') {
42       throw new Error('invalid params. text must be string')
43     }
44 
45     return (
46       Array(length)
47         .fill(chars)
48         .slice(text.length) + text
49     )
50   }
51 
52   console.log(createEncryptionText('wfsovereign')) // ""
53   console.log(createEncryptionText('wfsovereign').length) // 195

 

 

这里首先准备了一些隐形的Unicode字符用于对要加密文本(后面称之为签名)的替换,然后定义好替换的规则,将签名先转换成二进制然后逐位进行替换。上面我们可以看到加密后的文本输出好似一个空字符串,然而我们看到该字符串的长度却是195,由此证明我们成功的将签名转化为了隐形文本。

对于2、3点这里我们就不展开详说了,我将整个加解密以及隐形码位的提取抽成了一个ZeroWidthCharacterEncryptionManager 类,然后将代码放到了我的GitHub,感兴趣的同学可以移步查阅。其中需要的注意的两点这里我提一下,一个是反隐形加密的时候要按照加密的规则一一对应,这样才能得到原始签名;另一个是提取一段文本内容的时候,我采用的是正则,这个正则如何写是根据我们采取的一些隐形的码位来定的,比如上面我选择的zeroWIdthSpace等,对应的正则就应该是*/[\u200B-\u200C\uFEFF]+/* 。

 

应用

 

根据上面的工具类,我们看到的一个应用场景就是在一段文本中加上隐形签名或者水印,这样我们生成的文本内容如果被他人传播的话,就能通过隐形签名来检测是否是从我们这里传播出去的,感觉还能保护版权啥的,和在一些网站copy内容会自动带上出处的做法有异曲同工之妙啊~

那么,此外还有没有其他作用?这就要看聪明的你的奇思妙想咯 :)

ps: 及时总结,静心沉淀;如风少年,砥砺前行。

如想了解更多,请移步我的博客

欢迎关注我的公众号 “和F君一起xx”

参考资料:

  1. Control Character
  2. String MDN
  3. Be careful what you copy: Invisible inserting usernames into text width zero width characters
  4. Zero width non joiner
  5. Zero width space

Guess you like

Origin www.cnblogs.com/wfsovereign/p/12054768.html