#R# #gsub()# #正则表达式学习1#

gsub(pattern, replacement, x, ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE)

grep(pattern, x, ignore.case = FALSE, perl = FALSE, value = FALSE, fixed = FALSE, useBytes = FALSE, invert = FALSE)

grepl(pattern, x, ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE)

sub(pattern, replacement, x, ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE)

regexpr(pattern, text, ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE)

gregexpr(pattern, text, ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE)

regexec(pattern, text, ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE)

各常用功能介绍：

gsub()：对查找到的所有内容进行替换，返回替换后的text；否则直接返回text

sub()：只对查找到的第一个内容进行替换

注意：sub()是对字符串中查找到的第一个进行替换，x可以是一个向量，举例如下：

a <- c(12,123,234)
x <- c(a,a,a)
sub("2","r",a)
sub("2","r",x)
gsub("2","r",a)
gsub("2","r",x)

可以跑一下代码感受下

grep()：查找，存在参数value，返回结果是匹配项的下标

grep1()：查找，返回值为true

pattern：替换什么（即正则表达式：被用来检索、替换那些符合某个模式/规则的文本）

replacement：替换成什么

x：在哪里/什么里面替换

扫描二维码关注公众号，回复： 2766593 查看本文章

ignore.case：FALSE表示区分大小写；TRUE则忽略大小写

perl：是否使用perl兼容的正则表达式（regexps）（TRUE/FALSE）

fixed：如果为TRUE，pattern是要匹配的字符串。覆盖所有冲突的参数

useBytes：默认为false，当为true时，则是逐字节逐字节匹配而不是逐字符逐字符匹配。

gsub()与sub()区别

sub()只对查找到的第一个内容进行替换；gsub()对查找到的所有内容进行替换

转义符
Syntax	Description
\\d	Digit, 0,1,2 ... 9
\\D	Not Digit
\\s	Space
\\S	Not Space
\\w	Word
\\W	Not Word
\\t	Tab
\\n	New line
^	Beginning of the string
$	End of the string
\	Escape special characters, e.g. \\ is "\", \+ is "+"
\|	Alternation match. e.g. /(e\|d)n/ matches "en" and "dn"
•	Any character, except \n or line terminator
[ab]	a or b
[^ab]	Any character except a and b
[0-9]	All Digit
[A-Z]	All uppercase A to Z letters
[a-z]	All lowercase a to z letters
[A-z]	All Uppercase and lowercase a to z letters
i+	i at least one time
i*	i zero or more times
i?	i zero or 1 time
i{n}	i occurs n times in sequence
i{n1,n2}	i occurs n1 - n2 times in sequence
i{n1,n2}?	non greedy match, see above example
i{n,}	i occures >= n times
[:alnum:]	Alphanumeric characters: [:alpha:] and [:digit:]
[:alpha:]	Alphabetic characters: [:lower:] and [:upper:]
[:blank:]	Blank characters: e.g. space, tab
[:cntrl:]	Control characters
[:digit:]	Digits: 0 1 2 3 4 5 6 7 8 9
[:graph:]	Graphical characters: [:alnum:] and [:punct:]
[:lower:]	Lower-case letters in the current locale
[:print:]	Printable characters: [:alnum:], [:punct:] and space
[:punct:]	Punctuation character: ! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { \| } ~
[:space:]	Space characters: tab, newline, vertical tab, form feed, carriage return, space
[:upper:]	Upper-case letters in the current locale
[:xdigit:]	Hexadecimal digits: 0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f

可参考http://www.endmemo.com/program/R/gsub.php

https://www.cnblogs.com/wheng/p/6262737.html

#R# #gsub()# #正则表达式学习1#

猜你喜欢