Rlang(3)Factor - String - Function and Package

Rlang(3)Factor - String - Function and Package

1. Factor
Download more sample data from Here
http://dapengde.com/wp-content/uploads/2013/03/dapengde_DummyR_PMBeijing.csv
http://dapengde.com/wp-content/uploads/2013/03/dapengde_DummyR_PMZhengzhou.csv

Load all the data in
bj <- read.csv(file="/opt/data/dapengde_DummyR_PMBeijing.csv")
zz <- read.csv(file="/opt/data/dapengde_DummyR_PMZhengzhou.csv")

Add one more column
bj$city <- "Beijing"
zz$city <- "Zhengzhou"

Combine the 2 collections of data
data <- rbind(bj, zz)

The data format will be
index date pm city
1        1      5    Beijing
...

Not like summary, str() function will provide us some other information
> str(data)
'data.frame':    37 obs. of  3 variables:
$ date: int  1 2 3 4 5 6 7 8 9 10 ...
$ pm  : int  5 27 75 22 130 228 220 205 63 35 ...
$ city: chr  "Beijing" "Beijing" "Beijing" "Beijing" ...

Factor the column of data:
data$city <- factor(data$city)

Check the summary:
str(data)

How many levels we have
nlevels(data$city)

List all the levels
levels(data$city)

Draw the box diagram
plot(x = data$city, y = data$pm)

Print the average value of different cities
> for( i in levels(data$city)) {
+   print(i)
+   print(mean(data$pm[data$city == i]))
+ }
[1] "Beijing"
[1] 125
[1] "Zhengzhou"
[1] 66.8

The sample logic, less codes
> tapply(data$pm, data$city, mean)
  Beijing Zhengzhou
    124.7      66.8

2. Master of String
> mydata <- "/opt/data/dapengde_DummyR_PM25.csv"
> mydata
[1] "/opt/data/dapengde_DummyR_PM25.csv"
> class(mydata)
[1] "character"

Connect 2 Things into 1
> paste("hua","luo")
[1] "hua luo"
> paste("hua","luo",sep="_")
[1] "hua_luo"
> paste(c("carl","hua"), "luo")
[1] "carl luo" "hua luo"

We can do paste on collections as well
> paste(c("carl","kiko"), c("luo","kang"))
[1] "carl luo"  "kiko kang"
> paste(c("carl","kiko"), c("luo"))
[1] "carl luo" "kiko luo"
> paste(c("carl","kiko"), c("luo","kang","xie"))
[1] "carl luo"  "kiko kang" "carl xie"

cat a new line or tab some spaces
> cat("\n","new line", "\n", "new line", "\n", "tab the word","\t","be tabbed")

new line
new line
tab the word      be tabbed

count the characters in the string
> x = "luohua"
> nchar(x)
[1] 6

Split the string
> strsplit("luohua","u")
[[1]]
[1] "l"  "oh" "a"

substr(), substring()

Search keyword
> grep("java", c("javaworld", "scala based on java", "python is great"))
[1] 1 2

Search and Replace the words
> gsub("java", "scala", c("we build our system using java", "java is the main language we are using"))
[1] "we build our system using scala"         "scala is the main language we are using"

> sub("java", "scala", c("we build our system using java", "java is the main language we are using"))
[1] "we build our system using scala"         "scala is the main language we are using"

Search and Replace Each Char
> chartr("ja", "sc", c("we build our system using java", "java is the main language we are using"))
[1] "we build our system using scvc"         "scvc is the mcin lcngucge we cre using"

Lower case and Upper case
> toupper("carl")
[1] "CARL"
> tolower("CARL")
[1] "carl"

3. Function
Check function sd
> x <- 1:5
> sd(x)
[1] 1.58
> ?sd

> sd
function (x, na.rm = FALSE)
sqrt(var(if (is.vector(x)) x else as.double(x), na.rm = na.rm))
<bytecode: 0x10ca5abf8>
<environment: namespace:stats>

assign function is equal to <-
> assign("x", 1:5)
> x
[1] 1 2 3 4 5

Define the Function Our Selves
> newscore <- function(x) {
+   sqrt(x) * 10
+ }
> newscore(x=40)
[1] 63.2

> newscore2 <- function(x,n) {
+   sqrt(x) * 10 + n
+ }
> newscore2(36,10)
[1] 70

> newscore2
function(x,n) {
  sqrt(x) * 10 + n
}

Give function Default Value
> newscore <- function(x=36) {
+   sqrt(x) * 10
+ }
> newscore()
[1] 60

And the last line will be the return value of the function.

4. Packages
Total Seven Thousand packages?
> length(unique(rownames(available.packages())))
[1] 7086

Download and Install the Package directly on the Command Line
> install.packages("maptools")
also installing the dependency ‘sp’

trying URL 'https://cran.rstudio.com/bin/macosx/mavericks/contrib/3.2/sp_1.1-1.tgz'
Content type 'application/x-gzip' length 1508408 bytes (1.4 MB)
==================================================
downloaded 1.4 MB

trying URL 'https://cran.rstudio.com/bin/macosx/mavericks/contrib/3.2/maptools_0.8-36.tgz'
Content type 'application/x-gzip' length 1728083 bytes (1.6 MB)
==================================================
downloaded 1.6 MB

Before I use that package, I need to load that package first, in the original document, it does not mention that at the first example.
> require(maptools)
> position <- c(116.39, 39.91)
> mydate <- "2015-09-02"
> sunriset(matrix(position, nrow = 1), as.POSIXct(mydate, tz = "Asia/Shanghai"),
+          direction = c("sunrise"), POSIXct.out = TRUE)$time
[1] "2015-09-02 05:42:30 CST"
> sunriset(matrix(position, nrow = 1), as.POSIXct(mydate, tz = "Asia/Shanghai"),
+          direction = c("sunset"), POSIXct.out = TRUE)$time
[1] "2015-09-02 18:45:28 CST"

An animation Package
install.packages("animation")
require(animation)
demo("fireworks")
citation("animation")

References:
http://dapengde.com/archives/14845

http://dapengde.com/archives/14858

http://dapengde.com/archives/14862

http://dapengde.com/archives/14905

http://dapengde.com/archives/14850  exercise

猜你喜欢

转载自sillycat.iteye.com/blog/2240407