R language mean, median and mode

Statistical analysis in R is performed by using many built-in functions. Most of these functions are part of the base R package. These functions take R vectors as input and arguments, and give results.

The functions we discuss in this chapter are mean, median and mode.

Mean average

The average is obtained by summing the data set and dividing by the total number of sums

The function mean() is used to calculate the mean in R language.

grammar

The basic syntax for calculating mean in R is −

mean(x, trim = 0, na.rm = FALSE, ...)

Following is the description of the parameters used −

  • x is the input vector.

  • trim is used to discard some observations from both ends of the sorted vector.

  • na.rm is used to remove missing values ​​from the input vector.

example

# Create a vector. 
x <- c(12,7,3,4.2,18,2,54,-21,8,-5)

# Find Mean.
result.mean <- mean(x)
print(result.mean)

When we execute the above code, it produces the following result −

[1] 8.22

Apply trim options

When the trim argument is provided, the values ​​in the vector are sorted and then the desired observations are subtracted from the calculated mean.

When trim = 0.3, the 3 values ​​from each end are subtracted from the calculation to find the mean.

In this case, the sorted vector is (-21, -5, 2, 3, 4.2, 7, 8, 12, 18, 54) and the values ​​removed from the vector used to calculate the mean are ( -21, -5,2) from the left and (12,18,54) from the right.

# Create a vector.
x <- c(12,7,3,4.2,18,2,54,-21,8,-5)

# Find Mean.
result.mean <-  mean(x,trim = 0.3)
print(result.mean)

When we execute the above code, it produces the following result −

[1] 5.55

Apply NA option

The mean function returns NA if there are missing values.

To remove missing values ​​from the calculation, use na.rm = TRUE. This means removing NA values.

# Create a vector. 
x <- c(12,7,3,4.2,18,2,54,-21,8,-5,NA)

# Find mean.
result.mean <-  mean(x)
print(result.mean)

# Find mean dropping NA values.
result.mean <-  mean(x,na.rm = TRUE)
print(result.mean)

When we execute the above code, it produces the following result −

[1] NA
[1] 8.22

Median median

The most intermediate value in the data series is called the median. Use median() function in R language to calculate this value.

grammar

The basic syntax to calculate the median in R language is −

median(x, na.rm = FALSE)

Following is the description of the parameters used −

  • x is the input vector.

  • na.rm is used to remove missing values ​​from the input vector.

example

# Create the vector.
x <- c(12,7,3,4.2,18,2,54,-21,8,-5)

# Find the median.
median.result <- median(x)
print(median.result)

When we execute the above code, it produces the following result −

[1] 5.6

Mode mode

A pattern is the most frequent value in a set of data. Unike mean and median, and patterns can contain both numeric and character data.

The R language does not have a standard built-in function to compute patterns. Therefore, we create a user function to calculate the schema of the dataset in R language. The function takes a vector as input and the mode value as output.

example

# Create the function.
getmode <- function(v) {
   uniqv <- unique(v)
   uniqv[which.max(tabulate(match(v, uniqv)))]
}

# Create the vector with numbers.
v <- c(2,1,2,3,1,2,3,4,1,5,5,3,2,3)

# Calculate the mode using the user function.
result <- getmode(v)
print(result)

# Create the vector with characters.
charv <- c("o","it","the","it","it")

# Calculate the mode using the user function.
result <- getmode(charv)
print(result)

When we execute the above code, it produces the following result −

[1] 2
[1] "it"

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324770404&siteId=291194637