r dplyr mutate_if multiple conditions

Dani :

I realise this question has been asked previously, but I can't seem to get the code to work.

Here is my data:

structure(list(ph503_3 = c(-1, -1, -1, 0, -1, -1), gripstrength = c(33, 
40, 26, 30, 49, 31), IPAQmetminutes = c(5196, 198, 1674, 642, 
11724, 1155), tugtimesec = c(8, 7, 7, 17, 9, 8), MHcesd = c(1, 
0, 1, 12, 0, 9), id = c("292221", "334262", "075822", "40642", 
"274222", "245801"), age = c(58, 68, 54, 64, 52, 58), COGmmse = c(30, 
27, 29, 27, 30, 29), DISconverse1 = c("None", "None", "None", 
"None", "None", "None"), MDantidepressant = c("No", "No", "No", 
"Yes", "No", "No"), MDantipark = c(0, 0, 0, 0, 0, 0), MDpolypharmacy = c(0, 
0, 0, 1, 0, 0), W2socialclass = c("Skilled", "Semi-skilled", 
"Managerial & Technical", "Non-Manual", "Skilled", "Managerial & Technical"
), bh201 = c("Would never doze", "Would never doze", "Slight chance of dozing", 
"Would never doze", "High chance of dozing", "Would never doze"
), fl001_01 = c("NOT Walking 100 metres (100 yards)", "NOT Walking 100 metres (100 yards)", 
"NOT Walking 100 metres (100 yards)", "Walking 100 metres (100 yards)", 
"NOT Walking 100 metres (100 yards)", "NOT Walking 100 metres (100 yards)"
), fl001_02 = c("NOT Running or jogging about 1.5 kilometres (1 mile)", 
"Running or jogging about 1.5 kilometres (1 mile)", "NOT Running or jogging about 1.5 kilometres (1 mile)", 
"Running or jogging about 1.5 kilometres (1 mile)", "Running or jogging about 1.5 kilometres (1 mile)", 
"NOT Running or jogging about 1.5 kilometres (1 mile)"), fl001_04 = c("NOT Getting up from a chair after sitting for long periods", 
"NOT Getting up from a chair after sitting for long periods", 
"NOT Getting up from a chair after sitting for long periods", 
"Getting up from a chair after sitting for long periods", "NOT Getting up from a chair after sitting for long periods", 
"NOT Getting up from a chair after sitting for long periods"), 
    fl001_05 = c("NOT Climbing several flights of stairs without resting", 
    "NOT Climbing several flights of stairs without resting", 
    "NOT Climbing several flights of stairs without resting", 
    "Climbing several flights of stairs without resting", "NOT Climbing several flights of stairs without resting", 
    "NOT Climbing several flights of stairs without resting"), 
    fl001_06 = c("NOT Climbing one flight of stairs without resting", 
    "NOT Climbing one flight of stairs without resting", "NOT Climbing one flight of stairs without resting", 
    "Climbing one flight of stairs without resting", "NOT Climbing one flight of stairs without resting", 
    "NOT Climbing one flight of stairs without resting"), fl001_07 = c("NOT Stooping, kneeling, or crouching", 
    "NOT Stooping, kneeling, or crouching", "NOT Stooping, kneeling, or crouching", 
    "Stooping, kneeling, or crouching", "NOT Stooping, kneeling, or crouching", 
    "NOT Stooping, kneeling, or crouching")), row.names = c(NA, 
-6L), class = c("tbl_df", "tbl", "data.frame"))

My code:

mydata <- mydata %>% 
  mutate_if(class(.)=="character" & str_detect(colnames(.), "^fl\\d|^ph\\d"), ~if_else(grepl("NOT ", .), 0, 1)) 

The code runs, but nothing happens and I get the following message when I knit the markdown:

Warning message:
In class(.) == "character" & str_detect(colnames(.), "^fl\\d|^ph\\d") :
  longer object length is not a multiple of shorter object length
r2evans :

The conditional needs to return multiple columns all at once, but reading class(.) == "character" makes me believe you are checking one column at a time. The . is replaced internally with the whole frame not individual columns/vectors. The second half of your conditional is correct, but the first:

myfunc <- function(...) { browser(); TRUE; }
mydata %>% mutate_if(myfunc(.), ~ 1)
# Browse[2]>
list(...)
# [[1]]
# # A tibble: 6 x 20
#   ph503_3 gripstrength IPAQmetminutes tugtimesec MHcesd id      age COGmmse DISconverse1
#     <dbl>        <dbl>          <dbl>      <dbl>  <dbl> <chr> <dbl>   <dbl> <chr>       
# 1      -1           33           5196          8      1 2922~    58      30 None        
# 2      -1           40            198          7      0 3342~    68      27 None        
# 3      -1           26           1674          7      1 0758~    54      29 None        
# 4       0           30            642         17     12 40642    64      27 None        
# 5      -1           49          11724          9      0 2742~    52      30 None        
# 6      -1           31           1155          8      9 2458~    58      29 None        
# # ... with 11 more variables: MDantidepressant <chr>, MDantipark <dbl>, MDpolypharmacy <dbl>,
# #   W2socialclass <chr>, bh201 <chr>, fl001_01 <chr>, fl001_02 <chr>, fl001_04 <chr>,
# #   fl001_05 <chr>, fl001_06 <chr>, fl001_07 <chr>

In that context, class(whole_data_frame) == "character" does not make sense (by itself).

You can look for character columns using sapply(., is.character) (or one of purrr's equivalents):

mydata %>% 
  mutate_if(sapply(., is.character) &
              stringr::str_detect(colnames(.), "^fl\\d|^ph\\d"),
            ~ +(!grepl("NOT ", .))) %>%
  str(.)
# Classes 'tbl_df', 'tbl' and 'data.frame': 6 obs. of  20 variables:
#  $ ph503_3         : num  -1 -1 -1 0 -1 -1
#  $ gripstrength    : num  33 40 26 30 49 31
#  $ IPAQmetminutes  : num  5196 198 1674 642 11724 ...
#  $ tugtimesec      : num  8 7 7 17 9 8
#  $ MHcesd          : num  1 0 1 12 0 9
#  $ id              : chr  "292221" "334262" "075822" "40642" ...
#  $ age             : num  58 68 54 64 52 58
#  $ COGmmse         : num  30 27 29 27 30 29
#  $ DISconverse1    : chr  "None" "None" "None" "None" ...
#  $ MDantidepressant: chr  "No" "No" "No" "Yes" ...
#  $ MDantipark      : num  0 0 0 0 0 0
#  $ MDpolypharmacy  : num  0 0 0 1 0 0
#  $ W2socialclass   : chr  "Skilled" "Semi-skilled" "Managerial & Technical" "Non-Manual" ...
#  $ bh201           : chr  "Would never doze" "Would never doze" "Slight chance of dozing" "Would never doze" ...
#  $ fl001_01        : int  0 0 0 1 0 0
#  $ fl001_02        : int  0 1 0 1 1 0
#  $ fl001_04        : int  0 0 0 1 0 0
#  $ fl001_05        : int  0 0 0 1 0 0
#  $ fl001_06        : int  0 0 0 1 0 0
#  $ fl001_07        : int  0 0 0 1 0 0

(I shortened your if_else(grepl("NOT ", .), 0, 1) to be just +(!grepl("NOT ", .)), a little for code-golf, a little because I think using ifelse/if_else there is a little more than necessary. It's not wrong, and if your future needs are a little more complex than just 0/1, then if_else is certainly good. My trick of +(...) is a way to quickly convert logical to integer, try +TRUE.)

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=27254&siteId=1