Word Cloud - R

Have already gradually summarizes how to use Python and JavaScript to create a word cloud, today say is R. In fact, SPSS and SAS Word Cloud expansion modules are based on the R implementation.

Create Word Cloud via R

1) prepared text.

Then again ... we use the last saved Word Cloud History.txt text, so that we can compare a variety of ways to generate a word cloud effect in the final. (Well, it is mainly lazy, to continue to use it ......)

2) install and load the R package.

# Install
install.packages("tm")  # for text mining
install.packages("wordcloud") # word-cloud generator 
install.packages("RColorBrewer") # color palettes
# Load
library("tm")
library("wordcloud")
library("RColorBrewer")

3) to read the text data and washed. Read data is completed we can use inspect()to see if the text is read successfully.

#Read text file
text <- readLines(file.choose())
# Load the data as a corpus
docs <- Corpus(VectorSource(text))
#Inspect the content
#inspect(docs)[1:10]

4) cleaning the data. We will use the tm_map()function to convert the case of text, text clean spaces, common stop words and so on.

# Convert the text to lower case
docs <- tm_map(docs, content_transformer(tolower))
# Remove numbers
docs <- tm_map(docs, removeNumbers)
# Remove english common stopwords
docs <- tm_map(docs, removeWords, stopwords("english"))
# Remove punctuations
docs <- tm_map(docs, removePunctuation)
# Eliminate extra white spaces
docs <- tm_map(docs, stripWhitespace)

5) storing the words (words) and their frequencies (Frequencies) text data generating matrix. Used therein TermDocumentMatrix()from text mining package. After the conversion we can use head()to view the data matrix.

#Convert this into a matrix format
m <- as.matrix(dtm)
#Gives you the frequencies for every word
v <- sort(rowSums(m),decreasing=TRUE)
d <- data.frame(word = names(v),freq=v)
#Scan the data
#head(d, 10)

6) generate the word cloud.

wordcloud(words = d$word, freq = d$freq, scale=c(5,0.5), min.freq = 1,
          max.words=200, random.order=FALSE, rot.per=0.35, 
          colors=brewer.pal(8, "Accent"))

Word Cloud R

Notes

If you want to see the wordcloud()significance of each parameter function or want to change color graphics, knock help(wordcloud), or help(RColorBrewer)you can view the help document it.

Sample Code

download here

Guess you like

Origin www.cnblogs.com/yukiwu/p/10969250.html