R代码代写、R留学生国外作业实验代写代做

R代码代写、R留学生国外作业实验代写代做
For this activity you will create an R program and upload that program to wolfware. Be sure that your R file
adheres to the R file submission guidelines (available on wolfware).
To Do – Write an R script that will successfully do what follows. Note that you should have comments
throughout your program explaining what you are doing and also you are asked to place some answers in a
comment.
In this project, you are going to work with census data sets. There are two files: EDU01.xls and PST01.xls that
you will (eventually read in). There is also a file (Mastdata.xls) that describes the variables in the data sets
(along with variables in many other data sets).
With these census data files there are often multiple columns corresponding to one variable. The columns that
we want to work with are the columns that end with a D.
The instructions for this project are deliberately meant to be less prescriptive so that you must think about the
best way to answer the questions. We’ll start with reading in and manipulating the EDU01.xls data set and then
automate somethings to read in the PST01.xls data set.
Use tidyverse functions where possible. Chaining is optional.
Place this code at the top of your file and run the code:
#Load packages needed
library(tidyverse)
library(readxl)
#Create functions for use later on. Running this code makes these functions available to us.
#Given a census column name such as LLL######D, extract the last two numbers (which represent a year usually)
grabYear <- function(string){
m <- regexpr(pattern = "(\\d+)", string)
match <-regmatches(string, m)
year <- gsub(pattern = "\\d\\d\\d\\d", replacement = "", x = match)
return(as.numeric(year))
}
is.County <- function(countyString){
if(length(grep(pattern = ", \\w\\w", countyString))>0){
return(1)
} else {
return(0)
}
}
#vectorize it instead and use as a logical for indexing
is.CountyVec <- Vectorize(FUN = is.County, USE.NAMES = FALSE)
Our goal with the EDU01.xls data set will be to read in the Public School Enrollment variables from 1986-1987
to 2008-2009. Check out in the Mastdata.xls file which variables this corresponds to in EDU01.xls (note the
multiple sheets!).
Read the public school enrollment variables in and do the following:
? Keep only the first two columns (Area_name and STCOU) and those with a name that ends with D
? Remove rows corresponding to Area_name equal to “District of Columbia” (there are two versions of
this row, we want to keep the capital letter one only!)
? Put the data into “long form” rather than wide form
? Put all of the data into one data frame
? Create a “numYear” variable that is a numeric variable corresponding to the year. Recall from above
that the last two digits prior to D represent the year. The grabYear() function you read in above can be
used to extract that
? Use a for loop and the is.County() function to create an indexing vector that is 1 if a row corresponds to
a county level observation and a 0 if the row corresponds to any other observation (State, US, or DC
basically)
? Use the Vectorized version, is.CountyVec(), to create the same indexing vector
? Create two data sets – one data set that contains only county level data and one data set that contains
only non-county data
? Create a function that takes in a data frame that contains the non-county data, adds a column to the data
set called “Division” corresponding to the state’s classification of division here, and returns the data
frame. If row corresponds to a non-state (i.e. UNITED STATES), return ERROR for the division. This
will not be a ton of fun but can be made easier with the use of %in%.
? Your two final data sets should look like the following:
With your edSummaryData set (or whatever you called it), do the following:
? Create summary statistics for the overall mean for each division. Do this twice, once using tapply() and
once using the tidyverse method
? Use microbenchmark to compare the speeds (see Vectorized function video code for examples of this)
Lastly, create a function – the start of it is given here:
formatData <- function(path, sheet = 1, gatherCols = TRUE, colValue = "value", division = TRUE){
require(tidyverse)
}
? This function should read in a particular sheet of a census type data set (we’ll use it on the PST01.xls
data set shortly)
? The function should select the columns as above and filter out the “District of Columbia” rows
? The data set should be filtered to only include non-county data
? Optionally, the user should be able to return a “long form” data set with the value column equal to
whatever string the user specifies in the colValue argument
? If the long form data is being returned the user should be able to choose whether or not to append a
division column as well (if they don’t choose gatherCols = TRUE then it shouldn’t matter what the
division value or colValue things are set to)
? Use the function to read in the 2nd sheet of the PST01.xls data set. Read it in with gatherCols = TRUE
and give a value for colValue. Then also read it in with gatherCols = FALSE.
Below is the output you should get if you run the function with the above two settings:
formatData("PST01.xls", sheet = 2, colValue = "Population", division = TRUE, gather = TRUE)
formatData("PST01.xls", sheet = 2, gather = FALSE)
http://www.6daixie.com/contents/18/1339.htm

我们的方向领域:window编程 数值算法 AI人工智能 金融统计 计量分析 大数据 网络编程 WEB编程 通讯编程 游戏编程多媒体linux 外挂编程 程序API图像处理 嵌入式/单片机 数据库编程 控制台 进程与线程 网络安全  汇编语言 硬件编程 软件设计 工程标准规等。其中代写代做编程语言或工具包括但不限于以下范围:

C/C++/C#代写

Java代写

IT代写

Python代写

辅导编程作业

Matlab代写

Haskell代写

Processing代写

Linux环境搭建

Rust代写

Data Structure Assginment 数据结构代写

MIPS代写

Machine Learning 作业 代写

Oracle/SQL/PostgreSQL/Pig 数据库代写/代做/辅导

Web开发、网站开发、网站作业

ASP.NET网站开发

Finance Insurace Statistics统计、回归、迭代

Prolog代写

Computer Computational method代做

因为专业,所以值得信赖。如有需要,请加QQ:99515681 或邮箱:[email protected]

微信:codinghelp

猜你喜欢

转载自www.cnblogs.com/rrrrrrrrr/p/8974846.html