Statistics Permutation Test Approach Worksheet
Description
Having Trouble Meeting Your Deadline?
Get your assignment on Statistics Permutation Test Approach Worksheet completed on time. avoid delay and – ORDER NOW
This assgiment needs to use Rstudio. All questions inside the rmd file. Please answer questions inside stater.rmd. Here is preview:
—
output: html_document
—
“`{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(tidyverse) # turn on the tidyverse package
“`
## Problem
The file `county.csv` contains data for a sample of 200 counties in the United States, including variables for demographic, financial, education, and other characteristics. There are 25 variables as follows:
– `growth`: County-level population growth from 2000 to 2010.
– `FIPS`: FIPS code.
– `pop2010`: 2010 county population.
– `pop2000`: 2000 county population.
– `age_under_5`: Percent of population under 5 (2010).
– `age_under_18`: Percent of population under 18 (2010).
– `age_over_65`: Percent of population over 65 (2010).
– `female`: Percent of population that is female (2010).
– `black`: Percent of population that is black (2010).
– `hispanic`: Percent of population that is Hispanic (2010).
– `white_not_hispanic`: Percent of population that is white and not Hispanic (2010).
– `no_move_in_one_plus_year`: Percent of population that has not moved in at least one year (2006-2010).
– `foreign_born`: Percent of population that is foreign-born (2006-2010).
– `foreign_spoken_at_home`: Percent of population that speaks a foreign language at home (2006-2010).
– `hs_grad`: Percent of population that is a high school graduate (2006-2010).
– `bachelors`: Percent of population that earned a bachelor’s degree (2006-2010).
– `mean_work_travel`: Mean travel time to work (2006-2010).
– `home_ownership`: Home ownership rate (2006-2010).
– `housing_multi_unit`: Housing units in multi-unit structures (2006-2010).
– `median_val_owner_occupied`: Median value of owner-occupied housing units (2006-2010).
– `persons_per_household`: Persons per household (2006-2010).
– `per_capita_income`: Per capita money income in past 12 months (2010 dollars, 2006-2010).
– `poverty`: Percent below poverty level (2006-2010).
– `sales_per_capita`: Retail sales per capita, 2007.
– `density`: Persons per square mile (2010).
**Please note that correct identification of the problem and structure of the relevant data are critical to all the following questions.**
“`{r, readindata, echo=FALSE}
counties <- read.csv(“counties.csv”)
“`
—–
### part a
Run a permutation to test see if the true mean US county population size has increased from 2000 to 2010. You must use the difference in sample means as your test statistic. Provide all the usual elements of the test. Use $alpha = 0.05$. **ALSO:** provide a histogram of the generated null distribution with the observed test statistic value indicated.
“`{r, message=FALSE}
# Calculate observed test statistic value
# Prepping for 10000 permutations
# Execute the permutations under Ho, saving TS each time
# Calculate permutation p-value
# Visualization of the null distribution
“`
– **Null hypothesis $H_{0}:$**
– **Alternative hypothesis $H_{a}:$**
– **Observed test statistic: **
– **Permutation p-value: **
– **Decision/conclusion: **
—–
### part b
Is county poverty level related to population density? Create an appropriate fully labeled professional-standard scatterplot of poverty level ($y$) versus population density ($x$) for these data. Comment on the nature of any patterns you see, indicating the apparent strength and direction of any trend.
“`{r, message=FALSE, warning=FALSE}
“`
—–
### part c
Run a permutation to test see if county poverty level is related to population density, using the Kendall correlation as your test statistic. Provide all the usual elements of the test, including an assessment of the strength and direction of the association. Use $alpha = 0.05$. **ALSO:** provide a histogram of the generated null distribution with the observed test statistic value indicated.
“`{r, message=FALSE, warning=FALSE}
“`
– **Null hypothesis $H_{0}:$**
– **Alternative hypothesis $H_{a}:$**
– **Observed test statistic: **
– **Permutation p-value: **
– **Decision/conclusion: **
—–
### part d
Create a single professionally labeled and formatted side-by-side boxplot display that compares the variability in county level per capita incomes between counties where [a] 5% or less of the county residents are foreign born vs. [b] more than 5% of the county residents are foreign born. You must first create the appropriate grouping variable from the available data, and your plot must also provide a visual indication of the number of counties in each group. In context, comment on any differences or patterns you see.
“`{r, message=FALSE}
“`
**WRITE COMMENTS HERE**
—–
### part e
**Refer to part d.** Run a permutation test to compare the variability in county level per capita incomes between counties where [a] 5% or less of the county residents are foreign born vs. [b] more than 5% of the county residents are foreign born.
You must: **[a]** choose the most appropriate test procedure from what we’ve done in class; **[b]** provide all correct code; and **[c]** provide all the usual elements of the test. Use $alpha = 0.05$.
“`{r, message=FALSE}
“`
– **Null hypothesis $H_{0}:$**
– **Alternative hypothesis $H_{a}:$**
– **Observed test statistic: **
– **Permutation p-value: **
– **Decision/conclusion: **
—–
### part f
A social researcher wants to run a simple linear regression model to see if there is significant evidence to conclude that home ownership rates are related to the percent of county population that is female. A colleague states that the usual regression assumptions of normality and constant variance are probably not satisfied, so she instead suggests testing the **simple linear regression slope** $beta_{1}$ using a permutation test approach instead.
Create an appropriate fully labeled professional-standard scatterplot of home ownership rate ($y$) versus percent of county population that is female ($x$) for these data.
“`{r, message=FALSE}
“`
**Then,** run a permutation test of $H_{0}: beta_{1}=0$ vs. $H_{0}: beta_{1} ne 0$. Use the sample slope $b_{1}$ as your test statistic. Provide all the usual elements of the test. Use $alpha = 0.05$.
“`{r, message=FALSE}
“`
– **Null hypothesis $H_{0}: beta_{1}=0$** (home ownership rates are not linearly related to the percent of population that is female)
– **Alternative hypothesis $H_{a}: beta_{1} ne 0$** (home ownership rates are linearly related to the percent of population that is female)
– **Observed test statistic: **
– **Permutation p-value: **
– **Decision/conclusion: **
Our website has a team of professional writers who can help you write any of your homework. They will write your papers from scratch. We also have a team of editors just to make sure all papers are of HIGH QUALITY & PLAGIARISM FREE. To make an Order you only need to click Order Now and we will direct you to our Order Page at Litessays. Then fill Our Order Form with all your assignment instructions. Select your deadline and pay for your paper. You will get it few hours before your set deadline.
Fill in all the assignment paper details that are required in the order form with the standard information being the page count, deadline, academic level and type of paper. It is advisable to have this information at hand so that you can quickly fill in the necessary information needed in the form for the essay writer to be immediately assigned to your writing project. Make payment for the custom essay order to enable us to assign a suitable writer to your order. Payments are made through Paypal on a secured billing page. Finally, sit back and relax.