Describe the frequency counts, the relative frequencies, the cumulative frequencies, and the cumulative percentage frequencies per category in this variable.

As a team of consultant working to understand the data being generated by the COVID pandemic, your mandate is to analyze the following dataset: WHO COVID Data 31012021.xlsx., obtained from the World Health Organization (www.who.int), in date of January 31st, 2021.

Here is a description of the 11 variables providing information on 236 observations in the data set: Column Description

ACategorical Data: Country Name

BCategorical Data: Region

CRatio Data: Cumulative Number of Cases (Total)

DRatio Data: Cumulative Number of Cases (Total per Million People)ERatio

Data: Newly Reported Number of Cases (in the Last 7 Days)

FRatio Data: Newly Reported Number of Cases (in the Last 24 Hours)

GRatio Data: Cumulative Number of Deaths (Total)

HRatio Data: Cumulative Number of Deaths (Total per Million People)

IRatio Data: Cumulative Number of Deaths (in the Last 7 Days)J

Ratio Data: Cumulative Number of Deaths (in the Last 24 Hours)

KCategorical Data:

Transmission Classification

YOUR OBJECTIVE:

Use the following 10 questions to generate the information that you will use to generate a 1 page analytical report, that clearly explains, clarifies, and informs your client the continued spread of the novel corona virus (COVID-19).

1.Create a frequency distribution table on the categorical type variable Transmission Classification (column K).

i.Describe the frequency counts, the relative frequencies, the cumulative frequencies, and the cumulative percentage frequencies per category in this variable (create a table to summarize your findings).

2.Using the information from the frequency distribution table you created in Question 1, summarize Transmission Classification through a Pareto chart, enhanced by plotting a cumulative percentage point for each bar in the Pareto chart.

3.For each WHO region, a.calculate the proportion of Cumulative Deaths in the last 24 hours to the Cumulative Number of Deaths in the last 7 days (columns J and. I) Are the proportions comparable across regions? Why or why not?

b.Repeat part(a) above, using Reported Number of Cases instead (columns F & E). Are the proportions comparable across regions? Why or why not?

4.For the variable in column H, Cumulative Number of Deaths (Total per Million People),

a.Determine the following descriptive statistics on a per WHO Region: i.The mean value :: the median :: the mode :: the standard deviation :: the coefficient of variation

COMM 215 Data Analysis Project Winter 20213/4

ii.Describe the shape of the distribution of this variable per region (create distribution tables / graphs)

iii.How do these statistics compare between regions?

b.Repeat all questions of part

(a) for the variable Cumulative Number of Cases (Total per Million People).

5.For the variable Cumulative Number of Deaths – total per million people (Column H),

a.Construct a Box-and-Whiskers plot to determine if there are extreme outliers (ie: use outer fences). Discuss your findings.

b.Repeat part (a) but use the variable Cumulative Number of Cases (Total) (Column C).

6.Determine if the WHO Regions (Column B) is independent of the Transmission Classifications (Column K).

7.From March 1, 2020 until May 31, 2020, 100 random samples of size n = 30 were taken on the Deaths per million people in New Zealand. This sampling distribution was found to be approximately normally distributed with a mean = 0.34 deaths per million residents and a standard deviation of 0.11 deaths per million people

a.Given the above data, what is the probability that you, as researchers, will randomly select a small sample of 30 days with an average number of deaths per million exceeding 0.5 deaths per million? Test your hypothesis at the 90th and 95th percentiles.

b.The same process was done in Australia (same time period / same number of samples / same sample size), and the Australian average number of deaths per million was found to equal 0.31 with a standard deviation of 0.5. Repeat part

(a) for Australia and comment in your report about the differences in the distributions between the two neighbouring countries (why are they so different?)

8.Create a scatter plot comparing the Cumulative Number of Deaths in the last 7 days (column I) to the Cumulative Number of Deaths in the last 24 hours (column J).

a.Comment on the correlation (strength / direction)

b.Repeat part

(a) but use the variables in columns E & F 9.Construct a simple regression equation that describes the relationship between the Cumulative Number of Deaths in the last 7 days (column I) to the Cumulative Number of Deaths in the last 24 hours (column J).

a.What is the relationship? (ie: provide an equation)

b.Is the relationship statistically significant? How can you tell? c.Discuss the 95% confidence interval of the model’s slope.

10.Using the information in Question 9, discuss the following points in your report

a.If on a given week (7 days), a nation reports a total of 2000 new deaths due to the COVID 19 virus, what is the expected value for the number of deaths over the next 24 hours for that nation?

b.Calculate a 95% confidence interval for the mean value of your dependent variable and discuss your finding

© 2020 EssayQuoll.com. All Rights Reserved. | Disclaimer: For assistance purposes only. These custom papers should be used with proper reference.