8  Exam 2 Practice

Author

Melissa Wuellner, Jacob C. Cooper

Please complete the following problems as practice for Exam 2. Note that you may need to load specific packages from the previous exercises to get your code to work.

All questions must be answered in full sentences.

Example answers: For your conclusion, answers should look like this:

The mean height of the class is significantly higher than the height of the general population (\(Z = 2.50\), \(p = 0.006\)).
The mean length of the leg of the high-elevation Woodhouse’s Toads (Anaxyrus woodhousii) is indistinguishable from those of the general population (\(Z = 0.12\), \(p = 0.45\)).

8.1 Minnows

Consider this scenario: You have discovered a never-before-documented population of minnow in the Kearney Canal near campus. During your first sampling trip, you notice that the total length (i.e., measured from the tip of the snout to the very tip of the tail) of the fish you measure appear to be smaller than the average total length of the species as recorded among all known individuals across their range. The mean total length noted in one publication is 85.00 mm with a standard deviation of 4.50. Below are your data from 20 minnows that you captured during your first sampling trip to the Kearney Canal:

Fish data for this problem.
Fish ID Length (mm)
1 89.58
2 75.44
3 86.86
4 74.71
5 69.70
6 100.34
7 73.70
8 69.56
9 96.24
10 79.35
11 61.37
12 62.82
13 95.45
14 98.71
15 100.34
16 57.57
17 70.54
18 78.65
19 65.39
20 65.57

NOTE: you will have to create the numeric object for this problem. use c to link things together; e.g.:

x <- c(1,5,8,9)
x
[1] 1 5 8 9
  1. Why type of data are each of the variables (i.e., fish ID and length) – nominal, ordinal, interval, or ratio? Provide your answer along with a brief justification.
  2. Calculate the following for the length variable, rounding to the appropriate number of significant digits in all answers. Be sure to provide a written description of your answers as well as the code output.
    • Mean
    • Median
    • Mode
    • Q1
    • Q2
    • IQR
    • Variance
    • Standard deviation
    • Standard error
    • Coefficient of variation
  3. Calculate skewness and kurtosis for the length data. Provide an interpretation of these values.
  4. Create and compute the following for the length data:
    • Frequency histogram
    • Cumulative frequency plot
    • Shapiro-Wilkes test
    • Using the above, provide an interpretation of whether data are normally distributed, and be sure to justify your answer
  5. Let’s say the mean and standard deviation of length across all populations of this minnow are \(\mu = 85.00\) and \(\sigma = 4.50\).
    • State the null and alternative hypotheses for this scenario.
    • Run the appropriate \(Z\)-test and report the \(p\)-value.
    • State the final conclusion of whether or not you support or reject the null hypothesis. Assume \(\alpha = 0.05\). Justify your answer.

Be sure to answer all questions in full sentences!

8.2 Time spent on canvas

You are curious about how much time your classmates spend on Canvas for BIOL305. Let’s say you get the following data for the amount of time folks spend on Canvas:

time.hrs <- c(19.65, 69.29, 6.83, 5.50, 17.98,
              19.89, 8.52, 71.37, 12.62, 4.62,
              3.00, 5.69, 10.79, 6.59, 32.56, 
              15.72, 3.67, 10.04, 5.45, 3.69, 
              20.17, 12.99, 1.56, 2.40, 55.20)

student <- 1:length(time.hrs)

time_on_canvas <- cbind(student, time.hrs) |> 
  as.data.frame() |> 
  mutate(time.hrs = as.numeric(time.hrs)) |> 
  mutate(student = as.numeric(student))

head(time_on_canvas)
  student time.hrs
1       1    19.65
2       2    69.29
3       3     6.83
4       4     5.50
5       5    17.98
6       6    19.89
  1. What type of data are each of these variables (i.e., student ID [student] and time spent in hours [time.hrs]) – nominal, ordinal, interval, or ratio? Provide your answer along with a brief justification.
  2. Calculate the following for the time.hrs variable, rounding to the appropriate number of significant digits in all answers. Be sure to provide a written description of your answers as well as the code output.
    • Mean
    • Median
    • Mode
    • Q1
    • Q2
    • IQR
    • Variance
    • Standard deviation
    • Standard error
    • Coefficient of variation
  3. Calculate skewness and kurtosis for the length data. Provide an interpretation of these values.
  4. Create and compute the following for the length data:
    • Frequency histogram
    • Cumulative frequency plot
    • Shapiro-Wilkes test
    • Using the above, provide an interpretation of whether data are normally distributed, and be sure to justify your answer
  5. Let’s say Canvas has been tracking users across all of the classes ever put in their software platform. This includes many universities and many classes across all of those schools. The mean and standard deviation from this population-level dataset are \(\mu = 12.22\) and \(\sigma = 2.10\). You personally spend 15.45 hours on Canvas for one of your UNK classes. Is the number of hours you spent expected or not expected based on this information from the larger Canvas population? Assume \(\alpha = 0.05\). For this scenario, be sure to do the following:
    • State the null and alternative hypotheses
    • Run the appropriate \(Z\)-test and report the \(p\)-value
    • State the final conclusion of whether you support or reject the null hypothesis – be sure to support your answer!
    • Calculate the 95% confidence interval around the population mean
  6. Optional: Repeat steps 4 and 5 after performing a natural log transformation on the time.hrs data (use the command log1p). Don’t forget to back-transform your confidence intervals using expm1.