Module 34 Dates, times, and malaria
Learning goals
- This is a review exercise. You’ll be using the skills you’ve developed with the
dplyr
,ggplot2
, andlubridate
packages.
1. Start a new script. Name it “malaria.R”.
2. Set up your work space by loading the readr
, ggplot2
, dplyr
, and lubridate
packages.
3. Read in some malaria data by running the following:
pms <- read_csv('https://github.com/databrew/intro-to-data-science/blob/main/data/pms.csv?raw=true')
4. Take a look at the first few rows of the data. What is the unit of observation?
5. Create a new column/variable in pms
named dow
(as in, “day of week”). The should be the day of the week of the date_visit
.
6. How many visits were there to Catale on May 1 2022?
7. How many of those were for malaria?
8. Which age group has had the most malaria?
9. What day of the week has the most visits?
10. Which month has had the most malaria visits?
11. Which month has had the greatest percentage malaria visits?
12. Make a variable called hour of day?
13. Which hour of day has the most visits?
14. What do you think the function mdy_hms
is/does?
15. Look up the documentation for mdy_hms
.
16. Use mdy_hms
to create a new variable in pms
named date_time
based on the variale start_time
.
17. Use the hour
function to create a variable named hour_of_day
from the date_time
variable. This should be the hour of the day.
18. Get the total number of malaria cases diagnoses by hour of day.
19. Visualize the total number of malaria cases diagnoses by hour of day.
20. Visualize the total number of malaria cases diagnoses by hour of day, but separated by day.