Module 46 Working with dates & times
Learning goals
- Be able to read dates, and convert objects to dates
- Be able to convert dates, extract useful information, and modify them
- Use date times
- Gain familiarity with the lubridate package
Hadley Wickham’s tutorial on dates starts with 3 simple questions:
- Does every year have 365 days?
- Does every day have 24 hours?
- Does every minute have 60 seconds?
"I’m sure you know that not every year has 365 days, but do you know the full rule for determining if a year is a leap year? (It has three parts.)
You might have remembered that many parts of the world use daylight savings time (DST), so that some days have 23 hours, and others have 25.
You might not have known that some minutes have 61 seconds because every now and then leap seconds are added because the Earth’s rotation is gradually slowing down.
Dates and times are hard because they have to reconcile two physical phenomena (the rotation of the Earth and its orbit around the sun) with a whole raft of geopolitical phenomena including months, time zones, and DST.
This chapter won’t teach you every last detail about dates and times, but it will give you a solid grounding of practical skills that will help you with common data analysis challenges."
The lubridate()
package
First, install the lubridate
package.
Getting familiar with the date
type
Get today’s date:
This looks like a simple character string, but it is not. There are all sorts of date-time calculations in the background.
To demonstrate this, let’s bring in a simple string:
Note that class type impacts what you can do with text. The following causes an error…
… but this does not:
Common tasks
Converting to dates from strings
The lubridate
package was built to handle dates of various input formats. The following functions convert a character with a particular format into a standard datetime
object:
This also works if the single-digits dates are not padded with a 0
:
Other formats can also be handled:
Extracting components from dates
Let’s practice extracting information from the following datetime
object:
Get the month:
Get the day of month:
Get the day of year:
Get the day of week:
Get the name of the day of week:
Get the hour of the day:
Get the minute of the hour:
Get the seconds of the minute:
Dealing with time zones
When working with dates and times in R
, time zones can be a major pain, but the lubridate
package tries to make this simpler.
Adjust timezones for dates:
# Today's date where I am
today()
[1] "2022-06-17"
# Today's date in New Zealand
today(tzone='NZ')
[1] "2022-06-17"
Adjust time zones for date-times:
# Time where I am
now()
[1] "2022-06-17 13:02:29 CEST"
# Time in UTC / GMT (which are synonymous)
now('UTC')
[1] "2022-06-17 11:02:29 UTC"
now('GMT')
[1] "2022-06-17 11:02:30 GMT"
Don’t know what time zone your computer is working in? Use this function:
[1] "Europe/Madrid"
To get a list of time zones accepted in R
, use the function OlsonNames()
(there are about 500 options):
[1] "Africa/Abidjan" "Africa/Accra" "Africa/Addis_Ababa"
[4] "Africa/Algiers" "Africa/Asmara" "Africa/Asmera"
[7] "Africa/Bamako" "Africa/Bangui" "Africa/Banjul"
[10] "Africa/Bissau" "Africa/Blantyre" "Africa/Brazzaville"
[13] "Africa/Bujumbura" "Africa/Cairo" "Africa/Casablanca"
[16] "Africa/Ceuta" "Africa/Conakry" "Africa/Dakar"
[19] "Africa/Dar_es_Salaam" "Africa/Djibouti" "Africa/Douala"
[22] "Africa/El_Aaiun" "Africa/Freetown" "Africa/Gaborone"
[25] "Africa/Harare" "Africa/Johannesburg" "Africa/Juba"
[28] "Africa/Kampala" "Africa/Khartoum" "Africa/Kigali"
[31] "Africa/Kinshasa" "Africa/Lagos" "Africa/Libreville"
[34] "Africa/Lome" "Africa/Luanda" "Africa/Lubumbashi"
[37] "Africa/Lusaka" "Africa/Malabo" "Africa/Maputo"
[40] "Africa/Maseru" "Africa/Mbabane" "Africa/Mogadishu"
[43] "Africa/Monrovia" "Africa/Nairobi" "Africa/Ndjamena"
[46] "Africa/Niamey" "Africa/Nouakchott" "Africa/Ouagadougou"
[49] "Africa/Porto-Novo" "Africa/Sao_Tome"
At some point you may have reason to force the timezone of a datetime
object to change without actually changing the date or time. To do so, use the function force_tz()
:
Using timestamps instead
One way to avoid timezone issues is to convert a datetime
object to a numeric timestamp.
Timesetamps record the number of seconds that have passed since midnight GMT on January 1, 1970. It doesn’t matter which timezone you are standing in; the seconds that have passed since that moment will be the same:
Timestamps can simplify things when you are doing a lot of adding and substracting with time. Timestamps are just seconds; they are just numbers. So they are much less of a black box than datetime
objects.
You can always convert from a timestamp back into a datetime
object:
Exercises
Creating datetime
objects
Use the appropriate lubridate
function to parse each of the following dates:
1. January 1, 2010
2. 2015-Mar-07
3. 06-Jun-2017
4. c('August 19 (2015)', 'July 1 (2015)')
5. 12/30/14
Extracting datetime
components
Work with this vector of dates:
dt <- c('2000-01-04 03:43:01',
'2007-09-29 12:18:59',
'2011-04-16 19:51:16',
'2015-12-13 21:24:48',
'2020-06-01 06:39:02')
6. Create a dataframe that has the following columns:
raw
(containing the original string)
year
month
dom
(day of month)doy
(day of year)hour
minutes
seconds
7. Now add two more variables:
timestamp
diff
(the difference, in days, between this time and midnight GMT on January 1, 1970)
Record of a child’s cough
First, download the data:
8. Create a dow
(day of week) column.
9. Create a date
(without time) column.
10. How many coughs happened each day?
11. Create a chart of coughs by day.
11. Look up floor_date
. Use it to get the number of coughs by date-hour.
12. Create an hour
variable.
13. Use the hour
variable to create a night_day
column indicating whether the cough was occurring at night or day.
14. Does this child cough more at night or day?