Annotates dates as censored, uncertain, or approximate
Source:R/component_annotate.R
component_annotate.Rd
Some datasets have for example an arbitrary cut off point
for start and end points, but these are often coded as precise dates
when they are not necessarily the real start or end dates.
This collection of functions helps annotate uncertainty and
approximation to dates according to ISO2019E standards.
Inaccurate start or end dates can be represented by an affix
indicating "on or before", if used as a prefix (e.g. ..1816-01-01
),
or indicating "on or after", if used as a suffix (e.g. 2016-12-31..
).
Approximate dates are indicated by adding a tilde to year,
month, or day components, as well as groups of components or whole dates
to estimate values that are possibly correct (e.g. 2003-03-03~
).
Day, month, or year, uncertainty can be indicated by adding a question mark
to a possibly dubious date (e.g. 1916-10-10?
) or date
component (e.g. 1916-?10-10
).
Usage
on_or_before(x)
on_or_after(x)
as_approximate(x, component = NULL)
as_uncertain(x, component = NULL)
Arguments
- x
A date vector
- component
Annotation can be added on specific date components ("year", "month" or "day"), or to groups of date components (month and day ("md"), or year and month ("ym")). This must be specified. If unspecified, annotation will be added after the date (e.g.
1916-10-10?
), indicating the whole date is uncertain or approximate. For specific date components, uncertainty or approximation is annotated to the left of the date component. E.g. for "day":1916-10-?10
or1916-10-~10
. For groups of date components, uncertainty or approximation is annotated to the right of the group ("ym") or to both components ("md"). E.g. for "ym":1916-10~-10
; for "md":1916-?10-?10
.
Functions
on_or_before()
: prefixes dates with ".." where start date is uncertainon_or_after()
: suffixes dates with ".." where end date is uncertainas_approximate()
: adds tildes to indicate approximate dates/date componentsas_uncertain()
: adds question marks to indicate dubious dates/date components.
Examples
data <- data.frame(Beg = c("1816-01-01", "1916-01-01", "2016-01-01"),
End = c("1816-12-31", "1916-12-31", "2016-12-31"))
dplyr::mutate(data, Beg = ifelse(Beg <= "1816-01-01",
on_or_before(Beg), Beg))
#> Beg End
#> 1 ..1816-01-01 1816-12-31
#> 2 1916-01-01 1916-12-31
#> 3 2016-01-01 2016-12-31
dplyr::mutate(data, End = ifelse(End >= "2016-01-01",
on_or_after(End), End))
#> Beg End
#> 1 1816-01-01 1816-12-31
#> 2 1916-01-01 1916-12-31
#> 3 2016-01-01 2016-12-31..
dplyr::mutate(data, Beg = ifelse(Beg == "1916-01-01",
as_approximate(Beg), Beg))
#> Beg End
#> 1 1816-01-01 1816-12-31
#> 2 1916-01-01~ 1916-12-31
#> 3 2016-01-01 2016-12-31
dplyr::mutate(data, End = ifelse(End == "1916-12-31",
as_uncertain(End), End))
#> Beg End
#> 1 1816-01-01 1816-12-31
#> 2 1916-01-01 1916-12-31?
#> 3 2016-01-01 2016-12-31