-
Notifications
You must be signed in to change notification settings - Fork 20
Description
Feature Type
-
Adding new functionality to datar
-
Changing existing functionality in datar
-
Removing existing functionality in datar
Problem Description
- https://github.com/tidyverse/dplyr/releases/tag/v1.1.0
- https://github.com/tidyverse/dplyr/releases/tag/v1.1.1
- https://github.com/tidyverse/dplyr/releases/tag/v1.1.2
Feature Description
-
*_join()-
A join specification can now be created through
join_by(). This allows
you to specify both the left and right hand side of a join using unquoted
column names, such as join_by(sale_date == commercial_date). Join
specifications can be supplied to any *_join() function as the by
argument.Join specifications allow for new types of joins:
-
Equality joins: The most common join, specified by ==. For example,
join_by(sale_date == commercial_date). -
Inequality joins: For joining on inequalities, i.e.>=, >, <, and
<=. For example, use join_by(sale_date >= commercial_date) to find
every commercial that aired before a particular sale. -
Rolling joins: For "rolling" the closest match forward or backwards when
there isn't an exact match, specified by using the rolling helper,
closest(). For example,
join_by(closest(sale_date >= commercial_date)) to find only the most
recent commercial that aired before a particular sale. -
Overlap joins: For detecting overlaps between sets of columns, specified
by using one of the overlap helpers: between(), within(), or
overlaps(). For example, use
join_by(between(commercial_date, sale_date_lower, sale_date)) to
find commercials that aired before a particular sale, as long as they
occurred after some lower bound, such as 40 days before the sale was made.
-
multipleis a new argument for controlling what happens when a row
in x matches multiple rows in y. For equality joins and rolling joins,
where this is usually surprising, this defaults to signalling a "warning",
but still returns all of the matches. For inequality joins, where multiple
matches are usually expected, this defaults to returning "all" of the
matches. You can also return only the "first" or "last" match, "any"
of the matches, or you can "error". -
keepnow defaults to NULL rather than FALSE. NULL implies
keep = FALSE for equality conditions, but keep = TRUE for inequality
conditions, since you generally want to preserve both sides of an
inequality join. -
unmatchedis a new argument for controlling what happens when a row
would be dropped because it doesn't have a match. For backwards
compatibility, the default is "drop", but you can also choose to
"error" if dropped rows would be surprising.
-
-
-
consecutive_id()for creating groups based on contiguous runs of the
same values -
case_match()is a "vectorised switch" variant of case_when() that matches
on values rather than logical expressions. It is like a SQL "simple"
CASE WHEN statement, whereas case_when() is like a SQL "searched"
CASE WHEN statement -
cross_join()is a more explicit and slightly more correct replacement for
using by = character() during a join -
pick()makes it easy to access a subset of columns from the current group.
pick() is intended as a replacement for across(.fns = NULL), cur_data(),
and cur_data_all(). We feel that pick() is a much more evocative name when
you are just trying to select a subset of columns from your data. -
symdiff()computes the symmetric difference. -
cur_data()andcur_data_all()are soft-deprecated in favour of
pick() -
across(),c_across(),if_any(), andif_all()now require the
_colsand_fnsarguments. In general, we now recommend that you use
pick() instead of an empty across() call or across() with no_fns
(e.g. across(c(x, y)). (see also Quietly deprecate optional.colsand.fnscases tidyverse/dplyr#6523). -
Passing
**kwargsto across() is deprecated because it's ambiguous when
those arguments are evaluated. (see also Deprecateacross(, ...)tidyverse/dplyr#6073).
Additional Context
No response