Skip to content

Feature request: concatenation function for iterables #45563

@mikmoore

Description

@mikmoore

Currently we have vcat, hcat, and the slightly more versatile cat function for performing concatenation of varargs. But we do not have /proper/ functions for efficiently concatenating collections of items.

For example, to concatenate a vector of vectors into a matrix one should use reduce(hcat,x::Vector{Vector{T}}) where T. However, this isn't really the reduce function. This method has been special-cased for this specific purpose. It feels more like a pun, since it never actually calls hcat at all. One can also splat hcat(vectorofvectors...), but this has wretched performance.

Personally, I hate specializations like reduce(::typeof(hcat),...) and would love for these to eventually be removed. They thwart the programmers ability to reason about what may or may not be effective (without the specialization, the reduce approach would be horrible) with an encyclopedic knowledge of what specializations are available.

The reduce specializations work for this purpose, but are highly undiscoverable. Most people unaware of the specializations (and most people are unaware, in my experience) tend to reach for a splatted hcat. Further, these specializations are limited to just those two functions in that exact context. How do I concatenate a Vector{Matrix{T}} into an Array{T,3}? There is no builtin (that I am aware of) to do this efficiently and readably. I can start doing grotesque things like reshape(reduce(hcat,vectorofmatrices),size(first(vectorofmatrices))...,:) but I thought I left those monstrosities with MATLAB.

We have minimum for reducing a collection using min and many other such functions, but nothing for concatenation. I propose a function concat(itr;dims) (or concatenate) to generalize cat (which handles arbitrary dimensions - also multiple dimensions simultaneously but that is more than an initial implementation would need) to iterables. In keeping with similar functions, perhaps a predicate could be considered, but that implementation might be tricky given that the predicate might change the size of the items in a way that makes preallocation overly difficult. One can just use concat([f(x) for x in itr];dims) in the meantime.

Looking for discussion on name, functionality, and what else might be missing here. I've got a backlog of other PRs to work on in the near future, so someone feel free to go ahead and take a shot if you're feeling inspired.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions