Skip to contents

A generic function to create an indicator for splitting an object into n batches, or batches of a given size. The data.frame method batches rows.

Usage

batch(x, n = NULL, size = NULL, ..., balance = !is.null(n))

# S3 method for data.frame
batch(x, n = NULL, size = NULL, ..., balance = !is.null(n))

Arguments

x

A vector or data frame.

n

An integer. The number of batches to create.

size

An integer. The size of batches to create.

...

Arguments passed on to further methods.

balance

Logical. Should batch sizes be (approximately) balanced?

Value

An integer vector suitable to use as an index to split() the object by.

Examples

batch(LETTERS, 8)
#>  [1] 1 1 1 1 2 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6 7 7 7 8 8 8
batch(LETTERS, size = 8)
#>  [1] 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 4 4
batch(LETTERS, size = 8, balance = TRUE)
#>  [1] 1 1 1 1 1 1 1 2 2 2 2 2 2 2 3 3 3 3 3 3 4 4 4 4 4 4

# The data.frame method batches rows
split(iris, batch(iris, 2)) |> str()
#> List of 2
#>  $ 1:'data.frame':	75 obs. of  5 variables:
#>   ..$ Sepal.Length: num [1:75] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
#>   ..$ Sepal.Width : num [1:75] 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
#>   ..$ Petal.Length: num [1:75] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
#>   ..$ Petal.Width : num [1:75] 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
#>   ..$ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
#>  $ 2:'data.frame':	75 obs. of  5 variables:
#>   ..$ Sepal.Length: num [1:75] 6.6 6.8 6.7 6 5.7 5.5 5.5 5.8 6 5.4 ...
#>   ..$ Sepal.Width : num [1:75] 3 2.8 3 2.9 2.6 2.4 2.4 2.7 2.7 3 ...
#>   ..$ Petal.Length: num [1:75] 4.4 4.8 5 4.5 3.5 3.8 3.7 3.9 5.1 4.5 ...
#>   ..$ Petal.Width : num [1:75] 1.4 1.4 1.7 1.5 1 1.1 1 1.2 1.6 1.5 ...
#>   ..$ Species     : Factor w/ 3 levels "setosa","versicolor",..: 2 2 2 2 2 2 2 2 2 2 ...