Skip to content

[MNT] Switch from np.column_stack() to np.vstack().T for performance #31130

@scottshambaugh

Description

@scottshambaugh

From the discussion in #31001 (comment), it looks like np.column_stack() is generally a slow operation compared to np.vstack().T. This is because the former has to interleave elements in memory whereas the second does contiguous memory copies and returns a view.

It's unclear to me how many of these operations are driving time spent in the hot paths of the code, but we should see a performance improvement by switching things over.

Marking this as an easy first issue since it's largely a find-and-replace.

10,000 elements: 10 runs x 10,000 iterations

With broadcast:
- `np.column_stack(np.broadcast_arrays(x, y))`: 36.47 us
- `np.vstack(np.broadcast_arrays(x, y)).T`: 27.67 us
- `np.empty + assign`: 30.09 us

Without broadcast:
- `np.column_stack([x, y])`: 20.63 us
- `np.vstack((x, y)).T`: 13.18 us
"""Benchmark different array combination methods."""

import timeit
import numpy as np

N = 10_000
NUMBER = 10_000
REPEAT = 10

x = np.linspace(0, 1, N)
y = np.float64(2.0)

# Pre-matched arrays for non-broadcast cases
x_full = x
y_full = np.full(N, 2.0)


def broadcast_column_stack():
    return np.column_stack(np.broadcast_arrays(x, y))

def broadcast_vstack_T():
    return np.vstack(np.broadcast_arrays(x, y)).T

def broadcast_empty_assign():
    out = np.empty((N, 2))
    bx, by = np.broadcast_arrays(x, y)
    out[:, 0] = bx
    out[:, 1] = by
    return out

def no_broadcast_column_stack():
    return np.column_stack([x_full, y_full])

def no_broadcast_vstack_T():
    return np.vstack((x_full, y_full)).T


def bench(func):
    times = timeit.repeat(func, number=NUMBER, repeat=REPEAT)
    return min(times) / NUMBER


if __name__ == "__main__":
    print(f"{N:,} elements: {REPEAT} runs x {NUMBER:,} iterations")

    broadcast_cases = [
        ("np.column_stack(np.broadcast_arrays(x, y))", broadcast_column_stack),
        ("np.vstack(np.broadcast_arrays(x, y)).T", broadcast_vstack_T),
        ("np.empty + assign", broadcast_empty_assign),
    ]

    no_broadcast_cases = [
        ("np.column_stack([x, y])", no_broadcast_column_stack),
        ("np.vstack((x, y)).T", no_broadcast_vstack_T),
    ]

    print("\nWith broadcast:")
    for label, func in broadcast_cases:
        t = bench(func)
        print(f"- `{label}`: {t * 1e6:.2f} us")

    print("\nWithout broadcast:")
    for label, func in no_broadcast_cases:
        t = bench(func)
        print(f"- `{label}`: {t * 1e6:.2f} us")

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions