Differences in Array Dimensions in Julia, Python (NumPy, PyTorch) 📂Julia

Differences in Array Dimensions in Julia, Python (NumPy, PyTorch)

Overview

When dealing with high-dimensional arrays in Julia and NumPy, PyTorch (hereinafter referred to collectively as Python for simplicity), it is important to pay attention to what each dimension signifies as they differ. This distinction arises because Julia’s arrays are column-major, whereas Python’s arrays are row-major. Note that Matlab, being column-major like Julia, does not have this discrepancy, so those familiar with Matlab need not be overly cautious, but those accustomed to Python should be careful not to make indexing errors.

Be sure to distinguish between the dimensions of arrays and vectors as they are used interchangeably.

Explanation

1-Dimensional Arrays

In Julia, an array of size $n$ represents a $n$-dimensional column vector.

julia> ones(3)
3-element Vector{Float64}:
 1.0
 1.0
 1.0

In Python, an array of size $n$ represents a $n$-dimensional row vector.

>>> import numpy as np
>>> np.ones(3)
array([1., 1., 1.])

Although there is a difference between columns and rows, as it is a 1-dimensional array, there isn’t much to be cautious about in terms of indexing.

2-Dimensional Arrays

At first glance, up to 2 dimensions, they might not appear different. However, their significances are different thus requiring caution. In Julia, the dimensions of an array extend backwards. This means, for a $(m,n)$ array, there are $n$ 1-dimensional arrays (column vectors) of size $m$. Specifically, a $(3,2)$ array signifies having 2 3-dimensional column vectors.

julia> ones(3,2)
3×2 Matrix{Float64}:
 1.0  1.0
 1.0  1.0
 1.0  1.0

Additionally, as Julia is ‘column-major’, the index of elements increases from top to bottom first, and then left to right.

julia> A = reshape(range(1,6), (3,2))
3×2 reshape(::UnitRange{Int64}, 3, 2) with eltype Int64:
 1  4
 2  5
 3  6

julia> for i ∈ 1:6
           println(A[i])
       end
1
2
3
4
5
6

On the other hand, new dimensions in Python arrays extend forwards. Meaning, for a $(m,n)$ array, there are $m$ 1-dimensional arrays (row vectors) of size $n$. The result below shows that the arrays are divided on a row basis.

>>> np.ones([3,2])
array([[1., 1.],
       [1., 1.],
       [1., 1.]])

Therefore, at a glance, in both Julia and Python, a $(m,n)$ array is a $m \times n$ matrix, but due to the differences between column-major and row-major, the order of indices change. The direction of indexing is up-down left-right in Julia, and left-right up-down in Python.

# julia에서 2차원 배열의 인덱싱은 위에서 아래로, 그 다음 좌에서 우로
julia> A = reshape(range(1,6), (3,2))
3×2 reshape(::UnitRange{Int64}, 3, 2) with eltype Int64:
 1  4
 2  5
 3  6

# python에서 2차원 배열의 인덱싱은 좌에서 우로, 그 다음 위에서 아래로 
>>> np.arange(6).reshape(3,2)
array([[0, 1],
       [2, 3],
       [4, 5]])

3-Dimensional Arrays

In Julia, new dimensions to the array are added backwards. Thus, a $(m,n,k)$ array consists of $k$ $(m,n)$ arrays.

julia> ones(3,2,4)
3×2×4 Array{Float64, 3}:
[:, :, 1] =
 1.0  1.0
 1.0  1.0
 1.0  1.0

[:, :, 2] =
 1.0  1.0
 1.0  1.0
 1.0  1.0

[:, :, 3] =
 1.0  1.0
 1.0  1.0
 1.0  1.0

[:, :, 4] =
 1.0  1.0
 1.0  1.0
 1.0  1.0

Conversely, in Python, a $(m,n,k)$ array consists of $m$ $(n,k)$ arrays.

>>> np.ones([3,2,4])
array([[[1., 1., 1., 1.],
        [1., 1., 1., 1.]],

       [[1., 1., 1., 1.],
        [1., 1., 1., 1.]],

       [[1., 1., 1., 1.],
        [1., 1., 1., 1.]]])

In Machine Learning

Considering image data, where $H=\text{hieht}$ is height, $W=\text{width}$ is width, $C=\text{channel}$ is the number of channels, and $B=\text{batch size}$ is batch size, in PyTorch, it is a $(B,C,H,W)$ array, and in Julia, it is a $(H,W,C,B)$ array.

Environment

OS: Windows11
Version: Julia 1.7.1, Python 3.9.2, numpy 1.19.5