딥러닝에서 풀링층이란?
Overview
In artificial neural networks, a pooling layer is a function that reduces the dimension of input data into smaller local units. Keeping only the maximum value within a specified region is called max pooling, while retaining the average value of a specified region is referred to as average pooling.
Definition
For $m \times m$ matrix $\mathbf{X} = [X_{ij}]$, the following function is called max pooling.
$$ \begin{align*} P_{\text{max}} : \mathbb{R}^{n \times n} &\to \mathbb{R} \\ \mathbf{X} &\mapsto \max \left\{ X_{ij} : 1 \le i, j \le m\right\} \end{align*} $$
The following function is called average pooling.
$$ \begin{align*} P_{\text{avg}} : \mathbb{R}^{n \times n} &\to \mathbb{R} \\ \mathbf{X} &\mapsto \frac{1}{m^2} \sum_{i=1}^{m} \sum_{j=1}^{m} X_{ij} \end{align*} $$
Explanation
Despite being named “pooling,” it can actually be omitted. According to the above definition, simply calling it maximum or average conveys the same meaning. A pooling layer refers to a function that partitions input data into small regions and applies a pooling operation to each region. This can be redefined as follows: For $k \in \mathbb{N}$ and $n = km$, the function defined below as $P_{\text{max}} : M(\mathbb{R}^{n \times n}) \to M(\mathbb{R}^{m \times m})$ is called a max pooling layer.
$$ \begin{align*} \mathbf{Y} &= P_{\max}(\mathbf{X}) \\ Y_{ij} &= \max \{ X_{(i-1)k+p, (j-1)k+q} \mid 1 \le p, q \le k \}. \end{align*} $$
$P_{\text{avg}} : M(\mathbb{R}^{n \times n}) \to M(\mathbb{R}^{m \times m})$ defined as follows is called an average pooling layer.
$$ \begin{align*} \mathbf{Y} &= P_{\operatorname{avg}}(\mathbf{X}) \\ Y_{ij} &= \dfrac{1}{k^{2}} \sum_{p=1}^{k} \sum_{q=1}^{k} X_{(i-1)k+p, (j-1)k+q} \end{align*} $$
Using the deep learning package Flux.jl for Julia, calculations on a simple matrix are as follows.
julia> using Flux
julia> mp = MaxPool((2,2))
MaxPool((2, 2))
julia> ap = MeanPool((2,2))
MeanPool((2, 2))
julia> A = collect(1.0:16.0) |> x->reshape(x, 4,4,1,1)
4×4×1×1 Array{Float64, 4}:
[:, :, 1, 1] =
1.0 5.0 9.0 13.0
2.0 6.0 10.0 14.0
3.0 7.0 11.0 15.0
4.0 8.0 12.0 16.0
julia> mp(A)
2×2×1×1 Array{Float64, 4}:
[:, :, 1, 1] =
6.0 14.0
8.0 16.0
julia> ap(A)
2×2×1×1 Array{Float64, 4}:
[:, :, 1, 1] =
3.5 11.5
5.5 13.5