Machine Learning
Machine Learningmachine learning is the process of enabling a machine to learn methods for identifying features from an existing dataset so that it can effectively recognize features in new data as well.
The above definition isn’t particularly rigorous, nor does it need to be. Simply put, a machine can be understood as a computer or programming code. The dataset already in possession and used for learning is called the training settraining set, 훈련집합. Comparing machine learning to a student studying for exams:
- Machine: Student
- Training set: Past exam papers
- Features: Question patterns
- New data: Actual exam questions
- Learning: Solving past papers to be able to tackle actual exam questions
As of 2021, the most widely used method for implementing machine learning is deep learningdeep learning, 심층학습, which involves increasing the hidden layers of artificial neural networks. Artificial neural networks have recently seen dramatic improvements in performance, often showcasing the best results. Before deep learning achieved satisfactory performance, models based on statistical theories dominated machine learning.
To become proficient in machine learning, one needs to be skilled in mathematics, statistics, and programming. Mathematical and statistical knowledge is required to understand the theory, and programming skills are needed to implement it. Specifically, to deeply study machine learning theory, knowledge of linear algebra related to matrices, as well as measure theory, functional analysis, and others are necessary. Additionally, recent research is linking artificial intelligence with fields like geometry, graph theory, and partial differential equations1.
The following articles are written to be as accessible as possible for mathematics majors.
Basics
Learning Concepts
- Supervised and Unsupervised Learning
- What are Training/Validation/Test Sets?
- Online Learning, Batch Learning, Mini-Batch Learning
Optimization
Sampling
- What is the Monte Carlo Method?
- Monte Carlo Integration
- Rejection Sampling
- Importance Sampling
- Markov Chain Monte CarloMCMC
Classical Machine Learning
- Positive Definite Kernels and Reproducing Kernel Hilbert Space $k$, $H_{k}$
- Proof of the Representer Theorem
Linear Regression Models
Linear Classification Models
- Linear Classification Models
- Least Squares Method
- Fisher’s Linear Discriminant
- Neyman-Pearson Criterion for Binary Classification
- Bayes Risk Classifier
Clustering
Subfields
Reinforcement Learning
- 🔒 Mathematical Foundations for Reinforcement Learning
- 🔒 What is Reinforcement Learning
- 🔒 Multi-Armed Bandit Problem
- 🔒 Markov (Reward) Process
- 🔒 Markov Decision Process
Computer Vision
Deep Learning Theory
- Layer
- Linear Layer
- Convolutional Layer
- Skip Connection
- Activation Functions
- What is an Artificial Neural NetworkANN
- [Meanings and Differences of ANN (Artificial Neural Network), DNN (Deep Neural Network), FNN (Feedforward Neural Network)]
- Definition of Perceptron
- What is Deep Learning?
- Automatic Differentiation
- Mathematical Foundations of Deep Learning, Proof of the Cybenko Theorem
- What is Continual Learning in Deep Learning
- Overfitting and Regularization
- Boltzmann Machine
- Restricted Boltzmann Machine
- Batch Learning Algorithm
- Online Learning Algorithm
- RBM for Classification
- Radial Basis Function
- [MLP (Multilayer Perceptron)]
- [CNN (Convolutional Neural Network)]
- PINN (Physics-Informed Neural Networks) Paper Review
- Autoencoder
- DeepONet (Deep Operator Networks) Paper Review
- Imprementation in PyTorch
- Imprementation in Julia
- U-Net Paper Review
- Imprementation in Julia
- KAN(Kolmogorov-Arnold Neural Network) Paper Review
Deep Learning Frameworks
Python PyTorch
- How to Check Model/Tensor Device
.get_device()
- Random Sampling from a Given Distribution
torch.distributions.Distribution().sample()
- Creating and Using Custom Datasets with Numpy Arrays
TensorDataset
,DataLoader
- Saving and Loading Weights, Models, and Optimizers
torch.save(model.state_dict())
Neural Networks
- Implementing a Multilayer Perceptron
- Defining Neural Networks with Lists and Loops
nn.ModuleList
- Accessing Model Weights and Biases
.weight.data
,.bias.data
- Weight Initialization
torch.nn.init
Tensors
- Modular Arithmetic
fmod
,remainder
- Handling Dimensions and Sizes
.dim()
,.ndim
,.view()
,.reshape()
,.shape
,.size()
- Creating Random Permutations and Shuffling Tensors
torch.randperm
,tensor[indices]
- Deep Copying Tensors
.clone()
- Concatenating and Stacking Tensors
torch.cat()
,torch.stack()
- Padding Tensors
torch.nn.functional.pad()
- Sorting Tensors
torch.sort()
,torch.argsort()
Troubleshooting
- Fixing ‘RuntimeError: grad can be implicitly created only for scalar outputs’
- Fixing ‘TypeError: can’t convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first’ with Lists
- Fixing ‘RuntimeError: Boolean value of Tensor with more than one value is ambiguous’
- Fixing ‘RuntimeError: Parent directory does not exist’ When Saving Models
Julia
- Julia’s Deep Learning Frameworks
Flux.jl
,Knet.jl
,Lux.jl
- Using Machine Learning Datasets
MLDatasets.jl
Flux
- Handling Hidden Layers
- Implementing MLP and Optimizing with Gradient Descent
- One-Hot Encoding
onehot()
,onebatch()
,onecold()
- Implementing MLP for Nonlinear Function Approximation
- Implementing MLP and Training on MNIST
- Using GPU
- Setting Neural Network Training/Test Modes
trainmode!
,testmode!
References
- Christoper M. Bishop, Pattern Recognition annd Machine Learning (2006)
- Simon Haykin, Neural Networks and Learning Machines (3rd Edition, 2009)
- Trevor Hastie, The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd Edition, 2017)
- 오일석, 기계 학습(MACHINE LEARNING) (2017)
- Richard S. Sutton, Reinforcement Learning: An Introduction (2nd Edition, 2018)
All posts
- 자동미분과 이원수
- 줄리아에서 이원수를 이용하여 자동미분 전진모드 구현하기
- Free Online Resources for Artificial Intelligence, Machine Learning, and Deep Learning
- Confusion Matrix, Sensitivity, and Specificity
- Cross-validation
- Drawing ROC Curves in R
- Finding the optimal cutoff using ROC curves
- Comparing Models Using the AUC of ROC Curves
- What is an Artificial Neural Network?
- Loss Functions in Machine Learning
- Gradient Descent and Stochastic Gradient Descent in Machine Learning
- What is Deep Learning?
- Activation Functions in Deep Learning
- Softmax Function in Deep Learning
- Dropout in Deep Learning
- Supervised and Unsupervised Learning
- k-Means Clustering
- What is Overfitting and Regularization in Machine Learning?
- Commonly Used Datasets in Machine Learning
- Paper Review: Do We Need Zero Training Loss After Achieving Zero Training Error?
- Continuous Learning in Deep Learning
- What is Computer Vision
- Perceptron Definition
- What is a Sigmoid Function?
- What is a Logistic Function?
- Linear Models for Regression in Machine Learning
- What is a Discriminant Function?
- What is a Sigmoid Function?
- 딥러닝의 수학적 근거, 시벤코 정리 증명
- What is Reinforcement Learning in Machine Learning
- Perceptron Convergence Theorem
- Back Propagation Algorithm
- PyTorch RuntimeError: "grad can be implicitly created only for scalar outputs" Solution
- How to Implement MLP in PyTorch
- Initializing Weights in PyTorch
- Creating and Using Custom Datasets from Numpy Arrays in PyTorch
- Saving and Loading Weights, Models, and Optimizers in PyTorch
- Creating Random Permutations and Shuffling Tensor Order in PyTorch
- How to Define Artificial Neural Network Layers with Lists and Loops in PyTorch
- How to Deep Copy Tensors in PyTorch
- How to Obtain the Weight Values of a Model in PyTorch
- How to Concatenate or Stack Tensors in PyTorch
- Using Machine Learning Datasets in Julia
- How to Pad PyTorch Tensors
- Handling the Dimensions and Sizes of PyTorch Tensors
- Handling Hidden Layers in Julia Flux
- Implementing MLP in Julia Flux and Optimizing with Gradient Descent
- How to Perform One-Hot Encoding in Julia Flux
- Implementing MLP in Julia Flux and Learning with MNIST
- Resolving 'TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.' with Lists in PyTorch
- Implementing MLP in Julia Flux to Approximate Nonlinear Functions
- Hierarchical Clustering
- Gradient Descent Learning of Linear Regression Models in Machine Learning
- Paper Review: Physics-Informed Neural Networks
- What is ReLU in Machine Learning?
- Training/Validation/Test Sets in Machine Learning
- What is a Softplus Function?
- How to Check the Device on which the PyTorch Model/Tensor is loaded
- Support Vector Machine
- Definite Kernel and Reproducing Kernel Hilbert Space in Machine Learning
- What is One-Hot Encoding in Machine Learning?
- Proof of the Representation Theorem
- Automatic differentiation
- Various Deep Learning Frameworks of Julia
- MNIST Database
- Iris Dataset
- What is a Layer in Deep Learning?
- Sampling Randomly from a Given Distribution in PyTorch
- Solving 'RuntimeError: Boolean value of Tensor with more than one value is ambiguous' Error in PyTorch
- Functions for Tensor Sorting in PyTorch
- Flux-PyTorch-TensorFlow Cheat Sheet
- Solutions to 'RuntimeError: Parent directory does not exist' Error When Saving Models in PyTorch
- Monte Carlo Integration
- Rejection Sampling
- Monte Carlo Method
- What is Data Augmentation?
- Online Learning vs. Batch Learning in Machine Learning
- Modular Arithmetic in PyTorch
- Momentum Method in Gradient Descent
- Adaptive Learning Rates: AdaGrad, RMSProp, Adam
- How to Define and Train MLP with the Sequence Model and Functional API in TensorFlow and Keras
- Using AdaBelief Optimizer in PyTorch
- What is Skip Connection in Artificial Neural Networks?
- What are Weights in Machine Learning?
- Difference Between torch.nn and torch.nn.functional in PyTorch
- Grid Search, Brute Force, Hard Work
- Graduate Student Descent Method
- How to Use GPU in Julia Flux
- How to Set Training and Testing Modes for Neural Networks in Julia Flux
- Paper Review: Kolmogorov-Arnold Neural Network (KAN)
- Salt and Pepper Noise