PyTorch Tutorial

0. Foreword

The copyright belongs to the original author, and I am only using it for learning and sharing purposes.

Here is the author’s original address and related links.

Video: https://youtube.com/playlist?list=PLJV_el3uVTsOePyfmkfivYZ7Rqr2nMk3W

Course Home: https://speech.ee.ntu.edu.tw/~hylee/ml/2023-spring.php

Github: https://github.com/Fafa-DL/Lhy_Machine_Learning

PyTorch Official Document: https://pytorch.org/docs/stable/index.html

1. Background: Prerequisites & What is PyTorch ?

2. Train

2. Dataset & Dataloader

Dataset: stores data samples and expected values
Dataloader: groups data in batches, enables multiprocessing.

1 2	`dataset = MyDataset(file) dataloader = Dataloader(dataset, batch_size, shuffle=True)`

Customize MyDataset:

from torch.utils.data import Dataset, Dataloader

class MyDataset(Dataset):
    def init(self, file):
        # Read data & preprocess
        self.data = ...

    def __getitem__(self, index):
        # Returns one sample at a time
        return self.data[index]

    def __len__(self):
        # Returns the size of the dataset
        return len(self.data)

1
2
3

dataset = MyDataset(file)

dataloader = Dataloader(dataset, batch_size=5, shuffle=False)

3. Tensors

3.1 Tensors

High-dimensional matrices(arrays)

3.2 Shape of Tensors

Check with .shape

Note: dim in PyTorch == axis in NumPy

3.3 Creating Tensors

Directly from data (list or numpy.ndarray)

1 2	`x = torch.tensor([[1, -1], [-1, 1]]) x = torch.from_numpy(np.arrays([[1, -1], [-1, 1]]))`

Tensor of constant zeros & ones

1 2	`x = torch.zeros([2, 2]) x = torch.ones([1, 2, 5])`

3.4 Common Operations

Common arithmetic functions are supported, such as:

Addition
1
z = x + y
Subtraction
1
z = x - y
Power
1
y = x.pow(2)
Summation
1
y = x.sum()
Mean
1
y = x.mean()

Transpose: transpose two specified dimensions

>>> x = torch.zeros([2, 3])
>>> x.shape
torch.Size([2, 3])
>>> x = x.transpose(0, 1)
>>> x.shape
torch.Size([3, 2])

Squeeze: remove the specified dimension with length = 1

>>> x = torch.zeros([1, 2, 3])
>>> x.shape
torch.Size([1, 2, 3])
>>> x = x.squeeze(0)
>>> x.shape
torch.Size([2, 3])

Tips: The 0 of x.squeeze(0) represents dimension 0.

Unsqueeze: expand a new dimension

>>> x = torch.zeros([2, 3])
>>> x.shape
torch.Size([2, 3])
>>> x.unsqueeze(1)
>>> x.shape
torch.Size([2, 1, 3])

Tips: The 1 of x.unsqueeze(1) represents dimension 1.

Cat: concatenate multiple tensors

>>> x = torch.zeros([2, 1, 3])
>>> y = torch.zeros([2, 3, 3])
>>> z = torch.zeros([2, 2, 3])
>>> w = torch.cat([x, y, z], dim=1)
>>> w.shape
torch.Size([2, 6, 3])

Common initialization values:

1 2	`def normal(shape): return torch.randn(size=shape, device=device) * 0.01`

Note: The 0.01 for reducing variance

3.5 Data Type

Using different data types for model and data will cause errors.

3.6 PyTorch v.s. NumPy

Similar attributes

Many functions have the same names as well

3.7 Device

Tensor & modules will be computed with CPU by default

Use .to() to move tensors to appropriate device.

CPU
1
x = x.to('cpu')
GPU
1
x = x.to('cuda')

3.8 Device(GPU)

Check if your computer has NVIDIA GPU
1
torch.cuda.is_available()
Multiple GPUs: specified ‘cuda:0’, ‘cuda:1’, ‘cuda:2’, …

3.9 Gradient Calculation

>>> x = torch.tensor([[1., 0.], [-1., 1.]], requires_grad=True)
>>> z = x.pow(2).sum()
>>> z.backward()
>>> x.grad

4. torch.nn: Models, Loss Functions

4.1 Network Layers

Linear Layer(Fully-connected Layer)
nn.Linear(in_features, out_features)

4.2 Network Parameters

>>> layer = torch.nn.Linear(32, 64)
>>> layer.weight.shape
torch.Size([64, 32])
>>> layer.bias.shape
torch.Size([64])

4.3 Non-Linear Activation Functions

Sigmoid Activation
nn.Sigmoid()

1
2
3

m = torch.nn.Sigmoid()
x1 = torch.arange(-10, 10 + 1, 0.1)
y1 = m(x1)

ReLu Activation
nn.ReLu()

1
2
3

m = torch.nn.ReLu()
x1 = torch.arange(-10, 10 + 1, 0.1)
y1 = m(x1)

4.4 Build your own neural network

import torch.nn as nn


class MyModel(nn.Module):
    def __init__(self):
        """
        Initialize your model & define layers
        """
        super(MyModel, self).__init__()
        self.net = nn.Sequential(
            nn.Linear(10, 32),
            nn.Sigmoid(),
            nn.Linear(32, 1)
        )

    def forward(self, x):
        """
        Compute output of your NN
        """
        return self.net(x)

Both have the same effect.

class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.layer1 = nn.Linear(10, 32)
        self.layer2 = nn.Sigmoid()
        self.layer3 = nn.Linear(32, 1)

    def forward(self, x):
        out = self.layer1(x)
        out = self.layer2(out)
        out = self.layer3(out)
        return out

4.5 Loss Functions

Mean Squared Error (for regression tasks)
1
criterion = nn.MSELoss()
Cross Entropy (for classification tasks)
1
criterion = nn.CrossEntropyLoss()
loss = criterion(model_output, expected_value)

5. torch.optim: Optimization

Gradient-based optimiztion algorithms that adjust network parameters to reduce error.
E.g. Stochastic Gradient Descent (SGD)

1	`optimizer = torch.optim.SGD(model.parameters(), lr, momentum = 0)`

For every batch of data:

Call optimizer.zero_grad() to reset gradients of model parameters.
Call loss.backward() to backpropagate gradients of prediction loss.
Call optimizer.step() to adjust model parameters.

6. Entire Procedure

6.1 Neural Network `Training Setup`

dataset = MyDataset(file)  # read data via MyDataset
tr_set = Dataloader(dataset, 16, shuffle=True)  # put dataset into Dataloader
model = MyModel().to(device)  # construct model and move to device(cpu/cuda)
criterion = nn.MSELoss()  # set loss function
optimizer = torch.optim.SGD(model.parameters(), 0.1)  # set optimizer

6.2 Neural Network `Training` Loop

for epoch in range(n_epochs):  # iterate n_epochs
    model.train()  # set model to train mode
    for x, y in tr_set:  # iterate through the dataloader
        optimizer.zero_grad()  # set gradient to zero
        x, y = x.to(device), y.to(device)  # move data to device(cpu/cuda)
        pred = model(x)  # forward pass (compute output)
        loss = criterion(pred, y)  # compute loss
        loss.backward()  # compute gradient (backpropagation)
        optimizer.step()  # update model with optimizer

6.3 Neural Network `Validation` Loop

model.eval()  # set model to evaluation mode
total_loss = 0
for x, y in dv_set:  # iterate through the dataloader
    x, y = x.to(device), y.to(device)  # move data to device (cpu/cuda)
    with torch.no_grad():  # disable gradient calculation
        pred = model(x)  # forward pass (compute output)
        loss = criterion(pred, y)  # compute loss
    total_loss += loss.cpu().item() * len(x)  # accumulate loss
    avg_loss = total_loss / len(dv_set.dataset)  # compute averaged loss

6.4 Neural Network `Testing` Loop

model.eval()  # set model to evaluation mode
preds = []
for x in tt_set:  # iterate through the dataloader
    x = x.to(device)
    with torch.no_grad():  # disable gradient calculation
        pred = model(x)  # forward pass (compute output)
        preds.append(pred.cpu())  # collect prediction

7. Save/load models

Save
1
torch.save(model.state_dict(), path)

Load

1 2	`ckpt = torch.load(path) model.load_state_dict(ckpt)`

8. More About PyTorch

Useful github repositories using PyTorch
- Huggingface Transformers (transformer models: BERT, GPT, …)
- Fairseq (sequence modeling for NLP & speech)
- ESPnet (speech recognition, translation, synthesis, …)

PyTorch Tutorial

https://www.hardyhu.cn/2023/05/17/PyTorch-Tutorial/

Author

John Doe

Posted on

May 17, 2023

Licensed under

Joblib: A Practical Guide to Caching and Parallelization in Python Previous

Reading and Writing JSON to a File in Python Next

PyTorch Tutorial

0. Foreword

1. Background: Prerequisites & What is PyTorch ?

2. Train

2. Dataset & Dataloader

3. Tensors

3.1 Tensors

3.2 Shape of Tensors

3.3 Creating Tensors

3.4 Common Operations

3.5 Data Type

3.6 PyTorch v.s. NumPy

3.7 Device

3.8 Device(GPU)

3.9 Gradient Calculation

4. torch.nn: Models, Loss Functions

4.1 Network Layers

4.2 Network Parameters

4.3 Non-Linear Activation Functions

4.4 Build your own neural network

4.5 Loss Functions

5. torch.optim: Optimization

6. Entire Procedure

6.1 Neural Network Training Setup

6.2 Neural Network Training Loop

6.3 Neural Network Validation Loop

6.4 Neural Network Testing Loop

7. Save/load models

8. More About PyTorch

6.1 Neural Network `Training Setup`

6.2 Neural Network `Training` Loop

6.3 Neural Network `Validation` Loop

6.4 Neural Network `Testing` Loop