pytorch的矩阵操作分类

百色野人 · 發表於 2025-10-15 10:56:00

PyTorch 的矩阵操作

注意：

无论是torch.f()还是tensor.f()，都是返回新的Tensor，不会修改原始的tensor

单个tensor

初始化

empty

用于创建一个未初始化的张量，其值是随机的

与torch.randn的区别在于，torch.randn是从正态分布中采样的

torch.empty(*size, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False, pin_memory=False, memory_format=torch.contiguous_format) → Tensor

torch.empty((2,3), dtype=torch.int64)
tensor([[ 9.4064e+13,  2.8000e+01,  9.3493e+13],
        [ 7.5751e+18,  7.1428e+18,  7.5955e+18]])

zeros

torch.zeros(*size, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False)

torch.zeros(2, 3)
tensor([[ 0.,  0.,  0.],
        [ 0.,  0.,  0.]])

randn

\(out_i \sim \mathcal{N}(0, 1)\)，满足正态分布

torch.randn(*size, *, generator=None, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False, pin_memory=False)

torch.randn(2, 3)
tensor([[ 1.5954,  2.8929, -1.0923],
        [ 1.1719, -0.4709, -0.1996]])

randint

生成制定范围[low, high) 和形状size的tensor

torch.randint(low=0, high, size, \*, generator=None, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor

torch.randint(3, 10, (2, 2))
tensor([[4, 5],
        [6, 7]])

arange

和list(range())的原理相同

torch.arange(start=0, end, step=1, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor

torch.arange(5)
tensor([ 0,  1,  2,  3,  4])
torch.arange(1, 4)
tensor([ 1,  2,  3])
torch.arange(1, 2.5, 0.5)
tensor([ 1.0000,  1.5000,  2.0000])

range(deprecated)

类似于list(range())的用法,但是，torch.range的返回的最后一个元素是可以为end的

torch.range(start=0, end, step=1, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False)

# 0.5 指的是每步的大小
torch.range(1, 4, 0.5)
tensor([ 1.0000,  1.5000,  2.0000,  2.5000,  3.0000,  3.5000,  4.0000])

linspace

不同于torch.range，这里的step指的是有多少步，根据步数，计算每步的大小

torch.linspace的第一个元素一定是start，最后一个元素一定是end

torch.linspace(start, end, steps, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False)

torch.linspace(start=-10, end=10, steps=5)
tensor([-10.,  -5.,   0.,   5.,  10.])
torch.linspace(start=-10, end=10, steps=1)
tensor([-10.]

eye

返回对角线矩阵

torch.eye(n, m=None, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor

torch.eye(3)
tensor([[ 1.,  0.,  0.],
        [ 0.,  1.,  0.],
        [ 0.,  0.,  1.]])

full

把一个数字扩展到指定的形状上，是ones zeros的一般化

torch.full((2,3), 0.0) = torch.zeros((2,3))

torch.full((2,3), 1.0) = torch.ones((2,3))

torch.full(size, fill_value, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor

torch.full((2, 3), 3.141592)
tensor([[ 3.1416,  3.1416,  3.1416],
        [ 3.1416,  3.1416,  3.1416]])

zeros_like

返回于input tensor形状相同的元素全是0的tensor

torch.zeros_like(input, *, dtype=None, layout=None, device=None, requires_grad=False, memory_format=torch.preserve_format) → Tensor

input = torch.empty(2, 3)
torch.zeros_like(input)
tensor([[ 0.,  0.,  0.],
        [ 0.,  0.,  0.]])

改变形状

premute

改变维度的顺序

torch.permute(input, dims) -> Tensor

x = torch.randn(2, 3, 5)
x.size()
torch.Size([2, 3, 5])
torch.permute(x, (2, 0, 1)).size()
torch.Size([5, 2, 3])

reshape

改变tensor的形状，但是元素的数量和值不改变

torch.reshape(input, shape) → Tensor

a = torch.arange(4.)
torch.reshape(a, (2, 2))
tensor([[ 0.,  1.],
        [ 2.,  3.]])
b = torch.tensor([[0, 1], [2, 3]])
torch.reshape(b, (-1,))
tensor([ 0,  1,  2,  3])

transpose

将两个指定维度的位置进行替换

torch.permute(x, (0,2,1)) = torch.transpose(x, 1, 2)

torch.transpose(input, dim0, dim1) -> Tensor

x = torch.randn(2, 3)
tensor([[ 1.0028, -0.9893,  0.5809],
        [-0.1669,  0.7299,  0.4942]])
torch.transpose(x, 0, 1)
tensor([[ 1.0028, -0.1669],
        [-0.9893,  0.7299],
        [ 0.5809,  0.4942]])

view

tensor.view 创建的张量 tensor_view 是原始张量 tensor 的一个视图（view），而不是一个新的张量。因此，tensor_view 不会单独存储梯度信息。梯度信息会直接存储在原始张量 tensor 中。

Tensor.view而不是torch.view

Tensor.view(*shape) → Tensor

x = torch.randn(4, 4)
x.size()
torch.Size([4, 4])
y = x.view(16)
y.size()
torch.Size([16])
z = x.view(-1, 8)  # the size -1 is inferred from other dimensions
z.size()
torch.Size([2, 8])

b_view 只是b的一个不同形状的视图，后续使用b_view导致的属性的修改还是保存在b中

a = torch.randn(1,6)
b = torch.randn(3,2,requires_grad=True)
b_view = b.view(6,1)
loss = a@b_view
loss.backward()

b_view.grad
空
b.grad
tensor([[-0.3020, -1.4392],
        [ 0.7194,  0.1363],
        [-1.3413, -0.2453]])

此外，只有在内存中连续存储的tensor才可以使用view，否则使用reshape，reshape和view的性质一致

其中，tensor的转置会导致tensor是不连续的

tensor = torch.randn(2,3)
>>> # 转置张量，使其变为非连续
>>> tensor_transposed = tensor.transpose(0, 1)
>>> print("Transposed tensor:")
Transposed tensor:
>>> print(tensor_transposed)
tensor([[ 2.2194, -0.6988],
        [ 0.5496,  0.2167],
        [-0.2635, -2.5029]])
>>> print("Is the transposed tensor contiguous?", tensor_transposed.is_contiguous())
Is the transposed tensor contiguous? False

squeeze

把大小是1的维度 remove掉

When dim is given, a squeeze operation is done only in the given dimension(s). If input is of shape: (A×1×B)(A×1×B), squeeze(input, 0) leaves the tensor unchanged, but squeeze(input, 1) will squeeze the tensor to the shape (A×B)(A×B).

torch.squeeze(input: Tensor, dim: Optional[Union[int, List[int]]]) → Tensor

x = torch.zeros(2, 1, 2, 1, 2)
x.size()
torch.Size([2, 1, 2, 1, 2])
y = torch.squeeze(x)
y.size()
torch.Size([2, 2, 2])
y = torch.squeeze(x, 0)
y.size()
torch.Size([2, 1, 2, 1, 2])
y = torch.squeeze(x, 1)
y.size()
torch.Size([2, 2, 1, 2])
y = torch.squeeze(x, (1, 2, 3))
torch.Size([2, 2, 2])

unsqueeze

添加维度

x = torch.randn(4)
torch.unsqueeze(x, 0).size()
torch.Size(1,4)
torch.unsqueeze(x, 1).size()
torch.Size(4,1)

size

t.size() = t.shape. tuple(t.size())返回一个维度的元组

索引

待更新。。。

多个tensor之间

matmul

torch.matmul(input, other, *, out=None) → Tensor

# vector x vector
tensor1 = torch.randn(3)
tensor2 = torch.randn(3)
torch.matmul(tensor1, tensor2).size()
# matrix x vector
tensor1 = torch.randn(3, 4)
tensor2 = torch.randn(4)
torch.matmul(tensor1, tensor2).size()
# batched matrix x broadcasted vector
tensor1 = torch.randn(10, 3, 4)
tensor2 = torch.randn(4)
torch.matmul(tensor1, tensor2).size()
# batched matrix x batched matrix
tensor1 = torch.randn(10, 3, 4)
tensor2 = torch.randn(10, 4, 5)
torch.matmul(tensor1, tensor2).size()
# batched matrix x broadcasted matrix
tensor1 = torch.randn(10, 3, 4)
tensor2 = torch.randn(4, 5)
torch.matmul(tensor1, tensor2).size()

torch.mm 仅能支持两个2D矩阵tensor的乘法

stack

堆叠，从而产生一个新的维度

torch.stack(tensors, dim=0, *, out=None) → Tensor

x = torch.randn(2,3)
c = torch.stack((x,x), dim=0)
# c.size() = torch.Size(2,2,3)

cat

在一个维度上进行拼接

torch.stack(tensors, dim=0, *, out=None) → Tensor

x = torch.randn(2,3)
c = torch.cat((x,x), dim=0)
# c.size() = torch.Size(4, 3)
c = torch.cat((x,x), dim=1)
# c.size() = torch.Size(2, 6)

split

根据指定维度，切分成指定大小的tuple(tensor)

torch.split(tensor, split_size_or_sections, dim=0)

a = torch.arange(10).reshape(5, 2)
tensor([[0, 1],
        [2, 3],
        [4, 5],
        [6, 7],
        [8, 9]])
torch.split(a, 2)
(tensor([[0, 1],
         [2, 3]]),
 tensor([[4, 5],
         [6, 7]]),
 tensor([[8, 9]]))
torch.split(a, [1, 4])
(tensor([[0, 1]]),
 tensor([[2, 3],
         [4, 5],
         [6, 7],
         [8, 9]]))

参考：pytorch 官网API

来源：https://www.cnblogs.com/qlhh/p/19142823

pytorch的矩阵操作分类