SVD的数学原理及代码实现

0 需要线性代数的相关知识

考研的数学应该只学到了正交对角化,不过我下面的推导是基于考研的线性代数内容的,只要有基本的线代知识都可以看懂。

1 SVD的推导

SVD的推导过程及计算

2 SVD的代码实现及其应用

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66

import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
import requests
from io import BytesIO
import os

# Download
def download_load_image():

if not os.path.exists("lena.jpg"):
url = "https://raw.githubusercontent.com/opencv/opencv/master/samples/data/lena.jpg"
response = requests.get(url)
img = Image.open(BytesIO(response.content))
img.save("lena.jpg")

else:
img = Image.open("lena.jpg")
return np.array(img)

# 2. convert to grayscale
def rgb_to_grayscale(rgb_img):
return np.dot(rgb_img[...,:3], [0.299, 0.587, 0.114])

# 3. Singular Value Decomposition and Reconstruction
def svd_compress(image, k):
U, s, Vt = np.linalg.svd(image, full_matrices=False)
# s is a vector which is sorted by singular values in descending order, so we need convert s to a diagonal matrix
return U[:,:k] @ np.diag(s[:k]) @ Vt[:k,:]


# download
lena_rgb = download_load_image()
lena_gray = rgb_to_grayscale(lena_rgb) / 255.0 # 归一化

# resize to 512x512 if needed
if lena_gray.shape != (512, 512):
from PIL import Image
img_pil = Image.fromarray((lena_gray * 255).astype(np.uint8))
lena_gray = np.array(img_pil.resize((512, 512))) / 255.0


plt.figure(figsize=(20, 10))

# raw image

plt.subplot(2, 4, 1)
plt.imshow(lena_gray, cmap='gray')
plt.title("Original Image")
plt.axis('off')

k_values = [1, 5, 10, 20, 30, 50, 100]

for i, k in enumerate(k_values):
compressed_img = svd_compress(lena_gray, k)

position = i + 2
plt.subplot(2, 4, position)
plt.imshow(compressed_img, cmap='gray')
plt.title(f"top {k} singular values", fontsize=12)
plt.axis('off')

plt.tight_layout()
plt.show()

1
2
3
4
5
6
7
8
9
lena_gray = rgb_to_grayscale(lena_rgb) / 255.0  # 归一化
print(f"the size of lena image is {lena_gray.shape}")
U, s, Vt = np.linalg.svd(lena_gray, full_matrices=False)

tol = 1e-12
print(f"rank of image {np.linalg.matrix_rank(lena_gray, tol=tol)}")
print(f"the number of nonzero-singular value is {np.count_nonzero(s > tol)}")
print(f"the rank of A.T@A is {np.linalg.matrix_rank(lena_gray.T @ lena_gray, tol=tol)}")
print("in theory, the number of non-zero singular values equals to the rank of the original matrix and the rank of A.T@A")

3 运行结果

1
2
3
4
5
the size of lena image is (512, 512)
rank of image 512
the number of nonzero-singular value is 512
the rank of A.T@A is 512
in theory, the number of non-zero singular values equals to the rank of the original matrix and the rank of A.T@A

我们可以看到一共有512个非零的奇异值(singular value),我们取了前30个奇异值之后,得到的图像就已经跟原图很逼近了,当取到前100个奇异值时,我们已经分不出哪一张是原图了,哈哈哈哈。