SuperZLW's Blog

我很笨,但是我不懒

0%

学习记录_Computer Vision1_作业1_(2) Projective Transformation

(2)Projective Transformation

作业要求:
在这里插入图片描述
简单总结一下,就是说:给2D点核其他一些信息重构回3D图。然后再回构2D图。

步骤1:

笛卡尔坐标与转齐次坐标的相互转换,代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
def cart2hom(points):
""" Transforms from cartesian to homogeneous coordinates.

Args:
points: a np array of points in cartesian coordinates

Returns:
points_hom: a np array of points in homogeneous coordinates
"""

one = np.ones((1,points.shape[1]))
points_hom = np.r_[points, one] #Add 1 to the last line

return points_hom

1
2
3
4
5
6
7
8
9
10
11
12
def hom2cart(points):
""" Transforms from homogeneous to cartesian coordinates.

Args:
points: a np array of points in homogenous coordinates

Returns:
points_hom: a np array of points in cartesian coordinates
"""

points_cart = np.delete(points/points[-1], points.shape[0]-1, axis=0) #Delete the last line
return points_cart

这里补充一点关于齐次坐标的东西…\
(本来想自己总结的,但偶然看到一篇写得实在好,就在这里给个链接了)\
(这里简单总结总结下,当作唤醒记忆版:常见的笛卡尔坐标没办法表示无穷远,所以在n维笛卡尔坐标中添加一维变成n+1维,以二维为例,原来的(X, Y)变成(x,y,w),其中 X=x/w,Y=y/w,当w=0时即可表示笛卡尔中得无穷远处。引进齐次坐标可把坐标的旋转平移放缩等操作用一个矩阵表示。)

步骤2:

3D坐标的平移变换。齐次坐标的平移可如下表示:
在这里插入图片描述
左边矩阵为平移后的齐次坐标,vx, vy, vz为平移分量。\
代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
def gettranslation(v):
""" Returns translation matrix T in homogeneous coordinates for translation by v.

Args:
v: 3d translation vector

Returns:
T: translation matrix in homogeneous coordinates
"""
# [x,y,z] --> [x,y,z,1]
T = np.array([[1,0,0,v[0]],[0,1,0,v[1]],[0,0,1,v[2]],[0,0,0,1]])

return T

步骤3:

获得x, y, z 的旋转矩阵,三个轴的旋转矩阵都在代码的注释里给出了,代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
def getxrotation(d):
""" Returns rotation matrix Rx in homogeneous coordinates for a rotation of d degrees around the x axis.

Args:
d: degrees of the rotation

Returns:
Rx: rotation matrix
"""
# R_x: [1 0 0 ]
# [0 cos(d) sin(d)]
# [0 -sin(d) cos(d)]
n_x = np.array([[0,0,0],[0,0,-1],[0,1,0]])
R_x = np.eye(3) + np.sin(d*np.pi/180)*n_x + (1 - np.cos(d*np.pi/180))*n_x.dot(n_x)
Rx = np.c_[R_x, [0,0,0]]
# matrix in homogeneous coordinates
Rx = np.r_[Rx, [[0,0,0,1]]]
return Rx

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
def getyrotation(d):
""" Returns rotation matrix Ry in homogeneous coordinates for a rotation of d degrees around the y axis.

Args:
d: degrees of the rotation

Returns:
Ry: rotation matrix
"""
# R_y: [ cos(d) 0 sin(d)]
# [ 0 1 0 ]
# [-sin(d) 0 cos(d)]
n_y = np.array([[0,0,1],[0,0,0],[-1,0,0]])
R_y = np.eye(3) + np.sin(d*np.pi/180)*n_y + (1 - np.cos(d*np.pi/180))*n_y.dot(n_y)
Ry = np.c_[R_y, [0,0,0]]
# matrix in homogeneous coordinates
Ry = np.r_[Ry, [[0,0,0,1]]]
return Ry

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
def getzrotation(d):
""" Returns rotation matrix Rz in homogeneous coordinates for a rotation of d degrees around the z axis.

Args:
d: degrees of the rotation

Returns:
Rz: rotation matrix
"""
# R_z: [cos(d) -sin(d) 0]
# [sin(d) cos(d) 0]
# [ 0 0 1]
n_z = np.array([[0,-1,0],[1,0,0],[0,0,0]])
R_z = np.eye(3) + np.sin(d*np.pi/180)*n_z + (1 - np.cos(d*np.pi/180))*n_z.dot(n_z)
Rz = np.c_[R_z, [0,0,0]]
# matrix in homogeneous coordinates
Rz = np.r_[Rz, [[0,0,0,1]]]
return Rz

这里再简单说一下绕其它轴旋转的情况:\
绕其它轴旋转的话可以用轴角公式(Rodrigues’ rotation),通过给定旋转轴和角度可计算出旋转后的向量。这里依旧给个链接,B站的一个视频。

步骤4:

然后中心投影到xy平面上,主点为[8,10],焦距为8,像素为正方形。 我们最终获得了图像平面中的2D点。\
代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
def getcentralprojection(principal, focal):
""" Returns the (3 x 4) matrix L that projects homogeneous camera coordinates on homogeneous
image coordinates depending on the principal point and focal length.

Args:
principal: the principal point, 2d vector
focal: focal length

Returns:
L: central projection matrix
"""
# central: [f_x 0 u_0 0]
# [ 0 f_y v_0 0]
# [ 0 0 1 0]
square_L = np.array([[focal,0,principal[0]],[0,focal,principal[1]],[0,0,1]])
I = np.eye(3)
I_0 = np.c_[I, [0,0,0]] #[I|0]
L = square_L.dot(I_0)
return L

简单说明下:\
f_x和f_y是焦距,在这里都是8,u_0和v_0是中心点坐标,中心点坐标一般是取图像中点,比如,如果图像宽和高分别是W和H的话,中心点为(W/2,H/2).

步骤5:

将上面的平移,旋转,中心投影矩阵写在一起,如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
def getfullprojection(T, Rx, Ry, Rz, L):
""" Returns full projection matrix P and full extrinsic transformation matrix M.

Args:
T: translation matrix
Rx: rotation matrix for rotation around the x-axis
Ry: rotation matrix for rotation around the y-axis
Rz: rotation matrix for rotation around the z-axis
L: central projection matrix

Returns:
P: projection matrix
M: matrix that summarizes extrinsic transformations
"""
# The order is: y-axis first, then x-axis, and finally z-axis
# R = R_xR_yR_z
# extrinsic transformations:
# M = [R T]
# [0 1]
# matrix in homogeneous coordinates:
M = Rz.dot(Rx.dot(Ry.dot(T))) # [4,4]
# P = L.M
P = L.dot(M) # [3,4]
return P, M

这里要注意计算矩阵M的顺序!!!!!!!
这里贴一张图方便理解:
在这里插入图片描述
x^I^ 是图像坐标,x^C^ 是摄像机坐标,x^W^是世界3D坐标,K·[I | 0]是代码里的投影矩阵L。

步骤6:

借助上面得出的转换矩阵将世界3D坐标转化为图像2D坐标,代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
def projectpoints(P, X):
""" Apply full projection matrix P to 3D points X in cartesian coordinates.

Args:
P: projection matrix
X: 3d points in cartesian coordinates

Returns:
x: 2d points in cartesian coordinates
"""
X = cart2hom(X)
x = P.dot(X) #[3,2904] & [x,y,1]
x = hom2cart(x) #[2,2904] & [x,y]
return x

步骤7:

一些常规的加载操作。\
加载2D点:

1
2
3
4
5
6
7
8
9

def loadpoints():
""" Load 2D points from obj2d.npy.

Returns:
x: np array of points loaded from obj2d.npy
"""
x = np.load('data/obj2d.npy') #[2,2904]&[x,y]
return x

加载z坐标点:
1
2
3
4
5
6
7
8
9
def loadz():
""" Load z-coordinates from zs.npy.

Returns:
z: np array containing the z-coordinates
"""

z = np.load('data/zs.npy') #[1,2904]&[z]
return z

步骤8:

通过投影矩阵以及z方向的坐标,将上面的2D点显示为3D点,代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
def invertprojection(L, P2d, z):
"""
Invert just the projection L of cartesian image coordinates P2d with z-coordinates z.

Args:
L: central projection matrix
P2d: 2d image coordinates of the projected points
z: z-components of the homogeneous image coordinates

Returns:
P3d: 3d cartesian camera coordinates of the points
"""
# square_L = np.delete(L, 3, axis=1) #(3,3)
# L^-1
pinv_L = np.linalg.pinv(L)
# [x,y](2,2904) --> [x,y,1](3,2904)
hom_point_xy = cart2hom(P2d)
for i in range(P2d.shape[1]):
hom_point_xy[:,i] = hom_point_xy[:,i]*z[:,i]
hom_point_xy = pinv_L.dot(hom_point_xy)

# [x,y,z]
P3d = np.delete(hom_point_xy, 3, axis = 0)
return P3d

效果如下:
在这里插入图片描述
而原来2D点的图则如下:
在这里插入图片描述

步骤9:

获得外部转化后的3D点的笛卡尔坐标,以便用步骤6的代码获得3D点对应的2D点(步骤写得有点乱,不过不改了。。。。),代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
def inverttransformation(M, P3d):
""" Invert just the model transformation in homogeneous coordinates
for the 3D points P3d in cartesian coordinates.

Args:
M: matrix summarizing the extrinsic transformations
P3d: 3d points in cartesian coordinates

Returns:
X: 3d points after the extrinsic transformations have been reverted
"""
hom_P3d = cart2hom(P3d) #(4,2904)(x,y,z,1)
inv_M = np.linalg.inv(M) #Inverse matrix of M
P3d = inv_M.dot(hom_P3d)

return P3d #(4,2904)

步骤10:

对以上函数的引用:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
def problem3():
"""Example code implementing the steps in Problem 3"""
t = np.array([-27.1, -2.9, -3.2])
principal_point = np.array([8, -10])
focal_length = 8

# model transformations
T = gettranslation(t)
Ry = getyrotation(135)
Rx = getxrotation(-30)
Rz = getzrotation(90)
print(T)
print(Ry)
print(Rx)
print(Rz)

K = getcentralprojection(principal_point, focal_length)

P,M = getfullprojection(T, Rx, Ry, Rz, K)
print(P)
print(M)

points = loadpoints()
#displaypoints2d(points)

z = loadz()
Xt = invertprojection(K, points, z)

Xh = inverttransformation(M, Xt)

worldpoints = hom2cart(Xh)
displaypoints3d(worldpoints)

points2 = projectpoints(P, worldpoints)
displaypoints2d(points2)

plt.show()

总结:

最大收获是弄懂了齐次坐标,然后就是一些杂七杂八的变换矩阵

剩下待完善的:

  1. 关于摄像机矩阵,包括镜头畸变,本科毕设的时候有涉及到,但现在忘得差不多了,纠结下有没有必要单独总结一篇。。。
------ 本文结束------