numpy库学习笔记

整理python numpy库学习笔记

参考自 https://docs.scipy.org/doc/numpy/user/quickstart.html
http://python.jobbole.com/87471/

The Basics

ndarray.ndim:数组的维数(即数组轴的个数),等于秩。最常见的为二维数组(矩阵)。

ndarray.shape:数组的维度。为一个表示数组在每个维度上大小的整数元组。例如二维数组中,表示数组的“行数”和“列数”。ndarray.shape返回一个元组,这个元组的长度就是维度的数目,即ndim属性。

ndarray.size:数组元素的总个数,等于shape属性中元组元素的乘积。

ndarray.dtype:表示数组中元素类型的对象,可使用标准的Python类型创建或指定dtype。另外也可使用前一篇文章中介绍的NumPy提供的数据类型。

ndarray.itemsize:数组中每个元素的字节大小。例如,一个元素类型为float64的数组itemsiz属性值为8(float64占用64个bits,每个字节长度为8,所以64/8,占用8个字节),又如,一个元素类型为complex32的数组item属性为4(32/8)。

ndarray.data:包含实际数组元素的缓冲区,由于一般通过数组的索引获取元素,所以通常不需要使用这个属性。

example

1
2
3
4
5
6
import numpy as np
a = np.arange(15).reshape(3, 5)
a.shape
a.ndim
a.itemsize
a.dtype.name

Array Creation

1
2
3
4
5
6
7
8
9
10
a = np.array([2,3,4])
b = np.array([1.2, 3.5, 5.1])
np.zeros( (3,4) )
np.ones( (2,3,4), dtype=np.int16 )
np.empty( (2,3) )
np.arange( 10, 30, 5 )
np.arange( 0, 2, 0.3 )
np.linspace( 0, 2, 9 )
x = np.linspace( 0, 2*pi, 100 )
np.random.rand(3,2)

Printing Arrays

1
2
b = np.arange(12).reshape(4,3)
c = np.arange(24).reshape(2,3,4)

Basic Operations

Arithmetic operators

1
2
3
4
5
6
a = np.array( [20,30,40,50] )
b = np.arange( 4 )
c = a-b
b**2
10*np.sin(a)
a<35

The matrix product

1
2
3
4
5
6
A = np.array( [[1,1],
... [0,1]] )
B = np.array( [[2,0],
... [3,4]] )
A*B # elementwise product 元素积
A.dot(B) # matrix product 点积

+= and *=

1
2
3
4
5
6
7
8
9
10
11
>>> a = np.ones((2,3), dtype=int)
>>> b = np.random.random((2,3))
>>> a *= 3
>>> a
array([[3, 3, 3],
[3, 3, 3]])
>>> b += a
>>> b
array([[ 3.417022 , 3.72032449, 3.00011437],
[ 3.30233257, 3.14675589, 3.09233859]])
>>> a += b # b is not automatically converted to integer type

When operating with arrays of different types, the type of the resulting array corresponds to the more general or precise one (a behavior known as upcasting).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
>>> a = np.ones(3, dtype=np.int32)
>>> b = np.linspace(0,pi,3)
>>> b.dtype.name
'float64'
>>> c = a+b
>>> c
array([ 1. , 2.57079633, 4.14159265])
>>> c.dtype.name
'float64'
>>> d = np.exp(c*1j)
>>> d
array([ 0.54030231+0.84147098j, -0.84147098+0.54030231j,
-0.54030231-0.84147098j])
>>> d.dtype.name
'complex128'

1
2
3
4
5
6
7
8
a = np.random.random((2,3))
a.sum()
a.min()
a.max()
b = np.arange(12).reshape(3,4)
b.sum(axis=0) # sum of each column
b.min(axis=1) # min of each row
b.cumsum(axis=1) # cumulative sum along each row

Universal Functions

1
2
3
4
5
B = np.arange(3)
np.exp(B)
np.sqrt(B)
C = np.array([2., -1., 4.])
np.add(B, C)

Indexing, Slicing and Iterating

One-dimensional arrays

1
2
3
4
a = np.arange(10)**3
a[2:5]
a[:6:2] = -1000 # from start to position 6, exclusive, set every 2nd element to -1000
a[ : :-1] # reversed a

Multidimensional

1
2
3
4
5
6
7
8
9
10
11
12
13
14
b =array([[ 0, 1, 2, 3],
[10, 11, 12, 13],
[20, 21, 22, 23],
[30, 31, 32, 33],
[40, 41, 42, 43]])
b[2,3]
b[row,column]
b[0:5, 1] # each row in the second column of b
b[ : ,1] # equivalent to the previous example
b[1:3, : ] # each column in the second and third row of b
>>> for row in b:
... print(row)
>>> for element in b.flat:
... print(element)

1
2
3
4
5
6
7
8
9
10
11
12
>>> c = np.array( [[[ 0, 1, 2], # a 3D array (two stacked 2D arrays)
... [ 10, 12, 13]],
... [[100,101,102],
... [110,112,113]]])
>>> c.shape
(2, 2, 3)
>>> c[1,...] # same as c[1,:,:] or c[1]
array([[100, 101, 102],
[110, 112, 113]])
>>> c[...,2] # same as c[:,:,2]
array([[ 2, 13],
[102, 113]])

Shape Manipulation

1
2
3
4
5
a = np.floor(10*np.random.random((3,4)))
a.ravel() # returns the array, flattened
a.reshape(6,2) # returns the array with a modified shape
a.T # returns the array, transposed
a.resize((2,6))

Stacking together different arrays

1
2
3
4
5
6
a = np.floor(10*np.random.random((2,2)))
b = np.floor(10*np.random.random((2,2)))
np.vstack((a,b))
np.hstack((a,b))
np.column_stack((a,b)) # with 2D arrays
a[:,newaxis] # this allows to have a 2D columns vector

Splitting one array into several smaller ones

1
2
3
a = np.floor(10*np.random.random((2,12)))
np.hsplit(a,3) # Split a into 3
np.hsplit(a,(3,4)) # Split a after the third and the fourth column

Copies and Views

View or Shallow Copy

1
2
3
4
5
6
7
8
9
10
11
c = a.view()
c is a
c.base is a # c is a view of the data owned by a
c.flags.owndata
c[0,4] = 1234
>>> s = a[ : , 1:3] # spaces added for clarity; could also be written "s = a[:,1:3]"
>>> s[:] = 10 # s[:] is a view of s. Note the difference between s=10 and s[:]=10
>>> a
array([[ 0, 10, 10, 3],
[1234, 10, 10, 7],
[ 8, 10, 10, 11]])

deep copy

1
2
3
4
5
6
7
8
9
10
>>> d = a.copy() # a new array object with new data is created
>>> d is a
False
>>> d.base is a # d doesn't share anything with a
False
>>> d[0,0] = 9999
>>> a
array([[ 0, 10, 10, 3],
[1234, 10, 10, 7],
[ 8, 10, 10, 11]])

Fancy indexing and index tricks

Indexing with Arrays of Indices

1
2
3
4
5
6
7
8
9
>>> a = np.arange(12)**2 # the first 12 square numbers
>>> i = np.array( [ 1,1,3,8,5 ] ) # an array of indices
>>> a[i] # the elements of a at the positions i
array([ 1, 1, 9, 64, 25])
>>>
>>> j = np.array( [ [ 3, 4], [ 9, 7 ] ] ) # a bidimensional array of indices
>>> a[j] # the same shape as j
array([[ 9, 16],
[81, 49]])
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
>>> palette = np.array( [ [0,0,0], # black
... [255,0,0], # red
... [0,255,0], # green
... [0,0,255], # blue
... [255,255,255] ] ) # white
>>> image = np.array( [ [ 0, 1, 2, 0 ], # each value corresponds to a color in the palette
... [ 0, 3, 4, 0 ] ] )
>>> palette[image] # the (2,4,3) color image
array([[[ 0, 0, 0],
[255, 0, 0],
[ 0, 255, 0],
[ 0, 0, 0]],
[[ 0, 0, 0],
[ 0, 0, 255],
[255, 255, 255],
[ 0, 0, 0]]])
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
>>> a = np.arange(12).reshape(3,4)
>>> a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> i = np.array( [ [0,1], # indices for the first dim of a
... [1,2] ] )
>>> j = np.array( [ [2,1], # indices for the second dim
... [3,3] ] )
>>>
>>> a[i,j] # i and j must have equal shape
array([[ 2, 5],
[ 7, 11]])
>>> s = np.array( [i,j] )
>>> a[s] # not what we want
Traceback (most recent call last):
>>> a[tuple(s)] # same as a[i,j]
array([[ 2, 5],
[ 7, 11]])
1
2
3
4
5
6
7
8
>>> a = np.arange(5)
>>> a[[0,0,2]]=[1,2,3]
>>> a
array([2, 1, 3, 3, 4])
>>> a = np.arange(5)
>>> a[[0,0,2]]+=1
>>> a
array([1, 1, 3, 3, 4])

Indexing with Boolean Arrays

1
2
3
4
5
6
7
8
>>> a = np.arange(12).reshape(3,4)
>>> b = a > 4
>>> b # b is a boolean with a's shape
array([[False, False, False, False],
[False, True, True, True],
[ True, True, True, True]])
>>> a[b] # 1d array with the selected elements
array([ 5, 6, 7, 8, 9, 10, 11])
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
>>> a = np.arange(12).reshape(3,4)
>>> b1 = np.array([False,True,True]) # first dim selection
>>> b2 = np.array([True,False,True,False]) # second dim selection
>>>
>>> a[b1,:] # selecting rows
array([[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>>
>>> a[b1] # same thing
array([[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>>
>>> a[:,b2] # selecting columns
array([[ 0, 2],
[ 4, 6],
[ 8, 10]])

The ix_() function

The ix_ function can be used to combine different vectors so as to obtain the result for each n-uplet.

1
2
3
4
>>> a = np.array([2,3,4,5])
>>> b = np.array([8,5,4])
>>> c = np.array([5,4,6,8,3])
>>> ax,bx,cx = np.ix_(a,b,c)

additional information

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
t = s.transpose((2, 0, 1)) #按指定轴进行转置
'''
默认转置将维度倒序,对于2维就是横纵轴互换
array([[ 0, 4, 8],
[ 1, 5, 9],
[ 2, 6, 10],
[ 3, 7, 11]])
'''
u = a[0].transpose() # 或者u=a[0].T也是获得转置
'''
逆时针旋转90度,第二个参数是旋转次数
array([[ 3, 2, 1, 0],
[ 7, 6, 5, 4],
[11, 10, 9, 8]])
'''
v = np.rot90(u, 3)
'''
沿纵轴左右翻转
array([[ 8, 4, 0],
[ 9, 5, 1],
[10, 6, 2],
[11, 7, 3]])
'''
w = np.fliplr(u)
'''
沿水平轴上下翻转
array([[ 3, 7, 11],
[ 2, 6, 10],
[ 1, 5, 9],
[ 0, 4, 8]])
'''
x = np.flipud(u)
'''
按照一维顺序滚动位移
array([[11, 0, 4],
[ 8, 1, 5],
[ 9, 2, 6],
[10, 3, 7]])
'''
y = np.roll(u, 1)
'''
按照指定轴滚动位移
array([[ 8, 0, 4],
[ 9, 1, 5],
[10, 2, 6],
[11, 3, 7]])
'''
z = np.roll(u, 1, axis=1)
'''
vstack是指沿着纵轴拼接两个array,vertical
hstack是指沿着横轴拼接两个array,horizontal
更广义的拼接用concatenate实现,horizontal后的两句依次等效于vstack和hstack
stack不是拼接而是在输入array的基础上增加一个新的维度
'''
m = np.vstack((l0, l1))
p = np.hstack((l0, l1))
q = np.concatenate((l0, l1))
r = np.concatenate((l0, l1), axis=-1)
s = np.stack((l0, l1))

基础运算能力

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# 绝对值,1
a = np.abs(-1)
# sin函数,1.0
b = np.sin(np.pi/2)
# tanh逆函数,0.50000107157840523
c = np.arctanh(0.462118)
# e为底的指数函数,20.085536923187668
d = np.exp(3)
# 2的3次方,8
f = np.power(2, 3)
# 点积,1*3+2*4=11
g = np.dot([1, 2], [3, 4])
# 开方,5
h = np.sqrt(25)
# 求和,10
l = np.sum([1, 2, 3, 4])
# 平均值,5.5
m = np.mean([4, 5, 6, 7])
# 标准差,0.96824583655185426
p = np.std([1, 2, 3, 2, 1, 3, 2, 0])

线性代数模块(linalg)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
a = np.array([3, 4])
np.linalg.norm(a)
b = np.array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
])
c = np.array([1, 0, 1])
# 矩阵和向量之间的乘法
np.dot(b, c) # array([ 4, 10, 16])
np.dot(c, b.T) # array([ 4, 10, 16])
np.trace(b) # 求矩阵的迹,15
np.linalg.det(b) # 求矩阵的行列式值,0
np.linalg.matrix_rank(b) # 求矩阵的秩,2,不满秩,因为行与行之间等差
d = np.array([
[2, 1],
[1, 2]
])
'''
对正定矩阵求本征值和本征向量
本征值为u,array([ 3., 1.])
本征向量构成的二维array为v,
array([[ 0.70710678, -0.70710678],
[ 0.70710678, 0.70710678]])
是沿着45°方向
eig()是一般情况的本征值分解,对于更常见的对称实数矩阵,
eigh()更快且更稳定,不过输出的值的顺序和eig()是相反的
'''
u, v = np.linalg.eig(d)
# Cholesky分解并重建
l = np.linalg.cholesky(d)
'''
array([[ 2., 1.],
[ 1., 2.]])
'''
np.dot(l, l.T)
e = np.array([
[1, 2],
[3, 4]
])
# 对不镇定矩阵,进行SVD分解并重建
U, s, V = np.linalg.svd(e)
S = np.array([
[s[0], 0],
[0, s[1]]
])
'''
array([[ 1., 2.],
[ 3., 4.]])
'''
np.dot(U, np.dot(S, V))

random 模块

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
import numpy.random as random
# 设置随机数种子
random.seed(42)
# 产生一个1x3,[0,1)之间的浮点型随机数
# array([[ 0.37454012, 0.95071431, 0.73199394]])
# 后面的例子就不在注释中给出具体结果了
random.rand(1, 3)
# 产生一个[0,1)之间的浮点型随机数
random.random()
# 下边4个没有区别,都是按照指定大小产生[0,1)之间的浮点型随机数array,不Pythonic…
random.random((3, 3))
random.sample((3, 3))
random.random_sample((3, 3))
random.ranf((3, 3))
# 产生10个[1,6)之间的浮点型随机数
5*random.random(10) + 1
random.uniform(1, 6, 10)
# 产生10个[1,6]之间的整型随机数
random.randint(1, 6, 10)
# 产生2x5的标准正态分布样本
random.normal(size=(5, 2))
# 产生5个,n=5,p=0.5的二项分布样本
random.binomial(n=5, p=0.5, size=5)
a = np.arange(10)
# 从a中有回放的随机采样7个
random.choice(a, 7)
# 从a中无回放的随机采样7个
random.choice(a, 7, replace=False)
# 对a进行乱序并返回一个新的array
b = random.permutation(a)
# 对a进行in-place乱序
random.shuffle(a)
# 生成一个长度为9的随机bytes序列并作为str返回
# '\x96\x9d\xd1?\xe6\x18\xbb\x9a\xec'
random.bytes(9)