## 9.11 结构化数据：NumPy 的结构化数组

``````import numpy as np
``````

``````name = ['Alice', 'Bob', 'Cathy', 'Doug']
age = [25, 45, 37, 19]
weight = [55.0, 85.5, 68.0, 61.5]
``````

``````x = np.zeros(4, dtype=int)
``````

``````# 使用结构化数组的复合数据类型
data = np.zeros(4, dtype={'names':('name', 'age', 'weight'),
'formats':('U10', 'i4', 'f8')})
print(data.dtype)

# [('name', '<U10'), ('age', '<i4'), ('weight', '<f8')]
``````

``````data['name'] = name
data['age'] = age
data['weight'] = weight
print(data)

'''
[('Alice', 25, 55.0) ('Bob', 45, 85.5) ('Cathy', 37, 68.0)
('Doug', 19, 61.5)]
'''
``````

``````# 获取所有名称
data['name']

'''
array(['Alice', 'Bob', 'Cathy', 'Doug'],
dtype='<U10')
'''

# 获取数据的第一行
data[0]

# ('Alice', 25, 55.0)

# 获取最后一行的名称
data[-1]['name']

# 'Doug'
``````

``````# 获取年龄小于 30 的名称
data[data['age'] < 30]['name']

'''
array(['Alice', 'Doug'],
dtype='<U10')
'''
``````

### 创建结构化数组

``````np.dtype({'names':('name', 'age', 'weight'),
'formats':('U10', 'i4', 'f8')})

# dtype([('name', '<U10'), ('age', '<i4'), ('weight', '<f8')])
``````

``````np.dtype({'names':('name', 'age', 'weight'),
'formats':((np.str_, 10), int, np.float32)})

# dtype([('name', '<U10'), ('age', '<i8'), ('weight', '<f4')])
``````

``````np.dtype([('name', 'S10'), ('age', 'i4'), ('weight', 'f8')])

# dtype([('name', 'S10'), ('age', '<i4'), ('weight', '<f8')])
``````

``````np.dtype('S10,i4,f8')

# dtype([('f0', 'S10'), ('f1', '<i4'), ('f2', '<f8')])
``````

`'b'` 字节 `np.dtype('b')`
`'i'` 符号整数 `np.dtype('i4') == np.int32`
`'u'` 无符号整数 `np.dtype('u1') == np.uint8`
`'f'` 浮点 `np.dtype('f8') == np.int64`
`'c'` 复数浮点 `np.dtype('c16') == np.complex128`
`'S'`, `'a'` 字符串 `np.dtype('S5')`
`'U'` Unicode 字符串 `np.dtype('U') == np.str_`
`'V'` 原始数据（void） `np.dtype('V') == np.void`

### 更高级的复合类型

``````tp = np.dtype([('id', 'i8'), ('mat', 'f8', (3, 3))])
X = np.zeros(1, dtype=tp)
print(X[0])
print(X['mat'][0])

'''
(0, [[0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0]])
[[ 0.  0.  0.]
[ 0.  0.  0.]
[ 0.  0.  0.]]
'''
``````

### 记录数组：略有不同的结构化数组

NumPy 还提供了`np.recarray`类，它与刚刚描述的结构化数组几乎相同，但有一个附加功能：字段可以作为属性而不是字典的键来访问。

``````data['age']

# array([25, 45, 37, 19], dtype=int32)
``````

``````data_rec = data.view(np.recarray)
data_rec.age

# array([25, 45, 37, 19], dtype=int32)
``````

``````%timeit data['age']
%timeit data_rec['age']
%timeit data_rec.age

'''
1000000 loops, best of 3: 241 ns per loop
100000 loops, best of 3: 4.61 µs per loop
100000 loops, best of 3: 7.27 µs per loop
'''
``````