TensorFlow(Keras)

Posted on 2025-07-21 In Python , Machine Learning , TensorFlow , TensorFlow Views: Word count in article: 541 Reading time ≈ 2 mins.

暑假学习

tf (Keras)

Linear Layer

linear_layer = tf.keras.layers.Dense(units=1, activation='linear', )
linear_layer.get_weights()

'''
[]
'''

事实上参数创建是在第一次forward过程中进行，因此现在并无参数

set_w = np.array([[200]])
set_b = np.array([100])

# set_weights takes a list of numpy arrays
linear_layer.set_weights([set_w, set_b])
print(linear_layer.get_weights())

'''
[array([[200.]], dtype=float32), array([100.], dtype=float32)]
'''

在进行完初始化后可以手动设置参数，进行计算我们可以发现，与使用 NumPy 是相同的

1 2	prediction_tf = linear_layer(X_train) prediction_np = np.dot(X_train, set_w) + set_b

Sigmoid Layer

model = Sequential(
  [
    tf.keras.Input(shape=(1,)),
    tf.keras.layers.Dense(1, activation='sigmoid', name='L1')
  ]
)

通过 model.summary() 我们可以获取网络的概况，包含每一层的参数与激活函数

logistic_layer = model.get_layer('L1')
w, b = logistic_layer.get_weights()
print(w, b)
print(w.shape, b.shape)

'''
[[0.48]] [0.]
(1, 1) (1,)
'''

这里的模型是已经被初始化过的，是通过增加的 Input Layer，预先告诉 tf 计算参数

set_w = np.array([[2]])
set_b = np.array([-4.5])
# set_weights takes a list of numpy arrays
logistic_layer.set_weights([set_w, set_b])
print(logistic_layer.get_weights())

'''
[array([[2.]], dtype=float32), array([-4.5], dtype=float32)]
'''

同样的我们可以对其进行参数手工设置

a1 = model.predict(X_train[0].reshape(1, 1))
print(a1)
alog = sigmoidnp(np.dot(set_w, X_train[0].reshape(1, 1)) + set_b)
print(alog)

'''
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 37ms/step
[[0.01]]
[[0.01]]
'''

我们使用 model.predict 进行前向传播计算

Normalization Layer

X,Y = load_coffee_data()
print(X.shape, Y.shape)

print(f"Temperature Max, Min pre normalization: {np.max(X[:,0]):0.2f}, {np.min(X[:,0]):0.2f}")
print(f"Duration    Max, Min pre normalization: {np.max(X[:,1]):0.2f}, {np.min(X[:,1]):0.2f}")
norm_l = tf.keras.layers.Normalization(axis=-1)
norm_l.adapt(X)  # learns mean, variance
Xn = norm_l(X)
print(f"Temperature Max, Min post normalization: {np.max(Xn[:,0]):0.2f}, {np.min(Xn[:,0]):0.2f}")
print(f"Duration    Max, Min post normalization: {np.max(Xn[:,1]):0.2f}, {np.min(Xn[:,1]):0.2f}")

'''
(200, 2) (200, 1)

Temperature Max, Min pre normalization: 284.99, 151.32
Duration    Max, Min pre normalization: 15.45, 11.51
Temperature Max, Min post normalization: 1.66, -1.69
Duration    Max, Min post normalization: 1.79, -1.70
'''

标准化是为了 feature scaling，使得数据范围在附近，防止梯度爆炸与消失

获取每层权重

W1, b1 = model.get_layer("layer1").get_weights()
W2, b2 = model.get_layer("layer2").get_weights()
print(f"W1{W1.shape}:\n", W1, f"\nb1{b1.shape}:", b1)
print(f"W2{W2.shape}:\n", W2, f"\nb2{b2.shape}:", b2)

'''
W1(2, 3):
 [[ 0.8   0.94  0.54]
 [-0.16  0.8  -0.07]] 
b1(3,): [0. 0. 0.]
W2(3, 1):
 [[-0.67]
 [ 0.23]
 [ 0.84]] 
b2(1,): [0.]
'''

这里的参数是这样的，矩阵中的一列是一个神经元的权重，而向量中对应的分量即为对应神经元的 bias。其实这里会有一点绕，做多元线性回归的时候，我们使用，其中均为向量，这是对于单一数据点来说的。如果我们要一次计算多个数据点，那我们会想到使用矩阵来堆叠数据，我们很容易想到将进行堆叠，矩阵中的每一行是一个数据点，这也符合我们使用二维数组的习惯，但是因为矩阵乘法的约定，如果我们仍要使用的形式，那么应该要将这个输入转置，使矩阵中的每一行都是一个神经元的权重，但是在实践中我们通常直接给一般形式的，而在处理时将矩阵以转置的形式存储，由此而来的列作为权重，这样即可通过计算，而事实上，我们通常的正是通过行向量形式存储，因此这样处理起来更为方便.

拟合

model.compile(
    loss = tf.keras.losses.BinaryCrossentropy(),
    optimizer = tf.keras.optimizers.Adam(learning_rate=0.01),
)

model.fit(
    Xt,Yt,            
    epochs=10,
)

其中 model.compile 用于指定损失函数与优化器，model.fit 则将进行梯度下降