# Natural Language Processing2025
## LM Specialised Translation

### Neural networks

We are using Keras.

In [None]:
# Do not forget to install keras the first time
# (this is already installed on colab)
# ! pip3 install keras

In [None]:
# Importing all the dependencies for this notebook
# All the keras library
import keras
# specifically the layers
from keras import layers

# we are going to produce a couple of plots
import matplotlib.pyplot as plt
# to see what is number e
from math import e
# to produce numpy arrays (vectors)
import numpy as np

In [None]:
# Getting ready to plot a sigmoid function
fig = plt.figure(figsize=(5,5))
x = np.arange(-10.0, 10.0, 0.01)
print(x)

In [None]:
y = 1 / (1 + e**(-x))
plt.plot(x, y)
plt.axhline(y=0.5, xmin=-10, xmax=10, linestyle=":", linewidth=0.5)
plt.axvline(x=0, ymin=-10, ymax=10, linestyle=":", linewidth=0.5)
# here we are saving the figure. You can download it
fig.savefig("10_sigmoid.png")
# here we are displaying the figure
plt.show()

In [None]:
# Curious about what x and y contain?
print("x\ty")
for i, j in zip(x, y): #range(len(x)):
    print(i.round(2), "\t", j.round(5))

**Back to the slides**

## Neural network to lean the XOR logical function

With non-linear functions and more than one neuron, we can learn more sophisticated functions

In [None]:
# Instances for XOR
# input
x_train = np.array(
    [[0, 0],
    [0, 1],
    [1, 0],
    [1, 1]])

# desired output
y_train = np.array(
    [[0],
    [1],
    [1],
    [0]])

In [None]:
# Creating the model

# The first layer of the network will have 10 units
num_neurons = 10

# Sequential refers to a sequence of layers (the input is not sequential)
model = keras.Sequential(
    [
        layers.Input([2,]),
        # Input layer: 2 inputs, 10 units, hyperbolic tangent activation function
        layers.Dense(num_neurons, activation="tanh", name="layer1"),  # input_dim=2,
        # Hidden layer: How many units does it have? Which activation function?
        layers.Dense(1, activation="sigmoid", name="layer2")
    ]
)
model.summary()
# Let's go to the slides (after analysing this summary a bit)

First of all, [what is e](https://en.wikipedia.org/wiki/E_(mathematical_constant))?

In [None]:
# The tanh (hyperbolic tangent)

fig = plt.figure(figsize=(5,5))
x = np.arange(-10.0, 10.0, 0.01)
y = (e**x - e**(-x))/(e**x + e**(-x))
plt.plot(x, y)
plt.axhline(y=0, xmin=-10, xmax=10, linestyle=":", linewidth=0.5)
plt.axvline(x=0, ymin=-10, ymax=10, linestyle=":", linewidth=0.5)
fig.savefig("10_tanh.png")
plt.show()

In [None]:
# Building the model with stochastic gradient descent and alpha=0.1
# Reminder: alpha is the learning rate
sgd = keras.optimizers.SGD()
# SGD(learning_rate=0.1)
model.compile(loss='binary_crossentropy',
              optimizer=sgd,
              metrics=['accuracy'])

In [None]:
# Predicting with this model (before training: "zero shot")
model.predict(x_train)

In [None]:
# Train (fit) the model (if it doesn't converge, add more epochs)
# (notice that this can be launched many times, augmenting the number of epochs)
model.fit(x_train, y_train, epochs=800)

print("\nCurrent predictions:\n")
model.predict(x_train)

In [None]:
# Predicting with this model (after training)
y_pred = model.predict(x_train)
#        threshold     values to choose according to the threshold
np.where(y_pred > 0.5, 1, 0)

In [None]:
# Another way to get classes (if binary 0 vs 1 problem)
model.predict(x_train).round()

# Why this round works here:
# The threshold is at 0.5
# This is a binary problem
# The story is a bit different for multi-class (e.g., np.argmax(model.predict(x), axis=-1))

Finally, you can save your model to, for instance, deploy it later on

In [None]:
import h5py
model_structure = model.to_json()
with open("basic_model.json", "w") as json_file:
    json_file.write(model_structure)
model.save_weights("basic.weights.h5")

**end of the notebook**