Natural Language Processing 2023

Natural Language Processing 2023

Academic Year 2023/2024

This is not the version of the lesson in the current year (2024).

Visit the UniBO website of the lecture for official and administrative details.

Prerequisites

A gentle introduction to Python

This topic wont be covered in class.

if you are a student of TraTec:
  you had the intro to Python in PBR
elif you are a student of SpecTra:
  you had the intro to python in APS
else: 
  check the slides, notebooks, and 2021 video recordings

Regardless, you can find the materials on virtuale.

Regardless of whether you attended either of the introductions, I suggest you to do (or re-visit) all the exercises ASAP.

Course contents

Whereas the contents could be (slightly) adapted according to the students skills and interests, the general structure of the course is as follows.

As of June 2024. the links are broken. I am working on transferring the contents from the olf website

1. Introduction to Natural Language Processing

2. Words and the vector space model

3. Naïve Bayes

4. Word vectors

5. From Word Counts to Meaning

6. Training and Evaluation

7. Intro to NN

  • 31/10/23 Slides on the perceptron
  • 31/10/23 Notebook on the perceptron
  • 06/11/23 Slides introducing neural networks and keras
  • 06/11/23 Notebook introducing neural networks and keras

8. Word Embeddings

9. Doc2Vec

10. Convolutions for text

11. Text is Sequential / LSTM

12. Text generation

13. Intro to Seq2Seq and Transformers ; Closing Remaks

- 05/12/23 Slides for part one

13. A brief intro to LLMs

Embed videos, podcasts, code, LaTeX math, and even test students!

On this page, you’ll find some examples of the types of technical content that can be rendered with Hugo Blox.

Video

Teach your course by sharing videos with your students. Choose from one of the following approaches:

Youtube:

{{< youtube w7Ft2ymGmfc >}}

Bilibili:

{{< bilibili id="BV1WV4y1r7DF" >}}

Video file

Videos may be added to a page by either placing them in your assets/media/ media library or in your page’s folder, and then embedding them with the video shortcode:

{{< video src="my_video.mp4" controls="yes" >}}

Podcast

You can add a podcast or music to a page by placing the MP3 file in the page’s folder or the media library folder and then embedding the audio on your page with the audio shortcode:

{{< audio src="ambient-piano.mp3" >}}

Try it out:

Test students

Provide a simple yet fun self-assessment by revealing the solutions to challenges with the spoiler shortcode:

{{< spoiler text="👉 Click to view the solution" >}}
You found me!
{{< /spoiler >}}

renders as

👉 Click to view the solution
You found me 🎉

Math

Hugo Blox Builder supports a Markdown extension for $\LaTeX$ math. You can enable this feature by toggling the math option in your config/_default/params.yaml file.

To render inline or block math, wrap your LaTeX math with {{< math >}}$...${{< /math >}} or {{< math >}}$$...$${{< /math >}}, respectively.

We wrap the LaTeX math in the Hugo Blox math shortcode to prevent Hugo rendering our math as Markdown.

Example math block:

{{< math >}}
$$
\gamma_{n} = \frac{ \left | \left (\mathbf x_{n} - \mathbf x_{n-1} \right )^T \left [\nabla F (\mathbf x_{n}) - \nabla F (\mathbf x_{n-1}) \right ] \right |}{\left \|\nabla F(\mathbf{x}_{n}) - \nabla F(\mathbf{x}_{n-1}) \right \|^2}
$$
{{< /math >}}

renders as

$$\gamma_{n} = \frac{ \left | \left (\mathbf x_{n} - \mathbf x_{n-1} \right )^T \left [\nabla F (\mathbf x_{n}) - \nabla F (\mathbf x_{n-1}) \right ] \right |}{\left \|\nabla F(\mathbf{x}_{n}) - \nabla F(\mathbf{x}_{n-1}) \right \|^2}$$

Example inline math {{< math >}}$\nabla F(\mathbf{x}_{n})${{< /math >}} renders as $\nabla F(\mathbf{x}_{n})$ .

Example multi-line math using the math linebreak (\\):

{{< math >}}
$$f(k;p_{0}^{*}) = \begin{cases}p_{0}^{*} & \text{if }k=1, \\
1-p_{0}^{*} & \text{if }k=0.\end{cases}$$
{{< /math >}}

renders as

$$ f(k;p_{0}^{*}) = \begin{cases}p_{0}^{*} & \text{if }k=1, \\ 1-p_{0}^{*} & \text{if }k=0.\end{cases} $$

Code

Hugo Blox Builder utilises Hugo’s Markdown extension for highlighting code syntax. The code theme can be selected in the config/_default/params.yaml file.

```python
import pandas as pd
data = pd.read_csv("data.csv")
data.head()
```

renders as

import pandas as pd
data = pd.read_csv("data.csv")
data.head()

Inline Images

{{< icon name="python" >}} Python

renders as

Python

Did you find this page helpful? Consider sharing it 🙌