跳转至

Welcome to HuangFuSL's blog

Table of contents

Blog timeline

实现Llama-2

2024-09-07

在本节中,我们实现一个Llama-2模型。

归一化

2024-09-07

归一化是指将数据按照比例缩放,改变其分布。按照归一化的维度不同,可以分为以下几种:

位置编码

2024-09-07

Attention机制虽然能捕捉序列中不同位置的依赖关系,但是无法区分不同位置的元素。为了解决这个问题,Transformer模型引入了位置编码(Positional Encoding)。

transformer变种

2024-09-07

Transformer变种主要包含如下几种:

激活函数

2024-09-02

激活函数是神经网络中用于引入非线性因素的函数。常用的激活函数有如下几种:

注意力机制

2024-09-02

实现BERT

2024-09-02

huggingface中的模型通常对应一组config,存储模型的超参数。

Custom transformer

2024-09-02

编码器与解码器

2024-09-02

在注意力机制的基础上,Vaswani等人提出了两种transformer架构,即编码器和解码器。编码器利用自注意力机制,对输入的序列进行编码,解码器则利用自注意力机制和交叉注意力机制,生成对序列中下一个元素的预测。

损失函数

2024-09-02

损失函数是用来评价模型预测值与真实值之间的差异的函数。然后通过优化算法来调整模型的参数,使得损失函数的值最小。

Recent updates

Currently working on

Customization

Click on the buttons to change the primary color.

Click on the buttons to change the accent color.

However, if you try to switch from dark mode to light mode or reversed, changes to the primary color and accent color will lose.

Building documentation

Run git clone https://github.com/HuangFuSL/HuangFuSL.github.io.git to get the source code.

Bootstrap icon installation

The site uses bootstrap icons, which are added as submodules in third_party/icons. You have to manually initialize the submodule.

git submodule update --recursive --remote

LaTeX support

The site uses xelatex and dvisvgm to render tex document to SVG images embedded in the markdown files. However, as the SVG images are ignored by .gitignore, you have to manually perform the conversion.

For GitHub repository clones:

  • Run git submodule update --recursive --remote to receive the template.
  • Make sure you have installed and correctly configured xelatex and dvisvgm.
  • Add ./template directory to $TEXINPUTS environmental variable.
  • Execute ci/convert.py in the root directory of the repository.
  • Run mkdocs serve to view the images.

The template is located at HuangFuSL/latex-template

GitHub workflow

You need to install the dependencies stored in requirements.txt before you can start building the site:

pip install -r requirements.txt

There are cross-links in the site which require metadata defined in the page, so the project should be built before mkdocs serve is executed. The exported metadata is saved in meta.json after a build is successfully executed. To build the site, execute the following command:

mkdocs build -d build

Execute mkdocs serve, the built site will appear at http://127.0.0.1:8000

Acknowledgements

The blog relies on the following open-source projects:

The blog uses the following mkdocs plugins to function correctly.

Unless noted, content in this blog are shared under CC-BY-NC-SA 4.0 license.

Version information

commit 219ee659c16f2c9a7cda05fb62e5d2ce79ba57fe
Merge: ef3ecd01 a73de723
Author: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Date:   Sat Sep 7 16:20:23 2024 +0800

    Merge pull request #383 from HuangFuSL/ml-from-scratch

    Update: Llama implementation

评论