跳转至

Welcome to HuangFuSL's blog

Table of contents

Blog timeline

RL Lab 9: Advantage Actor-Critic

2025-05-12

基于A2C的小丑牌策略算法。

策略梯度方法

2025-05-06

基于Q学习的DQN面临如下问题:

RL Lab 8: REINFORCE

2025-05-06

基于REINFORCE的小丑牌策略算法。

深度Q网络

2025-05-04

如果环境的状态空间和动作空间较大,或者是连续的,此时就不再能使用表格的方式来存储动作价值函数\(Q(s, a)\)了。此时可以使用函数近似的方法,用一个参数化的函数\(Q(s, a; \theta)\)来近似动作价值函数\(Q(s, a)\)。其中\(\theta\)为函数的参数。在离散情况下,\(Q\)函数的更新方式为

RL Lab 7: Dueling DQN

2025-05-04

基于Dueling DQN的小丑牌策略算法。

RL Lab 6: Deep Q Networks

2025-05-04

基于DQN的小丑牌策略算法。

RL Lab 5: Jimbo Game

2025-05-04

小丑牌问题环境搭建

时序差分学习

2025-05-02

在更为复杂的强化学习环境中,我们往往无法获知环境的状态转移概率分布,无法计算Bellman方程中的\(\sum_{s'} P(s'|s, a)V^*(s')\)部分。导致不能使用动态规划的方法来求解最优策略。此时我们只能通过与环境交互,获取奖励,以此来估计状态价值函数\(V(s)\)和动作价值函数\(Q(s, a)\)。在强化学习中,最常用的两种方法是蒙特卡洛方法(Monte Carlo Method)和时序差分学习(Temporal Difference Learning)。

RL Lab 4: Temporal Difference Learning

2025-05-02

基于时序差分学习的网格迷宫求解算法。

动态规划

2025-05-01

对于简单的强化学习环境,如果环境的状态转移完全可知,则可以使用动态规划方法对问题进行求解。

Recent updates

Currently working on

Customization

Click on the buttons to change the primary color.

Click on the buttons to change the accent color.

However, if you try to switch from dark mode to light mode or reversed, changes to the primary color and accent color will lose.

Building documentation

Run git clone https://github.com/HuangFuSL/HuangFuSL.github.io.git to get the source code.

Bootstrap icon installation

The site uses bootstrap icons, which are added as submodules in third_party/icons. You have to manually initialize the submodule.

git submodule update --recursive --remote

LaTeX support

The site uses xelatex and dvisvgm to render tex document to SVG images embedded in the markdown files. However, as the SVG images are ignored by .gitignore, you have to manually perform the conversion.

For GitHub repository clones:

  • Run git submodule update --recursive --remote to receive the template.
  • Make sure you have installed and correctly configured xelatex and dvisvgm.
  • Add ./template directory to $TEXINPUTS environmental variable.
  • Execute ci/convert.py in the root directory of the repository.
  • Run mkdocs serve to view the images.

The template is located at HuangFuSL/latex-template

GitHub workflow

You need to install the dependencies stored in requirements.txt before you can start building the site:

pip install -r requirements.txt

There are cross-links in the site which require metadata defined in the page, so the project should be built before mkdocs serve is executed. The exported metadata is saved in meta.json after a build is successfully executed. To build the site, execute the following command:

mkdocs build -d build

Execute mkdocs serve, the built site will appear at http://127.0.0.1:8000

Acknowledgements

The blog relies on the following open-source projects:

The blog uses the following mkdocs plugins to function correctly.

Unless noted, content in this blog are shared under CC-BY-NC-SA 4.0 license.

Version information

commit a805efb77e12dbfeafa821c2c6939c84d908f188
Merge: 58f36bf09 9c19e3094
Author: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Date:   Fri May 16 18:11:17 2025 +0800

    Merge pull request #524 from HuangFuSL/reinforcement-learning

    Update: Fix RL intro

评论