Welcome to HuangFuSL's blog¶
Table of contents¶
Blog timeline¶
如果环境的状态空间和动作空间较大,或者是连续的,此时就不再能使用表格的方式来存储动作价值函数\(Q(s, a)\)了。此时可以使用函数近似的方法,用一个参数化的函数\(Q(s, a; \theta)\)来近似动作价值函数\(Q(s, a)\)。其中\(\theta\)为函数的参数。在离散情况下,\(Q\)函数的更新方式为
在更为复杂的强化学习环境中,我们往往无法获知环境的状态转移概率分布,无法计算Bellman方程中的\(\sum_{s'} P(s'|s, a)V^*(s')\)部分。导致不能使用动态规划的方法来求解最优策略。此时我们只能通过与环境交互,获取奖励,以此来估计状态价值函数\(V(s)\)和动作价值函数\(Q(s, a)\)。在强化学习中,最常用的两种方法是蒙特卡洛方法(Monte Carlo Method)和时序差分学习(Temporal Difference Learning)。
Recent updates¶
- 2025-05-16:强化学习基础概念,动态规划,时序差分学习
- 2025-05-12:策略梯度方法,RL Lab 9: Advantage Actor-Critic
- 2025-05-06:RL Lab 8: REINFORCE
- 2025-05-04:深度Q网络,RL Lab 7: Dueling DQN,RL Lab 6: Deep Q Networks,RL Lab 4: Temporal Difference Learning,RL Lab 5: Jimbo Game
- 2025-05-02:RL Lab 1: Environment
Currently working on¶
Customization¶
Click on the buttons to change the primary color.
Click on the buttons to change the accent color.
However, if you try to switch from dark mode to light mode or reversed, changes to the primary color and accent color will lose.
Building documentation¶
Run git clone https://github.com/HuangFuSL/HuangFuSL.github.io.git
to get the
source code.
Bootstrap icon installation¶
The site uses bootstrap icons, which are added as submodules in
third_party/icons
. You have to manually initialize the submodule.
git submodule update --recursive --remote
LaTeX support¶
The site uses xelatex
and dvisvgm
to render tex document to SVG images
embedded in the markdown files. However, as the SVG images are ignored by
.gitignore
, you have to manually perform the conversion.
For GitHub repository clones:
- Run
git submodule update --recursive --remote
to receive the template. - Make sure you have installed and correctly configured
xelatex
anddvisvgm
. - Add
./template
directory to$TEXINPUTS
environmental variable. - Execute
ci/convert.py
in the root directory of the repository. - Run
mkdocs serve
to view the images.
The template is located at HuangFuSL/latex-template
GitHub workflow¶
You need to install the dependencies stored in requirements.txt
before you can
start building the site:
pip install -r requirements.txt
There are cross-links in the site which require metadata defined in the page,
so the project should be built before mkdocs serve
is executed. The exported
metadata is saved in meta.json
after a build is successfully executed. To
build the site, execute the following command:
mkdocs build -d build
Execute mkdocs serve
, the built site will appear at http://127.0.0.1:8000
Acknowledgements¶
The blog relies on the following open-source projects:
The blog uses the following mkdocs plugins to function correctly.
- Neoteroi/mkdocs-plugins
- lukasgeiter/mkdocs-awesome-pages-plugin
- timvink/mkdocs-git-revision-date-localized-plugin
- zhaoterryy/mkdocs-git-revision-date-plugin
- squidfunk/mkdocs-material
- facelessuser/mkdocs-material-extensions
- fralau/mkdocs_macros_plugin
- danielfrg/mkdocs-jupyter
- prcr/mkdocs-meta-descriptions-plugin
Unless noted, content in this blog are shared under CC-BY-NC-SA 4.0 license.
Version information¶
commit a805efb77e12dbfeafa821c2c6939c84d908f188
Merge: 58f36bf09 9c19e3094
Author: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Date: Fri May 16 18:11:17 2025 +0800
Merge pull request #524 from HuangFuSL/reinforcement-learning
Update: Fix RL intro