HRM(Hierarchical Reasoning Model)部署训练记录

风中芦苇 發表於 2025-8-7 02:22:00

HRM(Hierarchical Reasoning Model)部署训练记录

前两天朋友刷到HRM这个27M模型的文章，想让我试着部署训练一下。此文用于记录部署过程 
前期准备 
克隆仓库 
sapientinc/HRM 
安装CUDA 
我的CUDA是已经安装好的12.8版本，安装过程不再赘述 
安装torch 
torch版本如下 Version: 2.7.1+cu128 
<code>pip install torch torchvision torchaudio -f https://mirrors.aliyun.com/pytorch-wheels/cu128/</code>
这里贴出的链接为CUDA12.8版本的阿里云镜像torch安装 
安装Flash Attention 
这里由于我的系统是windows所以我使用flash-attention-for-windows 
github地址 
<code>Pip install flash_attn-2.8.2+cu128torch2.7.1cxx11abiFALSEfullbackward-cp311-cp311-win_amd64.whl</code> 
这里cu128指CUDA12.8 
torch版本为2.7.1 
python版本3.11 
安装依赖 
<code>pip install -r requirements.txt</code> 
注册并创建wandb key 
由于项目使用wandb记录数据，所以此处需要注册wandb并且在训练时要保持网络畅通 
<code>pip install wandb</code> 
安装好后使用key登录 
<code>wandb login</code> 
安装triton 
虽然原文没有提到，但是我在部署过程中发现需要，故此添加安装 
需要注意triton版本、torch版本以及CUDA版本需要对应 
<img alt="image" loading="lazy" src="https://img2024.cnblogs.com/blog/3685372/202508/3685372-20250807021841905-1644369472.png" class="lazyload">
同样我这里使用的windows版 
github链接 
此处我使用的为3.3版本 Version: 3.3.1.post19 
<code>pip install -U "triton-windows<3.4"</code> 
开始实验 
下载并构建数独数据集 
<code>python dataset/build_sudoku_dataset.py --output-dir data/sudoku-extreme-1k-aug-1000--subsample-size 1000 --num-aug 1000</code> 
这里要先改一下pretrain.py的代码 
把 
<code>from adam_atan2 import AdamATan2</code> 
改为 
<code>from adam_atan2_pytorch import AdamAtan2 as AdamATan2</code> 
并且安装adam_atan2_pytorch 
<code>pip install adam-atan2-pytorch</code> 
以及将
<details>
<summary>点击查看代码</summary>
<pre><code>AdamATan2(
 model.parameters(),

 lr=0,# Needs to be set by scheduler
 weight_decay=config.weight_decay,
 betas=(config.beta1, config.beta2)
 )
</code></pre>
</details>
修改为
<details>
<summary>点击查看代码</summary>
<pre><code>AdamATan2(
 model.parameters(),

 lr=config.lr,# Needs to be set by scheduler
 weight_decay=config.weight_decay,
 betas=(config.beta1, config.beta2)
 )
</code></pre>
</details>
否则会报错 assert lr > 0 
开始训练 
<code>Set OMP_NUM_THREADS=8</code> 
<code>python pretrain.py data_path=data/sudoku-extreme-1k-aug-1000 epochs=20000 eval_interval=2000 global_batch_size=384 lr=7e-5 puzzle_emb_lr=7e-5 weight_decay=1.0 puzzle_emb_weight_decay=1.0</code> 
我的显卡是3070，跑了15个小时才跑完 
附上训练结果 
<img alt="image" loading="lazy" src="https://img2024.cnblogs.com/blog/3685372/202508/3685372-20250807020914759-1410385170.png" class="lazyload"> 
<img alt="image" loading="lazy" src="https://img2024.cnblogs.com/blog/3685372/202508/3685372-20250807020902396-1415969398.png" class="lazyload"> 
来源：https://www.cnblogs.com/ltruth/p/19026137

頁: [1]

圆梦公社's Archiver

HRM(Hierarchical Reasoning Model)部署训练记录