需求: 将个人博客最近的几篇文章同步到 github 首页的指定区域

在GitHub上面创建一个同名仓库,比如我的id为cs7eric
,我就创建一个仓库名为cs7eric
的仓库,里面的README会直接在个人Github首页渲染展示。
想让首页自动更新个人博客园上面的文章链接,可以使用GitHub自带的CI工具GitHub Actions
。
总体的思路:
- 用Python爬取个人博客的文章链接
- 编写GitHub Actions的配置脚本
使用 Python 爬取个人博客的文章
1 2 3 4 5
| <h3 class="home-article-title"> <a href="/2023/07/03/less/" data-pjax-state=""> less </a> </h3>
|
这是我个人博客文章的HTML结构,上面框里面的内容是我们想要的信息。
借助Python里面的BeautifulSoup
库,可以快速地将这些信息提取出来。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
| import requests from bs4 import BeautifulSoup
url = "https://blog.cccs7.icu/"
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")
articles = soup.find_all("h3", class_="home-article-title")[:5] article_list = [f"- [{article.text.strip()}]({url + article.find('a', href=True)['href']})\n" for article in articles]
with open("README.md", "r", encoding="utf-8") as f: lines = f.readlines()
start_index = 55 end_index = 60
new_lines = lines[:start_index] + article_list + lines[end_index:] with open("README.md", "w", encoding="utf-8") as f: f.writelines(new_lines)
|
python 脚本如上
编写 actions 配置文件
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
| name: Sync Blog Posts to GitHub Pages on: schedule: - cron: "0 0 * * *" workflow_dispatch: jobs: build: runs-on: ubuntu-latest steps: - name: Checkout uses: actions/checkout@v2 - name: Setup Python uses: actions/setup-python@v2 with: python-version: "3.9" - name: Install Dependencies run: | pip install beautifulsoup4 requests - name: Sync Blog Posts run: | python sync_blog_posts.py - name: Commit Changes run: | git config --global user.email "csq020611@gmail.com" git config --global user.name "cs7eric" git add -A git commit -m "Sync blog posts" - name: Push Changes uses: ad-m/github-push-action@v0.6.0 with: repository: cs7eric/cs7eric branch: main github_token: ${{ secrets.FOR_ACTIONS }}
|
github workflow 配置

将之前的 python 脚本放在仓库根目录下,将 action 配置文件添加到 .github/workflows 下,运行 workflows 测试
