Python問題紀錄#15-抓取AJAX網站資料 medium網站標題範例

by Gemma

學習WebCrawler基本用法與應用

  • 如何訪問網站加入瀏覽器資訊並取得網站資料
  • 使用Beautifulsoup抓取網站標題
  • 加入cookie語法
  • 使用迴圈抓取多頁資料
  • 練習AJAX網頁-KKDAY
  • 練習AJAX網頁-Medium

目標

抓取Medium首頁標題

發送請求取得網站文字資料

注意1: 確認Medium網站有Payload,且Headers有Content-Type參數,需加入請求代碼

注意2:# 更改limit後面數字可以跑出最多20篇

import urllib.request as req
import json
url="https://medium.com/_/graphql"
requestData={"operationName"此處複製完整Payload代碼"}
request=req.Request(url,headers={
    "content-type":"application/json; charset=utf-8",
    "user-agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Safari/537.36",
    },data=json.dumps(requestData).encode("utf-8"))
with req.urlopen(request) as response:
    result=response.read().decode("utf-8")

print(result)

可成功印出資料,最後一行可以註解掉

解析json資料,抓取文章標題,確定可以印出第一篇文章

注意: 0不需要加上引號”0″

result=json.loads(result)
print(result["data"]["webRecommendedFeed"]["items"][0]["post"]["title"]) 

Output

The Web We Have to Save

使用迴圈抓取20篇文文章標題

result=json.loads(result)
items=result["data"]["webRecommendedFeed"]["items"]
for item in items:
    print(item["post"]["title"])

Output

How Reading Can Literally Change Your Brain Chemistry
31 Photos From September 11th That You Have Never Seen
Most People Don’t Know the Difference Between “Feelings” and “Emotions”
The Crossroads of Should and Must
System Design Blueprint: The Ultimate Guide
Late Loves Are Better Loves
How ‘Should’ Makes Us Stupid — And How to Get Smart Again
Your portfolios are fucking boring
How To Write With AI Without Sounding Like AI — ChatGPT Canvas
9 Strategies I Used To Get 237,000 Website Visits From Google Discover
The Insanity of Relying on Vector Embeddings: Why RAG Fails
GenAI with Python: Build Agents from Scratch (Complete Tutorial)
You’re Using ChatGPT Wrong! Here’s How to Be Ahead of 99% of ChatGPT Users
Smoking Too Much Weed Almost Ruined My Life
Why can’t we read anymore?
Forget LangChain, CrewAI and AutoGen — Try This Framework and Never Look Back
How to Think About Your Career
The UX job market REALLY sucks right now
I’m Now Terrified of AI, And You Should Be Too
It’s Not Our Place To Change Them.

You may also like

Leave a Comment