亚洲国产精品无码久久大片,亚洲AV无码乱码麻豆精品国产,亚洲品质自拍网站,少妇伦子伦精品无码STYLES,国产精久久久久久久

<track id="zjo5z"><menu id="zjo5z"></menu></track>

<strong id="zjo5z"></strong>

<strong id="zjo5z"></strong>

<option id="zjo5z"><strong id="zjo5z"><acronym id="zjo5z"></acronym></strong></option>

<label id="zjo5z"></label>

<dl id="zjo5z"></dl>

<noframes id="zjo5z"><fieldset id="zjo5z"><acronym id="zjo5z"></acronym></fieldset></noframes>

<noframes id="zjo5z"></noframes>

全自動(dòng)文章采集、AI生成、自動(dòng)發(fā)布，網(wǎng)站自媒體全搞定！立即注冊

微信公眾號文章爬取

優(yōu)采云發(fā)布時(shí)間: 2020-08-28 03:12

　　微信公眾號文章爬取

　　哈哈，終于找到一個(gè)可以一鍵獲取所有公眾號上面的文章了，雖然比較笨，但是先湊活著(zhù)，畢竟還破解不了登陸。

　　參考鏈接：

　　第一步：先注冊一個(gè)公眾號

　　注冊以后登陸到主頁(yè)，找到這個(gè)素材管理

　　

　　然后你會(huì )看見(jiàn)下邊這個(gè)頁(yè)面

　　

　　點(diǎn)擊這個(gè)綠色箭頭指向的這個(gè)鏈接

　　

　　記得打開(kāi)調試工具

　　然后搜索你想爬取的公眾號

　　

　　這個(gè)懇求會(huì )返回我們搜索到的公眾號，我們要的公眾號也在這個(gè)列表上面，假如在第一個(gè)

　　

　　我們須要這個(gè)ｆａｋｅ－ｉｄ來(lái)標記這個(gè)公眾號

　　接下選中，然后點(diǎn)一下

　　

　　然后文章列表就下來(lái)了

　　整個(gè)過(guò)程就須要ｔｏｋｅｎ，公眾號名子，還有ｃｏｏｋｉｅ了

　　最后直接上代碼了

　　# -*- coding: utf-8 -*-

import pymysql as pymysql

from fake_useragent import UserAgent

import requests

import json

from retrying import retry

@retry(stop_max_attempt_number=5)

def get_author_list():

"""

獲取搜索到的公眾號列表

:return:

"""

global s

url = "https://mp.weixin.qq.com/cgi-bin/searchbiz?action=search_biz&begin=0&count=5&query={}&token={}&lang=zh_CN&f=json&ajax=1".format(

name, token)

try:

response = s.get(url, headers=headers, cookies=cookie)

text = json.loads(response.text)

# print('一共查詢(xún)出來(lái){}個(gè)公眾號'.format(text['total']))

return text

except Exception as e:

print(e)

reset_parmas()

@retry(stop_max_attempt_number=5)

def get_first_author_params(text):

"""

獲取單個(gè)公眾號的參數

:param text: 前面搜索獲取的公眾號列表

:return:fake_id公眾號id， text請求公眾號詳情的相應內容

"""

fake_id = text['list'][0] # 一般第一個(gè)就是咱們需要的，所以取第一個(gè)

# print(text['list'][0])

url = 'https://mp.weixin.qq.com/cgi-bin/appmsg?action=list_ex&begin=0&count=5&fakeid={}&type=9&query=&token={}&lang=zh_CN&f=json&ajax=1'.format(fake_id['fakeid'], token)

response = s.get(url, headers=headers, cookies=cookie)

try:

text1 = json.loads(response.text)

return text1, fake_id

except Exception as e:

print(e)

reset_parmas()

@retry(stop_max_attempt_number=5)

def get_title_url(text, fake_id):

"""

得到公眾號的標題和鏈接

:param text:

:param fake_id:

:return:

"""

print(text)

num = int(text['app_msg_cnt'])

if num % 5 > 0:

num = num // 5 + 1

for i in range(num):

# token begin：參數傳入

url = 'https://mp.weixin.qq.com/cgi-bin/appmsg?action=list_ex&begin={}&count=5&fakeid={}&type=9&query=&token={}&lang=zh_CN&f=json&ajax=1'.format(str(i * 5),fake_id['fakeid'], token,)

try:

response = s.get(url, headers=headers, cookies=cookie)

text = json.loads(response.text)

print(text)

artile_list = text['app_msg_list']

for artile in artile_list:

print('標題：{}'.format(artile['title']))

print('標題鏈接：{}'.format(artile['link']))

save_mysql(name, artile['title'], artile['link'])

except Exception as e:

print(e)

reset_parmas()

def save_mysql(name, title, url):

"""

保存數據到數據庫

:param name: 作者名字

:param title:文章標題

:param url: 文章鏈接

:return:

"""

try:

sql = """INSERT INTO title_url(author_name, title, url) values ('{}', '{}', '{}')""".format(name, title, url)

conn.ping(reconnect=False)

cur.execute(sql)

conn.commit()

except Exception as e:

print(e)

def reset_parmas():

"""

失效后重新輸入參數

:return:

"""

global name, token, s, cookie, headers

name = input("請輸入你要查看的公眾號名字: ")

token = input("請輸入你當前登錄自己公眾號的token: ")

cookies = input("請輸入當前頁(yè)面的cookie: ")

s = requests.session()

cookie = {'cookie': cookies}

headers = {

"User-Agent": UserAgent().random,

"Host": "mp.weixin.qq.com",

}

def run():

reset_parmas()

text = get_author_list()

text1, fake_id = get_first_author_params(text)

get_title_url(text1, fake_id)

run()

　　需要說(shuō)明一下，數據庫自己連我都不上代碼了

　　還有就是似乎插口更新的頻繁，所以你只須要弄懂我的代碼的邏輯就可以了

　?。希?/p>

0

2020-08-28

querylist采集微信公眾號文章

0 個(gè)評論

要回復文章請先登錄或注冊

視
頻
教
程

官方客服QQ群

在
線(xiàn)
客
服

亚洲国产精品无码久久大片,亚洲AV无码乱码麻豆精品国产,亚洲品质自拍网站,少妇伦子伦精品无码STYLES,国产精久久久久久久

<label id="ddkh5"></label>

<noframes id="ddkh5"></noframes>

<noframes id="ddkh5"><fieldset id="ddkh5"><dl id="ddkh5"></dl></fieldset></noframes>