abstract:web采集的數(shù)據(jù)為 %u6B63%u5F0F%u4EBA%u5458,需要讀取并轉(zhuǎn)換為python對(duì)象,想了下不調(diào)用Javascript去eval,只能自己翻譯了。核心代碼:import re import codecs pattern = re.compile('%u[0-9A-Z]{4}') n = co
web采集的數(shù)據(jù)為 %u6B63%u5F0F%u4EBA%u5458,需要讀取并轉(zhuǎn)換為python對(duì)象,想了下不調(diào)用Javascript去eval,只能自己翻譯了。
核心代碼:
import re import codecs pattern = re.compile('%u[0-9A-Z]{4}') n = codecs.open('d:\\new.txt', 'w', 'utf-8') with open('d:\\p', 'r') as f: for l in f: for i in pattern.findall(l): l = l.replace(i, unichr(int(i[2:], 16))) n.write(l) n.close()
更多關(guān)于python 轉(zhuǎn)換 Javascript %u 字符串為python unicode的代碼請(qǐng)關(guān)注PHP中文網(wǎng)(ipnx.cn)其他文章!