聯系我們 - 廣告服務 - 聯系電話:
您的當前位置: > 關注 > > 正文

天天視訊!【scrapy框架】scrapy框架糗事百科爬蟲案例分享

來源:CSDN 時間:2023-03-31 08:01:52

環境

架構:arm64工具鏈:gcc-linaro-5.3.1-2016.05-x86_64_aarch64-linux-gnulinux-5.4log文件在win7環境生成decodecode文件在Ubuntu環境

背景


【資料圖】

在分析oops異常時發現一個叫decodecode的腳本,可以在沒有源代碼或符號表的情況下,將oops異常的log作為輸入就可以解析出錯誤位置的匯編代碼。但在使用decodecode腳本的時候出現了如下錯誤:

$ ARCH=arm64 $ CROSS_COMPILE=gcc-linaro-5.3.1-2016.05-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu- $ ./scripts/decodecode < panic_test.txt[ 0.734345] Code: d0002881 912f9c21 94067e68 d2800001 (b900003f)aarch64-linux-gnu-strip: "/tmp/tmp.5Y9eybnnSi.o": No such fileaarch64-linux-gnu-objdump: "/tmp/tmp.5Y9eybnnSi.o": No such fileAll code========   0:   d0002881        adrp    x1, 0x512000   4:   912f9c21        add     x1, x1, #0xbe7   8:   94067e68        bl      0x19f9a8   c:   d2800001        mov     x1, #0x0                        // #0  10:   b900003f        str     wzr, [x1]Code starting with the faulting instruction===========================================

panic_test.txt如下:

[    0.508246] Unable to handle kernel write to read-only memory at virtual address 0000000000000000[    0.517073] Mem abort info:[    0.519835]   ESR = 0x96000045[    0.522881]   EC = 0x25: DABT (current EL), IL = 32 bits[    0.528166]   SET = 0, FnV = 0[    0.531189]   EA = 0, S1PTW = 0[    0.534318] Data abort info:[    0.537169]   ISV = 0, ISS = 0x00000045[    0.540992]   CM = 0, WnR = 1[    0.543929] [0000000000000000] user address but active_mm is swapper[    0.550269] Internal error: Oops: 96000045 [#1] PREEMPT SMP[    0.555804] Modules linked in:[    0.558842] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.99-00006-g02e5b77f5cd8-dirty #14[    0.567069] Hardware name: sun50iw10 (DT)[    0.571059] pstate: 80400005 (Nzcv daif +PAN -UAO)[    0.575833] pc : reg_fixed_voltage_probe+0xdc/0x148[    0.580677] lr : reg_fixed_voltage_probe+0xd8/0x148[    0.585529] sp : ffffffc01002bb40[    0.588822] x29: ffffffc01002bb40 x28: 0000000000000000[    0.594109] x27: ffffffc01101c000 x26: ffffffc010faf000[    0.599395] x25: 0000000000000000 x24: 0000000000000000[    0.604682] x23: ffffffc011046000 x22: ffffff803c483400[    0.609968] x21: ffffffc010730000 x20: ffffffc011008000[    0.615255] x19: ffffffc010e88000 x18: 000000000000000a[    0.620542] x17: 00000000e45a70be x16: 00000000e90dbb24[    0.625829] x15: 000000000007a823 x14: ffffffc09002b877[    0.631115] x13: ffffffffffffffff x12: 0000000000000030[    0.636402] x11: 0000000000000004 x10: 0101010101010101[    0.641688] x9 : 0000000000000002 x8 : 0000000000000003[    0.646975] x7 : 0000000000000005 x6 : 00000000001b0b13[    0.652262] x5 : 130b1b0000000000 x4 : 0000000000000000[    0.657548] x3 : 0000000000000069 x2 : ffffff803df20040[    0.662835] x1 : 0000000000000000 x0 : 00000000ffffffea[    0.668122] Call trace:[    0.670551]  reg_fixed_voltage_probe+0xdc/0x148[    0.675060]  platform_drv_probe+0x54/0xa4[    0.679044]  really_probe+0x1d8/0x468[    0.682684]  driver_probe_device+0xec/0x12c[    0.686844]  device_driver_attach+0x54/0x78[    0.691004]  __driver_attach+0x130/0x148[    0.694907]  bus_for_each_dev+0x80/0xc8[    0.698717]  driver_attach+0x30/0x3c[    0.702270]  bus_add_driver+0x130/0x200[    0.706084]  driver_register+0xb0/0xfc[    0.709811]  __platform_driver_register+0x58/0x64[    0.714496]  regulator_pmc_voltage_init+0x20/0x28[    0.719173]  do_one_initcall+0xbc/0x224[    0.722985]  kernel_init_freeable+0x158/0x1f8[    0.727320]  kernel_init+0x18/0x108[    0.730785]  ret_from_fork+0x10/0x18[    0.734345] Code: d0002881 912f9c21 94067e68 d2800001 (b900003f)[    0.740417] ---[ end trace f73e218fc7aa2872 ]---[    0.745016] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b[    0.752626] SMP: stopping secondary CPUs[    0.756537] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---

實驗

當發現腳本沒有正確解析后,只能去看腳本代碼加打印來確認問題。最終在這段代碼前后加打印確定了問題

echo "code before:$code"code=`echo $code |sed -e "s/ [<(]>)] / /;s/ /,0x/g; s/[>)]$//"`echo "code after:$code"

輸出如下:

$ ARCH=arm64 CROSS_COMPILE=gcc-linaro-5.3.1-2016.05-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-  ./scripts/decodecode < panic_test.txt[ 0.734345] Code: d0002881 912f9c21 94067e68 d2800001 (b900003f)code before:b900003f)code after:b900003f)aarch64-linux-gnu-strip: "/tmp/tmp.TxEaGdxHgA.o": No such fileaarch64-linux-gnu-objdump: "/tmp/tmp.TxEaGdxHgA.o": No such fileAll code========   0:   d0002881        adrp    x1, 0x512000   4:   912f9c21        add     x1, x1, #0xbe7   8:   94067e68        bl      0x19f9a8   c:   d2800001        mov     x1, #0x0                        // #0  10:   b900003f        str     wzr, [x1]Code starting with the faulting instruction===========================================

從輸出可以看出code before & code after沒有變化,猜想code經過處理后應該是要將)去掉的。可是打印中間的代碼是有對)進行處理的。將上述實驗抽離繼續作如下實驗:

$ code="b900003f)" && echo $code |sed -e "s/ [<(]>)] / /;s/ /,0x/g; s/[>)]$//"b900003f

從上述實驗可以看出,單獨抽出來就可以正常處理。難道是文本上有什么區別?于是創建一個測試文件并寫入b900003f)。有做了如下實驗:

cat temp.back |sed -e "s/ [<(]>)] / /;s/ /,0x/g; s/[>)]$//"b900003f

換了文本作為輸入依舊正常。到了這里我將懷疑的重點導向了格式問題。當Windows環境的文本會產生一個換行符(CR)而Ubuntu環境沒有并視為無效字符,當copy到Ubuntu環境中時這個無效字符(CR)將會充斥在每一行的句末。我們可以使用在vim中輸入:e ++ff=unix %顯示出來如下圖: 知道問題的根因,接下來就是消除^M的問題了。這是我解決后提到社區的patch如下:

Date: Mon, 27 Sep 2021 15:41:34 +0800Subject: [PATCH] scripts/decodecode: fix faulting instruction no print when opps.file is DOS formatIf opps.file is in DOS format, faulting instruction cannot be printed:/ # ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu-/ # ./scripts/decodecode < oops.file[ 0.734345] Code: d0002881 912f9c21 94067e68 d2800001 (b900003f)aarch64-linux-gnu-strip: "/tmp/tmp.5Y9eybnnSi.o": No such fileaarch64-linux-gnu-objdump: "/tmp/tmp.5Y9eybnnSi.o": No such fileAll code========   0:   d0002881        adrp    x1, 0x512000   4:   912f9c21        add     x1, x1, #0xbe7   8:   94067e68        bl      0x19f9a8   c:   d2800001        mov     x1, #0x0                        // #0  10:   b900003f        str     wzr, [x1]Code starting with the faulting instruction===========================================Background: The compilation environment is Ubuntu,and the test environment is Windows.Most logs are generated in the Windows environment.In this way, CR (carriage return) will inevitably appear,which will affect the use of decodecode in the Ubuntu environment.The repaired effect is as follows:/ # ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu-/ # ./scripts/decodecode < oops.file[ 0.734345] Code: d0002881 912f9c21 94067e68 d2800001 (b900003f)All code========   0:   d0002881        adrp    x1, 0x512000   4:   912f9c21        add     x1, x1, #0xbe7   8:   94067e68        bl      0x19f9a8   c:   d2800001        mov     x1, #0x0                        // #0  10:*  b900003f        str     wzr, [x1]               <-- 0="" 1="" 2="" 7="" trapping="" instructioncode="" starting="" with="" the="" faulting="" instruction="" 0:="" b900003f="" str="" cc:="" borislav="" petkovcc:="" andrew="" mortoncc:="" marc="" zyngiercc:="" will="" deaconcc:="" rabin="" vincentcc:="" lkmlsigned-off-by:="" weidonghui---="" decodecode="" -="" file="" diff="" --git="" decodecodeindex="" 31d884e35f2f..c711a196511c="" 100755---="" if="" marker="" -ne="" then="" fi="" echo="" code=""> $T.aa echo =========================================== >> $T.aa-code=`echo $code | sed -e "s/ [<(]>)] / /;s/ /,0x/g; s/[>)]$//"`+code=`echo $code | sed -e "s/\r//;s/ [<(]>)] / /;s/ /,0x/g; s/[>)]$//"` echo -n "      .$type 0x" > $T.s echo $code >> $T.s disas $T 0--2.22.0.windows.1

責任編輯:

標簽:

相關推薦:

精彩放送:

新聞聚焦
Top 岛国精品在线