In the past few days, I am preparing to improve the Base64 & UUE encoding file generation tool. I found that it is very slow when processing large files. After analyzing it, I found that the code efficiency of string splicing and segmentation is too low. See the following code:
Private Sub Command1_Click()
Dim fL As Long, enfp As Integer, defp As Integer, enfn, defn
Dim B() As Byte, tmpstr As String, outStr As String
Dim timx As Single
timx = Timer
enfn = Text1.Text
defn = Text2.Text
enfp = FreeFile
Open enfn For Binary As #enfp
fL = LOF(enfp)
ReDim B(fL - 1)
Get #enfp, , B
Close #enfp
tmpstr = StrConv(B, vbUnicode)
defp = FreeFile
Open defn For Output As #defp
Do While Len(tmpstr) > 60
outStr = "M" & Mid(tmpstr, 1, 60)
tmpstr = Mid(tmpstr, 61) '这句导致效率变低 20220522
Print #defp, outStr
DoEvents
Loop
Print #defp, tmpstr
Close #defp
MsgBox "处理:" & fL & " 字节用时:" & Timer - timx & " 秒"
End Sub
When the string obtained by the encoding result is divided into fixed lengths, this sentence:
tmpstr = Mid(tmpstr, 61) '这句导致效率变低 20220522
The original intention is to take out the remaining string after cutting, which has no effect when the string is short, but when the length of the string increases, the speed becomes slower and slower, so I thought of a new way:
Private Sub Command2_Click()
Dim fL As Long, enfp As Integer, defp As Integer, enfn, defn
Dim B() As Byte, tmpstr As String, outStr As String
Dim E
Dim timx As Single
timx = Timer
enfn = Text1.Text
defn = Text2.Text
enfp = FreeFile
Open enfn For Binary As #enfp
fL = LOF(enfp)
ReDim B(fL - 1)
Get #enfp, , B
Close #enfp
tmpstr = StrConv(B, vbUnicode)
defp = FreeFile
E = 1
Open defn For Output As #defp
Do While (fL - E) > 60
outStr = "M" & Mid(tmpstr, E, 60)
Print #defp, outStr
DoEvents
E = E + 60
Loop
outStr = Mid(tmpstr, E, 60)
Print #defp, outStr
Close #defp
MsgBox "处理:" & fL & " 字节用时:" & Format(Timer - timx, "0.000000") & " 秒"
End Sub
Only the specified length of characters is intercepted from the original string, and the original string is no longer changed, and the efficiency is improved hundreds of times at once (the longer the string, the greater the efficiency).
’================================================================
In addition, for the entire file reading, the original use is: Line Input
Open defn For Input As #defp
Do While Not EOF(defp)
Line Input #defp, tmpstr
EnStr = EnStr & tmpstr
Loop
Close #defp
In the same way, the string splicing statement of EnStr = EnStr & tmpstr also leads to extremely low reading efficiency, so I thought of using Adodb.Stream to read the entire file at once. Similarly, it is not obvious for small files, but for more than 2Mb For the file, the efficiency of obj.readtext is extremely low, and it takes up to 7.32 seconds for a file of 8.27 MB.
Private Sub Command3_Click()
Dim str, stm, enfn, defn
Dim timx As Single, tmpstr As String
timx = Timer
enfn = Text1.Text
defn = Text2.Text
Set stm = CreateObject("Adodb.Stream")
stm.Type = 2 '1 bin,2 txt
stm.Mode = 3
stm.Open
stm.Charset = "GB2312"
stm.LoadFromFile enfn
str = stm.readtext '------ 低效 7.32秒
' str = stm.Read '--------高效 0.015秒
stm.Close
Set stm = Nothing
' tmpstr = StrConv(str, vbUnicode)
MsgBox "完成读取文件用时:" & Timer - timx & " 秒" '& Chr(str(0))
End Sub
So it was changed to Obj.Read, and found that the efficiency immediately increased by nearly 500 times.
Private Sub Command3_Click()
Dim str, stm, enfn, defn
Dim timx As Single, tmpstr As String
timx = Timer
enfn = Text1.Text
defn = Text2.Text
Set stm = CreateObject("Adodb.Stream")
stm.Type = 1 '1 bin,2 txt
stm.Mode = 3
stm.Open
' stm.Charset = "GB2312"
stm.LoadFromFile enfn
' str = stm.readtext '------ 低效 7.32秒
str = stm.Read '--------高效 0.015秒
stm.Close
Set stm = Nothing
tmpstr = StrConv(str, vbUnicode)
MsgBox "完成读取文件用时:" & Timer - timx & " 秒" '& Chr(str(0))
End Sub
It can be seen that the efficiency is still low due to string splicing. At the same time, compared with the method I used to directly read the complete file with a single-section array, the efficiency of Adodb.Stream Obj.Read is still low. Using the previous 8.27MB file, the following code The delay can not be calculated, it is almost 0. So a 75.7 MB file was replaced, Adodb.Stream Obj.Read took 0.109 seconds, and the following code took 0.023 seconds. It can be seen that if the open statement reads the entire file, the efficiency is at least 4 times that of Adodb.Stream Obj.Read.
Private Sub Command4_Click()
Dim fL As Long, enfp As Integer, defp As Integer, enfn, defn
Dim B() As Byte, tmpstr As String, outStr As String
Dim timx As Single
timx = Timer
enfn = Text1.Text
defn = Text2.Text
enfp = FreeFile
Open enfn For Binary As #enfp
fL = LOF(enfp)
ReDim B(fL - 1) '----比 Adodb.Stream 更高效
Get #enfp, , B
Close #enfp
' tmpstr = StrConv(B, vbUnicode)
MsgBox "完成读取文件用时:" & Format((Timer - timx), "0.000000") & " 秒" '& Chr(B(0))
End Sub
'============================================
At the same time, when splicing the previous Base64 encoding results, the method of direct character splicing was originally used (see: a Base64 + UUE encoding program source code written in VBS, the encoding table can be customized_jessezappy's blog-CSDN blog ): ret = ret & Chr(Base64EncMap((first \ 4) And 63)) returns the entire string after all splicing is completed. It is also found that its efficiency is extremely low after the amount of data increases. Later, it is changed to save the encoding result to byte first single byte array,
ReDim Preserve ret(retLength + 4)
ret(retLength + 1) = (Base64EncMap((first \ 4) And 63))
ret(retLength + 2) = (Base64EncMap(((first * 16) And 48) + ((second \ 16) And 15)))
ret(retLength + 3) = (Base64EncMap(((second * 4) And 60) + ((third \ 64) And 3)))
ret(retLength + 4) = (Base64EncMap(third And 63))
Finally, the single-byte array is directly converted into a string with StrConv(ret, vbUnicode), and the comparison efficiency is increased by nearly a thousand times (the ratio is determined by the length of the encoded data).
’===========================================
To sum up, the splicing and cutting of strings is the culprit that leads to the low efficiency of the above code.
----------This note