go --- Determine the file type according to the file header

  Usually we choose to judge the file type by the file suffix. For some types of files that have been renamed, the judgment will be wrong. Therefore, it will be more accurate to judge through the file header.


 

Commonly used file headers

常用文件的文件头如下(16进制)

JPEG (jpg),文件头:FFD8FFE0FFD8FFE1FFD8FFE8
GIF (gif),文件头:47494638PNG (png),文件头:89504E47
TIFF (tif),文件头:49492A00
Windows Bitmap (bmp),文件头:424DC001
CAD (dwg),文件头:41433130
Adobe Photoshop (psd),文件头:38425053
Rich Text Format (rtf),文件头:7B5C727466
XML (xml),文件头:3C3F786D6C
HTML (html),文件头:68746D6C3E
Email [thorough only] (eml),文件头:44656C69766572792D646174653A
Outlook Express (dbx),文件头:CFAD12FEC5FD746F
Outlook (pst),文件头:2142444E
MS Word/Excel (xls.or.doc),文件头:D0CF11E0
MS Access (mdb),文件头:5374616E64617264204A
WordPerfect (wpd),文件头:FF575043
Adobe Acrobat (pdf),文件头:255044462D312E
Quicken (qdf),文件头:AC9EBD8F
Windows Password (pwl),文件头:E3828596
ZIP Archive (zip),文件头:504B0304
RAR Archive (rar),文件头:52617221
Wave (wav),文件头:57415645
AVI (avi),文件头:41564920
Real Audio (ram),文件头:2E7261FD
Real Media (rm),文件头:2E524D46
MPEG (mpg),文件头:000001BA
MPEG (mpg),文件头:000001B3
Quicktime (mov),文件头:6D6F6F76
Windows Media (asf),文件头:3026B2758E66CF11
MIDI (mid),文件头:4D546864

 

go code example

  Only judge whether it is jpg/jpeg/png, if you need to judge/obtain other types, you can add map by yourself

package main

import (
	"bytes"
	"encoding/hex"
	"fmt"
	"log"
	"os"
	"strconv"
	"strings"
)

var picMap map[string]string

func init() {
    
    
	picMap = make(map[string]string)
	picMap["ffd8ffe0"] = "jpg"
	picMap["ffd8ffe1"] = "jpg"
	picMap["ffd8ffe8"] = "jpg"
	picMap["89504e47"] = "png"
}

func main() {
    
    
	file,err := os.Open("pic/type/test.jpeg")
	if err != nil {
    
    
		log.Fatal(err.Error())
	}
	defer file.Close()
	result := judgeType(file)
	fmt.Println("判断结果: ",result)
}

func judgeType(file *os.File) bool {
    
    
	buf := make([]byte, 20)
	n, _ := file.Read(buf)

	fileCode := bytesToHexString(buf[:n])
	for k,_ := range picMap {
    
    
		if strings.HasPrefix(fileCode, k) {
    
    
			return true
		}
	}
	return false
}

//获取16进制
func bytesToHexString(src []byte) string {
    
    
	res := bytes.Buffer{
    
    }
	if src == nil || len(src) <= 0 {
    
    
		return ""
	}
	temp := make([]byte, 0)
	i, length := 100, len(src)
	if length < i {
    
    
		i = length
	}
	for j := 0; j < i; j++ {
    
    
		sub := src[j] & 0xFF
		hv := hex.EncodeToString(append(temp, sub))
		if len(hv) < 2 {
    
    
			res.WriteString(strconv.FormatInt(int64(0), 10))
		}
		res.WriteString(hv)
	}
	return res.String()
}

 
 

some pits

  When reading a file, calling methods such as Read() and ReadAll() will clear the buffer area. If the complete file is needed later, the file will be incomplete.

  If possible, first copy a copy of the first 20 bytes from the file stream []byte for judgment, or rewrite the file after reading it.

  For files requested by (gin.Context, fasthttp, etc.), you can copy the request first, so that the source data will not be destroyed.

var c *gin.Context
copyC := c.Cppy()
fileHeader, err := copyC.FormFile("file_name")
......

 
Reference link:
https://blog.miuyun.work
 
  If there is something wrong, please point it out, thank you~

Guess you like

Origin blog.csdn.net/my_miuye/article/details/125137194