python extract specific types of files in a directory

Use python 'os' and 're' module to extract specific types of files in the directory, which are mounted two mold carrying the python, there is no need to install.

Ideas:

Use os library lilstdir get all the file names in a folder, and then bring the folder path combined into a complete absolute path, and then to judge the type of the path to the file, if the file using the re library canonical correlation function to filter out specific suffix file; if it is a folder, this folder recursive processing.

note:

The following is extracted code 'xlsx' file, if necessary to extract other types of files, replacing re.complie ( 'str') in the regular expression can.

Source:

import os
import re

fileList = []

# Function can get *.xls/*.xlsx file from the directory
"""
dirpath: str, the path of the directory
"""
def _getfiles(dirPath):
    # open directory 
    files = os.listdir(dirPath)
    # re match *.xls/xlsx,you can change 'xlsx' to 'doc' or other file types.
    ptn = re.compile('.*\.xlsx')
    for f in files:
        # isdir, call self
        if (os.path.isdir(dirPath + '\\' + f)):
            getfiles(dirPath + '\\' + f)
        # isfile, judge
        elif (os.path.isfile(dirPath + '\\' + f)):
            res = ptn.match(f)
            if (res != None):
                fileList.append(dirPath + '\\' + res.group())
        else:
            fileList.append (dirpath + ' \\ invalid file ' )


# Function called outside
"""
dirpath: str, the path of the directory
"""
def getfiles(dirPath):
    _getfiles(dirPath)
    return fileList

if __name__ == "__main__":
     path = ' D: \\ \\ pyfiles Test ' 
     RES = GetFiles (path)
      Print ( ' extraction result: ' )
      for F in RES:
          Print (F)

Guess you like

Origin www.cnblogs.com/yocichen/p/11693240.html