POI recognizes Chinese time and date format cells in Excel tables

POI recognizes Chinese time and date format cells in Excel tables

The focus of this article is to address two questions:

1. The time of POI recognition Excel contains Chinese, for example: January 22, 2021

2. POI recognizes time data whose cells in Excel are date type or time type

Foreword:

This article mainly uses examples to solve the above two problems. In order to highlight the code that mainly solves the problem, the recognition method of the xlsx type table is taken as an example.

1. POI recognition contains Chinese time

Foreword:
Mainly take the type time of xx, xx, xxxx as an example for demonstration

Background:
Since POI 3.X version, poi cannot recognize the Chinese time in Excel. However, when Chinese appears in our Excel, we still need to recognize it.

analyze:

For the identification of data types in poi, the DateUtil.isCellDateFormatted(Cell cell) method in the org.apache.poi.ss.usermodel.DateUtil package is mainly used. By looking at the source code of the method isCellDateFormatted, we found that the method isADateFormat for analyzing time data only analyzes the symbols, and does not analyze the corresponding Chinese data, such as 'year', 'month', 'day', etc. Key Chinese characters.

//以下 isADateFormat 方法中对于时间数据的分析源码
if (separatorIndex < length - 1) {
	char nc = fs.charAt(separatorIndex + 1);
	if (c == '\\') {
        switch(nc) {
            case ' ':
            case ',':
            case '-':
            case '.':
            case '\\':
            continue;
        }
    } else if (c == ';' && nc == '@') {
    	++separatorIndex;
    	continue;
	}
}

solution:

In order to realize that isCellDateFormatted in POI can analyze Chinese time, we need to rewrite this method. The following is the modified method, which can be directly copied and used

//直接创建一个DateFormatUtil的工具类,当我们需要对时间类型进行判断时,直接使用DateFormatUtil.isCellDateFormatted (Cell cell)方法即可
public class DateFormatUtil{
    public static boolean isCellDateFormatted(Cell cell)
    {
        if (cell == null) {
            return false;
        }
        boolean bDate = false;

        double d = cell.getNumericCellValue();
        if (isValidExcelDate(d)) {
            CellStyle style = cell.getCellStyle();
            if (style == null) {
                return false;
            }
            int i = style.getDataFormat();
            String f = style.getDataFormatString();
            bDate = isADateFormat(i, f);
        }
        return bDate;
    }

    public static boolean isADateFormat(int formatIndex, String formatString)
    {
        if (isInternalDateFormat(formatIndex)) {
            return true;
        }

        if ((formatString == null) || (formatString.length() == 0)) {
            return false;
        }

        String fs = formatString;
        //下面这一行是自己手动添加的 以支持汉字格式wingzing
        fs = fs.replaceAll("[\"|\']","").replaceAll("[年|月|日|时|分|秒|毫秒|微秒]", "");
        fs = fs.replaceAll("\\\\-", "-");
        fs = fs.replaceAll("\\\\,", ",");
        fs = fs.replaceAll("\\\\.", ".");
        fs = fs.replaceAll("\\\\ ", " ");
        fs = fs.replaceAll(";@", "");
        fs = fs.replaceAll("^\\[\\$\\-.*?\\]", "");
        fs = fs.replaceAll("^\\[[a-zA-Z]+\\]", "");
        return (fs.matches("^[yYmMdDhHsS\\-/,. :]+[ampAMP/]*$"));
    }

    public static boolean isInternalDateFormat(int format)
    {
        switch (format) { case 14:
            case 15:
            case 16:
            case 17:
            case 18:
            case 19:
            case 20:
            case 21:
            case 22:
            case 45:
            case 46:
            case 47:
            case 57:
            case 58:
                return true;
            case 23:
            case 24:
            case 25:
            case 26:
            case 27:
            case 28:
            case 29:
            case 30:
            case 31:
            case 32:
            case 33:
            case 34:
            case 35:
            case 36:
            case 37:
            case 38:
            case 39:
            case 40:
            case 41:
            case 42:
            case 43:
            case 44: } return false;
    }

    public static boolean isValidExcelDate(double value)
    {
        return (value > -4.940656458412465E-324D);
    }
}

2. POI recognizes time data whose cells in Excel are date type or time type

2.1 Cell type

Before explaining the problem of POI identifying the time data in the Excel table, we must first understand that there are several ways to define the cells of the time data in the Excel table.

Not much nonsense, just go to the picture:

insert image description here
insert image description here

insert image description here

As shown in the figure, there are three cell types representing time in the Excel table, which are date type, time type and custom type.

2.2 Custom Type Cell – Time Processing

Among the above three cell types, the time data stored in custom type cells can be judged directly by the value of getDataFormat(). Commonly used time types and their corresponding format values ​​are as follows:

time type format value
yyyy-MM-dd 14
m month d day yyyy 31
yyyy m month 57
m month d 58
HH:mm 20
h hour mm minute 32

Below is my code for handling several common time types

//先获取对应的format值
short format = row.getCell(j).getCellStyle().getDataFormat();
SimpleDateFormat sdf = null;
//根据format值,将其转换成对应的时间样式
if (format == 20 || format == 32){
	sdf = new SimpleDateFormat("HH:mm");
}else if (format == 14 || format == 31 || format == 57 || format == 58){
	sdf = new SimpleDateFormat("yyyy-MM-dd");
}else {
	sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
}

2.3 Date or time type cell – judgment and processing

analyze:

Different from the time of custom-type cells, the time data of date-type and time-type cells cannot be judged by the format value. Therefore, in the above code, the time of the "yyyy-MM-dd" and "HH:mm" types will be directly converted into the "yyyy-MM-dd HH:mm:ss" format, which is neither beautiful nor Meets our needs for its format.

Solution: (The following solution is only my personal processing method, which may not be rigorous, and is for reference only)
Steps:

  1. First convert the data into the format of "yyyy-MM-dd HH:mm:ss"
  2. Convert the time of SimpleDateFormat type to String type, remove the three symbols of spaces, : and -, and store it in an array of String type
  3. Judging the final time style to be displayed through the data of hours, minutes, and seconds in progress in the array, for example: when the hour is judged to be 0, the hour, minute, and second will not be displayed; when the second is judged to be 0, the year, month, and day will not be displayed

Go directly to the code:

//先获取对应的format值
short format = row.getCell(j).getCellStyle().getDataFormat();
SimpleDateFormat sdf = null;
//根据format值,将其转换成对应的时间样式
if (format == 20 || format == 32){
	sdf = new SimpleDateFormat("HH:mm");
}else if (format == 14 || format == 31 || format == 57 || format == 58){
	sdf = new SimpleDateFormat("yyyy-MM-dd");
}else {
    /**
    * 上面处理的时间,单元格格式都是自定义格式中时间格式,而下面解决的是单元格格式为日期格式的问题
    * 单元格格式为日期格式时,没办法通过format进行判断,所以进行如下操作:
    * 1、先将数据转换成 "yyyy-MM-dd HH:mm:ss"的格式
    * 2、将SimpleDateFormat类型的时间转换成String类型,并去除其中的空格、:和-三个符号,存进String类型数组中
    * 3、通过数组进行中时分秒的数据进行判断最终要展示的时间样式
    * 判断小时为0时,则不显示时分秒;判断秒钟为0时,则不显示年月日
    */
    sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
    // split方法去除多个符号,使用 | 分隔符进行简单的配置
    // 转化为String类型数组后,我们数组中存储的数据依次为0-年、1-月、2-日、3-时、4-分、5-秒(数组下标-存放的数据)
    String[] str = sdf.format(DateUtil.getJavaDate(row.getCell(j).getNumericCellValue())).toString().split("-|:| ");
    if (Integer.parseInt(str[3]) == 0 && Integer.parseInt(str[4]) == 0 && Integer.parseInt(str[5]) == 0){
        //根据表格,当时分秒都为0时,说明此单元格存储的数据为“yyyy/MM/dd”,此处只显示年月日
        sdf = new SimpleDateFormat("yyyy-MM-dd");
	}else if (Integer.parseInt(str[3]) != 0 && Integer.parseInt(str[4]) != 0 && Integer.parseInt(str[5]) == 0){
        //根据表格,当小时和分钟不为0的情况,说明此单元格存储的数据为“HH:mm”,年月日是不需要的
        sdf = new SimpleDateFormat("HH:mm");
	}else{
		break;
	}
}

The explanations are all explained in the code. The examples in the code are limited in scope and are for reference only. You still need to analyze them based on your own data.

Full code:

Finally, I attach the complete code for identifying the xlsx type table

//  1.根据传入的文件路径获取工作簿
XSSFWorkbook xssfworkbook = new XSSFWorkbook(path);

//  2.根据脚标获取工作表,index从0开始
XSSFSheet sheet = xssfworkbook.getSheetAt(1);
int lastRowNum = sheet.getLastRowNum();
for (int i = 2;i <= lastRowNum; i++) {
    XSSFRow row = sheet.getRow(i);

    if (row != null){
        List<String> list =new ArrayList();
        for (int j = 0; j < 13;j++){
            if (row.getCell(j) != null || !row.getCell(j).toString().trim().equals("")) {
                String value = null;
                //判断获取到的数据是否为数字类型数据
                if (row.getCell(j).getCellType() == CellType.NUMERIC){
                    //是数字类型数据,进行类型转换
                    //获取日期类型
                    short format = row.getCell(j).getCellStyle().getDataFormat();
                    //此处运用了重写的isCellDateFormatted方法,在DateFormatUtil工具类中
                    if (DateFormatUtil.isCellDateFormatted(row.getCell(j))){
                        SimpleDateFormat sdf =null;
                        if (format == 20 || format == 32){
                        	sdf = new SimpleDateFormat("HH:mm");
                        }else if (format == 14 || format == 31 || format == 57 || format == 58){
                        	sdf = new SimpleDateFormat("yyyy-MM-dd");
                        }else {
                            /**
                            * 上面处理的时间,单元格格式都是自定义格式中时间格式,而下面解决的是单元格格式为日期格式的问题
                            * 单元格格式为日期格式时,没办法通过format进行判断,所以进行如下操作:
                            * 1、现将数据转换成 "yyyy-MM-dd HH:mm:ss"的格式
                            * 2、将SimpleDateFormat类型的时间转换成String类型,并去除其中的空格、:和-三个符号,存进String类型数组中
                            * 3、通过数组进行中时分秒的数据进行判断最终要展示的时间样式
                            * 判断小时为0时,则不显示时分秒;判断秒钟为0时,则不显示年月日
                            */
                            sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
                            String[] str = sdf.format(DateUtil.getJavaDate(row.getCell(j).getNumericCellValue())).toString().split("-|:| ");
                        	if (Integer.parseInt(str[3]) == 0 && Integer.parseInt(str[4]) == 0 && Integer.parseInt(str[5]) == 0){
                        	//根据表格,当时分秒都为0时,说明此单元格为生产日期,此处只显示年月日
                        	sdf = new SimpleDateFormat("yyyy-MM-dd");
                            }else if (Integer.parseInt(str[3]) != 0 && Integer.parseInt(str[4]) != 0 && Integer.parseInt(str[5]) == 0){
                                //根据表格,当小时和分钟不为0的情况,说明此单元格为初凝和终凝,年月日是不需要的
                                sdf = new SimpleDateFormat("HH:mm");
                            }else{
                                break;
                        	}
                        }
                        double value1 = row.getCell(j).getNumericCellValue();
                        Date date = DateUtil.getJavaDate(value1);
                        value = sdf.format(date);
                    }else{
                   		value = String.valueOf(row.getCell(j));
                    }
                }else {
                	value = String.valueOf(row.getCell(j));
                }
                list.add(value);
            }
        }
        //使用for循环list中的数据存储到对应的bean对象中
    }
}
//最后根据自己方法体中需要返回的数据进行封装并返回即可

Guess you like

Origin blog.csdn.net/xiri_/article/details/112986449