python 时间格式解析

说明

本文记录自己在爬虫过程中遇到的时间解析过程,因为有些网站显示的时间格式千奇百怪,但是我们存到数据库的格式却是唯一的。

下面讲自己在某论坛网站上遇到的时间格式解析

操作

在下面的所有时间操作中,都是将时间转换成标准的格式  %Y-%m-%d %H:%M:%S 示例:2018-07-26 18:56:42

在示例代码中会出现 s_time 这个字符串是我们提取出来的字符串,需要做处理的字符串。result_time 是我们处理的结果。

1、s_time = "2017-06-15"
if re.findall(r'\d{1,4}-\d{1,2}-\d{1,2}', s_time):
    result_time = time.strftime("%Y-%m-%d %H:%M:%S", time.strptime(s_time, "%Y-%m-%d"))

2、s_time = "6天前"

elif u'天前' in s_time:
    days = re.findall(u'(\d+)天前', s_time)[0]
    result_time = (datetime.now() - timedelta(days=int(days))).strftime("%Y-%m-%d %H:%M:%S")

3、s_time = "昨天 18:03"

elif u'昨天' in s_time:
    last_time = re.findall(r'.*?(\d{1,2}:\d{1,2})', s_time)[0]
    days_ago = datetime.now() - timedelta(days=int(1))
    y_m_d = str(days_ago.year) + '-' + str(days_ago.month) + '-' + str(days_ago.day)
    _time = y_m_d + ' ' + last_time
    result_time = time.strftime("%Y-%m-%d %H:%M:%S", time.strptime(_time, "%Y-%m-%d %H:%M"))

4、s_time = "28分钟前"

elif u'分钟前' in s_time:
    minutes = re.findall(u'(\d+)分钟', s_time)[0]
    minutes_ago = (datetime.now() - timedelta(minutes=int(minutes))).strftime("%Y-%m-%d %H:%M:%S")
    result_time = minutes_ago

5、s_time = "06-29"

elif re.findall(r'\d{1,2}-\d{1,2}', s_time) and len(s_time) <= 5:
    now_year = str(datetime.now().year)
    _time = now_year + '-' + s_time
    result_time = time.strftime("%Y-%m-%d %H:%M:%S", time.strptime(_time, "%Y-%m-%d"))

6、s_time = "1小时前"

elif u'小时前' in s_time:
    hours = re.findall(u'(\d+)小时前', s_time)[0]
    hours_ago = (datetime.now() - timedelta(hours=int(hours))).strftime("%Y-%m-%d %H:%M:%S")
    result_time = hours_ago

7、s_time = "1532573387"

elif re.findall('\d{10,13}',s_time)[0]:
    _t = int(s_time)
    _time = time.localtime(int(_t))
    result_time = time.strftime("%Y-%m-%d %H:%M:%S", _time)

以上就是某论坛基本的时间格式。最后贴出完整代码。

    def parse_time(self,s_time):
        result_time = ''
        # 1、2017-06-15
        if re.findall(r'\d{1,4}-\d{1,2}-\d{1,2}', s_time):
            result_time = time.strftime("%Y-%m-%d %H:%M:%S", time.strptime(s_time, "%Y-%m-%d"))
        # 6天前
        elif u'天前' in s_time:
            days = re.findall(u'(\d+)天前', s_time)[0]
            result_time = (datetime.now() - timedelta(days=int(days))).strftime("%Y-%m-%d %H:%M:%S")

        # 昨天 18:03
        elif u'昨天' in s_time:
            last_time = re.findall(r'.*?(\d{1,2}:\d{1,2})', s_time)[0]
            days_ago = datetime.now() - timedelta(days=int(1))
            y_m_d = str(days_ago.year) + '-' + str(days_ago.month) + '-' + str(days_ago.day)
            _time = y_m_d + ' ' + last_time
            result_time = time.strftime("%Y-%m-%d %H:%M:%S", time.strptime(_time, "%Y-%m-%d %H:%M"))

        # 28分钟前
        elif u'分钟前' in s_time:
            minutes = re.findall(u'(\d+)分钟', s_time)[0]
            minutes_ago = (datetime.now() - timedelta(minutes=int(minutes))).strftime("%Y-%m-%d %H:%M:%S")
            result_time = minutes_ago

        # 06-29
        elif re.findall(r'\d{1,2}-\d{1,2}', s_time) and len(s_time) <= 5:
            now_year = str(datetime.now().year)
            _time = now_year + '-' + s_time
            result_time = time.strftime("%Y-%m-%d %H:%M:%S", time.strptime(_time, "%Y-%m-%d"))
        # 1小时前
        elif u'小时前' in s_time:
            hours = re.findall(u'(\d+)小时前', s_time)[0]
            hours_ago = (datetime.now() - timedelta(hours=int(hours))).strftime("%Y-%m-%d %H:%M:%S")
            result_time = hours_ago

        return result_time

下面是某次爬虫遇到的结果。

parse_time

猜你喜欢

转载自blog.csdn.net/yunlongl/article/details/81225635