Python 中文包含判断及unicode
325 views
0
Python 中文包含判断及unicode
示例代码:
# 检测是否为字母数字混合, 默认(0),字母(1),数字(2), 数字字母(3),中文(4),异常(9) @classmethod def check_domain_name_type(cls, domain_name_prefix): domain_name_type = 0 try: # 精确判断 if type(domain_name_prefix).__name__ == "unicode": print domain_name_prefix, "\t\t unicode" domain_name_prefix.encode('utf-8').decode('unicode_escape') # u'abc中国123' else: print domain_name_prefix, "\t\t no unicode" domain_name_prefix = unicode(domain_name_prefix, 'utf-8') # 'abc中国123' # 简洁判断 if type(domain_name_prefix).__name__ != "unicode": domain_name_prefix = unicode(domain_name_prefix, 'utf-8') # 'abc中国123' zhPattern = re.compile(u'[\u4e00-\u9fa5]+') zhMatch = zhPattern.search(domain_name_prefix) if re.match('^[a-z]+$', domain_name_prefix): # 1 - letter domain_name_type = 1 elif re.match('^[0-9]+$', domain_name_prefix): # 2 - num domain_name_type = 2 elif re.match('^[0-9a-z]+$', domain_name_prefix): # 3 - letter-num domain_name_type = 3 elif zhMatch: # 4 - 中文 domain_name_type = 4 elif "-" in domain_name_prefix: # 7 - 横杠(-) domain_name_type = 7 except: domain_name_type = 9 return domain_name_type
测试示例:
if __name__ == '__main__': print("get_local_ip: " + YGDTime.get_local_ip()) testStr = 'abcxyz' print testStr, "\t\t", YGCommon.check_domain_name_type(testStr) testStr = '098' print testStr, "\t\t", YGCommon.check_domain_name_type(testStr) testStr = 'abc123' print testStr, "\t\t", YGCommon.check_domain_name_type(testStr) testStr = u'abc中国123' print testStr, "\t\t", YGCommon.check_domain_name_type(testStr) testStr = 'abc中国123' print testStr, "\t\t", YGCommon.check_domain_name_type(testStr) testStr = 'abc-123' print testStr, "\t\t", YGCommon.check_domain_name_type(testStr)
运行结果:
abcxyz abcxyz no unicode
1
098 098 no unicode
2
abc123 abc123 no unicode
3
abc中国123 abc中国123 unicode
4
abc中国123 abc中国123 no unicode
4
abc-123 abc-123 no unicode
7
应用实例
米扑域名: https://domain.mimvp.com
参考推荐:
Python的ASCII,GB2312,Unicode,UTF-8区别
版权所有: 本文系米扑博客原创、转载、摘录,或修订后发表,最后更新于 2020-11-17 20:27:32
侵权处理: 本个人博客,不盈利,若侵犯了您的作品权,请联系博主删除,莫恶意,索钱财,感谢!