首页 > 开发 > JAVA > 正文

正则贪婪模式,怎么搞清楚?总是一下子把后面的都给匹配上了

2017-09-07 09:05:24  来源:网友分享
t=edc9624908fac06b969fc8b3454d4d75; cna=rz39DUM8tSgCAXLjRlx1SJH8; l=Ah8fI61XmUXv5/-aXUJ9Fiy3732phHMm; lzstat_uv=7058553501377199257|1774292@1774054; cookie2=fa6653f49ed68b38b469988a4f1a2ac7; _tb_token_=obU42Ba54Lp; v=0; cookie32=d2874877ef98c1c142fc1ddff656fb50; cookie31=MTA5NzY3MjEsa2lzdGVhcixqc3djQDE2My5jb20sbnVsbA%3D%3D; alimamapwag=TW96aWxsYS80LjAgKGNvbXBhdGlibGU7IE1TSUUgNy4wOyBXaW5kb3dzIE5UIDYuMjsgV09XNjQ7IFRyaWRlbnQvNy4wOyAuTkVUNC4wQzsgLk5FVDQuMEU7IC5ORVQgQ0xSIDIuMC41MDcyNzsgLk5FVCBDTFIgMy4wLjMwNzI5OyAuTkVUIENMUiAzLjUuMzA3Mjkp; login=W5iHLLyFOGW7aA%3D%3D; alimamapw=DggWRANZRTAHBVcABwAEBQRbUFZcAgVeB1oEBwAHU1AEA1INVAUAUg%3D%3D

上面这个COOKIES 用正则匹配_tb_token_=obU42Ba54Lp;这个值,
用这个表达式直接把后面一长串都给匹配上了,
然后变成这样了_tb_token_=obU42Ba54Lp; v=0; cookie32=d2874877ef98c1c142fc1ddff656fb50; cookie31=MTA5NzY3MjEsa2lzdGVhcixqc3djQDE2My5jb20sbnVsbA%3D%3D; alimamapwag=TW96aWxsYS80LjAgKGNvbXBhdGlibGU7IE1TSUUgNy4wOyBXaW5kb3dzIE5UIDYuMjsgV09XNjQ7IFRyaWRlbnQvNy4wOyAuTkVUNC4wQzsgLk5FVDQuMEU7IC5ORVQgQ0xSIDIuMC41MDcyNzsgLk5FVCBDTFIgMy4wLjMwNzI5OyAuTkVUIENMUiAzLjUuMzA3Mjkp; login=W5iHLLyFOGW7aA%3D%3D; alimamapw=DggWRANZRTAHBVcABwAEBQRbUFZcAgVeB1oEBwAHU1AEA1INVAUAUg%3D%3D
我知道用\W可以搞定,但是我非要用.*的话应该怎么弄?就匹配到第一个;?

解决方案

1.findall,search皆可,注意方法返回的值
2.pattern写法,注意贪婪和非贪婪匹配
3.HEREDOC是不错的选择,字符串长的时候

import restring = """t=edc9624908fac06b969fc8b3454d4d75; cna=rz39DUM8tSgCAXLjRlx1SJH8; l=Ah8fI61XmUXv5/-aXUJ9Fiy3732phHMm; lzstat_uv=7058553501377199257|1774292@1774054; cookie2=fa6653f49ed68b38b469988a4f1a2ac7; _tb_token_=obU42Ba54Lp; v=0; cookie32=d2874877ef98c1c142fc1ddff656fb50; cookie31=MTA5NzY3MjEsa2lzdGVhcixqc3djQDE2My5jb20sbnVsbA%3D%3D; alimamapwag=TW96aWxsYS80LjAgKGNvbXBhdGlibGU7IE1TSUUgNy4wOyBXaW5kb3dzIE5UIDYuMjsgV09XNjQ7IFRyaWRlbnQvNy4wOyAuTkVUNC4wQzsgLk5FVDQuMEU7IC5ORVQgQ0xSIDIuMC41MDcyNzsgLk5FVCBDTFIgMy4wLjMwNzI5OyAuTkVUIENMUiAzLjUuMzA3Mjkp; login=W5iHLLyFOGW7aA%3D%3D; alimamapw=DggWRANZRTAHBVcABwAEBQRbUFZcAgVeB1oEBwAHU1AEA1INVAUAUg%3D%3D""";#注意匹配不到的情况,加个判断list = re.findall("_tb_token_=(.*?);", string, re.S)print(list[0])