As a regular expression for html2RSS Service

  h2R idea has been a long time, finally started working today.
The main use of the matching text:
None.gif
None.gif            XmlNode channles
= root.FirstChild;
None.gif
None.gif            Regex r;
None.gif            Match m;
None.gif
None.gif            r 
=   new  Regex( " href\\s*=\\s*(?:\ " ( ?< 1 > [ ^ \ " ]*)\ " | ( ?< 1 > \\S + ))\\s + \\S + \\s + title\\s *= \\s * ( ? :\ " (?<2>[^\ " ] * )\ " |(?<2>\\S+)) " ,RegexOptions.IgnoreCase | RegexOptions.Compiled);
None.gif            
for  (m  =  r.Match(str); m.Success; m  =  m.NextMatch()) 
ExpandedBlockStart.gifContractedBlock.gif            
{//    rst+="link=" + m.Groups[1] + "\ntitle=" + m.Groups[2]+"\n";                XmlElement oitem=xml.CreateElement("item");                XmlElement o=xml.CreateElement("title");                o.InnerText=m.Groups[2].Value;                oitem.AppendChild(o);                O dot.gif
InBlock.gif            

InBlock.gif

InBlock.gif

InBlock.gif

InBlock.gif
InBlock.gif                
InBlock.gif
=xml.CreateElement("link");
InBlock.gif                o.InnerText
=m.Groups[1].Value;
InBlock.gif                oitem.AppendChild(o);
InBlock.gif
InBlock.gif
InBlock.gif                channles.AppendChild(oitem);
ExpandedBlockEnd.gif            }

None.gif
None.gif = Example STR      < TR > < TD > < TR  height =. 19 > < TD  align = left = Center  width = 14 > < IMG  the src = / iconsThe / info / dot_h.gif  width =. 5  height =. 5 > </ TD > < TD  align = left left = > < href = / ZZH / 30630.nsf / (AllDocsByUnid) / C81ECBA70F9A8795C82570990031DE28? OpenDocument  target = _blank  title = "IC card student card and student card paper lost property list" > IC card student card and student card paper lost property list </ A > </ TD >< td  align =right  width =80 >< font  color =#000066 > 10-13 18:04 </ font ></ td ></ tr >< tr  height =19 >< td  align =center  width =14 >< img  src =/icons/info/dot_h.gif  width =5  height =5 ></ td >< td  align =left >< href =/zzh/30630.nsf/(AllDocsByUnid)/81BF13BCCAB992A1C825709900300465?opendocument  target = _blank  title = "About" Excellent SRT Program Award "to declare the notice" > About "Excellent SRT Program Award" Shen dot.gif </ A > </ td > < td  align = left = right  width = 80 > < font  Color = # 000066 > 10-13 17:44 </ font > </ TD > </ TR > < TR  height =. 19 > < TD  align = left = Center  width = 14 > < IMG  the src = / iconsThe / info / dot_h.gif  width =. 5  height =. 5 > </ eg > < Td  align = left = left > < href = / ZZH / 30630.nsf / (AllDocsByUnid) / 713C777073ED05DBC8257099002FE71B? OpenDocument  target = _blank  title = "new round of SRT project application notice" > new round of SRT project application notice </ A > </ TD >
None.gif

Regular expression parsing the case:
. 1, the href = \\ \\ S * S *
matching the href, behind which there is no space = both, there may be several spaces.
2, (:?? \ " ? (<1> [^ \"] *) \ "| (<1> \\ S +))
removal of data 1, Link i.e., there are no marks on both sides thereof.
3 , \\ s + \\ S + \\ s +
match at least one blank each, followed by at least one non-space, each followed by at least one empty.
in fact, the match is target = _blank

We are continuing to make.
Found that regular expression too strong, simply SQL text processing, stronger than SQL!
Now I feel, I do not know that they are learning in order to achieve h2R service Regex, or to learn Regex and take h2R service to do the exercises.
They are very good.

Reproduced in: https: //www.cnblogs.com/civ3/archive/2005/10/16/256119.html

Guess you like

Origin blog.csdn.net/weixin_33775572/article/details/93571508