C#爬虫基本知识

url编码解码

  • 首先引用程序集System.Web.dll

如果要解码某个url的参数值的话,可以调用下面的方法:
System.Web.HttpUtility.UrlDecode(string)
对某个url参数进行编码:
string s = "[1,2]"; string result = System.Web.HttpUtility.UrlEncode(s);

HttpWebRequest HttpWebResponse的使用

string url = "www.baidu.com";
HttpWebRequest request = WebRequest.Create(url) as HttpWebRequest;
// request.Accept = ...(根据实际情况填写)
// request.Method = ...(根据实际情况填写)
HttpWebResponse response = request.GetResponse() as HttpWebResponse;

using(Stream s = response.GetResponseStream())
{
    using(StreamReader reader = new StreamReader(s))
    {
        string data = reader.ReadToEnd();
    }
    s.Close();
}

response.Close();

要注意Stream 和 HttpWebResponse都实现了IDisposeable接口,所以要用using语句包裹,或者自行调用其Dispose()方法.还有,他们两在使用完后有调用一下他们的Close()方法来关闭连接.

利用Html Agility Pack来解析html

猜你喜欢

转载自www.cnblogs.com/Laggage/p/10740012.html