近一段时间的项目的核心是百度API的POI(兴趣点)数据,所以需要调用百度API数据,下面就来讲一下整个流程和遇到的问题及解决办法。
首先要查看百度API文档,链接如下:http://lbsyun.baidu.com/index.php?title=webapi 项目中主要用到的是Place API 以及坐标转换API。
刚开始准备调用API的时候就遇到了一个很大的问题:每次调用API最多返回400条结果。也就是说,当爬取某个区域的POI数据时,最多随机返回400条结果。由于Place API提供了一个参数tag,即标签,碰到这个问题的第一反应是是否可以通过标签分类进行更小粒度的爬取。然而Place API提供了三种区域范围的搜索:城市内检索、矩形区域检索、圆形区域检索。如果按照城市内区域检索,由于城市区域范围过大,显然利用标签还是无法解决数据量限制的问题,所以只能通过矩形或者圆形区域检索。这时想到了GIS专业软件arcgs可以对区域进行分块然后对每个小区域进行遍历,于是就对整个城市进行了更小粒度的区域(600m*600m的正方形)划分,最终返回每个小区域边界的大地坐标。
由于百度地图POI的坐标采用百度坐标系,所以需要将大地坐标转换为百度坐标,转换坐标的API服务地址为:http://api.map.baidu.com/geoconv/v1/ 具体的参数可以参考百度API文档。
接下来就是要调用百度API,java代码如下:
是的package com.baiduapi;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.UnsupportedEncodingException;
import java.net.MalformedURLException;
import java.net.URL;
import java.net.URLConnection;
import com.baiduapi.ConvertCoordinate;
import com.bean.CommunityInf;
import com.bean.FoodInf;
import com.conndatabase.DBOperateFood;
import net.sf.json.JSONObject;
import net.sf.json.JSONArray;
import net.sf.json.JSONException;
public class GetFoodDetail {
String ak = "";
int count = 0;
int num = 0;
String query = null;
String data = null;
String la1 = null;
String ln1 = null;
String la2 = null;
String ln2 = null;
URL myURL = null;
InputStreamReader insr = null;
BufferedReader br = null;
CommunityInf comm = null;
String food_Uid;
String food_Name;
String food_Addr;
String food_Lat;
String food_Lng;
String food_StreetId;
String food_Tag;
String food_Type;
String food_Price;
String food_OverAllRating;
String food_TasteRating;
String food_ServiceRating;
String food_EnviorRating;
String food_ImageNum;
String food_GrouponNum;
String food_CommentNum;
String food_Location;
String detail = null;
public void getFood(String q, String lat1, String lng1, String lat2,String lng2, String region) throws IOException
//获取百度美食POI的数据
{
try {
query = java.net.URLEncoder.encode(q, "UTF-8");
la1 = java.net.URLEncoder.encode(lat1, "UTF-8");
ln1 = java.net.URLEncoder.encode(lng1, "UTF-8");
la2 = java.net.URLEncoder.encode(lat2, "UTF-8");
ln2 = java.net.URLEncoder.encode(lng2, "UTF-8");
} catch (UnsupportedEncodingException e1) {
e1.printStackTrace();
}
for (int j = 0; j < 20; j++) {
String url = String
.format("http://api.map.baidu.com/place/v2/search?query=%s&page_size=20&page_num="
+ j + "&bounds=%s,%s,%s,%s&output=json&ak=%s",
query, la1, ln1, la2, ln2, ak);
// System.out.println(url);
URLConnection conn = null;
try {
myURL = new URL(url);
} catch (MalformedURLException e) {
e.printStackTrace();
}
try {
conn = (URLConnection) myURL.openConnection();
if (conn != null) {
insr = new InputStreamReader(conn.getInputStream(), "UTF-8");
br = new BufferedReader(insr);
StringBuilder sb = new StringBuilder("");
while ((data = br.readLine()) != null) {
System.out.println(data);
sb.append(data.trim());
}
String str = sb.toString();
JSONObject jsonObj = JSONObject.fromObject(str);
// System.out.println(jsonObj);
num = num + 1;
System.out.println("#" + num);
if (jsonObj.getInt("total") == 0) {
break;
}
new FoodInf();
JSONArray results = jsonObj.getJSONArray("results");
// System.out.println(results.size());
for (int i = 0; i < results.size(); i++) {
JSONObject results0 = results.getJSONObject(i);
food_Uid = results0.getString("uid");
getDetail(food_Uid, region);
}
}
} catch (IOException e) {
e.printStackTrace();
} finally {
if (insr != null) {
insr.close();
}
if (br != null) {
br.close();
}
}
}
}
private void getDetail(String uid, String region) throws IOException,
net.sf.json.JSONException {
// 获取美食POI的细节数据并解析储存到本地数据库中
String detail_url = String
.format("http://api.map.baidu.com/place/v2/detail?uid=%s&output=json&scope=2&ak=%s",
uid, ak);
URLConnection conn = null;
try {
myURL = new URL(detail_url);
} catch (MalformedURLException e) {
e.printStackTrace();
}
try {
conn = (URLConnection) myURL.openConnection();
if (conn != null) {
insr = new InputStreamReader(conn.getInputStream(), "UTF-8");
br = new BufferedReader(insr);
StringBuilder sb = new StringBuilder();
while ((data = br.readLine()) != null) {
System.out.println(data);
sb.append(data.trim());
}
String detail = sb.toString();
System.out.println("##" + detail);
try {
JSONObject results0 = JSONObject.fromObject(detail)
.getJSONObject("result");
food_Name = results0.getString("name");
food_Addr = results0.getString("address");
try {
results0.getString("street_id");
} catch (JSONException jsonExp) {
food_StreetId = "-";
}
try {
results0.getString("location");
food_Location = results0.getString("location");
String loc[] = food_Location.split(",");
food_Lng = (String) loc[0].subSequence(
food_Location.indexOf(":") + 1,
food_Location.indexOf(","));
food_Lat = (String) loc[1].subSequence(
loc[1].indexOf(":") + 1, loc[1].indexOf("}"));
} catch (JSONException jsonExp) {
food_Lng = "-";
food_Lat = "-";
}
results0.getString("detail_info");
detail = results0.getString("detail_info");
JSONObject jsonObj1 = JSONObject.fromObject(detail);
// System.out.println(jsonObj1);
try {
jsonObj1.getString("image_num");
food_ImageNum = jsonObj1.getString("image_num");
} catch (JSONException jsonExp) {
food_ImageNum = "-";
}
try {
jsonObj1.getString("price");
food_Price = jsonObj1.getString("price");
} catch (JSONException jsonExp) {
food_Price = "-";
}
try {
jsonObj1.getString("tag");
food_Tag = jsonObj1.getString("tag");
} catch (JSONException jsonExp) {
food_Tag = "-";
}
try {
jsonObj1.getString("type");
food_Type = jsonObj1.getString("type");
} catch (JSONException jsonExp) {
food_Type = "-";
}
try {
jsonObj1.getString("overall_rating");
food_OverAllRating = jsonObj1
.getString("overall_rating");
} catch (JSONException jsonExp) {
food_OverAllRating = "-";
}
try {
jsonObj1.getString("taste_rating");
food_TasteRating = jsonObj1.getString("taste_rating");
} catch (JSONException jsonExp) {
food_TasteRating = "-";
}
try {
jsonObj1.getString("service_rating");
food_ServiceRating = jsonObj1
.getString("service_rating");
} catch (JSONException jsonExp) {
food_ServiceRating = "-";
}
try {
jsonObj1.getString("environment_rating");
food_EnviorRating = jsonObj1
.getString("environment_rating");
} catch (JSONException jsonExp) {
food_EnviorRating = "-";
}
try {
jsonObj1.getString("groupon_num");
food_GrouponNum = jsonObj1.getString("groupon_num");
} catch (JSONException jsonExp) {
food_GrouponNum = "-";
}
try {
jsonObj1.getString("comment_num");
food_CommentNum = jsonObj1.getString("comment_num");
} catch (JSONException jsonExp) {
food_CommentNum = "-";
}
} catch (JSONException jsonExp) {
food_ImageNum = "-";
food_Price = "-";
food_Tag = "-";
food_Type = "-";
food_OverAllRating = "-";
food_TasteRating = "-";
food_ServiceRating = "-";
food_EnviorRating = "-";
food_GrouponNum = "-";
food_CommentNum = "-";
}
FoodInf food = new FoodInf();
food.setFood_Addr(food_Addr);
food.setFood_CommentNum(food_CommentNum);
food.setFood_EnviorRating(food_EnviorRating);
food.setFood_GrouponNum(food_GrouponNum);
food.setFood_ImageNum(food_ImageNum);
food.setFood_Lat(food_Lat);
food.setFood_Lng(food_Lng);
food.setFood_Name(food_Name);
food.setFood_OverAllRating(food_OverAllRating);
food.setFood_Price(food_Price);
food.setFood_ServiceRating(food_ServiceRating);
food.setFood_StreetId(food_StreetId);
food.setFood_Tag(food_Tag);
food.setFood_TasteRating(food_TasteRating);
food.setFood_Type(food_Type);
food.setFood_Uid(uid);
food.setFood_region(region);
// System.out.println(community.getCommunity_Uid()+"#"+community.getCommunity_Name()
// + "#"
// +community.getCommunity_Addr()+"#"+community.getCommunity_StreetId()+"#"+community.getCommunity_Lng()+"#"+community.getCommunity_Lat()+"#"+community.getCommunity_Image_Num()+"#"+community.getCommunity_Price()+"#"+community.getCommunity_Tag()+"#"+community.getCommunity_Type()+"#"+community.getCommunity_Overall_Rating()+"#"+community.getCommunity_Region());
DBOperateFood operate = new DBOperateFood();
operate.inserData(food);
}
} catch (IOException e) {
e.printStackTrace();
} finally {
if (insr != null) {
insr.close();
}
if (br != null) {
br.close();
}
}
}
public void setCoor(String la1, String ln1, String la2, String ln2,
String region) throws IOException {
//转换坐标系
ConvertCoordinate coor = new ConvertCoordinate();
Object[] llocation = coor.getCoordinate(la1, ln1);
String lat1 = llocation[0].toString();
String lng1 = llocation[1].toString();
Object[] rlocation = coor.getCoordinate(la2, ln2);
String lat2 = rlocation[0].toString();
String lng2 = rlocation[1].toString();
// System.out.println(lat1+"#"+lng1+"#"+lat2+"#"+lng2);
GetFoodDetail foodDetail = new GetFoodDetail();
foodDetail.getFood("美食", lat1, lng1, lat2, lng2, region);
}
代码写的比较烂,大体实现了整个功能,但是仍有几点需要注意和解决的问题:
1、利用arcgis划分的区域并不算太准确,在边界处可能存在越界的问题,所以需要对爬取下来的数据进行进一步的筛选。
2、就算是划分到600m*600m的小区域内,调用API所返回的数据也有可能超过400条,这种情况目前所能想到的解决办法就是尽可能的根据POI标签进行分类爬取。