GeoHash原理与Java实现
GeoHash算法原理
GeoHash是目前比较主流实现位置服务的技术,Geohash算法将经纬度二维数据编码为一个字符串,本质是一个降维的过程
样例数据(基于15次区域分割)
位置 | 经纬度 | Geohash |
---|---|---|
北京站 | 116.433589,39.910508 | wx4g19 |
天安门 | 116.403874,39.913884 | wx4g0f |
首都机场 | 116.606819,40.086109 | wx4uj3 |
GeoHash算法思想
我们知道,经度范围是东经180到西经180,纬度范围是南纬90到北纬90,我们设定西经为负,南纬为负,所以地球上的经度范围就是[-180, 180],纬度范围就是[-90,90]。如果以本初子午线、赤道为界,地球可以分成4个部分。
GeoHash的思想就是将地球划分的四部分映射到二维坐标上。
[-90˚,0˚)代表0,(0˚,90˚]代表1,[-180˚,0)代表0,(0˚,180˚]代表1
映射到二维空间划分为四部分则如下图
但是这么粗略的划分没有什么意义,想要更精确的使用GeoHash就需要再进一步二分切割
通过上图可看出,进一步二分切割
将原本大略的划分变为细致的区域划分,这样就会更加精确。GeoHash算法就是基于这种思想,递归划分的次数越多,所计算出的数据越精确。
GeoHash算法原理
GeoHash算法大体上分为三步:1. 计算经纬度的二进制、2. 合并经纬度的二进制、3. 通过Base32对合并后的二进制进行编码。
- 计算经纬度的二进制
//根据经纬度和范围,获取对应的二进制
private BitSet getBits(double l, double floor, double ceiling) {
BitSet buffer = new BitSet(numbits);
for (int i = 0; i < numbits; i++) {
double mid = (floor + ceiling) / 2;
if (l >= mid) {
buffer.set(i);
floor = mid;
} else {
ceiling = mid;
}
}
return buffer;
}
上述代码numbits为:private static int numbits = 3 * 5; //经纬度单独编码长度
也就是说将地球进行15次二分切割
注: 这里需要对BitSet类进行一下剖析,没了解过该类的话指定懵。
了解BitSet只需了去了解它的set()、get()方法就足够了
- BitSet的set方法
/**
* Sets the bit at the specified index to {@code true}.
*
* @param bitIndex a bit index
* @throws IndexOutOfBoundsException if the specified index is negative
* @since JDK1.0
*/
public void set(int bitIndex) {
if (bitIndex < 0)
throw new IndexOutOfBoundsException("bitIndex < 0: " + bitIndex);
int wordIndex = wordIndex(bitIndex);
expandTo(wordIndex);
words[wordIndex] |= (1L << bitIndex); // Restores invariants
checkInvariants();
}
set方法内wordIndex(bitIndex)
底层将bitIndex右移6位然后返回,ADDRESS_BITS_PER_WORD
为常量6
/**
* Given a bit index, return word index containing it.
*/
private static int wordIndex(int bitIndex) {
return bitIndex >> ADDRESS_BITS_PER_WORD;
}
set方法内的expandTo(wordIndex)
只是一个判断数组是否需要扩容的方法
/**
* Ensures that the BitSet can accommodate a given wordIndex,
* temporarily violating the invariants. The caller must
* restore the invariants before returning to the user,
* possibly using recalculateWordsInUse().
* @param wordIndex the index to be accommodated.
*/
private void expandTo(int wordIndex) {
int wordsRequired = wordIndex+1;
if (wordsInUse < wordsRequired) {
ensureCapacity(wordsRequired);
wordsInUse = wordsRequired;
}
}
set内重要的一行代码words[wordIndex] |= (1L << bitIndex)
,这里只解释一下|=
a|=b
就是a=a|b,就是说将a、b转为二进制按位与,同0为0,否则为1
- BitSet的get方法
/**
* Returns the value of the bit with the specified index. The value
* is {@code true} if the bit with the index {@code bitIndex}
* is currently set in this {@code BitSet}; otherwise, the result
* is {@code false}.
*
* @param bitIndex the bit index
* @return the value of the bit with the specified index
* @throws IndexOutOfBoundsException if the specified index is negative
*/
public boolean get(int bitIndex) {
if (bitIndex < 0)
throw new IndexOutOfBoundsException("bitIndex < 0: " + bitIndex);
checkInvariants();
int wordIndex = wordIndex(bitIndex);
return (wordIndex < wordsInUse)
&& ((words[wordIndex] & (1L << bitIndex)) != 0);
}
get方法用一句话概括就是:如果传入的下标有值,返回true;反之为false
以天安门坐标为例:39.913884, 116.403874
BitSet latbits = getBits(lat, -90, 90);
BitSet lonbits = getBits(lon, -180, 180);
// 纬度
for (int i = 0; i < numbits; i++) {
System.out.print(latbits.get(i) + " ");
}
// 经度
for (int i = 0; i < numbits; i++) {
System.out.print(lonbits.get(i) + " ");
}
纬度经过转换为:
true false true true true false false false true true false false false true false
转为二进制:
1 0 1 1 1 0 0 0 1 1 0 0 0 1 0
经度经过转换为:
true true false true false false true false true true false false false true true
转为二进制:
1 1 0 1 0 0 1 0 1 1 0 0 0 1 1
- 合并经纬度二进制
合并原则:经度占偶数位,纬度占奇数位。也就是说经纬度交替合并,首位0位置为经度的0位置
合并后二进制编码为:
11100 11101 00100 01111 00000 01110
- 使用Base32对合并后的经纬度二进制进行编码
- 代码实现
// Base32进行编码
public String encode(double lat, double lon) {
BitSet latbits = getBits(lat, -90, 90);
BitSet lonbits = getBits(lon, -180, 180);
StringBuilder buffer = new StringBuilder();
for (int i = 0; i < numbits; i++) {
buffer.append((lonbits.get(i)) ? '1' : '0');
buffer.append((latbits.get(i)) ? '1' : '0');
}
String code = base32(Long.parseLong(buffer.toString(), 2));
return code;
}
本文案例经纬度编码后
wx4g0f
后续问题
如果要使用此功能实现附近的人。假如红点为使用者,经过Geohash算法分割后只会推荐同区域0011
中的绿点,但是如下图所示,蓝色点相对于绿色点更接近用户,所以区域划分的弊端就展现在这里。
针对上述问题,我们可以人为获取红色用户所在的0011
区域周边八个区域中的用户,即获取0011
的同时还要获取0100
,0110
,1100
,0001
,1001
,0000
,0010
,1000
- 代码实现
public ArrayList<String> getArroundGeoHash(double lat, double lon) {
ArrayList<String> list = new ArrayList<>();
double uplat = lat + minLat;
double downLat = lat - minLat;
double leftlng = lon - minLng;
double rightLng = lon + minLng;
String leftUp = encode(uplat, leftlng);
list.add(leftUp);
String leftMid = encode(lat, leftlng);
list.add(leftMid);
String leftDown = encode(downLat, leftlng);
list.add(leftDown);
String midUp = encode(uplat, lon);
list.add(midUp);
String midMid = encode(lat, lon);
list.add(midMid);
String midDown = encode(downLat, lon);
list.add(midDown);
String rightUp = encode(uplat, rightLng);
list.add(rightUp);
String rightMid = encode(lat, rightLng);
list.add(rightMid);
String rightDown = encode(downLat, rightLng);
list.add(rightDown);
return list;
}
然后根据球体两点间的距离计算红色用户与周边区域用户距离,从而进行附近的人功能实现
- 通过两经纬度计算距离java代码实现
static double getDistance(double lat1, double lon1, double lat2, double lon2) {
// 经纬度(角度)转弧度。弧度用作参数,以调用Math.cos和Math.sin
double radiansAX = Math.toRadians(lon1); // A经弧度
double radiansAY = Math.toRadians(lat1); // A纬弧度
double radiansBX = Math.toRadians(lon2); // B经弧度
double radiansBY = Math.toRadians(lat2); // B纬弧度
// 公式中“cosβ1cosβ2cos(α1-α2)+sinβ1sinβ2”的部分,得到∠AOB的cos值
double cos = Math.cos(radiansAY) * Math.cos(radiansBY) * Math.cos(radiansAX - radiansBX)
+ Math.sin(radiansAY) * Math.sin(radiansBY);
double acos = Math.acos(cos); // 反余弦值
return EARTH_RADIUS * acos; // 最终结果
}
GeoHash算法代码实现
public class GeoHash {
public static final double MINLAT = -90;
public static final double MAXLAT = 90;
public static final double MINLNG = -180;
public static final double MAXLNG = 180;
private static int numbits = 3 * 5; //经纬度单独编码长度
private static double minLat;
private static double minLng;
private final static char[] digits = {'0', '1', '2', '3', '4', '5', '6', '7', '8',
'9', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'j', 'k', 'm', 'n', 'p',
'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'};
//定义编码映射关系
final static HashMap<Character, Integer> lookup = new HashMap<Character, Integer>();
//初始化编码映射内容
static {
int i = 0;
for (char c : digits)
lookup.put(c, i++);
}
public GeoHash() {
setMinLatLng();
}
// Base32进行编码
public String encode(double lat, double lon) {
BitSet latbits = getBits(lat, -90, 90);
BitSet lonbits = getBits(lon, -180, 180);
StringBuilder buffer = new StringBuilder();
for (int i = 0; i < numbits; i++) {
buffer.append((lonbits.get(i)) ? '1' : '0');
buffer.append((latbits.get(i)) ? '1' : '0');
}
String code = base32(Long.parseLong(buffer.toString(), 2));
return code;
}
public ArrayList<String> getArroundGeoHash(double lat, double lon) {
ArrayList<String> list = new ArrayList<>();
double uplat = lat + minLat;
double downLat = lat - minLat;
double leftlng = lon - minLng;
double rightLng = lon + minLng;
String leftUp = encode(uplat, leftlng);
list.add(leftUp);
String leftMid = encode(lat, leftlng);
list.add(leftMid);
String leftDown = encode(downLat, leftlng);
list.add(leftDown);
String midUp = encode(uplat, lon);
list.add(midUp);
String midMid = encode(lat, lon);
list.add(midMid);
String midDown = encode(downLat, lon);
list.add(midDown);
String rightUp = encode(uplat, rightLng);
list.add(rightUp);
String rightMid = encode(lat, rightLng);
list.add(rightMid);
String rightDown = encode(downLat, rightLng);
list.add(rightDown);
return list;
}
//根据经纬度和范围,获取对应的二进制
private BitSet getBits(double l, double floor, double ceiling) {
BitSet buffer = new BitSet(numbits);
for (int i = 0; i < numbits; i++) {
double mid = (floor + ceiling) / 2;
if (l >= mid) {
buffer.set(i);
floor = mid;
} else {
ceiling = mid;
}
}
return buffer;
}
//将经纬度合并后的二进制进行指定的32位编码
private String base32(long i) {
char[] buf = new char[65];
int charPos = 64;
boolean negative = (i < 0);
if (!negative) {
i = -i;
}
while (i <= -32) {
buf[charPos--] = digits[(int) (-(i % 32))];
i /= 32;
}
buf[charPos] = digits[(int) (-i)];
if (negative) {
buf[--charPos] = '-';
}
return new String(buf, charPos, (65 - charPos));
}
private void setMinLatLng() {
minLat = MAXLAT - MINLAT;
for (int i = 0; i < numbits; i++) {
minLat /= 2.0;
}
minLng = MAXLNG - MINLNG;
for (int i = 0; i < numbits; i++) {
minLng /= 2.0;
}
}
//根据二进制和范围解码
private double decode(BitSet bs, double floor, double ceiling) {
double mid = 0;
for (int i = 0; i < bs.length(); i++) {
mid = (floor + ceiling) / 2;
if (bs.get(i))
floor = mid;
else
ceiling = mid;
}
return mid;
}
//对编码后的字符串解码
public double[] decode(String geohash) {
StringBuilder buffer = new StringBuilder();
for (char c : geohash.toCharArray()) {
int i = lookup.get(c) + 32;
buffer.append(Integer.toString(i, 2).substring(1));
}
BitSet lonset = new BitSet();
BitSet latset = new BitSet();
//偶数位,经度
int j = 0;
for (int i = 0; i < numbits * 2; i += 2) {
boolean isSet = false;
if (i < buffer.length())
isSet = buffer.charAt(i) == '1';
lonset.set(j++, isSet);
}
//奇数位,纬度
j = 0;
for (int i = 1; i < numbits * 2; i += 2) {
boolean isSet = false;
if (i < buffer.length())
isSet = buffer.charAt(i) == '1';
latset.set(j++, isSet);
}
double lon = decode(lonset, -180, 180);
double lat = decode(latset, -90, 90);
return new double[]{lat, lon};
}
public static void main(String[] args) {
GeoHash geoHash = new GeoHash();
// 北京站
String encode = geoHash.encode(39.910508, 116.433589);
System.out.println(encode);
// 天安门
System.out.println(geoHash.encode(39.913884, 116.403874));
// 首都机场
System.out.println(geoHash.encode(40.086109, 116.606819));
BitSet latbits = geoHash.getBits(39.913884, -90, 90);
BitSet lonbits = geoHash.getBits(116.403874, -180, 180);
// for (int i=0; i< latbits.length(); i++) {
// System.out.println(latbits.get(i));
// }
for (int i = 0; i < numbits; i++) {
// System.out.print(latbits.get(i));
System.out.print(latbits.get(i) ? '1' : '0');
System.out.print(" ");
}
System.out.println();
StringBuilder buffer = new StringBuilder();
for (int i = 0; i < numbits; i++) {
buffer.append((lonbits.get(i)) ? '1' : '0');
buffer.append((latbits.get(i)) ? '1' : '0');
}
System.out.println(buffer.toString());
System.out.println(geoHash.encode(39.913884, 116.403874));
}
}
写在最后
如果嫌GeoHash算法麻烦,但是还想用它,没关系。
Redis知道你懒Redis官网GeoHash用法