本博客(http://blog.csdn.net/livelylittlefish )贴出作者(阿波)相关研究、学习内容所做的笔记,欢迎广大朋友指正!
Content
0.序
1.hash结构
1.1ngx_hash_t结构
1.2ngx_hash_init_t结构
1.3ngx_hash_key_t结构
1.4hash的逻辑结构
2.hash操作
2.1NGX_HASH_ELT_SIZE宏
2.2hash函数
2.3hash初始化
2.4hash查找
3.一个例子
3.1代码
3.2如何编译
3.3运行结果
3.3.1bucket_size=64字节
3.3.2bucket_size=256字节
4.小结
0. 序
本文继续介绍nginx的数据结构——hash结构。
链表实现文件:文件:./src/core/ngx_hash.h/.c。.表示nginx-1.0.4代码目录,本文为/usr/src/nginx-1.0.4。
1. hash结构
nginx的hash结构比其list、array、queue等结构稍微复杂一些,下图是hash相关数据结构图。下面一一介绍。
1.1 ngx_hash_t结构
nginx的hash结构为ngx_hash_t,hash元素结构为ngx_hash_elt_t,定义如下。
typedef struct { //hash元素结构
void *value; //value,即某个key对应的值,即<key,value>中的value
u_short len; //name长度
u_char name[1]; //某个要hash的数据(在nginx中表现为字符串),即<key,value>中的key
} ngx_hash_elt_t;
typedef struct { //hash结构
ngx_hash_elt_t **buckets; //hash桶(有size个桶)
ngx_uint_t size; //hash桶个数
} ngx_hash_t;
其中,sizeof(ngx_hash_t) = 8,sizeof(ngx_hash_elt_t) = 8。实际上,ngx_hash_elt_t结构中的name字段就是ngx_hash_key_t结构中的key。这在ngx_hash_init()函数中可以看到,请参考后续的分析。该结构在模块配置解析时经常使用。
1.2 ngx_hash_init_t结构
nginx的hash初始化结构是ngx_hash_init_t,用来将其相关数据封装起来作为参数传递给ngx_hash_init()或ngx_hash_wildcard_init()函数。这两个函数主要是在http相关模块中使用,例如ngx_http_server_names()函数(优化http Server Names),ngx_http_merge_types()函数(合并httptype),ngx_http_fastcgi_merge_loc_conf()函数(合并FastCGI Location Configuration)等函数或过程用到的参数、局部对象/变量等。这些内容将在后续的文章中讲述。
ngx_hash_init_t结构如下。sizeof(ngx_hash_init_t)=28。
typedef struct { //hash初始化结构
ngx_hash_t *hash; //指向待初始化的hash结构
ngx_hash_key_pt key; //hash函数指针
ngx_uint_t max_size; //bucket的最大个数
ngx_uint_t bucket_size; //每个bucket的空间
char *name; //该hash结构的名字(仅在错误日志中使用)
ngx_pool_t *pool; //该hash结构从pool指向的内存池中分配
ngx_pool_t *temp_pool; //分配临时数据空间的内存池
} ngx_hash_init_t;
1.3 ngx_hash_key_t结构
该结构也主要用来保存要hash的数据,即键-值对<key,value>,在实际使用中,一般将多个键-值对保存在ngx_hash_key_t结构的数组中,作为参数传给ngx_hash_init()或ngx_hash_wildcard_init()函数。其定义如下。
typedef struct { //hash key结构
ngx_str_t key; //key,为nginx的字符串结构
ngx_uint_t key_hash; //由该key计算出的hash值(通过hash函数如ngx_hash_key_lc())
void *value; //该key对应的值,组成一个键-值对<key,value>
} ngx_hash_key_t;
typedef struct { //字符串结构
size_t len; //字符串长度
u_char *data; //字符串内容
} ngx_str_t;
其中,sizeof(ngx_hash_key_t) = 16。一般在使用中,value指针可能指向静态数据区(例如全局数组、常量字符串)、堆区(例如动态分配的数据区用来保存value值)等。可参考本文后面的例子。
关于ngx_table_elt_t结构和ngx_hash_keys_arrays_t结构,因其对于hash结构本身没有太大作用,主要是为模块配置、referer合法性验证等设计的数据结构,例如http的core模块、map模块、referer模块、SSI filter模块等,此处不再讲述,将在后续的文章中介绍。
1.4 hash的逻辑结构
ngx_hash_init_t结构引用了ngx_pool_t结构,因此本文参考nginx-1.0.4源码分析—内存池结构ngx_pool_t及内存管理一文画出相关结构的逻辑图,如下。注:本文采用UML的方式画出该图。
2. hash操作
2.1 NGX_HASH_ELT_SIZE宏
NGX_HASH_ELT_SIZE宏用来计算上述ngx_hash_elt_t结构大小,定义如下。
#define NGX_HASH_ELT_SIZE(name) \ //该参数name即为ngx_hash_elt_t结构指针
(sizeof(void *) + ngx_align((name)->key.len + 2, sizeof(void *))) //以4字节对齐
在32位平台上,sizeof(void*)=4,(name)->key.len即是ngx_hash_elt_t结构中name数组保存的内容的长度,其中的"+2"是要加上该结构中len字段(u_short类型)的大小。
因此,NGX_HASH_ELT_SIZE(name)=4+ngx_align((name)->key.len + 2, 4),该式后半部分即是(name)->key.len+2以4字节对齐的大小。
2.2 hash函数
nginx-1.0.4提供的hash函数有以下几种。
#define ngx_hash(key, c) ((ngx_uint_t) key * 31 + c) //hash宏
ngx_uint_t ngx_hash_key(u_char *data, size_t len);
ngx_uint_t ngx_hash_key_lc(u_char *data, size_t len); //lc表示lower case,即字符串转换为小写后再计算hash值
ngx_uint_t ngx_hash_strlow(u_char *dst, u_char *src, size_t n);
hash函数都很简单,以上3个函数都会调用ngx_hash宏,该宏返回一个(长)整数。此处介绍第一个函数,定义如下。
ngx_uint_t
ngx_hash_key(u_char *data, size_t len)
{
ngx_uint_t i, key;
key = 0;
for (i = 0; i < len; i++) {
key = ngx_hash(key, data[i]);
}
return key;
}
因此,ngx_hash_key函数的计算可表述为下列公式。
Key[0] = data[0]
Key[1] = data[0]*31 + data[1]
Key[2] = (data[0]*31 + data[1])*31 + data[2]
...
Key[len-1] = ((((data[0]*31 + data[1])*31 + data[2])*31) ... data[len-2])*31 + data[len-1]
key[len-1]即为传入的参数data对应的hash值。
2.3 hash初始化
hash初始化由ngx_hash_init()函数完成,其names参数是ngx_hash_key_t结构的数组,即键-值对<key,value>数组,nelts表示该数组元素的个数。因此,在调用该函数进行初始化之前,ngx_hash_key_t结构的数组是准备好的,如何使用,可以采用nginx的ngx_array_t结构,详见本文后面的例子。
该函数初始化的结果就是将names数组保存的键-值对<key,value>,通过hash的方式将其存入相应的一个或多个hash桶(即代码中的buckets)中,该hash过程用到的hash函数一般为ngx_hash_key_lc等。hash桶里面存放的是ngx_hash_elt_t结构的指针(hash元素指针),该指针指向一个基本连续的数据区。该数据区中存放的是经hash之后的键-值对<key',value'>,即ngx_hash_elt_t结构中的字段<name,value>。每一个这样的数据区存放的键-值对<key',value'>可以是一个或多个。
此处有几个问题需要说明。
问题1:为什么说是基本连续?
——用NGX_HASH_ELT_SIZE宏计算某个hash元素的总长度时,存在以sizeof(void*)对齐的填补(padding)。因此将names数组中的键-值对<key,value>中的key拷贝到ngx_hash_elt_t结构的name[1]数组中时,已经为该hash元素分配的空间不会完全被用完,故这个数据区是基本连续的。这一点也可以参考本节后面的结构图或本文后面的例子。
问题2:这些基本连续的数据区从哪里分配的?
——当然是从该函数的第一个参数ngx_hash_init_t的pool字段指向的内存池中分配的。
问题3:<key',value'>与<key,value>不同的是什么?
——key保存的仅仅是个指针,而key'却是key拷贝到name[1]的结果。而value和value'都是指针。如1.3节说明,value指针可能指向静态数据区(例如全局数组、常量字符串)、堆区(例如动态分配的数据区用来保存value值)等。可参考本文后面的例子。
问题4:如何知道某个键-值对<key,value>放在哪个hash桶中?
——key = names[n].key_hash % size; 代码中的这个计算是也。计算结果key即是该键要放在那个hash桶的编号(从0到size-1)。
该函数代码如下。一些疑点、难点的解释请参考//后笔者所加的注释,也可参考本节的hash结构图。
//nelts是names数组中(实际)元素的个数
ngx_int_t
ngx_hash_init(ngx_hash_init_t *hinit, ngx_hash_key_t *names, ngx_uint_t nelts)
{
u_char *elts;
size_t len;
u_short *test;
ngx_uint_t i, n, key, size, start, bucket_size;
ngx_hash_elt_t *elt, **buckets;
for (n = 0; n < nelts; n++) { //检查names数组的每一个元素,判断桶的大小是否够分配
if (hinit->bucket_size < NGX_HASH_ELT_SIZE(&names[n]) + sizeof(void *))
{ //有任何一个元素,桶的大小不够为该元素分配空间,则退出
ngx_log_error(NGX_LOG_EMERG, hinit->pool->log, 0,
"could not build the %s, you should "
"increase %s_bucket_size: %i",
hinit->name, hinit->name, hinit->bucket_size);
return NGX_ERROR;
}
}
//分配2*max_size个字节的空间保存hash数据(该内存分配操作不在nginx的内存池中进行,因为test只是临时的)
test = ngx_alloc(hinit->max_size * sizeof(u_short), hinit->pool->log);
if (test == NULL) {
return NGX_ERROR;
}
bucket_size = hinit->bucket_size - sizeof(void *); //一般sizeof(void*)=4
start = nelts / (bucket_size / (2 * sizeof(void *))); //
start = start ? start : 1;
if (hinit->max_size > 10000 && hinit->max_size / nelts < 100) {
start = hinit->max_size - 1000;
}
for (size = start; size < hinit->max_size; size++) {
ngx_memzero(test, size * sizeof(u_short));
//标记1:此块代码是检查bucket大小是否够分配hash数据
for (n = 0; n < nelts; n++) {
if (names[n].key.data == NULL) {
continue;
}
//计算key和names中所有name长度,并保存在test[key]中
key = names[n].key_hash % size; //若size=1,则key一直为0
test[key] = (u_short) (test[key] + NGX_HASH_ELT_SIZE(&names[n]));
if (test[key] > (u_short) bucket_size) {//若超过了桶的大小,则到下一个桶重新计算
goto next;
}
}
goto found;
next:
continue;
}
//若没有找到合适的bucket,退出
ngx_log_error(NGX_LOG_EMERG, hinit->pool->log, 0,
"could not build the %s, you should increase "
"either %s_max_size: %i or %s_bucket_size: %i",
hinit->name, hinit->name, hinit->max_size,
hinit->name, hinit->bucket_size);
ngx_free(test);
return NGX_ERROR;
found: //找到合适的bucket
for (i = 0; i < size; i++) { //将test数组前size个元素初始化为4
test[i] = sizeof(void *);
}
/** 标记2:与标记1代码基本相同,但此块代码是再次计算所有hash数据的总长度(标记1的检查已通过)
但此处的test[i]已被初始化为4,即相当于后续的计算再加上一个void指针的大小。
*/
for (n = 0; n < nelts; n++) {
if (names[n].key.data == NULL) {
continue;
}
//计算key和names中所有name长度,并保存在test[key]中
key = names[n].key_hash % size; //若size=1,则key一直为0
test[key] = (u_short) (test[key] + NGX_HASH_ELT_SIZE(&names[n]));
}
//计算hash数据的总长度
len = 0;
for (i = 0; i < size; i++) {
if (test[i] == sizeof(void *)) {//若test[i]仍为初始化的值4,即没有变化,则继续
continue;
}
//对test[i]按ngx_cacheline_size对齐(32位平台,ngx_cacheline_size=32)
test[i] = (u_short) (ngx_align(test[i], ngx_cacheline_size));
len += test[i];
}
if (hinit->hash == NULL) {//在内存池中分配hash头及buckets数组(size个ngx_hash_elt_t*结构)
hinit->hash = ngx_pcalloc(hinit->pool, sizeof(ngx_hash_wildcard_t)
+ size * sizeof(ngx_hash_elt_t *));
if (hinit->hash == NULL) {
ngx_free(test);
return NGX_ERROR;
}
//计算buckets的启示位置(在ngx_hash_wildcard_t结构之后)
buckets = (ngx_hash_elt_t **)
((u_char *) hinit->hash + sizeof(ngx_hash_wildcard_t));
} else { //在内存池中分配buckets数组(size个ngx_hash_elt_t*结构)
buckets = ngx_pcalloc(hinit->pool, size * sizeof(ngx_hash_elt_t *));
if (buckets == NULL) {
ngx_free(test);
return NGX_ERROR;
}
}
//接着分配elts,大小为len+ngx_cacheline_size,此处为什么+32?——下面要按32字节对齐
elts = ngx_palloc(hinit->pool, len + ngx_cacheline_size);
if (elts == NULL) {
ngx_free(test);
return NGX_ERROR;
}
//将elts地址按ngx_cacheline_size=32对齐
elts = ngx_align_ptr(elts, ngx_cacheline_size);
for (i = 0; i < size; i++) { //将buckets数组与相应elts对应起来
if (test[i] == sizeof(void *)) {
continue;
}
buckets[i] = (ngx_hash_elt_t *) elts;
elts += test[i];
}
for (i = 0; i < size; i++) { //test数组置0
test[i] = 0;
}
for (n = 0; n < nelts; n++) { //将传进来的每一个hash数据存入hash表
if (names[n].key.data == NULL) {
continue;
}
//计算key,即将被hash的数据在第几个bucket,并计算其对应的elts位置
key = names[n].key_hash % size;
elt = (ngx_hash_elt_t *) ((u_char *) buckets[key] + test[key]);
//对ngx_hash_elt_t结构赋值
elt->value = names[n].value;
elt->len = (u_short) names[n].key.len;
ngx_strlow(elt->name, names[n].key.data, names[n].key.len);
//计算下一个要被hash的数据的长度偏移
test[key] = (u_short) (test[key] + NGX_HASH_ELT_SIZE(&names[n]));
}
for (i = 0; i < size; i++) {
if (buckets[i] == NULL) {
continue;
}
//test[i]相当于所有被hash的数据总长度
elt = (ngx_hash_elt_t *) ((u_char *) buckets[i] + test[i]);
elt->value = NULL;
}
ngx_free(test); //释放该临时空间
hinit->hash->buckets = buckets;
hinit->hash->size = size;
return NGX_OK;
}
所谓的hash数据长度即指ngx_hash_elt_t结构被赋值后的长度。nelts个元素存放在names数组中,调用该函数对hash进行初始化之后,这nelts个元素被保存在size个hash桶指向的ngx_hash_elts_t数据区,这些数据区中共保存了nelts个hash元素。即hash桶(buckets)存放的是ngx_hash_elt_t数据区的起始地址,以该起始地址开始的数据区存放的是经hash之后的hash元素,每个hash元素的最后是以name[0]为开始的字符串,该字符串就是names数组中某个元素的key,即键值对<key,value>中的key,然后该字符串之后会有几个字节的因对齐产生的padding。
一个典型的经初始化后的hash物理结构如下。具体的可参考后文的例子。
2.4 hash查找
hash查找操作由ngx_hash_find()函数完成,代码如下。//后的注释为笔者所加。
//由key,name,len信息在hash指向的hash table中查找该key对应的value
void *
ngx_hash_find(ngx_hash_t *hash, ngx_uint_t key, u_char *name, size_t len)
{
ngx_uint_t i;
ngx_hash_elt_t *elt;
elt = hash->buckets[key % hash->size];//由key找到所在的bucket(该bucket中保存其elts地址)
if (elt == NULL) {
return NULL;
}
while (elt->value) {
if (len != (size_t) elt->len) { //先判断长度
goto next;
}
for (i = 0; i < len; i++) {
if (name[i] != elt->name[i]) { //接着比较name的内容(此处按字符匹配)
goto next;
}
}
return elt->value; //匹配成功,直接返回该ngx_hash_elt_t结构的value字段
next:
//注意此处从elt->name[0]地址处向后偏移,故偏移只需加该elt的len即可,然后在以4字节对齐
elt = (ngx_hash_elt_t *) ngx_align_ptr(&elt->name[0] + elt->len,
sizeof(void *));
continue;
}
return NULL;
}
查找操作相当简单,由key直接计算所在的bucket,该bucket中保存其所在ngx_hash_elt_t数据区的起始地址;然后根据长度判断并用name内容匹配,匹配成功,其ngx_hash_elt_t结构的value字段即是所求。
3. 一个例子
本节给出一个创建内存池并从中分配hash结构、hash桶、hash元素并将键-值对<key,value>加入该hash结构的简单例子。
在该例中,将完成这样一个应用,将给定的多个url及其ip组成的二元组<url,ip>作为<key,value>,初始化时对这些<url,ip>进行hash,然后根据给定的url查找其对应的ip地址,若没有找到,则给出相关提示信息。以此向读者展示nginx的hash使用方法。
3.1代码
/**
* ngx_hash_t test
* in this example, it will first save URLs into the memory pool, and IPs saved in static memory.
* then, give some examples to find IP according to a URL.
*/
#include <stdio.h>
#include "ngx_config.h"
#include "ngx_conf_file.h"
#include "nginx.h"
#include "ngx_core.h"
#include "ngx_string.h"
#include "ngx_palloc.h"
#include "ngx_array.h"
#include "ngx_hash.h"
#define Max_Num 7
#define Max_Size 1024
#define Bucket_Size 64 //256, 64
#define NGX_HASH_ELT_SIZE(name) \
(sizeof(void *) + ngx_align((name)->key.len + 2, sizeof(void *)))
/* for hash test */
static ngx_str_t urls[Max_Num] = {
ngx_string("www.baidu.com"), //220.181.111.147
ngx_string("www.sina.com.cn"), //58.63.236.35
ngx_string("www.google.com"), //74.125.71.105
ngx_string("www.qq.com"), //60.28.14.190
ngx_string("www.163.com"), //123.103.14.237
ngx_string("www.sohu.com"), //219.234.82.50
ngx_string("www.abo321.org") //117.40.196.26
};
static char* values[Max_Num] = {
"220.181.111.147",
"58.63.236.35",
"74.125.71.105",
"60.28.14.190",
"123.103.14.237",
"219.234.82.50",
"117.40.196.26"
};
#define Max_Url_Len 15
#define Max_Ip_Len 15
#define Max_Num2 2
/* for finding test */
static ngx_str_t urls2[Max_Num2] = {
ngx_string("www.china.com"), //60.217.58.79
ngx_string("www.csdn.net") //117.79.157.242
};
ngx_hash_t* init_hash(ngx_pool_t *pool, ngx_array_t *array);
void dump_pool(ngx_pool_t* pool);
void dump_hash_array(ngx_array_t* a);
void dump_hash(ngx_hash_t *hash, ngx_array_t *array);
ngx_array_t* add_urls_to_array(ngx_pool_t *pool);
void find_test(ngx_hash_t *hash, ngx_str_t addr[], int num);
/* for passing compiling */
volatile ngx_cycle_t *ngx_cycle;
void ngx_log_error_core(ngx_uint_t level, ngx_log_t *log, ngx_err_t err, const char *fmt, ...)
{
}
int main(/* int argc, char **argv */)
{
ngx_pool_t *pool = NULL;
ngx_array_t *array = NULL;
ngx_hash_t *hash;
printf("--------------------------------\n");
printf("create a new pool:\n");
printf("--------------------------------\n");
pool = ngx_create_pool(1024, NULL);
dump_pool(pool);
printf("--------------------------------\n");
printf("create and add urls to it:\n");
printf("--------------------------------\n");
array = add_urls_to_array(pool); //in fact, here should validate array
dump_hash_array(array);
printf("--------------------------------\n");
printf("the pool:\n");
printf("--------------------------------\n");
dump_pool(pool);
hash = init_hash(pool, array);
if (hash == NULL)
{
printf("Failed to initialize hash!\n");
return -1;
}
printf("--------------------------------\n");
printf("the hash:\n");
printf("--------------------------------\n");
dump_hash(hash, array);
printf("\n");
printf("--------------------------------\n");
printf("the pool:\n");
printf("--------------------------------\n");
dump_pool(pool);
//find test
printf("--------------------------------\n");
printf("find test:\n");
printf("--------------------------------\n");
find_test(hash, urls, Max_Num);
printf("\n");
find_test(hash, urls2, Max_Num2);
//release
ngx_array_destroy(array);
ngx_destroy_pool(pool);
return 0;
}
ngx_hash_t* init_hash(ngx_pool_t *pool, ngx_array_t *array)
{
ngx_int_t result;
ngx_hash_init_t hinit;
ngx_cacheline_size = 32; //here this variable for nginx must be defined
hinit.hash = NULL; //if hinit.hash is NULL, it will alloc memory for it in ngx_hash_init
hinit.key = &ngx_hash_key_lc; //hash function
hinit.max_size = Max_Size;
hinit.bucket_size = Bucket_Size;
hinit.name = "my_hash_sample";
hinit.pool = pool; //the hash table exists in the memory pool
hinit.temp_pool = NULL;
result = ngx_hash_init(&hinit, (ngx_hash_key_t*)array->elts, array->nelts);
if (result != NGX_OK)
return NULL;
return hinit.hash;
}
void dump_pool(ngx_pool_t* pool)
{
while (pool)
{
printf("pool = 0x%x\n", pool);
printf(" .d\n");
printf(" .last = 0x%x\n", pool->d.last);
printf(" .end = 0x%x\n", pool->d.end);
printf(" .next = 0x%x\n", pool->d.next);
printf(" .failed = %d\n", pool->d.failed);
printf(" .max = %d\n", pool->max);
printf(" .current = 0x%x\n", pool->current);
printf(" .chain = 0x%x\n", pool->chain);
printf(" .large = 0x%x\n", pool->large);
printf(" .cleanup = 0x%x\n", pool->cleanup);
printf(" .log = 0x%x\n", pool->log);
printf("available pool memory = %d\n\n", pool->d.end - pool->d.last);
pool = pool->d.next;
}
}
void dump_hash_array(ngx_array_t* a)
{
char prefix[] = " ";
if (a == NULL)
return;
printf("array = 0x%x\n", a);
printf(" .elts = 0x%x\n", a->elts);
printf(" .nelts = %d\n", a->nelts);
printf(" .size = %d\n", a->size);
printf(" .nalloc = %d\n", a->nalloc);
printf(" .pool = 0x%x\n", a->pool);
printf(" elements:\n");
ngx_hash_key_t *ptr = (ngx_hash_key_t*)(a->elts);
for (; ptr < (ngx_hash_key_t*)(a->elts + a->nalloc * a->size); ptr++)
{
printf(" 0x%x: {key = (\"%s\"%.*s, %d), key_hash = %-10ld, value = \"%s\"%.*s}\n",
ptr, ptr->key.data, Max_Url_Len - ptr->key.len, prefix, ptr->key.len,
ptr->key_hash, ptr->value, Max_Ip_Len - strlen(ptr->value), prefix);
}
printf("\n");
}
/**
* pass array pointer to read elts[i].key_hash, then for getting the position - key
*/
void dump_hash(ngx_hash_t *hash, ngx_array_t *array)
{
int loop;
char prefix[] = " ";
u_short test[Max_Num] = {0};
ngx_uint_t key;
ngx_hash_key_t* elts;
int nelts;
if (hash == NULL)
return;
printf("hash = 0x%x: **buckets = 0x%x, size = %d\n", hash, hash->buckets, hash->size);
for (loop = 0; loop < hash->size; loop++)
{
ngx_hash_elt_t *elt = hash->buckets[loop];
printf(" 0x%x: buckets[%d] = 0x%x\n", &(hash->buckets[loop]), loop, elt);
}
printf("\n");
elts = (ngx_hash_key_t*)array->elts;
nelts = array->nelts;
for (loop = 0; loop < nelts; loop++)
{
char url[Max_Url_Len + 1] = {0};
key = elts[loop].key_hash % hash->size;
ngx_hash_elt_t *elt = (ngx_hash_elt_t *) ((u_char *) hash->buckets[key] + test[key]);
ngx_strlow(url, elt->name, elt->len);
printf(" buckets %d: 0x%x: {value = \"%s\"%.*s, len = %d, name = \"%s\"%.*s}\n",
key, elt, (char*)elt->value, Max_Ip_Len - strlen((char*)elt->value), prefix,
elt->len, url, Max_Url_Len - elt->len, prefix); //replace elt->name with url
test[key] = (u_short) (test[key] + NGX_HASH_ELT_SIZE(&elts[loop]));
}
}
ngx_array_t* add_urls_to_array(ngx_pool_t *pool)
{
int loop;
char prefix[] = " ";
ngx_array_t *a = ngx_array_create(pool, Max_Num, sizeof(ngx_hash_key_t));
for (loop = 0; loop < Max_Num; loop++)
{
ngx_hash_key_t *hashkey = (ngx_hash_key_t*)ngx_array_push(a);
hashkey->key = urls[loop];
hashkey->key_hash = ngx_hash_key_lc(urls[loop].data, urls[loop].len);
hashkey->value = (void*)values[loop];
/** for debug
printf("{key = (\"%s\"%.*s, %d), key_hash = %-10ld, value = \"%s\"%.*s}, added to array\n",
hashkey->key.data, Max_Url_Len - hashkey->key.len, prefix, hashkey->key.len,
hashkey->key_hash, hashkey->value, Max_Ip_Len - strlen(hashkey->value), prefix);
*/
}
return a;
}
void find_test(ngx_hash_t *hash, ngx_str_t addr[], int num)
{
ngx_uint_t key;
int loop;
char prefix[] = " ";
for (loop = 0; loop < num; loop++)
{
key = ngx_hash_key_lc(addr[loop].data, addr[loop].len);
void *value = ngx_hash_find(hash, key, addr[loop].data, addr[loop].len);
if (value)
{
printf("(url = \"%s\"%.*s, key = %-10ld) found, (ip = \"%s\")\n",
addr[loop].data, Max_Url_Len - addr[loop].len, prefix, key, (char*)value);
}
else
{
printf("(url = \"%s\"%.*s, key = %-10d) not found!\n",
addr[loop].data, Max_Url_Len - addr[loop].len, prefix, key);
}
}
}
3.2如何编译
请参考nginx-1.0.4源码分析—内存池结构ngx_pool_t及内存管理一文。本文编写的makefile文件如下。
CXX = gcc
CXXFLAGS += -g -Wall -Wextra
NGX_ROOT = /usr/src/nginx-1.0.4
TARGETS = ngx_hash_t_test
TARGETS_C_FILE = $(TARGETS).c
CLEANUP = rm -f $(TARGETS) *.o
all: $(TARGETS)
clean:
$(CLEANUP)
CORE_INCS = -I. \
-I$(NGX_ROOT)/src/core \
-I$(NGX_ROOT)/src/event \
-I$(NGX_ROOT)/src/event/modules \
-I$(NGX_ROOT)/src/os/unix \
-I$(NGX_ROOT)/objs \
NGX_PALLOC = $(NGX_ROOT)/objs/src/core/ngx_palloc.o
NGX_STRING = $(NGX_ROOT)/objs/src/core/ngx_string.o
NGX_ALLOC = $(NGX_ROOT)/objs/src/os/unix/ngx_alloc.o
NGX_ARRAY = $(NGX_ROOT)/objs/src/core/ngx_array.o
NGX_HASH = $(NGX_ROOT)/objs/src/core/ngx_hash.o
$(TARGETS): $(TARGETS_C_FILE)
$(CXX) $(CXXFLAGS) $(CORE_INCS) $(NGX_PALLOC) $(NGX_STRING) $(NGX_ALLOC) $(NGX_ARRAY) $(NGX_HASH) $^ -o $@
3.3 运行结果
3.3.1 bucket_size=64字节
bucket_size=64字节时,运行结果如下。
# ./ngx_hash_t_test
--------------------------------
create a new pool:
--------------------------------
pool = 0x8870020
.d
.last = 0x8870048
.end = 0x8870420
.next = 0x0
.failed = 0
.max = 984
.current = 0x8870020
.chain = 0x0
.large = 0x0
.cleanup = 0x0
.log = 0x0
available pool memory = 984
--------------------------------
create and add urls to it:
--------------------------------
array = 0x8870048
.elts = 0x887005c
.nelts = 7
.size = 16
.nalloc = 7
.pool = 0x8870020
elements:
0x887005c: {key = ("www.baidu.com" , 13), key_hash = 270263191 , value = "220.181.111.147"}
0x887006c: {key = ("www.sina.com.cn", 15), key_hash = 1528635686, value = "58.63.236.35" }
0x887007c: {key = ("www.google.com" , 14), key_hash = -702889725, value = "74.125.71.105" }
0x887008c: {key = ("www.qq.com" , 10), key_hash = 203430122 , value = "60.28.14.190" }
0x887009c: {key = ("www.163.com" , 11), key_hash = -640386838, value = "123.103.14.237" }
0x88700ac: {key = ("www.sohu.com" , 12), key_hash = 1313636595, value = "219.234.82.50" }
0x88700bc: {key = ("www.abo321.org" , 14), key_hash = 1884209457, value = "117.40.196.26" }
--------------------------------
the pool:
--------------------------------
pool = 0x8870020
.d
.last = 0x88700cc
.end = 0x8870420
.next = 0x0
.failed = 0
.max = 984
.current = 0x8870020
.chain = 0x0
.large = 0x0
.cleanup = 0x0
.log = 0x0
available pool memory = 852
--------------------------------
the hash:
--------------------------------
hash = 0x88700cc: **buckets = 0x88700d8, size = 3
0x88700d8: buckets[0] = 0x8870100
0x88700dc: buckets[1] = 0x8870140
0x88700e0: buckets[2] = 0x8870180
buckets 1: 0x8870140: {value = "220.181.111.147", len = 13, name = "www.baidu.com" }
buckets 2: 0x8870180: {value = "58.63.236.35" , len = 15, name = "www.sina.com.cn"}
buckets 1: 0x8870154: {value = "74.125.71.105" , len = 14, name = "www.google.com" }
buckets 2: 0x8870198: {value = "60.28.14.190" , len = 10, name = "www.qq.com" }
buckets 0: 0x8870100: {value = "123.103.14.237" , len = 11, name = "www.163.com" }
buckets 0: 0x8870114: {value = "219.234.82.50" , len = 12, name = "www.sohu.com" }
buckets 0: 0x8870128: {value = "117.40.196.26" , len = 14, name = "www.abo321.org" }
--------------------------------
the pool:
--------------------------------
pool = 0x8870020
.d
.last = 0x88701c4
.end = 0x8870420
.next = 0x0
.failed = 0
.max = 984
.current = 0x8870020
.chain = 0x0
.large = 0x0
.cleanup = 0x0
.log = 0x0
available pool memory = 604
--------------------------------
find test:
--------------------------------
(url = "www.baidu.com" , key = 270263191 ) found, (ip = "220.181.111.147")
(url = "www.sina.com.cn", key = 1528635686) found, (ip = "58.63.236.35")
(url = "www.google.com" , key = -702889725) found, (ip = "74.125.71.105")
(url = "www.qq.com" , key = 203430122 ) found, (ip = "60.28.14.190")
(url = "www.163.com" , key = -640386838) found, (ip = "123.103.14.237")
(url = "www.sohu.com" , key = 1313636595) found, (ip = "219.234.82.50")
(url = "www.abo321.org" , key = 1884209457) found, (ip = "117.40.196.26")
(url = "www.china.com" , key = -1954599725) not found!
(url = "www.csdn.net" , key = -1667448544) not found!
以上结果是bucket_size=64字节的输出。由该结果可以看出,对于给定的7个url,程序将其分到了3个bucket中,详见该结果。该例子的hash物理结构图如下。
3.3.2 bucket_size=256字节
bucket_size=256字节时,运行结果如下。# ./ngx_hash_t_test
--------------------------------
create a new pool:
--------------------------------
pool = 0x8b74020
.d
.last = 0x8b74048
.end = 0x8b74420
.next = 0x0
.failed = 0
.max = 984
.current = 0x8b74020
.chain = 0x0
.large = 0x0
.cleanup = 0x0
.log = 0x0
available pool memory = 984
--------------------------------
create and add urls to it:
--------------------------------
array = 0x8b74048
.elts = 0x8b7405c
.nelts = 7
.size = 16
.nalloc = 7
.pool = 0x8b74020
elements:
0x8b7405c: {key = ("www.baidu.com" , 13), key_hash = 270263191 , value = "220.181.111.147"}
0x8b7406c: {key = ("www.sina.com.cn", 15), key_hash = 1528635686, value = "58.63.236.35" }
0x8b7407c: {key = ("www.google.com" , 14), key_hash = -702889725, value = "74.125.71.105" }
0x8b7408c: {key = ("www.qq.com" , 10), key_hash = 203430122 , value = "60.28.14.190" }
0x8b7409c: {key = ("www.163.com" , 11), key_hash = -640386838, value = "123.103.14.237" }
0x8b740ac: {key = ("www.sohu.com" , 12), key_hash = 1313636595, value = "219.234.82.50" }
0x8b740bc: {key = ("www.abo321.org" , 14), key_hash = 1884209457, value = "117.40.196.26" }
--------------------------------
the pool:
--------------------------------
pool = 0x8b74020
.d
.last = 0x8b740cc
.end = 0x8b74420
.next = 0x0
.failed = 0
.max = 984
.current = 0x8b74020
.chain = 0x0
.large = 0x0
.cleanup = 0x0
.log = 0x0
available pool memory = 852
--------------------------------
the hash:
--------------------------------
hash = 0x8b740cc: **buckets = 0x8b740d8, size = 1
0x8b740d8: buckets[0] = 0x8b740e0
buckets 0: {value = "220.181.111.147", len = 13, name = "www.baidu.com" }
buckets 0: {value = "58.63.236.35" , len = 15, name = "www.sina.com.cn"}
buckets 0: {value = "74.125.71.105" , len = 14, name = "www.google.com" }
buckets 0: {value = "60.28.14.190" , len = 10, name = "www.qq.com" }
buckets 0: {value = "123.103.14.237" , len = 11, name = "www.163.com" }
buckets 0: {value = "219.234.82.50" , len = 12, name = "www.sohu.com" }
buckets 0: {value = "117.40.196.26" , len = 14, name = "www.abo321.org" }
--------------------------------
the pool:
--------------------------------
pool = 0x8b74020
.d
.last = 0x8b7419c
.end = 0x8b74420
.next = 0x0
.failed = 0
.max = 984
.current = 0x8b74020
.chain = 0x0
.large = 0x0
.cleanup = 0x0
.log = 0x0
available pool memory = 644
--------------------------------
find test:
--------------------------------
(url = "www.baidu.com" , key = 270263191 ) found, (ip = "220.181.111.147")
(url = "www.sina.com.cn", key = 1528635686) found, (ip = "58.63.236.35")
(url = "www.google.com" , key = -702889725) found, (ip = "74.125.71.105")
(url = "www.qq.com" , key = 203430122 ) found, (ip = "60.28.14.190")
(url = "www.163.com" , key = -640386838) found, (ip = "123.103.14.237")
(url = "www.sohu.com" , key = 1313636595) found, (ip = "219.234.82.50")
(url = "www.abo321.org" , key = 1884209457) found, (ip = "117.40.196.26")
(url = "www.china.com" , key = -1954599725) not found!
(url = "www.csdn.net" , key = -1667448544) not found!
以上结果是bucket_size=256字节的输出。由给结果可以看出,对于给定的7个url,程序将其放到了1个bucket中,即ngx_hash_init()函数中的size=1,因这7个url的总长度只有140,因此,只需size=1个bucket,即buckets[0]。
下表是ngx_hash_init()函数在计算过程中的一些数据。物理结构图省略,可参考上图。
url |
计算长度 |
test[0]的值 |
4+ngx_align(13+2,4)=20 |
20 |
|
4+ngx_align(15+2,4)=24 |
44 |
|
4+ngx_align(14+2,4)=20 |
64 |
|
4+ngx_align(10+2,4)=16 |
80 |
|
4+ngx_align(11+2,4)=20 |
100 |
|
4+ngx_align(12+2,4)=20 |
120 |
|
4+ngx_align(14+2,4)=20 |
140 |
4. 小结
本文针对nginx-1.0.4的hash结构进行了较为全面的分析,包括hash结构、hash元素结构、hash初始化结构等,hash操作主要包括hash初始化、hash查找等。最后通过一个简单例子向读者展示nginx的hash使用方法,并给出详细的运行结果,且画出hash的物理结构图,以此向图这展示hash的设计、原理;同时借此向读者展示编译测试nginx代码的方法。
敬请关注后续的分析。谢谢!
Reference
Nginx代码研究计划 (RainX1982)
nginx-1.0.4源码分析—内存池结构ngx_pool_t及内存管理 (阿波)
nginx-1.0.4源码分析—数组结构ngx_array_t (阿波)
nginx-1.0.4源码分析—链表结构ngx_list_t (阿波)
nginx-1.0.4源码分析—队列结构ngx_queue_t (阿波)
nginx-1.0.4源码分析—内存池结构ngx_pool_t及内存管理 (阿波)
nginx-1.0.4源码分析—数组结构ngx_array_t (阿波)