Source Address: http://peopleyun.com/?p=665
This article will BigTable data model-depth analysis and describes how it is called.
Data Model
Like I said before to the fact BigTable name suggests, it is a very large table, but is a capable of storing billions of rows (Row) and several thousand columns (Column) of a very large table. What table is how much? Next, give some simple examples, such as: a table for all sites and content on the Internet personal information of all Chinese citizens, the overall size of these tables can reach more than PB level, and the size will increase with the date of these tables, so obviously we need to use the distributed approach, instead of using a machine to carry this huge and growing table. First, it will introduce the basic BigTable data model, which is the table.
Table
This is the Table (table), although the screenshot above, only three and five Row Column, but since this table will store the personal information of all Chinese citizens, so there will be more than 1.3 billion and hundreds of multi-Row Column, Next, introduction in order to improve access efficiency and scalability of two characteristics: Colunm Family (column group) and the Tablet (sheet).
Column Family
Since each table, there will be hundreds of Column, and most queries only get a few of them Column, so if you have taken out all the Column per query, then, would be wasted, so Google's BigTable design Column family introduced this feature, this feature can through multiple Column and as a group, such as "home address" and "work address" on the map are subordinate to "address" the Column family, the biggest benefit of doing so these can be stored together in Column, not only can improve the access efficiency, but also to avoid excessive Column read, such as read only can select a Column Family.
Tablet
This is very easy to understand, the system will automatically BigTable is based on a range of Row Name to copy the data to a different server.
Timestamp
In order to help synchronization and backup of data, may be provided for each respective Timestamp Cell (cell), and the system can be done GC (Garbage Collection) according Timestamp.
Call Interface
Google's BigTable API call interface mainly based, following is some sample code, the main reference of self Paper BigTable.
// Open Table
Table *T = OpenOrDie(“/peopletable”);
// find the appropriate Row, and make the appropriate updates
RowMutation r1(T,”310101”);
r1.Set ( "Address: Home Address", "SH88");
// perform the update
Operation on;
Apply(&op, &r1);
// create Scanner for queries
Scanner scanner(T);
ScanStream *stream;
// relevant code: 1 Lock "address" this Cloumn Family; 2 returns all versions; 3 Find Row Name is "310101" column....
stream = scanner.FetchColumnFamily ( "address");
stream->SetReturnAllVersion();
scanner.Lookup(“310101”);
for(;!stream->Done();stream->Next()){
printf(“%s %s %lld %s\n”,Scanner.RowName(), stream->ColumnName,
stream->TimeStamp, stream->Value);
}
Part II will focus on the development of the diary of BigTable storage model.
Reproduced in: https: //www.cnblogs.com/licheng/archive/2010/09/09/1821903.html