Git study notes (1): Introduction to Git

Introduction to Git (from wiki)

git (/ɡɪt/) is a distributed version control software originally created by Linus Torvalds and released under the GPL in 2005. Originally designed to better manage Linux kernel development. It should be noted that this is different from GNU Interactive Tools, a file manager with a Norton Commander-like interface.
The initial development momentum of git came from BitKeeper and Monotone. Git was originally developed only as a backend that could be wrapped by other frontends (like Cogito or Stgit), but then the git core has matured enough to be used independently for version control. Many well-known software uses git for version control, including the development process of projects such as the Linux kernel, the X.Org server, and the OLPC kernel.


The source of naming

Linus Torvalds self-deprecatingly took the name "git," which is derived from a British slang word meaning roughly "jerk."
"I'm an egotistical bastard, and I name all my projects after myself. First Linux, now git." The official wiki of git also gives a variety of explanations for this name.


2. History

Since 2002, Linus Torvalds decided to use BitKeeper as the main version control system for the Linux kernel to maintain the code. Because BitKeeper is proprietary software, this decision has long been questioned in the community. In the Linux community, especially Richard Stallman and members of the Free Software Foundation, advocate the use of open source software as the version control system at the heart of Linux. Linus Torvalds has considered using off-the-shelf software as a version control system (such as Monotone), but these have some problems, especially poor performance. Off-the-shelf solutions, such as the architecture of CVS, have been criticized by Linus Torvalds.
In 2005, Andrew Tweety wrote a simple program that could connect to BitKeeper's repository. BitKeeper copyright owner Larry Mcvoy believed that Andrew Tweety had reverse-engineered the protocol used internally by BitKeeper and decided to withdraw the use of BitKeeper for free. authorization. The Linux kernel development team consulted with BitMover, but could not resolve their differences. Linus Torvalds decided to develop his own version control system to replace BitKeeper, and wrote the first git version in ten days.


3. Main functions

git is a version control tool for Linux kernel development. Different from centralized version control tools such as CVS and Subversion, it adopts the method of distributed version library, and can operate version control without server-side software, which makes the release and communication of source code extremely convenient. git is fast, which is naturally important for large projects like the Linux kernel. The best thing about git is its merge tracing capabilities.
In fact, when the kernel development team decided to start developing and using git as a version control system for kernel development, there were many objections from the open source community in the world. The biggest reason was that git was too difficult to understand. From the internal working mechanism of git , indeed. But with the deepening of development, the normal use of git is performed by some friendly commands, making git very easy to use. Now, more and more famous projects use git to manage project development, such as wine, U-boot, etc.
As an open source free fundamentalist project, git does not impose any permission restrictions on the browsing and modification of the repository, and limited permission control can also be achieved through other tools, such as: gitosis, CodeBeamer MR. The original use of git is only applicable to Linux/Unix platforms, but the use of Windows platforms is also becoming more and more mature, mainly due to Cygwin, msysgit environment, and easy-to-use GUI tools such as TortoiseGit. The source code of git has also added support for the Cygwin and MinGW compilation environments and is gradually improving, bringing good news to Windows users.


Fourth, the realization principle

There are many differences between git and other version control systems (such as CVS). Git itself cares about whether the integrity of the file has changed, but most version control systems such as CVS or Subversion care about the difference in file content. git refuses to maintain a version-revision relationship for each file. Therefore, viewing the history of a file requires traversing each history snapshot; git implicitly handles file renaming, that is, a file with the same name defaults to its predecessor, and if there is no file with the same name, it searches for a file with similar content in the previous version.
Git is more like a file system, fetching data directly on the machine, without having to connect to the host to get data. Each developer can have a local copy of the entire development history, and changes are replicated from this local repository to other developers. These changes are imported as new development branches that can be merged with the local development branch.
Branches are very lightweight, a branch is just a reference to a commit.
git is developed in C for maximum performance. git automatically completes garbage collection, and can also be called directly with the command git gc --prune.
git stores each newly created object as a separate file. In order to compress the storage space, the packs operation compresses many files (heuristically similar names often have similar content) into a file (packfile) using differential compression, and creates a corresponding index file indicating the offset of the object in the packfile value. Newly created objects still exist as separate files. The repacks operation is very time-consuming, and git will do this automatically when it is idle. You can also use the command git gc to start repack directly. Both packfile and index file use SHA-1 as the checksum and as the filename. The git fsck command does checksum integrity verification.
A typical TCP listening port for a Git server is 9418.


5. Library directory

  • hooks: Folder where hooks are stored
  • logs: Folder where logs are stored
  • refs: files that store pointers (SHA-1 identifiers) to individual branches
  • objects: store git objects
  • config: store various setting documents
  • HEAD: The pointer file path pointing to the current branch, generally pointing to a file under refs

6. Data structure

write picture description here
Git has two data structures: a mutable index (index or stage or cache) for buffering working directory information and version information for the next commit; an immutable, append-only object database.
The object database contains 4 types of objects:

  • A blob (binary large object) is the content of a file. Blobs do not have proper filenames, timestamps, or other metadata. A blob's internal name is the hash of its contents.
  • A tree object is equivalent to a directory. Contains a list of filenames along with the file's type bits, a reference to a blob or tree object. The tree object is a snapshot of the source tree. Implemented with Merkle tree.
  • The commit object links tree objects together to become history. Contains the tree object name of the top-level source directory, a timestamp, log information, and the names of 0 or more parent commit objects.
  • A tag object is a container that contains a reference to another object and can also add metadata about another object. Usually it holds the digital signature of a commit object for a specific version of data that needs to be traced back.

Each object is identified by the SHA-1 hash of its contents. The object is placed in the directory identified by the first two characters of its hash value, and the remaining hash characters are used as the object's filename.
Objects with immutable references in the Git database will be cleaned up by garbage collection. Git commands can create, move, and delete references. "git show-ref" lists all refs. Some reference types:

  • heads: refers to a local object, which is a commit pointer. Each head can refer to any such pointer. Can contain any number of heads. And "HEAD" (all uppercase), only refers to the currently valid head. By default, there is one head under each repository, called master.
  • remotes: refer to an object in the remote repository
  • stash: refer to an object that has not been committed
  • meta: such as a configuration in a bare repository, user permissions; refs/meta/config namespace, etc.
  • tags:

7. Transplantability

There are msysgit and TortoiseGit available on Windows platforms. TortoiseGit also provides GUI.
Now git also provides windows version download.
Visual Studio has built-in Git functionality since version 2013.


Eight, GIT GUI client

  • GitHub Desktop: Available for third-party Git repository sites
  • Git for Windows
  • TortoiseGit
  • SourceTree
  • GitEye

9. Projects using git

1 2 3 4 5
amarok Android Arch Linux Aquamacs Emacs BlueZ
Btrfs Clojure CakePHP Debian Digg
DragonFly BSD Drupal Elinks Fedora FFmpeg
Freenet git GIMPGNAME GPM GStreamer
gThumb GTK+ Hurd jQuery Laika (EHR testing framework)
LilyPond (music typesetting) Linux kernel Linux Mint LMMS Music Production Software Maemo
MeeGo Merb MooTools One Laptop Per Child (OLPC) OpenFOAM
openSUSE Perl PHP phpBB PostgreSQL
Prototype.js Qt Reddit rsync Ruby on Rails
Samba SproutCore Sugar SWI-Prologue VLC
Wine Xiph X.org Server x264 YUI

Ten, the difference between Git and SVN

GIT is not just a version control system, it is also a content management system (CMS), work management system, etc.
The difference between Git and SVN:

  • GIT is distributed, SVN is not: this is the core difference between GIT and other non-distributed version control systems, such as SVN, CVS, etc.
  • GIT stores content by metadata, while SVN is by file: all resource control systems hide the metadata of the file in a folder like .svn, .cvs, etc.
  • GIT branches are different from SVN branches: branches are nothing special in SVN, just another directory in the repository.
  • GIT doesn't have a global version number, while SVN does: this is by far the biggest feature that GIT lacks compared to SVN.
  • GIT's content integrity is better than SVN: GIT's content storage uses the SHA-1 hashing algorithm. This ensures the integrity of the code content and ensures less damage to the repository in the event of disk failures and network problems.

appendix

Reference URL: http://www.runoob.com/git/git-tutorial.html

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324512873&siteId=291194637