Git Tools - Submodules: Use of submodule and subtree

In daily use of git, there is basically one project and one Git warehouse. So when our code encounters business-level code that needs to be reused, what do we usually do?

For example: a project at work needs to include and use another project . Maybe a third-party library, or a library you developed independently and used in multiple parent projects.

Therefore, it is necessary to extract a common class library for use by multiple projects, but how can this library be easily managed with git?

Now here's the problem: you want to treat them as two separate projects, but at the same time want to use one in the other.

Let’s think about it roughly, there are generally two options:

  1. Abstract into NPM packages for reuse;

  2. Use Git sub-repositories to reuse code;

However: some pages or functions in the two programs overlap. In the front-end field, such as AntDesign, element-UI, react, vue, and angular version styles are the same, but the components are different.

If you develop two sets of codes for the overlapping parts during the development process, a lot of manpower will be wasted.

I personally recommend using Git sub-modules for development. The parent repository relies on two public sub-modules. The sub-module itself is developed together with the parent repository, which can avoid version problems and repeated development problems .

For more complex projects, we may split the code into different sub-modules based on functions. The main project has dependencies on submodules, but does not care about the details of the internal development process of submodules .

The general structure may be like this

project

|--moduleA

|--submoduleC

|--submoduleD

|--moduleB

In each module of project and ABCD, CD is in a different git repository. In this case, you need to use the module function of git.

Git Tools - Submodules

Git solves this problem through submodules. Submodules allow you to make one Git repository a subdirectory of another Git repository. It allows you to clone another repository into your own project while still keeping the commits independent.

Popular understanding is that there are multiple other Git warehouses placed under a Git warehouse, and the other Git warehouses are sub-warehouses of our parent warehouse.

Git Submodule is a good tool for multiple projects to use a common class library. It allows the class library project to be used as a repository, and the sub-project is stored in the parent project as a separate git project. The sub-project can have its own independent commit and push. , pull. The parent project contains subprojects in the form of Submodules. The parent project can specify the subproject header. The submission information in the parent project contains the information of the Submodule. The Submodule can be initialized when cloning the parent project.

You can check the official website: https://git-scm.com/book/zh/v2/Git-Tools-Submodule

Multiple parent warehouses all depend on the same sub-warehouse, but the sub-warehouse itself does not modify it alone, but follows the parent project for updates and releases . Other projects that rely on the sub-warehouse are only responsible for pulling updates .

Git two sub-repository usage plans

  1. git submodule

  2. git subtree

git submodule (submodule)

Git submodule allows us to use one or more Git repositories as subdirectories of another Git repository. It allows you to clone another repository into your own project while still keeping the commits independent .

In Git, you can use submodule to manage these projects. Submodule allows you to treat one Git repository as a subdirectory of another Git repository. This allows you to clone another repository into your project and keep your commits relatively independent.

Get started with submodules

git clone https://github.com/zhoulujun/zhoulujun.cn-phpcms.git zhoulujun
cd  zhoulujun
git submodule add   tools 
git submodule add  https://github.com/zhoulujun/zhoulujun.cn-tools-vue.git tools-vue

After adding the submodule, run git status. You can see that a file has been added to the directory. gitmodules. This file is used to save the submodule information.

$ git status
On branch master

Initial commit

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)

    new file:   .gitmodules
    new file:   assets

Use git init --bare to create two bare warehouses locally, representing the main warehouse and the dependent sub-warehouse respectively. We name the main warehouse main and the dependent sub-warehouse lib. git subtree uses the same initialization method, which will not be discussed below. Again.

Common commands for git submodule

  • View submodules : git submodule

  • Update submodule:

    • Update the submodules in the project to the latest version : git submodule update

    • Update the submodule to the latest version of the remote project : git submodule update --remote

  • Clone the project containing the submodule:

    • Clone the parent project: git clone https://github.com/demo.git assets

      • Initialize submodule: git submodule init

      • Update submodule : git submodule update

    • Recursively clone the entire project submodule : git clone https://github.com/demo.git assets --recursive 

    • Recursively update the entire project submodule : git submodule foreach git pull

  • Delete submodules : git rm --cached subModulesA rm -rf subModulesA

--recursive means to recursively clone all sub-repositories that git_parent depends on.

git subtree (subtree merge)

The git submodule introduced above is a native function of Git. The git subtree we will introduce next is a contrib script contributed by third-party developers. Git itself does not provide the git subtree command . The contrib contains some experimental ones. Third-party tools are maintained by their respective authors.

At the same time, this also makes us realize that git subtree is not a command natively supported by Git, but a high-level script written by third-party developers through the underlying commands of Git, so it can be implemented by basic Git commands.

Subtree and submodule have the same function, but subtree appears later than submodule. It appears to make up for the problems of submodule:

  1. Submodule cannot modify the code of the sub-repository in the parent repository, but can only modify it in the sub-repository, which is one-way;

  2. submodule does not have the function of directly deleting sub-repositories;

Subtree can realize two-way data modification. It is officially recommended to use subtree instead of submodule .

I won’t mention it here.

Using Git subtree commands

  • Create local directory

    • Syntax: `git remote add <sub-warehouse name> <sub-warehouse address>`

    • Example: `git remote add component [email protected]`

  • Add a remote warehouse (local file directory)

    • Syntax: `git remote add -f <subwarehouse name> <subwarehouse address>`

    • Example: `git remote add -f component [email protected]`

  • Use (pull & push)

    • pull:`git subtree pull --prefix=component component master --squash`

    • push:`git subtree push --prefix=component component master --squash`

Note: **Must be executed in the parent directory of `component`, which is not very convenient to use.

Submodule can be cloned together. Just add the --recursive recursive parameter. However, subtree cannot be cloned and can only be added manually.

I am used to using submodule, so I naturally find subtree complicated and difficult to use === Is it like being used to intelliJ and not having the time to use VScode, haha!

Someone's summary of the difference between submodule and subtree is quite vivid:  submodule is link; subtree is copy  .

More recommended reading: Detailed explanation of Git application Lecture 10: Git sub-library: submodule and subtree  Detailed explanation of Git application Lecture 10: Git sub-library: submodule and subtree_tortoisegit subtree_AhuntSun's blog-CSDN blog

Analysis of Git sub-warehouse principle

If you don’t understand the underlying principles very well, it is likely to lead to confusion when using sub-warehouses. It is not clear whether the parent warehouse or the sub-warehouse is submitted first.

Analysis of git submodule principle

We know that the bottom layer of Git roughly relies on four types of objects, which form the basis for Git's tracking of file content:

  • blob: A large binary file, which can be commonly understood as a modification of the file.

  • tree: records the modifications of blob objects and other tree objects, commonly understood as directories

  • commit: Submit object, which records the tree object submitted this time and the commit object of the parent class as well as our submission information.

  • tag: The object we record the version of the current commit

For more detailed information, please refer to " In-depth Understanding of Git "

We need to rely on a print_all_object tool function here, which will help us display these four objects in the git warehouse according to the order of reverse submission history. It can be placed under an environment variable for global use:

#!/bin/bash

print_all_object() {
  for object in `git rev-list --objects --all | cut -d ' ' -f 1`; do
    echo 'SHA1: ' $object
    git cat-file -p $object
    echo '-------------------------------'
  done
}

print_all_object

We execute print_all_object under the main warehouse:

# This is at the point in time when we just submitted the sub-module 
# We have intercepted part of the long hash, which does not affect the reading experience 
print_all_object 

SHA1: a1cfd26e 
tree c77ba9c2 
parent ab118b8 

feat: Add sub-warehouse dependency 
------- -------------------------- 
SHA1: ab118b8 
tree f5771cd 

feat: Create index.js in parent warehouse 
------------- ------------------- 
SHA1: c77ba9c2 
100644 blob d8c9fb4 .gitmodules 
100644 blob ddd81ae index.js 
160000 commit 40f8536 lib 
------------- ------------------ 
SHA1: d8c9fb4 
[submodule "lib"] 
        path = lib 
        url = /path/to/repos/lib.git 
-------- ----------------------- 
SHA1: ddd81ae 
console.log('main');------------------ -------------
SHA1:  f5771cd
100644 blob ddd81ae    index.js
-------------------------------

The index.js file is a blob object, and the corresponding file mode is 100644, but for the lib sub-repository it is indeed a commit object, and the file mode is 160000. This is a special mode in Git, indicating that we record the commit of a submission. In Git, rather than recording it as a subdirectory or file.

This is the core principle of git submodule. When Git processes submodule references, it will not scan the changes in the files under the sub-repository. Instead, it will take the hash value of the commit pointed to by the current HEAD of the sub-repository. When we process the sub-repository After making the change, Git obtains the change in the commit value of the submodule, thus recording the change in the Git pointer.

In the staging area, we discovered the new commits prompt. Git does not care how the files of the submodule change . I only need to record the hash value of the commit of the submodule in the current submission . After that, we start from the parent When the warehouse pulls the sub-warehouse, Git pulls the submission corresponding to the hash value of the sub-module in the submission record , and restores the code of our entire warehouse.

Notes on git submodule

Although using git submodule brings a lot of convenience to our development, it will also lead to some mistakes that are easy to make. We have sorted them out to prevent everyone from making pitfalls:

  1. When the submodule is submitted, it is not pushed to the remote repository.  The parent references the submodule's commit update and submits it to the remote repository. When someone else pulls the code, it will report that the submodule's commit does not exist. Fatal: reference isn't 't a tree.

  2. If you only reference the free branch of someone else's submodule, then modify the code of the subrepository in the main repository, and then use git submodule update to pull the latest code, then the changes you made in the free branch of the subrepository will be overwritten . .

  3. We assume that you did not use the sub-module development method in the main warehouse at the beginning, but used the sub-warehouse in another development branch. Then when you switch from the development branch back to the branch that did not use sub-modules, the sub- module The directory will not be automatically deleted by Git, but you need to delete it manually .

Reference article:

Git submodule management and use of  submodules Git submodule management and use of submodules - short book

Use Git Submodule to manage submodules.  Use Git Submodule to manage submodules - Jiang Jiazhi - SegmentFault Sifu

Git sub-repository is explained in simple terms.  Git sub-repository is explained in simple terms - Nuggets

Submodule  Git Book Chinese version - Submodule

Use git subtree & submodule to manage multiple subprojects  Use git subtree & submodule to manage multiple subprojects - Brief Book

 Detailed explanation of Git application Lecture 10: Git sub-library: submodule and subtree  Detailed explanation of Git application Lecture 10: Git sub-library: submodule and subtree_tortoisegit subtree_AhuntSun's blog-CSDN blog

git submoudle vs git subtree https://efe.baidu.com/blog/git-submodule-vs-git-subtree/

Reprint the article " Git Tools - Submodule: The use of submodule and subtree " on this site , please indicate the source: Git Tool - Submodule: The use of submodule and subtree - Some daily small combinations used by git - Zhou Junjun's personal website

Guess you like

Origin blog.csdn.net/u012244479/article/details/130049839