带来颠覆性性能提升,远端的版本库太大了

 联系我们     |      2020-03-17 06:08

笔者简要介绍

方法一

王振威,CODING 创始团队成员之一,多年体系软件开垦经历,专长Linux,Golang,Java,Ruby,Docker 等本事世界,近七年来一贯在 CODING 从事系统架交涉运营专业

http://stackoverflow.com/questions/25815202/git-fetch-a-single-commit

前言

The git fetch command delivers references (names, not raw commit-IDs) to the remote, more or less.

近年 Google 宣布了一篇小说,描述了对 Git 的三个传输左券的换代,引起了国内手艺圈的非常的大规模的震撼(相关文章请自行百度“Git v2 性情提高”)。 超级多能力圈的仇敌也在转发那么些新闻,那关于品质修正有多大,里面包车型地铁内情是如何吗?事实上这一次退换只在最为气象下有质量提高,绝大超多情景下,顾客体会不到质量的提高。超多不明所以的转向差不离是因为 Google 的牌子效应吧 :)

 

Git 是什么?

(More specifically, use git ls-remote remotename to see what the remote is willing to give you in terms of names.

为了讲领会 why,大家先来回顾介绍一下 Git 相关的左券。纵然你还不打听 Git,想打听越多内容,可参照其官网: . 也可来 这里询问怎么在境内采取优越火速的 Git 托管服务。

 

Git 传输左券

This produces a list of SHA-1s on the left with names on the right, and the only thing your fetch can ask for is the names-on-the-right.

Git 平淡无奇的有三种左券,SSH,HTTP(SState of Qatar,Git,使用最普遍的是前三种。

 

让我们来看一下, HTTP(SState of Qatar 和 SSH 合同的施用示例

At which point you'll get the ID-on-the-left if the name on the remote still points to that ID, so it depends on how actively that remote gets updated.)

git clone

 

Cloning into 'coding-demo'...

 

remote: Counting objects: 3, done.

 

remote: Total 3 (delta 0), reused 0 (delta 0)

It is possible, in various ways, to deliver raw commit-IDs to a remote and ask that remote what is visible starting from that point,

Unpacking objects: 100% (3/3), done.

 

git clone git@git.coding.net:wzw/coding-demo.git

and sometimes working backwards through history as well, but not via git fetch.

Cloning into 'coding-demo'...

 

remote: Counting objects: 3, done.

(You can use git archive but the remote can decide whether or not to allow you to access via raw commit-IDs;

remote: Total 3 (delta 0), reused 0 (delta 0)

 

Receiving objects: 100% (3/3), done.

or with remotes that have web server access, including to specific commits, you can often just view the top-level contents of a commit,

能够见到,对于全新 clone 来说两个基本上的进程是同出一辙的。

 

实际, Git 底层对于种种应用层合同的尾巴部分管理是相近的,不管是 HTTP(S卡塔尔 照旧 SSH 照旧 Git 合同。

and use that to "drill down", as they say, to the various pieces. But that is a very slow way to do it.)

让我们来一发看一下, Git 在传输进程中都做了什么样。

 

GIT_TRACE=1 GIT_TRACE_PACKET=1 git clone

 

17:48:21.767799 git.c:344               trace: built-in: git 'clone' ''

 

Cloning into 'coding-demo'...

If you'd like to use git fetch to get some particular commit, probably the easiest way to do that is to have someone with access to the remote attach a name—most likely a tag—to that commit ID.

17:48:21.797959 run-command.c:626       trace: run_command: 'git-remote-https' 'origin' ''

 

17:48:22.278880 pkt-line.c:80           packet:          git< # service=git-upload-pack

Then you can have your git fetch bring over that refspec, and put it under any other refspec you like.

17:48:22.279390 pkt-line.c:80           packet:          git< 0000

 

17:48:22.279405 pkt-line.c:80           packet:          git< fdacba1d541c75bd48f2cd742ee18f77ea3517a1 HEADmulti_ack thin-pack side-band side-band-64k ofs-delta shallow deepen-since deepen-not deepen-relative no-progress include-tag multi_ack_detailed no-done symref=HEAD:refs/heads/master agent=git/2.15.0

For instance, suppose you can ssh directly to whatever hosts origin:

17:48:22.279419 pkt-line.c:80           packet:          git< fdacba1d541c75bd48f2cd742ee18f77ea3517a1 refs/heads/master

 

17:48:22.279431 pkt-line.c:80           packet:  

$ ssh our.origin.host 'cd /repos/repo.git; git tag temporary f1e32e1'
[enter password, etc; observe tag created]
$ git fetch origin refs/tags/temporary:refs/heads/newbranch
[observe fetch happen; now you have local branch 'newbranch']
$ ssh our.origin.host 'cd /repos/repo.git; git tag -d temporary'

2.jpg (上传于2018-05-27 12:00:33)
图片 1

 

好,底子知识补充达成,有未有察觉可以的区块链在能力层面上跟 Git 的仓库储存是有相近之处的 :)

Note that the name need not be a branch, it need only be a reference you can pull over with git fetch and see with git ls-remote.

在 Clone 进程中,服务器端首先会推荐给客商端一些 ref 列表,那也是 Git v2 商业事务号称的本性改良的地点,后文有分解。

 

像这样:

You then use a name that will match that on the left-hand-side of your refspec when fetching.

17:49:19.772436 pkt-line.c:80           packet:        clone< fdacba1d541c75bd48f2cd742ee18f77ea3517a1 refs/heads/master

 

17:49:19.772527 pkt-line.c:80           packet:        clone< 1536ad10fc0a188c50680932ca191c8da46938c4 refs/heads/test-abc

The name created in your repo is controlled by the right-hand-side of the refspec (refs/heads/newbranch in the example above).

17:49:19.772549 pkt-line.c:80           packet:        clone< 1536ad10fc0a188c50680932ca191c8da46938c4 refs/heads/test-bcd

 

17:49:19.772566 pkt-line.c:80           packet:        clone< 30eb4b0d813c662c4d7e87c4d3b4cf561e544f8e refs/tags/v1.0

 

17:49:19.772863 pkt-line.c:80           packet:        clone< 1536ad10fc0a188c50680932ca191c8da46938c4 refs/tags/v1.0^{}

 

很鲜明,上文中的 40 位16进制数字正是对应前边的 ref 指向的靶子 ID。

This is also the answer to your last paragraph question: you can only name things that have names on the remote (this is partly intended to avoid "leaking" unnamed commits that remain in a repository before garbage-collection, so it's considered a feature rather than a bug). These names go on the LHS of the refspec. Your own names go on the right.

而顾客端,只要求依照自身感兴趣的 ref 和温馨本地曾经存在的靶子库(对于 pull 和 fetch 来说,本地有对象库,对于 clone 来说本地还从未目的库,那么他正是索要具备的感兴趣的靶子卡塔尔(قطر‎。

 

在客商端总计结束自个儿感兴趣的指标列表后,会用 want 指令告诉远端服务器。

 

17:49:19.776185 pkt-line.c:80           packet:        clone> want fdacba1d541c75bd48f2cd742ee18f77ea3517a1 multi_ack_detailed side-band-64k thin-pack ofs-delta deepen-since deepen-not agent=git/2.15.1.(Apple.Git-101)

 

17:49:19.776215 pkt-line.c:80           packet:        clone> want fdacba1d541c75bd48f2cd742ee18f77ea3517a1

Your name on the right is assumed to be a branch or tag name (based on what the name on the left matches, though you can explicitly spell out refs/heads/ or refs/tags/ to override it), so even though f1e32e1... is a valid SHA-1, it's treated as a branch name here—the missing name on the left translates to HEAD, as missing names almost always do—and git fetch creates a branch whose name is disturbingly SHA-1-ish. (Incidentally I once created a branch name that looked like an SHA-1, and later confused myself. I forget exactly what the name was, something like de-beadwithout the hyphen. I renamed it to the hyphenated version just to make it clear I didn't mean a raw commit ID! :-) )

17:49:19.776224 pkt-line.c:80           packet:        clone> want 1536ad10fc0a188c50680932ca191c8da46938c4

 

17:49:19.776232 pkt-line.c:80           packet:        clone> want 1536ad10fc0a188c50680932ca191c8da46938c4

相仿是无解的,远端必须给第叁个commit起了名字,或许成立分支,或然Tag

17:49:19.776239 pkt-line.c:80           packet:        clone> want 30eb4b0d813c662c4d7e87c4d3b4cf561e544f8e

技能通过fetch来赢得

假若顾客端实行的是 pull 可能 fetch ,他还只怕会告知远端本身一度有了怎么指标(在随笔的前面,大家会补充一段专门求证此点)。

 

远端服务器会基于客商端想要的靶子以致客商端已经某些对象并对照自身的对象库和对象注重关系,将顾客端必需的靶子收拾起来并打包压缩传给客商端。

 1. 本地初叶化三个版本库

客商端收到对象包后,解包并校验对象,并更新引用的相应指向。

git init

谷歌 在 Protocol version 2 做了如何

2.加多远端

一体化的 version 2 的协议表明在那:

git remote add origin

此地大家对其做的主要改动做些表明,首要有三点:

3.参阅上面包车型地铁链接管理

服务端援用过滤

 git ls-remote origin >origin.txt

新性情的易扩张性进级(比方可评释想要什么 ref)

把git ls-remote origin的结果输出到origin.txt的文本中

简化的客商端 HTTP 公约管理

 

被过多标题党名过其实的重假如其首先点:服务端援用过滤。

d6602ec5194c87b0fc87103ca4d67251c76f233a   refs/tags/v0.99

谷歌 官方的博客中对此段的描述是这般的:

$ git pull origin v0.99
remote: Counting objects: 4508, done.
remote: Compressing objects: 100% (234/234), done.
emote: Total 4508 (delta 1498), reused 1409 (delta 1409), pack-reused 2865R
Receivjects: 100% (4508/4508), 980.00 KiB | 334.00 KiB/s
ing objects: 100% (4508/4508), 1.08 MiB | 334.00 KiB/s, done.
Resolving deltas: 100% (3056/3056), done.
From
* tag v0.99 -> FETCH_HEAD
error: Trying to write non-commit object d6602ec5194c87b0fc87103ca4d67251c76f233
a to branch refs/heads/master
fatal: Cannot update the ref 'HEAD'.

The main motivation for the new protocol was to enable server sid

或者

git fetch origin v0.99

以此命令不会出错,不过地面未有代码,也找不到方法把代码弄出来。【因为地点二次提交都未有,是向来不HEAD的】

 

$ git rev-list FETCH_HEAD --count
1076

//在版本0.99的时候,一共有1076次提交

 

$ git reset --hard d6602ec5194c87b0fc87103ca4d67251c76f233a   【这些也没啥用】
HEAD is now at a3eb250 [PATCH] alternate object store and fsck

 

//HEAD的照准和master的照准是一致的

8d530c4d64ffcc853889f7b385f554d53db375ed HEAD    

//5个分支
ee6ad5f4d56e697c972af86cbefdf269b386e470 refs/heads/maint
8d530c4d64ffcc853889f7b385f554d53db375ed refs/heads/master
c07a1e8782dadcedeffd389aa9bce4fda5b0983c refs/heads/next
9b9e9adbccf975e4ffc7af213fbf55f187e752bf refs/heads/pu
49a1e5ee48904bfe562388041bdcdb3d8ad21d10 refs/heads/todo