You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Sep 11, 2020. It is now read-only.
When cloning large repositories, with respect to the occupied space and less so with respect to the number of commits go-git uses some sort of a different strategy than git resulting in massive memory footprint and very long clone times. Here cloning a repo that unpacks into 1.5Gb and contains ca. 110k commits, go-git uses up to 5Gb RAM and runs over 4m while git uses 290Mb and runs in about 1m (tested with geat based on go-git):
➜ date && geat clone git@gitserver:myrepo && date
Mit Jun 21 14:17:28 CEST 2017
=> clone: myrepo cloned from git@gitserver:myrepo into origin
Mit Jun 21 14:21:42 CEST 2017
➜ du -s myrepo
1463492 myrepo
➜ date && git clone git@gitserver:myrepo && date
Mit Jun 21 14:23:58 CEST 2017
Cloning into 'myrepo'...
remote: Counting objects: 974758, done.
remote: Compressing objects: 100% (167444/167444), done.
remote: Total 974758 (delta 798392), reused 973743 (delta 797550)
Receiving objects: 100% (974758/974758), 791.39 MiB | 67.73 MiB/s, done.
Resolving deltas: 100% (798392/798392), done.
Mit Jun 21 14:25:09 CEST 2017
Memory requirements scales more or less linearly with the commit number and repository size, below e.g. a smaller repo with quite a lot of commits and go-git uses about 8x more memory than git. On the performance side, the growth of the repository size leads to much faster degradation: for the 1.5 Gb repo about the difference is 4 times, for a 10 times smaller repo below the times are about the same for git and go-git while git shows approximately the same times as for 1.5Gb repo.
Cloning github.com:moby/moby with 32k commits and 170Mb overall unpacked size takes about the same 1m20s with both git and go-git. Memory wise, go-git loses uses the max of 320Mb (2x the repo size) and git 45Mb (0.25x the repo size):
➜ date && geat clone git@github.com:moby/moby && date
Mit Jun 21 13:32:03 CEST 2017
=> clone: moby cloned from git@github.com:moby/moby into origin
Mit Jun 21 13:32:38 CEST 2017
➜ date && git clone git@github.com:moby/moby && date
Mit Jun 21 13:33:05 CEST 2017
Cloning into 'moby'...
remote: Counting objects: 229544, done.
remote: Compressing objects: 100% (35/35), done.
remote: Total 229544 (delta 23), reused 17 (delta 14), pack-reused 229495
Receiving objects: 100% (229544/229544), 127.47 MiB | 5.22 MiB/s, done.
Resolving deltas: 100% (152573/152573), done.
Mit Jun 21 13:33:35 CEST 2017
The text was updated successfully, but these errors were encountered:
@osklyar Thanks for the report.
Given that we're approaching a stable release of v4, it's time to focus on performance and fix long-standing issues on that front. So we'll be working on this soon.
I have hit some performance issues during clone as well. My repository .git dir is ~24MB after git gc --aggressize; git repack -a -d but cloning seems to take about 1m15s on my Core i7 based MacBook. Using the standard git tools, the same process is done in less than 1s. Watching the clone progress via:
When cloning large repositories, with respect to the occupied space and less so with respect to the number of commits
go-git
uses some sort of a different strategy thangit
resulting in massive memory footprint and very long clone times. Here cloning a repo that unpacks into 1.5Gb and contains ca. 110k commits,go-git
uses up to 5Gb RAM and runs over 4m whilegit
uses 290Mb and runs in about 1m (tested with geat based ongo-git
):Memory requirements scales more or less linearly with the commit number and repository size, below e.g. a smaller repo with quite a lot of commits and
go-git
uses about 8x more memory thangit
. On the performance side, the growth of the repository size leads to much faster degradation: for the 1.5 Gb repo about the difference is 4 times, for a 10 times smaller repo below the times are about the same forgit
andgo-git
whilegit
shows approximately the same times as for 1.5Gb repo.Cloning
github.com:moby/moby
with 32k commits and 170Mb overall unpacked size takes about the same 1m20s with bothgit
andgo-git
. Memory wise,go-git
loses uses the max of 320Mb (2x the repo size) andgit
45Mb (0.25x the repo size):The text was updated successfully, but these errors were encountered: