Git is not scalable with too many refs/*

Heya,

[+shawn, who runs into something similar with Gerrit]

Post by NAKAMURA Takumi
Hello, Git. It is my 1st post here.
I have tried tagging each commit as "refs/tags/rXXXXXX" on git-svn
repo locally. (over 100k refs/tags.)
Indeed, it made something extremely slower, even with packed-refs and
pack objects.
I gave up, then, to push tags to upstream. (it must be terror) :p
I know it might be crazy in the git way, but it would bring me conven=

iences.

Post by NAKAMURA Takumi
(eg. git log --oneline --decorate shows me each svn revision)
I would like to work for Git to live with many tags.
* Issues as far as I have investigated;
=C2=A0- git show --decorate is always slow.
=C2=A0 =C2=A0in decorate.c, every commits are inspected.
=C2=A0- git rev-tree --quiet --objects $upstream --not --all spends s=

o much time,

Post by NAKAMURA Takumi
=C2=A0 =C2=A0even if it is expected to return with 0.
=C2=A0 =C2=A0As you know, it is used in builtin/fetch.c.
=C2=A0- git-upload-pack shows "all" refs to me if upstream has too ma=

ny refs.

Post by NAKAMURA Takumi
I would like to work as below if they were valuable.
=C2=A0- Get rid of inspecting commits in packed-refs on decorate stuf=

Post by NAKAMURA Takumi
=C2=A0- Implement sort-by-hash packed-refs, (not sort-by-name)
=C2=A0- Implement more effective pruning --not --all on revision.c.
=C2=A0- Think about enhancement of protocol to transfer many refs mor=

e effectively.

Post by NAKAMURA Takumi
I am happy to consider the issue, thank you.

--=20
Cheers,

Sverre Rabbelier

Shawn Pearce

2011-06-09 15:23:44 UTC

Post by Sverre Rabbelier
[+shawn, who runs into something similar with Gerrit]

Post by NAKAMURA Takumi
Hello, Git. It is my 1st post here.
I have tried tagging each commit as "refs/tags/rXXXXXX" on git-svn
repo locally. (over 100k refs/tags.)

As Jakub pointed out, use git notes for this. They were designed to
scale to >100,000 annotations.

Post by NAKAMURA Takumi
Indeed, it made something extremely slower, even with packed-refs and
pack objects.

Having a reference to every commit in the repository is horrifically
slow. We run into this with Gerrit Code Review and I need to find
another solution. Git just wasn't meant to process repositories like
this.

--
Shawn.

A Large Angry SCM

2011-06-09 15:52:18 UTC

Post by Sverre Rabbelier
[+shawn, who runs into something similar with Gerrit]

Post by NAKAMURA Takumi
Hello, Git. It is my 1st post here.
I have tried tagging each commit as "refs/tags/rXXXXXX" on git-svn
repo locally. (over 100k refs/tags.)

As Jakub pointed out, use git notes for this. They were designed to
scale to>100,000 annotations.

Post by NAKAMURA Takumi
Indeed, it made something extremely slower, even with packed-refs and
pack objects.

Assuming a very large number of refs, what is it that makes git so
horrifically slow? Is there a design or implementation lesson here?

Shawn Pearce

2011-06-09 15:56:50 UTC

Post by A Large Angry SCM

Post by Shawn Pearce
Having a reference to every commit in the repository is horrifically
slow. We run into this with Gerrit Code Review and I need to find
another solution. Git just wasn't meant to process repositories like
this.

Assuming a very large number of refs, what is it that makes git so
horrifically slow? Is there a design or implementation lesson here?

A few things.

Git does a sequential scan of all references when it first needs to
access references for an operation. This requires reading the entire
packed-refs file, and the recursive scan of the "refs/" subdirectory
for any loose refs that might override the packed-refs file.

A lot of operations toss every commit that a reference points at into
the revision walker's LRU queue. If you have a tag pointing to every
commit, then the entire project history enters the LRU queue at once,
up front. That queue is managed with O(N^2) insertion time. And the
entire queue has to be filled before anything can be output.

--
Shawn.

Jeff King

2011-06-09 16:26:04 UTC

Post by Shawn Pearce
A lot of operations toss every commit that a reference points at into
the revision walker's LRU queue. If you have a tag pointing to every
commit, then the entire project history enters the LRU queue at once,
up front. That queue is managed with O(N^2) insertion time. And the
entire queue has to be filled before anything can be output.

We ran into this recently at github. Since our many-refs repos were
mostly forks, we had a lot of duplicate commits, and were able to solve
it with ea5f220 (fetch: avoid repeated commits in mark_complete,
2011-05-19).

However, I also worked up a faster priority queue implementation that
would work in the general case:

http://thread.gmane.org/gmane.comp.version-control.git/174003/focus=174005

I suspect it would speed up the original poster's slow fetch. The
problem is that a fast priority queue doesn't have quite the same access
patterns as a linked list, so replacing all of the commit_lists in git
with the priority queue would be quite a painful undertaking. So we are
left with using the fast queue only in specific hot-spots.

-Peff

NAKAMURA Takumi

2011-06-10 03:59:47 UTC

Good afternoon Git! Thank you guys to give me comments.

Jakub and Shawn,

Sure, Notes should be used at the case, I agree.

Post by NAKAMURA Takumi
(eg. git log --oneline --decorate shows me each svn revision)

My example might misunderstand you. I intended tags could show me
pretty abbrev everywhere on Git. I would be happier if tags might be
available bi-directional alias, as Stephen mentions.

It would be better git-svn could record metadata into notes, I think, t=
oo. :D

Stephen,

Post by NAKAMURA Takumi
=A01) Hacking on some code in Git the programmer finds something wron=

g. =A0Using Git tools he can pickaxe/bisect/etc. and find that the prob=
lem traces back to a commit imported from Subversion.

Post by NAKAMURA Takumi
=A02) The programmer finds something wrong, asks coworker, coworker s=

ays "see bug XYZ", bug XYZ says "Fixed in r20356".

Post by NAKAMURA Takumi
I agree notes is the right answer for (1), but for (2) you really wan=

t a cross reference table from Subversion rev number to Git commit.

It is the point I wanted to say, thank you! I am working with svn-men.
They often speak svn revision number. (And I have to tell them svn
revs then)

Post by NAKAMURA Takumi
In our office we created the cross reference table once by walking th=

e Git tree and storing it as a file (we had some degenerate cases where=
one SVN rev mapped to multiple Git commits, but I don't remember the d=
etails), but it's not really usable from Git. =A0Lightweight tags would=
be an awesome solution (if they worked). =A0Perhaps a custom subcomman=
d is a reasonable middle ground.

Reconstructing svnrev-commits mapping can be done by git-svn itself.
Unfortunately, git-svn's .rev-map is sorted by revision number. I
think it would be useless to make subcommands unless they were
pluggable into Git as "smart-tag resolver".

Peff,

At first, thank you to work for Github! Awesome!
I didn't know Github has refs issues. (yeah, I should not push 100k of
tags to Github for now :p )

I am working on linux and windows. Many-refs-repo can make Git awfully
slow (than linux!) I hope I could work also for windows to improve
various performance issue.

=46YI, I have tweaked git-rev-list for commits not to sort by date with
--quiet. It improves git-fetch (git-rev-list --not --all) performance
when objects is well-packed.

=2E..Takumi

Jeff King

2011-06-13 22:27:34 UTC

=C2=A01) Hacking on some code in Git the programmer finds something=

wrong. =C2=A0Using Git tools he can pickaxe/bisect/etc. and find that =
the problem traces back to a commit imported from Subversion.

=C2=A02) The programmer finds something wrong, asks coworker, cowor=

ker says "see bug XYZ", bug XYZ says "Fixed in r20356".

I agree notes is the right answer for (1), but for (2) you really w=

ant a cross reference table from Subversion rev number to Git commit.

=20
It is the point I wanted to say, thank you! I am working with svn-men=

=2E

They often speak svn revision number. (And I have to tell them svn
revs then)

Yeah, there is no simple way to do the bi-directional mapping in git.
If all you want are decorations on commits, notes are definitely the wa=
y
to go. They are optimized for lookup in of commit -> data. But if you
want data -> commit, the only mapping we have is refs, and they are not
well optimized for the many-refs use case.

Packed-refs are better than loose refs, but I think right now we just
load them all in to an in-memory linked list. We could load them into a
more efficient in-memory data structure, or we could perhaps even mmap
the packed-refs file and binary search it in place.

But lookup is only part of the problem. There are algorithms that want
to look at all the refs (notably fetching and pushing), which are going
to be a bit slower. We don't have a way to tell those algorithms that
those refs are uninteresting for reachability analysis, because they ar=
e
just pointing to parts of the graph that are already reachable by
regular refs. Maybe there could be a part of the refs namespace that is
ignored by "--all". I dunno. That seems like a weird inconsistency.

FYI, I have tweaked git-rev-list for commits not to sort by date with
--quiet. It improves git-fetch (git-rev-list --not --all) performance
when objects is well-packed.

I'm not sure that is a good solution. Even with --quiet, we will be
walking the commit graph to find merge bases to see if things are
connected. The walking code expects date-sorting; I'm not sure what
changing that assumption will do to the code.

-Peff

Andreas Ericsson

2011-06-14 00:17:58 UTC

Post by NAKAMURA Takumi
Good afternoon Git! Thank you guys to give me comments.
Jakub and Shawn,
Sure, Notes should be used at the case, I agree.

Post by NAKAMURA Takumi
(eg. git log --oneline --decorate shows me each svn revision)

My example might misunderstand you. I intended tags could show me
pretty abbrev everywhere on Git. I would be happier if tags might be
available bi-directional alias, as Stephen mentions.
It would be better git-svn could record metadata into notes, I think, too. :D
Stephen,

Post by NAKAMURA Takumi
1) Hacking on some code in Git the programmer finds something wrong. Using Git tools he can pickaxe/bisect/etc. and find that the problem traces back to a commit imported from Subversion.
2) The programmer finds something wrong, asks coworker, coworker says "see bug XYZ", bug XYZ says "Fixed in r20356".
I agree notes is the right answer for (1), but for (2) you really want a cross reference table from Subversion rev number to Git commit.

If you're using svn metadata in the commit text, you can always do
"git log -p --grep=@20356" to get the commits relevant to that one.
It's not as fast as "git show svn-20356", but it's not exactly
glacial either and would avoid the problems you're having now.

--
Andreas Ericsson ***@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

Jeff King

2011-06-14 00:30:29 UTC

Post by Andreas Ericsson
If you're using svn metadata in the commit text, you can always do
It's not as fast as "git show svn-20356", but it's not exactly
glacial either and would avoid the problems you're having now.

If we do end up putting this data into notes eventually (which I think
we _should_ do, because then you aren't locked into having this svn
cruft in your commit messages for all time, but can rather choose
whether or not to display it), it would be nice to have a --grep-notes
feature in git-log. Or maybe --grep should look in notes by default,
too, if we are showing them.

I suspect the feature would be really easy to implement, if somebody is
looking for a gentle introduction to git, or a fun way to spend an hour.
:)

-Peff

Junio C Hamano

2011-06-14 04:41:10 UTC

Post by Jeff King
I suspect the feature would be really easy to implement, if somebody is
looking for a gentle introduction to git, or a fun way to spend an hour.

I would rather want to see if somebody can come up with a flexible reverse
mapping feature around notes. It does not have to be completely generic,
just being flexible enough is fine.

Sverre Rabbelier

2011-06-14 07:26:10 UTC

Heya,

Post by Junio C Hamano
I would rather want to see if somebody can come up with a flexible reverse
mapping feature around notes. It does not have to be completely generic,
just being flexible enough is fine.

Wouldn't it be enough to simply create a note on 'r651235' with as
contents the git ref?

--
Cheers,

Sverre Rabbelier

Johan Herland

2011-06-14 10:02:46 UTC

Post by Sverre Rabbelier
Heya,

Post by Junio C Hamano
I would rather want to see if somebody can come up with a flexible
reverse mapping feature around notes. It does not have to be
completely generic, just being flexible enough is fine.

Wouldn't it be enough to simply create a note on 'r651235' with as
contents the git ref?

Not quite sure what you mean by "create a note on 'r651235'". You could
devise a scheme where you SHA1('r651235'), and then create a note on the
resulting hash.

Notes are named by the SHA1 of the object they annotate, but there is no
hard requirement (as long as you stay away from "git notes prune") that the
SHA1 annotated actually exists as a valid Git object in your repo.

Hence, you can use notes to annotate _anything_ that can be uniquely reduced
to a SHA1 hash.

...Johan

--
Johan Herland, <***@herland.net>
www.herland.net

Sverre Rabbelier

2011-06-14 10:34:47 UTC

Heya,

Post by Johan Herland
Not quite sure what you mean by "create a note on 'r651235'". You could
devise a scheme where you SHA1('r651235'), and then create a note on the
resulting hash.

I was thinking they could annotate anything, even non-sha's, but in
that case, yes, the sha of the revision would work just as well.

--
Cheers,

Sverre Rabbelier

Jeff King

2011-06-14 17:02:14 UTC

Post by Johan Herland

Post by Sverre Rabbelier
Wouldn't it be enough to simply create a note on 'r651235' with as
contents the git ref?

Not quite sure what you mean by "create a note on 'r651235'". You could
devise a scheme where you SHA1('r651235'), and then create a note on the
resulting hash.
Notes are named by the SHA1 of the object they annotate, but there is no
hard requirement (as long as you stay away from "git notes prune") that the
SHA1 annotated actually exists as a valid Git object in your repo.
Hence, you can use notes to annotate _anything_ that can be uniquely reduced
to a SHA1 hash.

I lean against that as a solution. I think "git gc" will probably
eventually learn to do "git notes prune", at which point we would start
losing people's data. So I think it is better to keep the definition of
notes a little tighter now, and say "the left-hand side of a notes
mapping must be a referenced object". We can always loosen it later.

On top of that, though, the sha1 solution is not all that pleasant. It
lets you do exact lookups, but you have no way of iterating over the
list of svn revisions.

I also think we can do something a little more lightweight. The user has
already created and is maintaining a mapping in one direction via the
notes. We just need the inverse mapping, which we can generate
programatically. So it can be a straight cache, with the sha1 of the
notes tree determining the cache validity (i.e., if the forward mapping
in the notes tree changes, you regenerate the cache from scratch).

We would want to store the cache in an on-disk format that could be
searched easily. Possibly something like the packed-refs format would be
sufficient, if we mmap'd and binary searched it. It would be dirt simple
if we used an existing key/value store like gdbm or tokyocabinet, but we
usually try to avoid extra dependencies.

-Peff

Shawn Pearce

2011-06-14 19:20:29 UTC

Post by Jeff King
I also think we can do something a little more lightweight. The user has
already created and is maintaining a mapping in one direction via the
notes. We just need the inverse mapping, which we can generate
programatically. So it can be a straight cache, with the sha1 of the
notes tree determining the cache validity (i.e., if the forward mapping
in the notes tree changes, you regenerate the cache from scratch).
We would want to store the cache in an on-disk format that could be
searched easily. Possibly something like the packed-refs format would be
sufficient, if we mmap'd and binary searched it. It would be dirt simple
if we used an existing key/value store like gdbm or tokyocabinet, but we
usually try to avoid extra dependencies.

Yea, not a bad idea. Use a series of SSTable like things, like Hadoop
uses. It doesn't need to be as complex as the Hadoop SSTable concept.
But a simple sorted string to string mapping file that is immutable,
with edits applied by creating an overlay file that contains
new/updated entries.

As you point out, we can use the notes tree to tell us the validity of
the cache, and do incremental updates. If the current cache doesn't
match the notes ref, compute the tree diff between the current cache's
source tree and the new tree, and create a new SSTable like thing that
has the relevant updates as an overlay of the existing tables. After
some time you will have many of these little overlay files, and a GC
can just merge them down to a single file.

The only problem is, you probably want this "reverse notes index" to
be indexing a portion of the note blob text, not all of it. That is,
we want the SVN note text to say something like "SVN Revision: r1828"
so `git log --notes=svn` shows us something more useful than just
"r1828". But in the reverse index, we may only want the key to be
"r1828". So you need some sort of small mapping function to decide
what to put into that reverse index.

--
Shawn.

Jeff King

2011-06-14 19:47:49 UTC

Post by Jeff King
We would want to store the cache in an on-disk format that could be
searched easily. Possibly something like the packed-refs format would be
sufficient, if we mmap'd and binary searched it. It would be dirt simple
if we used an existing key/value store like gdbm or tokyocabinet, but we
usually try to avoid extra dependencies.

I was really hoping that it would be fast enough that we could simply
blow away the old mapping and recreate it from scratch. That gets us out
of writing any journaling-type code with overlays. For something like
svn revisions, it's probably fine to take an extra second or two to
build the cache after we do a fetch. But it wouldn't scale to something
that was getting updated frequently.

If we're going to start doing clever database-y things, I'd much rather
use a proven key/value db solution like tokyocabinet. I'm just not sure
how to degrade gracefully when the db library isn't available. Don't
allow reverse mappings? Fallback to something slow?

Post by Shawn Pearce
The only problem is, you probably want this "reverse notes index" to
be indexing a portion of the note blob text, not all of it. That is,
we want the SVN note text to say something like "SVN Revision: r1828"
so `git log --notes=svn` shows us something more useful than just
"r1828". But in the reverse index, we may only want the key to be
"r1828". So you need some sort of small mapping function to decide
what to put into that reverse index.

I had assumed that we would just be writing r1828 into the note. The
output via git log is actually pretty readable:

$ git notes --ref=svn/revisions add -m r1828
$ git show --notes=svn/revisions
...
Notes (svn/revisions):
r1828

Of course this is just one use case.

For that matter, we have to figure out how one would actually reference
the reverse mapping. If we have a simple, pure-reverse mapping, we can
just generate and cache them on the fly, and give a special syntax.
Like:

$ git log notes/svn/revisions@{revnote:r1828}

which would invert the notes/svn/revisions tree, search for r1828, and
reference the resulting commit.

If you had something more heavyweight that actually needed to parse
during the mapping, you might have something like:

$ : set up the mapping
$ git config revnote.svn.map 'SVN Revision: (r[0-9]+)'

$ : do the reverse; we should be able to build the cache on the fly
$ git notes reverse r1828
346ab9aaa1cf7b1ed2dd2c0a67bccc5b8ec23f7c

$ : so really you could have a similar ref syntax like, though
$ : this would require some ref parser updates, as we currently
$ : assume anything to the left of @{} is a real ref
$ git log r1828@{revnote:svn}

The syntaxes are not as nice as having a real ref. In the last example,
we could probably look for the contents of "@{}" as a possible revnote
mapping (since we've already had to name it via the configuration), to
make it "r1828@{svn}". Or you could even come up with a default set of
revnotes to consider, so that if we lookup "r1828" and it isn't a real
ref, we fall back to trying r1828@{revnote:svn}.

I dunno. I'm just throwing ideas out at this point.

-Peff

Shawn Pearce

2011-06-14 20:12:07 UTC

We would want to store the cache in an on-disk format that could b=

searched easily. Possibly something like the packed-refs format wo=

uld be

sufficient, if we mmap'd and binary searched it. It would be dirt =

simple

if we used an existing key/value store like gdbm or tokyocabinet, =

but we

usually try to avoid extra dependencies.

Yea, not a bad idea. Use a series of SSTable like things, like Hadoo=

uses. It doesn't need to be as complex as the Hadoop SSTable concept=

=2E

But a simple sorted string to string mapping file that is immutable,
with edits applied by creating an overlay file that contains
new/updated entries.
As you point out, we can use the notes tree to tell us the validity =

the cache, and do incremental updates. If the current cache doesn't
match the notes ref, compute the tree diff between the current cache=

source tree and the new tree, and create a new SSTable like thing th=

has the relevant updates as an overlay of the existing tables. After
some time you will have many of these little overlay files, and a GC
can just merge them down to a single file.

I was really hoping that it would be fast enough that we could simply
blow away the old mapping and recreate it from scratch. That gets us =

out

Post by Jeff King
of writing any journaling-type code with overlays. For something like
svn revisions, it's probably fine to take an extra second or two to
build the cache after we do a fetch. But it wouldn't scale to somethi=

Post by Jeff King
that was getting updated frequently.
If we're going to start doing clever database-y things, I'd much rath=

Post by Jeff King
use a proven key/value db solution like tokyocabinet. I'm just not su=

Post by Jeff King
how to degrade gracefully when the db library isn't available. Don't
allow reverse mappings? Fallback to something slow?

This is why I would prefer to build the solution into Git.

Its not that bad to do a sorted string file. Take something simple
that is similar to a pack file:

GSST | vers | rcnt | srctree | base
[ klen key vlen value ]*
[ roff ]*
SHA1(all_of_above)

Where vers is a version number, rcnt is the number of records, srctree
is the SHA-1 of the notes tree this thing indexed from, and base is
the SHA-1 of the notes tree this "applies on top of". There are then
rcnt records in the file, each using a variable length key length and
value length field (klen, vlen), with variable length key and values.
At the end of the file are rcnt 4 byte offsets to the start of each
key.

When writing the file, write all of the above to a temporary file,
then rename it to $GIT_DIR/cache/db-$SHA1.db, as its a unique name.
Its easy to prepare the list of entries in memory as an array of
structs of key/value pairs, sort them with qsort(), write them out and
update offsets as you go, then dump out the offset table at the end.
One could compress the offset table by only storing every N offsets,
readers perform binary search until they find the first key that is
before their desired key, then sequentially scan records until they
locate the correct entry... but I'm not sure the space savings is
really worthwhile here.

When reading, scan the directory and read the headers of each file. If
the file has your target srctree, your cache is current and you can
read it. If a key isn't in this file, you open the file named
$GIT_DIR/cache/db-$base.db and try again there, walking back along
that base chain until base is '0'x40. (or some other marker in the
header to denote there is no base file).

GC is just a matter of merging the sorted files together. Follow along
all of the base pointers, open all of them, scan through the records
and write out the first key that is defined. I guess we need a small
"delete" bit in the record to indicate a particular key/value was
removed from the database. Since this is a reverse mapping, duplicates
are possible, and readers that want all values need to scan back to
the base file, but skip base entries that were marked deleted in a
newer file.

Updating is just preparing a new file that uses the current srctree as
your base, and only inserting/sorting the paths that were different in
the notes.

We probably need to store these files keyed by their notes ref, so we
can find "svn/revisions" differently from "bugzilla" (another
hypothetical mapping of Bugzilla bug ids to commit SHA-1s, based on
notes that attached bug numbers to commits).

I don't think its that bad. Maybe its a bit too much complexity for
version 1 to have these incremental update files be supported, but it
shouldn't be that hard.

The only problem is, you probably want this "reverse notes index" to
be indexing a portion of the note blob text, not all of it. That is,
we want the SVN note text to say something like "SVN Revision: r1828=

so `git log --notes=3Dsvn` shows us something more useful than just
"r1828". But in the reverse index, we may only want the key to be
"r1828". So you need some sort of small mapping function to decide
what to put into that reverse index.

I had assumed that we would just be writing r1828 into the note. The
=A0$ git notes --ref=3Dsvn/revisions add -m r1828
=A0$ git show --notes=3Dsvn/revisions
=A0...
=A0 =A0 =A0r1828
Of course this is just one use case.

Thanks, I keep forgetting that the notes prints the note ref name out
before the text, so its already got this annotation present. This
makes it much more likely that the bare "r1828" text is acceptable in
the note, and that the reverse index is just the entire content of the
blob as the key. :-)

Post by Jeff King
For that matter, we have to figure out how one would actually referen=

Post by Jeff King
the reverse mapping. If we have a simple, pure-reverse mapping, we ca=

Post by Jeff King
just generate and cache them on the fly, and give a special syntax.

Uhm. Ick.

Post by Jeff King
The syntaxes are not as nice as having a real ref. In the last exampl=

e,
e

Post by Jeff King
mapping (since we've already had to name it via the configuration), t=

o
f

Post by Jeff King
revnotes to consider, so that if we lookup "r1828" and it isn't a rea=

l
Or, what about setting up a fake ref namespace:

git config ref.refs/remotes/svn/*.from refs/notes/svn/revisions

Then `git log svn/r1828` works. But these aren't real references. We
would only want to consider them if a request matched the glob, so
`git for-each-ref` and `git upload-pack` aren't reporting these things
by default, and neither is `git log --all` or `gitk --all`.

I agree a syntax that works out of the box without a configuration
file change would be nicer. But we are running out of operators to do
that with. `git log notes/svn/revisions@{revnote:r1828}` as you
propose above is at least workable...

--=20
Shawn.

Martin Fick

2011-09-08 19:53:39 UTC

Just thought that I should add some numbers to this thread as it seems that
the later versions of git are worse off by several orders of magnitude on
this one.

We have a Gerrit repo with just under 100K refs in refs/changes/*. When I
fetch them all with git 1.7.6 it does not seem to complete. Even after 5
days, it is just under half way through the ref #s! It appears, but I am
not sure, that it is getting slower with time also, so it may not even
complete after 10 days, I couldn't wait any longer. However, the same
command works in under 8 mins with git 1.7.3.3 on the same machine!

Syncing 100K refs:

git 1.7.6 > 8 days?
git 1.7.3.3 ~8mins

That is quite a difference! Have there been any obvious changes to git that
should cause this? If needed, I can bisect git to find out where things go
sour, but I thought that perhaps there would be someone who already
understands the problem and why older versions aren't nearly as bas as
recent ones.

Some more things that I have tried: after syncing the repo locally with all
100K refs under refs/changes, I cloned it locally again and tried fetching
locally with both git 1.7.6 and 1.7.3.3. I got the same results as
remotely, so it does not appear to be related to round trips.

The original git remote syncing takes just a bit of time, and then it
outputs lines like these:
...
* [new branch] refs/changes/13/66713/2 -> refs/changes/13/66713/2
* [new branch] refs/changes/13/66713/3 -> refs/changes/13/66713/3
* [new branch] refs/changes/13/66713/4 -> refs/changes/13/66713/4
* [new branch] refs/changes/13/66713/5 -> refs/changes/13/66713/5
...

This is the part that takes forever. The lines seem to scroll by slower and
slower (with git 1.7.6). In the beginning, the lines might be a screens
worth a minute, after 5 days, about 1 a minute. My CPU is pegged at 100%
during this time (one core). Since I have some good test data for this, let
me know if I should test anything specific.

Thanks,

-Martin

Employee of Qualcomm Innovation Center, Inc. which is a member of Code
Aurora Forum

--
View this message in context: http://git.661346.n2.nabble.com/Git-is-not-scalable-with-too-many-refs-tp6456443p6773496.html
Sent from the git mailing list archive at Nabble.com.

Martin Fick

2011-09-09 00:52:02 UTC

An update, I bisected it down to this commit:

88a21979c5717e3f37b9691e90b6dbf2b94c751a

fetch/pull: recurse into submodules when necessary

Since this can be disabled with the --no-recurse-submodules switch, I tried
that and indeed, even with the latest 1.7.7rc it becomes fast (~8mins)
again. The strange part about this is that the repository does not have any
submodules. Anyway, I hope that this can be useful to others since it is a
workaround which speed things up enormously. Let me know if you have any
other tests that you want me to perform,

-Martin

--
View this message in context: http://git.661346.n2.nabble.com/Git-is-not-scalable-with-too-many-refs-tp6456443p6774328.html
Sent from the git mailing list archive at Nabble.com.

Thomas Rast

2011-09-09 01:05:15 UTC

Post by Martin Fick
88a21979c5717e3f37b9691e90b6dbf2b94c751a
fetch/pull: recurse into submodules when necessary
Since this can be disabled with the --no-recurse-submodules switch, I tried
that and indeed, even with the latest 1.7.7rc it becomes fast (~8mins)
again. The strange part about this is that the repository does not have any
submodules. Anyway, I hope that this can be useful to others since it is a
workaround which speed things up enormously. Let me know if you have any
other tests that you want me to perform,

Jens should know about this, so let's Cc him.

I took a quick look and I'm guessing that there's at least one
quadratic behaviour: in check_for_new_submodule_commits(), I see

+ const char *argv[] = {NULL, NULL, "--not", "--all", NULL};
+ int argc = ARRAY_SIZE(argv) - 1;
+
+ init_revisions(&rev, NULL);

which means that the --all needs to walk all commits reachable from
all refs and flag them as uninteresting. But that function is called
for every ref update, so IIUC the time spent is on the order of
#ref updates*#commits.

--
Thomas Rast
trast@{inf,student}.ethz.ch

Thomas Rast

2011-09-09 01:13:56 UTC

Post by Thomas Rast
+ const char *argv[] = {NULL, NULL, "--not", "--all", NULL};
+ int argc = ARRAY_SIZE(argv) - 1;
+
+ init_revisions(&rev, NULL);
which means that the --all needs to walk all commits reachable from
all refs and flag them as uninteresting.

Scratch that, it "only" needs to mark every tip commit and then walk
them back to about where the interesting commits end.

In any case, since the uninteresting set only gets larger, it should
be possible to reuse the same revision walker.

--
Thomas Rast
trast@{inf,student}.ethz.ch

Jens Lehmann

2011-09-09 15:59:20 UTC

Thanks for nailing that one down. I'm currently looking into bringing back
decent performance here.

Martin Fick

2011-09-25 20:43:27 UTC

A coworker of mine pointed out to me that a simple

git checkout

can also take rather long periods of time > 3 mins when run
on a repo with ~100K refs.

While this is not massive like the other problem I reported,
it still seems like it is more than one would expect. So, I
tried an older version of git, and to my surprise/delight,
it was much faster (.2s). So, I bisected this issue also,
and it seems that the "offending" commit is
680955702990c1d4bfb3c6feed6ae9c6cb5c3c07:

commit 680955702990c1d4bfb3c6feed6ae9c6cb5c3c07
Author: Christian Couder <***@tuxfamily.org>
Date: Fri Jan 23 10:06:53 2009 +0100

replace_object: add mechanism to replace objects found
in "refs/replace/"

The code implementing this mechanism has been copied
more-or-less
from the commit graft code.

This mechanism is used in "read_sha1_file". sha1 passed
to this
function that match a ref name in "refs/replace/" are
replaced by
the sha1 that has been read in the ref.

We "die" if the replacement recursion depth is too high
or if we
can't read the replacement object.

Signed-off-by: Christian Couder
<***@tuxfamily.org>
Signed-off-by: Junio C Hamano <***@pobox.com>

Now, I suspect this commit is desirable, but I was hoping
that perhaps a look at it might inspire someone to find an
obvious problem with it.

Thanks,

-Martin

--
Employee of Qualcomm Innovation Center, Inc. which is a
member of Code Aurora Forum

Christian Couder

2011-09-26 12:41:04 UTC

Post by Martin Fick
A coworker of mine pointed out to me that a simple
=A0git checkout
can also take rather long periods of time > 3 mins when run
on a repo with ~100K refs.

Are all these refs packed?

Post by Martin Fick
While this is not massive like the other problem I reported,
it still seems like it is more than one would expect. =A0So, I
tried an older version of git, and to my surprise/delight,
it was much faster (.2s). =A0So, I bisected this issue also,
and it seems that the "offending" commit is
commit 680955702990c1d4bfb3c6feed6ae9c6cb5c3c07
Date: =A0 Fri Jan 23 10:06:53 2009 +0100
=A0 =A0replace_object: add mechanism to replace objects found
in "refs/replace/"

[...]

Post by Martin Fick
Now, I suspect this commit is desirable, but I was hoping
that perhaps a look at it might inspire someone to find an
obvious problem with it.

I don't think there is an obvious problem with it, but it would be
nice if you could dig a bit deeper.

The first thing that could take a lot of time is the call to
for_each_replace_ref() in this function:

+static void prepare_replace_object(void)
+{
+ static int replace_object_prepared;
+
+ if (replace_object_prepared)
+ return;
+
+ for_each_replace_ref(register_replace_ref, NULL);
+ replace_object_prepared =3D 1;
+}

Another thing is calling replace_object_pos() repeatedly in
lookup_replace_object().

Thanks,
Christian.

Martin Fick

2011-09-26 17:47:56 UTC

On Monday, September 26, 2011 06:41:04 am Christian Couder

On Sun, Sep 25, 2011 at 10:43 PM, Martin Fick

Post by Martin Fick
A coworker of mine pointed out to me that a simple
git checkout
can also take rather long periods of time > 3 mins when
run on a repo with ~100K refs.

Are all these refs packed?

I think so, is there a way to find out for sure?

-Martin

Christian Couder

2011-09-26 18:56:04 UTC

Post by Martin Fick
On Monday, September 26, 2011 06:41:04 am Christian Couder

Post by Christian Couder
Are all these refs packed?

I think so, is there a way to find out for sure?

After "git pack-refs --all" I get:

$ find .git/refs/ -type f
.git/refs/remotes/origin/HEAD
.git/refs/stash

So I suppose that if such a find gives you only a few files all (or most of)
your refs are packed.

Best regards,
Christian.

Martin Fick

2011-09-30 16:41:13 UTC

On Monday, September 26, 2011 12:56:04 pm Christian Couder

OK. So many great improvements in ref scalability, thanks
everyone!

It is getting so good, that I had to take a step back and
re-evaluate what we consider good/bad. On doing so, I can't
help but think that fetches still need some improvement.

Fetches had the worst regression of all > 8days, so the
massive fix to bring it down to 7.5mins was awesome.
7-8mins sounded pretty good 2 weeks ago, especially when a
checkout took 5+ mins! but now that almost every other
operation has been sped up, that is starting to feel a bit
on the slow side still. My spidey sense tells me something
is still not quite right in the fetch path.

Here is some more data to backup my spidey sense: after all
the improvements, a noop fetch of all the changes (noop
meaning they are all already uptodate) takes around
3mins with a non gced (non packed refs) case. That same
noop only takes ~12s in the gced (packed ref case)!

I dug into this a bit further. I took a non gced and non
packed refs repo and this time instead of gcing it to get
packedrefs, I only ran the above git pack-refs --all so that
objects did not get gced. With this, the noop fetch was
also only around 12s. This confirmed that the non gced
objects are not interfering with the noop fetch, the problem
really is just the unpacked refs. Just to confirm that the
FS is not horribly slow, I did a "find .git/refs" and it
only takes about .4s for about 80Kresults!

So, while I understand that a full fetch will actually have
to transfer quite a bit of data, the noop fetch seems like
it is still suffering in the non gced (non packed ref case).
If that time were improved, I suspect that the full fetch
will improve at least by an equivalent amount, if not more.

Any thoughts?

-Martin

--
Employee of Qualcomm Innovation Center, Inc. which is a
member of Code Aurora Forum

Martin Fick

2011-09-28 19:38:04 UTC

On Monday, September 26, 2011 06:41:04 am Christian Couder

On Sun, Sep 25, 2011 at 10:43 PM, Martin Fick

<***@codeaurora.org> wrote:
...

Post by Martin Fick
git checkout
can also take rather long periods of time > 3 mins when
run on a repo with ~100K refs.

...

Post by Martin Fick
So, I bisected this issue also, and it seems that the
"offending" commit is

...

Post by Martin Fick
commit 680955702990c1d4bfb3c6feed6ae9c6cb5c3c07
replace_object: add mechanism to replace objects
found in "refs/replace/"

...

I don't think there is an obvious problem with it, but it
would be nice if you could dig a bit deeper.
The first thing that could take a lot of time is the call
+static void prepare_replace_object(void)
+{
+ static int replace_object_prepared;
+
+ if (replace_object_prepared)
+ return;
+
+ for_each_replace_ref(register_replace_ref, NULL);
+ replace_object_prepared = 1;
+}

The time was actually spent in for_each_replace_ref()
which calls get_loose_refs() which has the recursive bug
that Julian Phillips fixed 2 days ago. Good to see that
this fix helps other use cases too.

So with that bug fixed, the thing taking the most time now
for a git checkout with ~100K refs seems to be the orphan
check as Thomas predicted. The strange part with this, is
that the orphan check seems to take only about ~20s in the
repo where the refs aren't packed. However, in the repo
where they are packed, this check takes at least 5min! This
seems a bit unusual, doesn't it? Is the filesystem that
much better at indexing refs than git's pack mechanism?
Seems unlikely, the unpacked refs take 312M in the FS, the
packed ones only take about 4.3M. I suspect their is
something else unexpected going on here in the packed ref
case.

Any thoughts? I will dig deeper...

-Martin

--
Employee of Qualcomm Innovation Center, Inc. which is a
member of Code Aurora Forum

Martin Fick

2011-09-28 22:10:48 UTC

On Wednesday, September 28, 2011 01:38:04 pm Martin Fick

Post by Martin Fick
On Monday, September 26, 2011 06:41:04 am Christian
Couder

On Sun, Sep 25, 2011 at 10:43 PM, Martin Fick

...

Post by Martin Fick
git checkout
can also take rather long periods of time > 3 mins
when run on a repo with ~100K refs.

...

Post by Martin Fick
So, I bisected this issue also, and it seems that
the
"offending" commit is

...

Post by Martin Fick
commit 680955702990c1d4bfb3c6feed6ae9c6cb5c3c07
replace_object: add mechanism to replace objects
found in "refs/replace/"

...

I don't think there is an obvious problem with it, but
it would be nice if you could dig a bit deeper.
The first thing that could take a lot of time is the
+static void prepare_replace_object(void)
+{
+ static int replace_object_prepared;
+
+ if (replace_object_prepared)
+ return;
+
+ for_each_replace_ref(register_replace_ref,
NULL); + replace_object_prepared = 1;
+}

The time was actually spent in for_each_replace_ref()
which calls get_loose_refs() which has the recursive bug
that Julian Phillips fixed 2 days ago. Good to see that
this fix helps other use cases too.
So with that bug fixed, the thing taking the most time
now for a git checkout with ~100K refs seems to be the
orphan check as Thomas predicted. The strange part with
this, is that the orphan check seems to take only about
~20s in the repo where the refs aren't packed. However,
in the repo where they are packed, this check takes at
least 5min! This seems a bit unusual, doesn't it? Is
the filesystem that much better at indexing refs than
git's pack mechanism? Seems unlikely, the unpacked refs
take 312M in the FS, the packed ones only take about
4.3M. I suspect their is something else unexpected
going on here in the packed ref case.
Any thoughts? I will dig deeper...

I think the problem is that resolve_ref() walks a linked
list of searching for the packed ref. Does this mean that
packed refs are not indexed at all?

Post by Martin Fick
-Martin

--
Employee of Qualcomm Innovation Center, Inc. which is a
member of Code Aurora Forum

Julian Phillips

2011-09-29 00:54:00 UTC

Post by Martin Fick
On Wednesday, September 28, 2011 01:38:04 pm Martin Fick

-- snip --

Post by Martin Fick
So with that bug fixed, the thing taking the most time
now for a git checkout with ~100K refs seems to be the
orphan check as Thomas predicted. The strange part with
this, is that the orphan check seems to take only about
~20s in the repo where the refs aren't packed. However,
in the repo where they are packed, this check takes at
least 5min! This seems a bit unusual, doesn't it? Is
the filesystem that much better at indexing refs than
git's pack mechanism? Seems unlikely, the unpacked refs
take 312M in the FS, the packed ones only take about
4.3M. I suspect their is something else unexpected
going on here in the packed ref case.
Any thoughts? I will dig deeper...

I think the problem is that resolve_ref() walks a linked
list of searching for the packed ref. Does this mean that
packed refs are not indexed at all?

Are you sure that it is walking the linked list that is the problem?
I've created a test repo with ~100k refs/changes/... style refs, and
~40000 refs/heads/... style refs, and checkout can walk the list of
~140k refs seven times in 85ms user time including doing whatever other
processing is needed for checkout. The real time is only 114ms - but
then my test repo has no real data in.

If resolve_ref() walking the linked list of refs was the problem, then
I would expect my test repo to show the same problem. It doesn't, a pre
ref-packing checkout took minutes (~0.5s user time), whereas a
ref-packed checkout takes ~0.1s. So, I would suggest that the problem
lies elsewhere.

Have you tried running a checkout whilst profiling?

--
Julian

Martin Fick

2011-09-29 01:37:18 UTC

I think the problem is that resolve_ref() walks a linked
list of searching for the packed ref. Does this mean that
packed refs are not indexed at all?
Are you sure that it is walking the linked list that is the problem?

It sure seems like it.

Post by Julian Phillips
I've created a test repo with ~100k refs/changes/... style refs, and
~40000 refs/heads/... style refs, and checkout can walk the list of
~140k refs seven times in 85ms user time including doing whatever other
processing is needed for checkout. The real time is only 114ms - but
then my test repo has no real data in.

If I understand what you are saying, it sounds like you do not have a very good test case. The amount of time it takes for checkout depends on how long it takes to find a ref with the sha1 that you are on. If that sha1 is so early in the list of refs that it only took you 7 traversals to find it, then that is not a very good testcase. I think that you should probably try making an orphaned ref (checkout a detached head, commit to it), that is probably the worst testcase since it should then have to search all 140K refs to eventually give up.

Again, if I understand what you are saying, if it took 85ms for 7 traversals, then it takes approximately 10ms per traversal, that's only 100/s! If you have to traverse it 140K times, that should work out to 1400s ~ 23mins.

Post by Julian Phillips
If resolve_ref() walking the linked list of refs was the problem, then > I would expect my test repo to show the same problem. It doesn't, a pre
ref-packing checkout took minutes (~0.5s user time), whereas a
ref-packed checkout takes ~0.1s. So, I would suggest that the problem > lies elsewhere.
Have you tried running a checkout whilst profiling?

No, to be honest, I am not familiar with any profilling tools.

-Martin

Employee of Qualcomm Innovation Center,Inc. which is a member of Code Aurora Forum

Julian Phillips

2011-09-29 02:19:16 UTC

-- snip --

If I understand what you are saying, it sounds like you do not have a
very good test case. The amount of time it takes for checkout depends
on how long it takes to find a ref with the sha1 that you are on. If
that sha1 is so early in the list of refs that it only took you 7
traversals to find it, then that is not a very good testcase. I think
that you should probably try making an orphaned ref (checkout a
detached head, commit to it), that is probably the worst testcase
since it should then have to search all 140K refs to eventually give
up.
Again, if I understand what you are saying, if it took 85ms for 7
traversals, then it takes approximately 10ms per traversal, that's
only 100/s! If you have to traverse it 140K times, that should work
out to 1400s ~ 23mins.

Well, it's no more than 10ms per traversal - since the rest of the work
presumably takes some time too ...

However, I had forgotten to make the orphaned commit as you suggest -
and then _bang_ 7N^2, it tries seven different variants of each ref
(which is silly as they are all fully qualified), and with packed refs
it has to search for them each time, all to turn names into hashes that
we already know to start with.

So, yes - it is that list traversal.

Does the following help?

diff --git a/builtin/checkout.c b/builtin/checkout.c
index 5e356a6..f0f4ca1 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -605,7 +605,7 @@ static int add_one_ref_to_rev_list_arg(const char
*refname,
int flags,
void *cb_data)
{
- add_one_rev_list_arg(cb_data, refname);
+ add_one_rev_list_arg(cb_data, strdup(sha1_to_hex(sha1)));
return 0;
}

--
Julian

Martin Fick

2011-09-29 16:38:44 UTC

On Wednesday, September 28, 2011 08:19:16 pm Julian Phillips

On Wednesday 28 September 2011 18:59:09 Martin Fick

-- snip --

Post by Julian Phillips
I've created a test repo with ~100k refs/changes/...
style refs, and ~40000 refs/heads/... style refs, and
checkout can walk the list of ~140k refs seven times
in 85ms user time including doing whatever other
processing is needed for checkout. The real time is
only 114ms - but then my test repo has no real data
in.

If I understand what you are saying, it sounds like you
do not have a very good test case. The amount of time
it takes for checkout depends on how long it takes to
find a ref with the sha1 that you are on. If that sha1
is so early in the list of refs that it only took you
7 traversals to find it, then that is not a very good
testcase. I think that you should probably try making
an orphaned ref (checkout a detached head, commit to
it), that is probably the worst testcase since it
should then have to search all 140K refs to eventually
give up.
Again, if I understand what you are saying, if it took
85ms for 7 traversals, then it takes approximately
10ms per traversal, that's only 100/s! If you have to
traverse it 140K times, that should work out to 1400s
~ 23mins.

Well, it's no more than 10ms per traversal - since the
rest of the work presumably takes some time too ...
However, I had forgotten to make the orphaned commit as
you suggest - and then _bang_ 7N^2, it tries seven
different variants of each ref (which is silly as they
are all fully qualified), and with packed refs it has to
search for them each time, all to turn names into hashes
that we already know to start with.
So, yes - it is that list traversal.
Does the following help?
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 5e356a6..f0f4ca1 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -605,7 +605,7 @@ static int
add_one_ref_to_rev_list_arg(const char *refname,
int flags,
void *cb_data)
{
- add_one_rev_list_arg(cb_data, refname);
+ add_one_rev_list_arg(cb_data,
strdup(sha1_to_hex(sha1))); return 0;
}

Yes, but in some strange ways. :)

First, let me clarify that all the tests here involve your
"sort fix" from 2 days ago applied first.

In the packed ref repo, it brings the time down to about
~10s (from > 5 mins). In the unpacked ref repo, it brings
it down to about the same thing ~10s, but it was only
starting at about ~20s.

So, I have to ask, what does that change do, I don't quite
understand it? Does it just do only one lookup per ref by
normalizing it? Is the list still being traversed, just
about 7 time less now? Should the packed_ref list simply be
put in an array which could be binary searched instead, it
is a fixed list once loaded, no?

I prototyped a packed_ref implementation using the hash.c
provided in the git sources and it seemed to speed a
checkout up to almost instantaneous, but I was getting a few
collisions so the implementation was not good enough. That
is when I started to wonder if an array wouldn't be better
in this case?

Now I also decided to go back and test a noop fetch (a
refetch) of all the changes (since this use case is still
taking way longer than I think it should, even with the
submodule fix posted earlier). Up until this point, even
the sorting fix did not help. So I tried it with this fix.
In the unpackref case, it did not seem to change (2~4mins).
However, in the packed ref change (which was previously also
about 2-4mins), this now only takes about 10-15s!

Any clues as to why the unpacked refs would still be so slow
on noop fetches and not be sped up by this?

-Martin

--
Employee of Qualcomm Innovation Center, Inc. which is a
member of Code Aurora Forum

Julian Phillips

2011-09-29 18:26:24 UTC

Post by Martin Fick
On Wednesday, September 28, 2011 08:19:16 pm Julian Phillips

-- snip --

Post by Julian Phillips
However, I had forgotten to make the orphaned commit as
you suggest - and then _bang_ 7N^2, it tries seven
different variants of each ref (which is silly as they
are all fully qualified), and with packed refs it has to
search for them each time, all to turn names into hashes
that we already know to start with.
So, yes - it is that list traversal.
Does the following help?
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 5e356a6..f0f4ca1 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -605,7 +605,7 @@ static int
add_one_ref_to_rev_list_arg(const char *refname,
int flags,
void *cb_data)
{
- add_one_rev_list_arg(cb_data, refname);
+ add_one_rev_list_arg(cb_data,
strdup(sha1_to_hex(sha1))); return 0;
}

In order to check for orphaned commits, checkout effectively calls
rev-list passing it a list of the names of all the refs as input. The
rev-list code then has to go through this list and convert each entry
into an actual hash that it can look up in the object database. This is
where the N^2 comes in for packed refs, as it calles resolve_ref() for
each ref in the list (N), which then loops through the list of all refs
(N) to find a match. However, the code that creates the list of refs to
pass to the rev-list code already knows the hash for each ref. So the
change above passes the hashes to rev-list, which then doesn't need to
lookup the ref - it just converts the string form hash back to binary
form, avoiding the N^2 work altogether. This is why packed and unpacked
are about the same speed, as they are now doing the same amount of work.

Post by Martin Fick
Should the packed_ref list simply be
put in an array which could be binary searched instead, it
is a fixed list once loaded, no?

A quick look at the code suggests that probably both the list of loose
refs, and the list of packed refs could both be stored as binary
searchable arrays, or in an ordered hash table. Though whether it is
actually necessary I don't know. So far, it seems to have been possible
to fix performance issues whilst keeping the simple lists ...

Post by Martin Fick
I prototyped a packed_ref implementation using the hash.c
provided in the git sources and it seemed to speed a
checkout up to almost instantaneous, but I was getting a few
collisions so the implementation was not good enough. That
is when I started to wonder if an array wouldn't be better
in this case?
Now I also decided to go back and test a noop fetch (a
refetch) of all the changes (since this use case is still
taking way longer than I think it should, even with the
submodule fix posted earlier). Up until this point, even
the sorting fix did not help. So I tried it with this fix.
In the unpackref case, it did not seem to change (2~4mins).
However, in the packed ref change (which was previously also
about 2-4mins), this now only takes about 10-15s!
Any clues as to why the unpacked refs would still be so slow
on noop fetches and not be sped up by this?

Not really. I wouldn't expect this change to have any effect on fetch,
but I haven't actually looked into it.

--
Julian

René Scharfe

2011-09-29 18:27:43 UTC

Post by Julian Phillips
Does the following help?
=20
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 5e356a6..f0f4ca1 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -605,7 +605,7 @@ static int add_one_ref_to_rev_list_arg(const char
*refname,
int flags,
void *cb_data)
{
- add_one_rev_list_arg(cb_data, refname);
+ add_one_rev_list_arg(cb_data, strdup(sha1_to_hex(sha1)));
return 0;
}

Hmm. Can we get rid of the multiple ref lookups fixed by the above
*and* the overhead of dealing with a textual argument list at the same
time by calling add_pending_object directly, like this? (Factoring
out add_pending_sha1 should be a separate patch..)

Ren=C3=A9

---
builtin/checkout.c | 39 ++++++++++++---------------------------
revision.c | 11 ++++++++---
revision.h | 1 +
3 files changed, 21 insertions(+), 30 deletions(-)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index 5e356a6..84e0cdc 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -588,24 +588,11 @@ static void update_refs_for_switch(struct checkou=
t_opts *opts,
report_tracking(new);
}
=20
-struct rev_list_args {
- int argc;
- int alloc;
- const char **argv;
-};
-
-static void add_one_rev_list_arg(struct rev_list_args *args, const cha=
r *s)
-{
- ALLOC_GROW(args->argv, args->argc + 1, args->alloc);
- args->argv[args->argc++] =3D s;
-}
-
-static int add_one_ref_to_rev_list_arg(const char *refname,
- const unsigned char *sha1,
- int flags,
- void *cb_data)
+static int add_pending_uninteresting_ref(const char *refname,
+ const unsigned char *sha1,
+ int flags, void *cb_data)
{
- add_one_rev_list_arg(cb_data, refname);
+ add_pending_sha1(cb_data, refname, sha1, flags | UNINTERESTING);
return 0;
}
=20
@@ -685,19 +672,17 @@ static void suggest_reattach(struct commit *commi=
t, struct rev_info *revs)
*/
static void orphaned_commit_warning(struct commit *commit)
{
- struct rev_list_args args =3D { 0, 0, NULL };
struct rev_info revs;
-
- add_one_rev_list_arg(&args, "(internal)");
- add_one_rev_list_arg(&args, sha1_to_hex(commit->object.sha1));
- add_one_rev_list_arg(&args, "--not");
- for_each_ref(add_one_ref_to_rev_list_arg, &args);
- add_one_rev_list_arg(&args, "--");
- add_one_rev_list_arg(&args, NULL);
+ struct object *object =3D &commit->object;
=20
init_revisions(&revs, NULL);
- if (setup_revisions(args.argc - 1, args.argv, &revs, NULL) !=3D 1)
- die(_("internal error: only -- alone should have been left"));
+ setup_revisions(0, NULL, &revs, NULL);
+
+ object->flags &=3D ~UNINTERESTING;
+ add_pending_object(&revs, object, sha1_to_hex(object->sha1));
+
+ for_each_ref(add_pending_uninteresting_ref, &revs);
+
if (prepare_revision_walk(&revs))
die(_("internal error in revision walk"));
if (!(commit->object.flags & UNINTERESTING))
diff --git a/revision.c b/revision.c
index c46cfaa..2e8aa33 100644
--- a/revision.c
+++ b/revision.c
@@ -185,6 +185,13 @@ static struct object *get_reference(struct rev_inf=
o *revs, const char *name, con
return object;
}
=20
+void add_pending_sha1(struct rev_info *revs, const char *name,
+ const unsigned char *sha1, unsigned int flags)
+{
+ struct object *object =3D get_reference(revs, name, sha1, flags);
+ add_pending_object(revs, object, name);
+}
+
static struct commit *handle_commit(struct rev_info *revs, struct obje=
ct *object, const char *name)
{
unsigned long flags =3D object->flags;
@@ -832,9 +839,7 @@ struct all_refs_cb {
static int handle_one_ref(const char *path, const unsigned char *sha1,=
int flag, void *cb_data)
{
struct all_refs_cb *cb =3D cb_data;
- struct object *object =3D get_reference(cb->all_revs, path, sha1,
- cb->all_flags);
- add_pending_object(cb->all_revs, object, path);
+ add_pending_sha1(cb->all_revs, path, sha1, cb->all_flags);
return 0;
}
=20
diff --git a/revision.h b/revision.h
index 3d64ada..4541265 100644
--- a/revision.h
+++ b/revision.h
@@ -191,6 +191,7 @@ extern void add_object(struct object *obj,
const char *name);
=20
extern void add_pending_object(struct rev_info *revs, struct object *o=
bj, const char *name);
+extern void add_pending_sha1(struct rev_info *revs, const char *name, =
const unsigned char *sha1, unsigned int flags);
=20
extern void add_head_to_pending(struct rev_info *);
=20
--=20
1.7.7.rc1

Junio C Hamano

2011-09-29 19:10:06 UTC

Post by RenÃ© Scharfe
Hmm. Can we get rid of the multiple ref lookups fixed by the above
*and* the overhead of dealing with a textual argument list at the sam=

Post by RenÃ© Scharfe
time by calling add_pending_object directly, like this? (Factoring
out add_pending_sha1 should be a separate patch..)

I haven't tested it or thought about it through, but it smells right ;-=
)

Also we would probably want to drop "next" field from "struct ref_list"
(i.e. making it not a linear list), introduce a new "struct ref_array"
that is a ALLOC_GROW() managed array of pointers to "struct ref_list",
make get_packed_refs() and get_loose_refs() return a pointer to "struct
ref_array" after sorting the array contents by "name". Then resolve_ref=
()
can do a bisection search in the packed refs array when it does not fin=
d a
loose ref.

Martin Fick

2011-09-29 20:44:32 UTC

On Thursday, September 29, 2011 01:10:06 pm Junio C Hamano

Post by Junio C Hamano
Also we would probably want to drop "next" field from
"struct ref_list" (i.e. making it not a linear list),
introduce a new "struct ref_array" that is a
ALLOC_GROW() managed array of pointers to "struct
ref_list", make get_packed_refs() and get_loose_refs()
return a pointer to "struct ref_array" after sorting the
array contents by "name". Then resolve_ref() can do a
bisection search in the packed refs array when it does
not find a loose ref.

That would be nice, and I suspect it would shave a bit more
of the orphan check and possibly even a fetch. If I
understood all that, I might try. But I might need some
hand holding, my C is pretty rusty... Is there a bisection
search library in git already to use? Is there a git
sorting library for the array also?

-Martin

--
Employee of Qualcomm Innovation Center, Inc. which is a
member of Code Aurora Forum

Julian Phillips

2011-09-29 04:18:10 UTC

Currently we linearly search through lists of refs when we need to
find a specific ref. This can be very slow if we need to lookup a
large number of refs. By changing to a binary search we can make this
faster.

In order to be able to use a binary search we need to change from
using linked lists to arrays, which we can manage using ALLOC_GROW.

We can now also use the standard library qsort function to sort the
refs arrays.

Signed-off-by: Julian Phillips <***@quantumfyre.co.uk>
---

Something like this?

refs.c | 328 ++++++++++++++++++++++++++--------------------------------------
1 files changed, 131 insertions(+), 197 deletions(-)

diff --git a/refs.c b/refs.c
index a49ff74..e411bea 100644
--- a/refs.c
+++ b/refs.c
@@ -8,14 +8,18 @@
#define REF_KNOWS_PEELED 04
#define REF_BROKEN 010

-struct ref_list {
- struct ref_list *next;
+struct ref_entry {
unsigned char flag; /* ISSYMREF? ISPACKED? */
unsigned char sha1[20];
unsigned char peeled[20];
char name[FLEX_ARRAY];
};

+struct ref_array {
+ int nr, alloc;
+ struct ref_entry **refs;
+};
+
static const char *parse_ref_line(char *line, unsigned char *sha1)
{
/*
@@ -44,108 +48,55 @@ static const char *parse_ref_line(char *line, unsigned char *sha1)
return line;
}

-static struct ref_list *add_ref(const char *name, const unsigned char *sha1,
- int flag, struct ref_list *list,
- struct ref_list **new_entry)
+static void add_ref(const char *name, const unsigned char *sha1,
+ int flag, struct ref_array *refs,
+ struct ref_entry **new_entry)
{
int len;
- struct ref_list *entry;
+ struct ref_entry *entry;

/* Allocate it and add it in.. */
len = strlen(name) + 1;
- entry = xmalloc(sizeof(struct ref_list) + len);
+ entry = xmalloc(sizeof(struct ref) + len);
hashcpy(entry->sha1, sha1);
hashclr(entry->peeled);
memcpy(entry->name, name, len);
entry->flag = flag;
- entry->next = list;
if (new_entry)
*new_entry = entry;
- return entry;
+ ALLOC_GROW(refs->refs, refs->nr + 1, refs->alloc);
+ refs->refs[refs->nr++] = entry;
}

-/* merge sort the ref list */
-static struct ref_list *sort_ref_list(struct ref_list *list)
+static int ref_entry_cmp(const void *a, const void *b)
{
- int psize, qsize, last_merge_count, cmp;
- struct ref_list *p, *q, *l, *e;
- struct ref_list *new_list = list;
- int k = 1;
- int merge_count = 0;
-
- if (!list)
- return list;
-
- do {
- last_merge_count = merge_count;
- merge_count = 0;
-
- psize = 0;
-
- p = new_list;
- q = new_list;
- new_list = NULL;
- l = NULL;
+ struct ref_entry *one = *(struct ref_entry **)a;
+ struct ref_entry *two = *(struct ref_entry **)b;
+ return strcmp(one->name, two->name);
+}

- while (p) {
- merge_count++;
+static void sort_ref_array(struct ref_array *array)
+{
+ qsort(array->refs, array->nr, sizeof(*array->refs), ref_entry_cmp);
+}

- while (psize < k && q->next) {
- q = q->next;
- psize++;
- }
- qsize = k;
-
- while ((psize > 0) || (qsize > 0 && q)) {
- if (qsize == 0 || !q) {
- e = p;
- p = p->next;
- psize--;
- } else if (psize == 0) {
- e = q;
- q = q->next;
- qsize--;
- } else {
- cmp = strcmp(q->name, p->name);
- if (cmp < 0) {
- e = q;
- q = q->next;
- qsize--;
- } else if (cmp > 0) {
- e = p;
- p = p->next;
- psize--;
- } else {
- if (hashcmp(q->sha1, p->sha1))
- die("Duplicated ref, and SHA1s don't match: %s",
- q->name);
- warning("Duplicated ref: %s", q->name);
- e = q;
- q = q->next;
- qsize--;
- free(e);
- e = p;
- p = p->next;
- psize--;
- }
- }
+static struct ref_entry *search_ref_array(struct ref_array *array, const char *name)
+{
+ struct ref_entry *e, **r;
+ int len;

- e->next = NULL;
+ len = strlen(name) + 1;
+ e = xmalloc(sizeof(struct ref) + len);
+ memcpy(e->name, name, len);

- if (l)
- l->next = e;
- if (!new_list)
- new_list = e;
- l = e;
- }
+ r = bsearch(&e, array->refs, array->nr, sizeof(*array->refs), ref_entry_cmp);

- p = q;
- };
+ free(e);

- k = k * 2;
- } while ((last_merge_count != merge_count) || (last_merge_count != 1));
+ if (r == NULL)
+ return NULL;

- return new_list;
+ return *r;
}

/*
@@ -155,38 +106,37 @@ static struct ref_list *sort_ref_list(struct ref_list *list)
static struct cached_refs {
char did_loose;
char did_packed;
- struct ref_list *loose;
- struct ref_list *packed;
+ struct ref_array loose;
+ struct ref_array packed;
} cached_refs, submodule_refs;
-static struct ref_list *current_ref;
+static struct ref_entry *current_ref;

-static struct ref_list *extra_refs;
+static struct ref_array extra_refs;

-static void free_ref_list(struct ref_list *list)
+static void free_ref_array(struct ref_array *array)
{
- struct ref_list *next;
- for ( ; list; list = next) {
- next = list->next;
- free(list);
- }
+ int i;
+ for (i = 0; i < array->nr; i++)
+ free(array->refs[i]);
+ free(array->refs);
+ array->nr = array->alloc = 0;
+ array->refs = NULL;
}

static void invalidate_cached_refs(void)
{
struct cached_refs *ca = &cached_refs;

- if (ca->did_loose && ca->loose)
- free_ref_list(ca->loose);
- if (ca->did_packed && ca->packed)
- free_ref_list(ca->packed);
- ca->loose = ca->packed = NULL;
+ if (ca->did_loose)
+ free_ref_array(&ca->loose);
+ if (ca->did_packed)
+ free_ref_array(&ca->packed);
ca->did_loose = ca->did_packed = 0;
}

static void read_packed_refs(FILE *f, struct cached_refs *cached_refs)
{
- struct ref_list *list = NULL;
- struct ref_list *last = NULL;
+ struct ref_entry *last = NULL;
char refline[PATH_MAX];
int flag = REF_ISPACKED;

@@ -205,7 +155,7 @@ static void read_packed_refs(FILE *f, struct cached_refs *cached_refs)

name = parse_ref_line(refline, sha1);
if (name) {
- list = add_ref(name, sha1, flag, list, &last);
+ add_ref(name, sha1, flag, &cached_refs->packed, &last);
continue;
}
if (last &&
@@ -215,21 +165,20 @@ static void read_packed_refs(FILE *f, struct cached_refs *cached_refs)
!get_sha1_hex(refline + 1, sha1))
hashcpy(last->peeled, sha1);
}
- cached_refs->packed = sort_ref_list(list);
+ sort_ref_array(&cached_refs->packed);
}

void add_extra_ref(const char *name, const unsigned char *sha1, int flag)
{
- extra_refs = add_ref(name, sha1, flag, extra_refs, NULL);
+ add_ref(name, sha1, flag, &extra_refs, NULL);
}

void clear_extra_refs(void)
{
- free_ref_list(extra_refs);
- extra_refs = NULL;
+ free_ref_array(&extra_refs);
}

-static struct ref_list *get_packed_refs(const char *submodule)
+static struct ref_array *get_packed_refs(const char *submodule)
{
const char *packed_refs_file;
struct cached_refs *refs;
@@ -237,7 +186,7 @@ static struct ref_list *get_packed_refs(const char *submodule)
if (submodule) {
packed_refs_file = git_path_submodule(submodule, "packed-refs");
refs = &submodule_refs;
- free_ref_list(refs->packed);
+ free_ref_array(&refs->packed);
} else {
packed_refs_file = git_path("packed-refs");
refs = &cached_refs;
@@ -245,18 +194,17 @@ static struct ref_list *get_packed_refs(const char *submodule)

if (!refs->did_packed || submodule) {
FILE *f = fopen(packed_refs_file, "r");
- refs->packed = NULL;
if (f) {
read_packed_refs(f, refs);
fclose(f);
}
refs->did_packed = 1;
}
- return refs->packed;
+ return &refs->packed;
}

-static struct ref_list *get_ref_dir(const char *submodule, const char *base,
- struct ref_list *list)
+static void get_ref_dir(const char *submodule, const char *base,
+ struct ref_array *array)
{
DIR *dir;
const char *path;
@@ -299,7 +247,7 @@ static struct ref_list *get_ref_dir(const char *submodule, const char *base,
if (stat(refdir, &st) < 0)
continue;
if (S_ISDIR(st.st_mode)) {
- list = get_ref_dir(submodule, ref, list);
+ get_ref_dir(submodule, ref, array);
continue;
}
if (submodule) {
@@ -314,12 +262,11 @@ static struct ref_list *get_ref_dir(const char *submodule, const char *base,
hashclr(sha1);
flag |= REF_BROKEN;
}
- list = add_ref(ref, sha1, flag, list, NULL);
+ add_ref(ref, sha1, flag, array, NULL);
}
free(ref);
closedir(dir);
}
- return list;
}

struct warn_if_dangling_data {
@@ -356,21 +303,21 @@ void warn_dangling_symref(FILE *fp, const char *msg_fmt, const char *refname)
for_each_rawref(warn_if_dangling_symref, &data);
}

-static struct ref_list *get_loose_refs(const char *submodule)
+static struct ref_array *get_loose_refs(const char *submodule)
{
if (submodule) {
- free_ref_list(submodule_refs.loose);
- submodule_refs.loose = get_ref_dir(submodule, "refs", NULL);
- submodule_refs.loose = sort_ref_list(submodule_refs.loose);
- return submodule_refs.loose;
+ free_ref_array(&submodule_refs.loose);
+ get_ref_dir(submodule, "refs", &submodule_refs.loose);
+ sort_ref_array(&submodule_refs.loose);
+ return &submodule_refs.loose;
}

if (!cached_refs.did_loose) {
- cached_refs.loose = get_ref_dir(NULL, "refs", NULL);
- cached_refs.loose = sort_ref_list(cached_refs.loose);
+ get_ref_dir(NULL, "refs", &cached_refs.loose);
+ sort_ref_array(&cached_refs.loose);
cached_refs.did_loose = 1;
}
- return cached_refs.loose;
+ return &cached_refs.loose;
}

/* We allow "recursive" symbolic refs. Only within reason, though */
@@ -381,8 +328,8 @@ static int resolve_gitlink_packed_ref(char *name, int pathlen, const char *refna
{
FILE *f;
struct cached_refs refs;
- struct ref_list *ref;
- int retval;
+ struct ref_entry *ref;
+ int retval = -1;

strcpy(name + pathlen, "packed-refs");
f = fopen(name, "r");
@@ -390,17 +337,12 @@ static int resolve_gitlink_packed_ref(char *name, int pathlen, const char *refna
return -1;
read_packed_refs(f, &refs);
fclose(f);
- ref = refs.packed;
- retval = -1;
- while (ref) {
- if (!strcmp(ref->name, refname)) {
- retval = 0;
- memcpy(result, ref->sha1, 20);
- break;
- }
- ref = ref->next;
+ ref = search_ref_array(&refs.packed, refname);
+ if (ref != NULL) {
+ memcpy(result, ref->sha1, 20);
+ retval = 0;
}
- free_ref_list(refs.packed);
+ free_ref_array(&refs.packed);
return retval;
}

@@ -501,15 +443,13 @@ const char *resolve_ref(const char *ref, unsigned char *sha1, int reading, int *
git_snpath(path, sizeof(path), "%s", ref);
/* Special case: non-existing file. */
if (lstat(path, &st) < 0) {
- struct ref_list *list = get_packed_refs(NULL);
- while (list) {
- if (!strcmp(ref, list->name)) {
- hashcpy(sha1, list->sha1);
- if (flag)
- *flag |= REF_ISPACKED;
- return ref;
- }
- list = list->next;
+ struct ref_array *packed = get_packed_refs(NULL);
+ struct ref_entry *r = search_ref_array(packed, ref);
+ if (r != NULL) {
+ hashcpy(sha1, r->sha1);
+ if (flag)
+ *flag |= REF_ISPACKED;
+ return ref;
}
if (reading || errno != ENOENT)
return NULL;
@@ -584,7 +524,7 @@ int read_ref(const char *ref, unsigned char *sha1)

#define DO_FOR_EACH_INCLUDE_BROKEN 01
static int do_one_ref(const char *base, each_ref_fn fn, int trim,
- int flags, void *cb_data, struct ref_list *entry)
+ int flags, void *cb_data, struct ref_entry *entry)
{
if (prefixcmp(entry->name, base))
return 0;
@@ -630,18 +570,12 @@ int peel_ref(const char *ref, unsigned char *sha1)
return -1;

if ((flag & REF_ISPACKED)) {
- struct ref_list *list = get_packed_refs(NULL);
+ struct ref_array *array = get_packed_refs(NULL);
+ struct ref_entry *r = search_ref_array(array, ref);

- while (list) {
- if (!strcmp(list->name, ref)) {
- if (list->flag & REF_KNOWS_PEELED) {
- hashcpy(sha1, list->peeled);
- return 0;
- }
- /* older pack-refs did not leave peeled ones */
- break;
- }
- list = list->next;
+ if (r != NULL && r->flag & REF_KNOWS_PEELED) {
+ hashcpy(sha1, r->peeled);
+ return 0;
}
}

@@ -660,36 +594,39 @@ fallback:
static int do_for_each_ref(const char *submodule, const char *base, each_ref_fn fn,
int trim, int flags, void *cb_data)
{
- int retval = 0;
- struct ref_list *packed = get_packed_refs(submodule);
- struct ref_list *loose = get_loose_refs(submodule);
+ int retval = 0, i, p = 0, l = 0;
+ struct ref_array *packed = get_packed_refs(submodule);
+ struct ref_array *loose = get_loose_refs(submodule);

- struct ref_list *extra;
+ struct ref_array *extra = &extra_refs;

- for (extra = extra_refs; extra; extra = extra->next)
- retval = do_one_ref(base, fn, trim, flags, cb_data, extra);
+ for (i = 0; i < extra->nr; i++)
+ retval = do_one_ref(base, fn, trim, flags, cb_data, extra->refs[i]);

- while (packed && loose) {
- struct ref_list *entry;
- int cmp = strcmp(packed->name, loose->name);
+ while (p < packed->nr && l < loose->nr) {
+ struct ref_entry *entry;
+ int cmp = strcmp(packed->refs[p]->name, loose->refs[l]->name);
if (!cmp) {
- packed = packed->next;
+ p++;
continue;
}
if (cmp > 0) {
- entry = loose;
- loose = loose->next;
+ entry = loose->refs[l++];
} else {
- entry = packed;
- packed = packed->next;
+ entry = packed->refs[p++];
}
retval = do_one_ref(base, fn, trim, flags, cb_data, entry);
if (retval)
goto end_each;
}

- for (packed = packed ? packed : loose; packed; packed = packed->next) {
- retval = do_one_ref(base, fn, trim, flags, cb_data, packed);
+ if (l < loose->nr) {
+ p = l;
+ packed = loose;
+ }
+
+ for (; p < packed->nr; p++) {
+ retval = do_one_ref(base, fn, trim, flags, cb_data, packed->refs[p]);
if (retval)
goto end_each;
}
@@ -1005,24 +942,24 @@ static int remove_empty_directories(const char *file)
}

static int is_refname_available(const char *ref, const char *oldref,
- struct ref_list *list, int quiet)
-{
- int namlen = strlen(ref); /* e.g. 'foo/bar' */
- while (list) {
- /* list->name could be 'foo' or 'foo/bar/baz' */
- if (!oldref || strcmp(oldref, list->name)) {
- int len = strlen(list->name);
+ struct ref_array *array, int quiet)
+{
+ int i, namlen = strlen(ref); /* e.g. 'foo/bar' */
+ for (i = 0; i < array->nr; i++ ) {
+ struct ref_entry *entry = array->refs[i];
+ /* entry->name could be 'foo' or 'foo/bar/baz' */
+ if (!oldref || strcmp(oldref, entry->name)) {
+ int len = strlen(entry->name);
int cmplen = (namlen < len) ? namlen : len;
- const char *lead = (namlen < len) ? list->name : ref;
- if (!strncmp(ref, list->name, cmplen) &&
+ const char *lead = (namlen < len) ? entry->name : ref;
+ if (!strncmp(ref, entry->name, cmplen) &&
lead[cmplen] == '/') {
if (!quiet)
error("'%s' exists; cannot create '%s'",
- list->name, ref);
+ entry->name, ref);
return 0;
}
}
- list = list->next;
}
return 1;
}
@@ -1129,18 +1066,13 @@ static struct lock_file packlock;

static int repack_without_ref(const char *refname)
{
- struct ref_list *list, *packed_ref_list;
- int fd;
- int found = 0;
+ struct ref_array *packed;
+ struct ref_entry *ref;
+ int fd, i;

- packed_ref_list = get_packed_refs(NULL);
- for (list = packed_ref_list; list; list = list->next) {
- if (!strcmp(refname, list->name)) {
- found = 1;
- break;
- }
- }
- if (!found)
+ packed = get_packed_refs(NULL);
+ ref = search_ref_array(packed, refname);
+ if (ref == NULL)
return 0;
fd = hold_lock_file_for_update(&packlock, git_path("packed-refs"), 0);
if (fd < 0) {
@@ -1148,17 +1080,19 @@ static int repack_without_ref(const char *refname)
return error("cannot delete '%s' from packed refs", refname);
}

- for (list = packed_ref_list; list; list = list->next) {
+ for (i = 0; i < packed->nr; i++) {
char line[PATH_MAX + 100];
int len;

- if (!strcmp(refname, list->name))
+ ref = packed->refs[i];
+
+ if (!strcmp(refname, ref->name))
continue;
len = snprintf(line, sizeof(line), "%s %s\n",
- sha1_to_hex(list->sha1), list->name);
+ sha1_to_hex(ref->sha1), ref->name);
/* this should not happen but just being defensive */
if (len > sizeof(line))
- die("too long a refname '%s'", list->name);
+ die("too long a refname '%s'", ref->name);
write_or_die(fd, line, len);
}
return commit_lock_file(&packlock);

--
1.7.6.1

Junio C Hamano

2011-09-29 21:57:53 UTC

Post by Julian Phillips
Currently we linearly search through lists of refs when we need to
find a specific ref. This can be very slow if we need to lookup a
large number of refs. By changing to a binary search we can make this
faster.
In order to be able to use a binary search we need to change from
using linked lists to arrays, which we can manage using ALLOC_GROW.
We can now also use the standard library qsort function to sort the
refs arrays.
---
Something like this?
refs.c | 328 ++++++++++++++++++++++++++--------------------------------------
1 files changed, 131 insertions(+), 197 deletions(-)
diff --git a/refs.c b/refs.c
index a49ff74..e411bea 100644
--- a/refs.c
+++ b/refs.c
@@ -8,14 +8,18 @@
#define REF_KNOWS_PEELED 04
#define REF_BROKEN 010
-struct ref_list {
- struct ref_list *next;
+struct ref_entry {
unsigned char flag; /* ISSYMREF? ISPACKED? */
unsigned char sha1[20];
unsigned char peeled[20];
char name[FLEX_ARRAY];
};
+struct ref_array {
+ int nr, alloc;
+ struct ref_entry **refs;
+};
+

Yeah, I can say "something like that" without looking at the rest of the
patch ;-) The rest should naturally follow from the above data structures.

Junio C Hamano

2011-09-29 22:06:03 UTC

Post by Julian Phillips
+static void add_ref(const char *name, const unsigned char *sha1,
+ int flag, struct ref_array *refs,
+ struct ref_entry **new_entry)
{
int len;
- struct ref_list *entry;
+ struct ref_entry *entry;
/* Allocate it and add it in.. */
len = strlen(name) + 1;
- entry = xmalloc(sizeof(struct ref_list) + len);
+ entry = xmalloc(sizeof(struct ref) + len);

This should be sizeof(struct ref_entry), no? There is another such
misallocation in search_ref_array() where it prepares a temporary.

Julian Phillips

2011-09-29 22:11:42 UTC

Post by Junio C Hamano

This should be sizeof(struct ref_entry), no? There is another such
misallocation in search_ref_array() where it prepares a temporary.

Indeed, thanks.

Looks like two instances of not noticing that "struct ref" already existed
managed to survive. Drat. Of course since "struct ref" is bigger than "struct
ref_entry", everthing worked fine ... so no failed tests to tip me off.

refs.c | 329 ++++++++++++++++++++++++++--------------------------------------
1 files changed, 133 insertions(+), 196 deletions(-)

diff --git a/refs.c b/refs.c
index a49ff74..4c01d79 100644
--- a/refs.c
+++ b/refs.c
@@ -8,14 +8,18 @@
#define REF_KNOWS_PEELED 04
#define REF_BROKEN 010

-struct ref_list {
- struct ref_list *next;
+struct ref_entry {
unsigned char flag; /* ISSYMREF? ISPACKED? */
unsigned char sha1[20];
unsigned char peeled[20];
char name[FLEX_ARRAY];
};

+struct ref_array {
+ int nr, alloc;
+ struct ref_entry **refs;
+};
+
static const char *parse_ref_line(char *line, unsigned char *sha1)
{
/*
@@ -44,108 +48,58 @@ static const char *parse_ref_line(char *line, unsigned char *sha1)
return line;
}

-static struct ref_list *add_ref(const char *name, const unsigned char *sha1,
- int flag, struct ref_list *list,
- struct ref_list **new_entry)
+static void add_ref(const char *name, const unsigned char *sha1,
+ int flag, struct ref_array *refs,
+ struct ref_entry **new_entry)
{
int len;
- struct ref_list *entry;
+ struct ref_entry *entry;

/* Allocate it and add it in.. */
len = strlen(name) + 1;
- entry = xmalloc(sizeof(struct ref_list) + len);
+ entry = xmalloc(sizeof(struct ref_entry) + len);
hashcpy(entry->sha1, sha1);
hashclr(entry->peeled);
memcpy(entry->name, name, len);
entry->flag = flag;
- entry->next = list;
if (new_entry)
*new_entry = entry;
- return entry;
+ ALLOC_GROW(refs->refs, refs->nr + 1, refs->alloc);
+ refs->refs[refs->nr++] = entry;
}

-/* merge sort the ref list */
-static struct ref_list *sort_ref_list(struct ref_list *list)
+static int ref_entry_cmp(const void *a, const void *b)
{
- int psize, qsize, last_merge_count, cmp;
- struct ref_list *p, *q, *l, *e;
- struct ref_list *new_list = list;
- int k = 1;
- int merge_count = 0;
-
- if (!list)
- return list;
-
- do {
- last_merge_count = merge_count;
- merge_count = 0;
-
- psize = 0;
+ struct ref_entry *one = *(struct ref_entry **)a;
+ struct ref_entry *two = *(struct ref_entry **)b;
+ return strcmp(one->name, two->name);
+}

- p = new_list;
- q = new_list;
- new_list = NULL;
- l = NULL;
+static void sort_ref_array(struct ref_array *array)
+{
+ qsort(array->refs, array->nr, sizeof(*array->refs), ref_entry_cmp);
+}

- while (p) {
- merge_count++;
+static struct ref_entry *search_ref_array(struct ref_array *array, const char *name)
+{
+ struct ref_entry *e, **r;
+ int len;

- while (psize < k && q->next) {
- q = q->next;
- psize++;
- }
- qsize = k;
-
- while ((psize > 0) || (qsize > 0 && q)) {
- if (qsize == 0 || !q) {
- e = p;
- p = p->next;
- psize--;
- } else if (psize == 0) {
- e = q;
- q = q->next;
- qsize--;
- } else {
- cmp = strcmp(q->name, p->name);
- if (cmp < 0) {
- e = q;
- q = q->next;
- qsize--;
- } else if (cmp > 0) {
- e = p;
- p = p->next;
- psize--;
- } else {
- if (hashcmp(q->sha1, p->sha1))
- die("Duplicated ref, and SHA1s don't match: %s",
- q->name);
- warning("Duplicated ref: %s", q->name);
- e = q;
- q = q->next;
- qsize--;
- free(e);
- e = p;
- p = p->next;
- psize--;
- }
- }
+ if (name == NULL)
+ return NULL;

- e->next = NULL;
+ len = strlen(name) + 1;
+ e = xmalloc(sizeof(struct ref_entry) + len);
+ memcpy(e->name, name, len);

- if (l)
- l->next = e;
- if (!new_list)
- new_list = e;
- l = e;
- }
+ r = bsearch(&e, array->refs, array->nr, sizeof(*array->refs), ref_entry_cmp);

- p = q;
- };
+ free(e);

- k = k * 2;
- } while ((last_merge_count != merge_count) || (last_merge_count != 1));
+ if (r == NULL)
+ return NULL;

- return new_list;
+ return *r;
}

/*
@@ -155,38 +109,37 @@ static struct ref_list *sort_ref_list(struct ref_list *list)
static struct cached_refs {
char did_loose;
char did_packed;
- struct ref_list *loose;
- struct ref_list *packed;
+ struct ref_array loose;
+ struct ref_array packed;
} cached_refs, submodule_refs;
-static struct ref_list *current_ref;
+static struct ref_entry *current_ref;

-static struct ref_list *extra_refs;
+static struct ref_array extra_refs;

-static void free_ref_list(struct ref_list *list)
+static void free_ref_array(struct ref_array *array)
{
- struct ref_list *next;
- for ( ; list; list = next) {
- next = list->next;
- free(list);
- }
+ int i;
+ for (i = 0; i < array->nr; i++)
+ free(array->refs[i]);
+ free(array->refs);
+ array->nr = array->alloc = 0;
+ array->refs = NULL;
}

static void invalidate_cached_refs(void)
{
struct cached_refs *ca = &cached_refs;

- if (ca->did_loose && ca->loose)
- free_ref_list(ca->loose);
- if (ca->did_packed && ca->packed)
- free_ref_list(ca->packed);
- ca->loose = ca->packed = NULL;
+ if (ca->did_loose)
+ free_ref_array(&ca->loose);
+ if (ca->did_packed)
+ free_ref_array(&ca->packed);
ca->did_loose = ca->did_packed = 0;
}

static void read_packed_refs(FILE *f, struct cached_refs *cached_refs)
{
- struct ref_list *list = NULL;
- struct ref_list *last = NULL;
+ struct ref_entry *last = NULL;
char refline[PATH_MAX];
int flag = REF_ISPACKED;

@@ -205,7 +158,7 @@ static void read_packed_refs(FILE *f, struct cached_refs *cached_refs)

name = parse_ref_line(refline, sha1);
if (name) {
- list = add_ref(name, sha1, flag, list, &last);
+ add_ref(name, sha1, flag, &cached_refs->packed, &last);
continue;
}
if (last &&
@@ -215,21 +168,20 @@ static void read_packed_refs(FILE *f, struct cached_refs *cached_refs)
!get_sha1_hex(refline + 1, sha1))
hashcpy(last->peeled, sha1);
}
- cached_refs->packed = sort_ref_list(list);
+ sort_ref_array(&cached_refs->packed);
}

void add_extra_ref(const char *name, const unsigned char *sha1, int flag)
{
- extra_refs = add_ref(name, sha1, flag, extra_refs, NULL);
+ add_ref(name, sha1, flag, &extra_refs, NULL);
}

void clear_extra_refs(void)
{
- free_ref_list(extra_refs);
- extra_refs = NULL;
+ free_ref_array(&extra_refs);
}

-static struct ref_list *get_packed_refs(const char *submodule)
+static struct ref_array *get_packed_refs(const char *submodule)
{
const char *packed_refs_file;
struct cached_refs *refs;
@@ -237,7 +189,7 @@ static struct ref_list *get_packed_refs(const char *submodule)
if (submodule) {
packed_refs_file = git_path_submodule(submodule, "packed-refs");
refs = &submodule_refs;
- free_ref_list(refs->packed);
+ free_ref_array(&refs->packed);
} else {
packed_refs_file = git_path("packed-refs");
refs = &cached_refs;
@@ -245,18 +197,17 @@ static struct ref_list *get_packed_refs(const char *submodule)

if (!refs->did_packed || submodule) {
FILE *f = fopen(packed_refs_file, "r");
- refs->packed = NULL;
if (f) {
read_packed_refs(f, refs);
fclose(f);
}
refs->did_packed = 1;
}
- return refs->packed;
+ return &refs->packed;
}

-static struct ref_list *get_ref_dir(const char *submodule, const char *base,
- struct ref_list *list)
+static void get_ref_dir(const char *submodule, const char *base,
+ struct ref_array *array)
{
DIR *dir;
const char *path;
@@ -299,7 +250,7 @@ static struct ref_list *get_ref_dir(const char *submodule, const char *base,
if (stat(refdir, &st) < 0)
continue;
if (S_ISDIR(st.st_mode)) {
- list = get_ref_dir(submodule, ref, list);
+ get_ref_dir(submodule, ref, array);
continue;
}
if (submodule) {
@@ -314,12 +265,11 @@ static struct ref_list *get_ref_dir(const char *submodule, const char *base,
hashclr(sha1);
flag |= REF_BROKEN;
}
- list = add_ref(ref, sha1, flag, list, NULL);
+ add_ref(ref, sha1, flag, array, NULL);
}
free(ref);
closedir(dir);
}
- return list;
}

struct warn_if_dangling_data {
@@ -356,21 +306,21 @@ void warn_dangling_symref(FILE *fp, const char *msg_fmt, const char *refname)
for_each_rawref(warn_if_dangling_symref, &data);
}

-static struct ref_list *get_loose_refs(const char *submodule)
+static struct ref_array *get_loose_refs(const char *submodule)
{
if (submodule) {
- free_ref_list(submodule_refs.loose);
- submodule_refs.loose = get_ref_dir(submodule, "refs", NULL);
- submodule_refs.loose = sort_ref_list(submodule_refs.loose);
- return submodule_refs.loose;
+ free_ref_array(&submodule_refs.loose);
+ get_ref_dir(submodule, "refs", &submodule_refs.loose);
+ sort_ref_array(&submodule_refs.loose);
+ return &submodule_refs.loose;
}

if (!cached_refs.did_loose) {
- cached_refs.loose = get_ref_dir(NULL, "refs", NULL);
- cached_refs.loose = sort_ref_list(cached_refs.loose);
+ get_ref_dir(NULL, "refs", &cached_refs.loose);
+ sort_ref_array(&cached_refs.loose);
cached_refs.did_loose = 1;
}
- return cached_refs.loose;
+ return &cached_refs.loose;
}

/* We allow "recursive" symbolic refs. Only within reason, though */
@@ -381,8 +331,8 @@ static int resolve_gitlink_packed_ref(char *name, int pathlen, const char *refna
{
FILE *f;
struct cached_refs refs;
- struct ref_list *ref;
- int retval;
+ struct ref_entry *ref;
+ int retval = -1;

strcpy(name + pathlen, "packed-refs");
f = fopen(name, "r");
@@ -390,17 +340,12 @@ static int resolve_gitlink_packed_ref(char *name, int pathlen, const char *refna
return -1;
read_packed_refs(f, &refs);
fclose(f);
- ref = refs.packed;
- retval = -1;
- while (ref) {
- if (!strcmp(ref->name, refname)) {
- retval = 0;
- memcpy(result, ref->sha1, 20);
- break;
- }
- ref = ref->next;
+ ref = search_ref_array(&refs.packed, refname);
+ if (ref != NULL) {
+ memcpy(result, ref->sha1, 20);
+ retval = 0;
}
- free_ref_list(refs.packed);
+ free_ref_array(&refs.packed);
return retval;
}

@@ -501,15 +446,13 @@ const char *resolve_ref(const char *ref, unsigned char *sha1, int reading, int *
git_snpath(path, sizeof(path), "%s", ref);
/* Special case: non-existing file. */
if (lstat(path, &st) < 0) {
- struct ref_list *list = get_packed_refs(NULL);
- while (list) {
- if (!strcmp(ref, list->name)) {
- hashcpy(sha1, list->sha1);
- if (flag)
- *flag |= REF_ISPACKED;
- return ref;
- }
- list = list->next;
+ struct ref_array *packed = get_packed_refs(NULL);
+ struct ref_entry *r = search_ref_array(packed, ref);
+ if (r != NULL) {
+ hashcpy(sha1, r->sha1);
+ if (flag)
+ *flag |= REF_ISPACKED;
+ return ref;
}
if (reading || errno != ENOENT)
return NULL;
@@ -584,7 +527,7 @@ int read_ref(const char *ref, unsigned char *sha1)

#define DO_FOR_EACH_INCLUDE_BROKEN 01
static int do_one_ref(const char *base, each_ref_fn fn, int trim,
- int flags, void *cb_data, struct ref_list *entry)
+ int flags, void *cb_data, struct ref_entry *entry)
{
if (prefixcmp(entry->name, base))
return 0;
@@ -630,18 +573,12 @@ int peel_ref(const char *ref, unsigned char *sha1)
return -1;

if ((flag & REF_ISPACKED)) {
- struct ref_list *list = get_packed_refs(NULL);
+ struct ref_array *array = get_packed_refs(NULL);
+ struct ref_entry *r = search_ref_array(array, ref);

- while (list) {
- if (!strcmp(list->name, ref)) {
- if (list->flag & REF_KNOWS_PEELED) {
- hashcpy(sha1, list->peeled);
- return 0;
- }
- /* older pack-refs did not leave peeled ones */
- break;
- }
- list = list->next;
+ if (r != NULL && r->flag & REF_KNOWS_PEELED) {
+ hashcpy(sha1, r->peeled);
+ return 0;
}
}

@@ -660,36 +597,39 @@ fallback:
static int do_for_each_ref(const char *submodule, const char *base, each_ref_fn fn,
int trim, int flags, void *cb_data)
{
- int retval = 0;
- struct ref_list *packed = get_packed_refs(submodule);
- struct ref_list *loose = get_loose_refs(submodule);
+ int retval = 0, i, p = 0, l = 0;
+ struct ref_array *packed = get_packed_refs(submodule);
+ struct ref_array *loose = get_loose_refs(submodule);

- struct ref_list *extra;
+ struct ref_array *extra = &extra_refs;

- for (extra = extra_refs; extra; extra = extra->next)
- retval = do_one_ref(base, fn, trim, flags, cb_data, extra);
+ for (i = 0; i < extra->nr; i++)
+ retval = do_one_ref(base, fn, trim, flags, cb_data, extra->refs[i]);

- while (packed && loose) {
- struct ref_list *entry;
- int cmp = strcmp(packed->name, loose->name);
+ while (p < packed->nr && l < loose->nr) {
+ struct ref_entry *entry;
+ int cmp = strcmp(packed->refs[p]->name, loose->refs[l]->name);
if (!cmp) {
- packed = packed->next;
+ p++;
continue;
}
if (cmp > 0) {
- entry = loose;
- loose = loose->next;
+ entry = loose->refs[l++];
} else {
- entry = packed;
- packed = packed->next;
+ entry = packed->refs[p++];
}
retval = do_one_ref(base, fn, trim, flags, cb_data, entry);
if (retval)
goto end_each;
}

- for (packed = packed ? packed : loose; packed; packed = packed->next) {
- retval = do_one_ref(base, fn, trim, flags, cb_data, packed);
+ if (l < loose->nr) {
+ p = l;
+ packed = loose;
+ }
+
+ for (; p < packed->nr; p++) {
+ retval = do_one_ref(base, fn, trim, flags, cb_data, packed->refs[p]);
if (retval)
goto end_each;
}
@@ -1005,24 +945,24 @@ static int remove_empty_directories(const char *file)
}

static int is_refname_available(const char *ref, const char *oldref,
- struct ref_list *list, int quiet)
-{
- int namlen = strlen(ref); /* e.g. 'foo/bar' */
- while (list) {
- /* list->name could be 'foo' or 'foo/bar/baz' */
- if (!oldref || strcmp(oldref, list->name)) {
- int len = strlen(list->name);
+ struct ref_array *array, int quiet)
+{
+ int i, namlen = strlen(ref); /* e.g. 'foo/bar' */
+ for (i = 0; i < array->nr; i++ ) {
+ struct ref_entry *entry = array->refs[i];
+ /* entry->name could be 'foo' or 'foo/bar/baz' */
+ if (!oldref || strcmp(oldref, entry->name)) {
+ int len = strlen(entry->name);
int cmplen = (namlen < len) ? namlen : len;
- const char *lead = (namlen < len) ? list->name : ref;
- if (!strncmp(ref, list->name, cmplen) &&
+ const char *lead = (namlen < len) ? entry->name : ref;
+ if (!strncmp(ref, entry->name, cmplen) &&
lead[cmplen] == '/') {
if (!quiet)
error("'%s' exists; cannot create '%s'",
- list->name, ref);
+ entry->name, ref);
return 0;
}
}
- list = list->next;
}
return 1;
}
@@ -1129,18 +1069,13 @@ static struct lock_file packlock;

static int repack_without_ref(const char *refname)
{
- struct ref_list *list, *packed_ref_list;
- int fd;
- int found = 0;
+ struct ref_array *packed;
+ struct ref_entry *ref;
+ int fd, i;

- packed_ref_list = get_packed_refs(NULL);
- for (list = packed_ref_list; list; list = list->next) {
- if (!strcmp(refname, list->name)) {
- found = 1;
- break;
- }
- }
- if (!found)
+ packed = get_packed_refs(NULL);
+ ref = search_ref_array(packed, refname);
+ if (ref == NULL)
return 0;
fd = hold_lock_file_for_update(&packlock, git_path("packed-refs"), 0);
if (fd < 0) {
@@ -1148,17 +1083,19 @@ static int repack_without_ref(const char *refname)
return error("cannot delete '%s' from packed refs", refname);
}

- for (list = packed_ref_list; list; list = list->next) {
+ for (i = 0; i < packed->nr; i++) {
char line[PATH_MAX + 100];
int len;

- if (!strcmp(refname, list->name))
+ ref = packed->refs[i];
+
+ if (!strcmp(refname, ref->name))
continue;
len = snprintf(line, sizeof(line), "%s %s\n",
- sha1_to_hex(list->sha1), list->name);
+ sha1_to_hex(ref->sha1), ref->name);
/* this should not happen but just being defensive */
if (len > sizeof(line))
- die("too long a refname '%s'", list->name);
+ die("too long a refname '%s'", ref->name);
write_or_die(fd, line, len);
}
return commit_lock_file(&packlock);

--
1.7.6.1

Junio C Hamano

2011-09-29 23:48:00 UTC

This version looks sane, although I have a suspicion that it may have
some interaction with what Michael may be working on.

Thanks.

Michael Haggerty

2011-09-30 15:30:04 UTC

Post by Junio C Hamano
This version looks sane, although I have a suspicion that it may have
some interaction with what Michael may be working on.

Indeed, I have almost equivalent changes in the giant patch series that
I am working on [1]. The branch is very experimental. The tip
currently passes all the tests, but it has a known performance
regression in connection if "git fetch" is used to fetch many commits.

But before comparing ref-related optimizations, we have an *urgent* need
for a decent performance test suite. There are many slightly different
scenarios that have very different performance characteristics, and we
have to be sure that we are optimizing for the whole palette of
many-reference use cases. So I made an attempt at a kludgey but
somewhat flexible performance-testing script [2]. I don't know whether
something like this should be integrated into the git project, and if so
where; suggestions are welcome.

To run the tests, from the root of the git source tree:

make # make sure git is up-to-date
t/make-refperf-repo --help
t/make-refperf-repo [OPTIONS]
t/refperf
cat refperf.times # See the results

The default repo has 5k commits in a linear series with one reference on
each commit. (These numbers can both be adjusted.)

The reference namespace can be laid out a few ways:

* Many references in a single "directory" vs. sharded over many
"directories"

* In lexicographic order by commit, in reverse order, or "shuffled".

By default, the repo is written to "refperf-repo".

The time it takes to create the test repository is itself also an
interesting benchmark. For example, on the maint branch it is terribly
slow unless it is passed either the --pack-refs-interval=N (with N, say
100) or --no-replace-object option. I also noticed that if it is run like

t/make-refperf-repo --refs=5000 --commits=5000 \
--pack-refs-interval=100

(one ref per commit), git-pack-refs becomes precipitously and
dramatically slower after the 2000th commit.

I haven't had time yet for systematic benchmarks of other git versions.

See the refperf script to see what sorts of benchmarks that I have built
into it so far. The refperf test is non-destructive; it always copies
from "refperf-repo" to "refperf-repo-copy" and does its tests in the
copy; therefore a test repo can be reused. The timing data are written
to "refperf.times" and other output to "refperf.log".

Here are my refperf results for the "maint" branch on my notebook with
the default "make-refperf-repo" arguments (times in seconds):

3.36 git branch (cold)
0.01 git branch (warm)
0.04 git for-each-ref
3.08 git checkout (cold)
0.01 git checkout (warm)
0.00 git checkout --orphan (warm)
0.15 git checkout from detached orphan
0.12 git pack-refs
1.17 git branch (cold)
0.00 git branch (warm)
0.17 git for-each-ref
0.95 git checkout (cold)
0.00 git checkout (warm)
0.00 git checkout --orphan (warm)
0.21 git checkout from detached orphan
0.18 git branch -a --contains
7.67 git clone
0.06 git fetch (nothing)
0.01 git pack-refs
0.05 git fetch (nothing, packed)
0.10 git clone of a ref-packed repo
0.63 git fetch (everything)

Probably we should test with even more references than this, but this
test already shows that some commands are quite sluggish.

There are some more things that could be added, like:

* Branches vs. annotated tags

* References on the tips of branches in a more typical "branchy" repository.

* git describe --all

* git log --decorate

* git gc

* git filter-branch
(This has very different performance characteristics because it is a
script that invokes git many times.)

I suggest that we try to do systematic benchmarking of any changes that
we claim are performance optimizations and share before/after results in
the cover letter for the patch series.

Michael

[1] branch hierarchical-refs at git://github.com/mhagger/git.git
[2] branch refperf at git://github.com/mhagger/git.git

--
Michael Haggerty
***@alum.mit.edu
http://softwareswirl.blogspot.com/

Junio C Hamano

2011-09-30 16:38:54 UTC

Post by Junio C Hamano
This version looks sane, although I have a suspicion that it may have
some interaction with what Michael may be working on.

Indeed, I have almost equivalent changes in the giant patch series that
I am working on [1].

Good; that was the primary thing I wanted to know. I want to take
Julian's patch early but if the approach and data structures were
drastically different from what you are cooking, that would force
unnecessary reroll on your part, which I wanted to avoid.

Thanks.

Julian Phillips

2011-09-30 17:56:57 UTC

The previous custom merge sort would drop duplicate entries as part of
the sort. It would also die if the duplicate entries had different
sha1 values. The standard library qsort doesn't do this, so we have
to do it manually afterwards.

Signed-off-by: Julian Phillips <***@quantumfyre.co.uk>
---

Post by Junio C Hamano

Post by Junio C Hamano
This version looks sane, although I have a suspicion that it may have
some interaction with what Michael may be working on.

Indeed, I have almost equivalent changes in the giant patch series that
I am working on [1].

I had a quick look at Michael's code, and it reminded me that I had missed one
thing out. If we want to keep the duplicate detection & removal from the
original merge sort then this patch is needed on top of v3 of the binary search.

Though I never could figure out how duplicate refs were supposed to appear ... I
tested by editing packed-refs, but I assume that isn't "supported".

refs.c | 22 ++++++++++++++++++++++
1 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/refs.c b/refs.c
index 4c01d79..cf080ee 100644
--- a/refs.c
+++ b/refs.c
@@ -77,7 +77,29 @@ static int ref_entry_cmp(const void *a, const void *b)

static void sort_ref_array(struct ref_array *array)
{
+ int i = 0, j = 1;
+
+ /* Nothing to sort unless there are at least two entries */
+ if (array->nr < 2)
+ return;
+
qsort(array->refs, array->nr, sizeof(*array->refs), ref_entry_cmp);
+
+ /* Remove any duplicates from the ref_array */
+ for (; j < array->nr; j++) {
+ struct ref_entry *a = array->refs[i];
+ struct ref_entry *b = array->refs[j];
+ if (!strcmp(a->name, b->name)) {
+ if (hashcmp(a->sha1, b->sha1))
+ die("Duplicated ref, and SHA1s don't match: %s",
+ a->name);
+ warning("Duplicated ref: %s", a->name);
+ continue;
+ }
+ i++;
+ array->refs[i] = array->refs[j];
+ }
+ array->nr = i + 1;
}

static struct ref_entry *search_ref_array(struct ref_array *array, const char *name)

--
1.7.6.1

Martin Fick

2011-09-30 01:13:08 UTC

On Thursday, September 29, 2011 04:11:42 pm Julian Phillips=20

This works for me, however unfortunately, I cannot find any=20
scenarios where it improves anything over the previous fix=20
by Ren=E9. :(

I tested many things, clones, fetches, fetch noops,=20
checkouts, garbage collection. I am a bit surprised,=20
because I thought that my hack of a hash map did improve=20
still on checkouts on packed refs, but it could just be that=20
my hack was buggy and did not actually do a full orphan=20
check.

Thanks,

-Martin

Junio C Hamano

2011-09-30 03:44:40 UTC

Post by Martin Fick
This works for me, however unfortunately, I cannot find any=20
scenarios where it improves anything over the previous fix=20
by Ren=C3=A9. :(

Nevertheless, I would appreciate it if you can try this _without_ Ren=C3=
=A9's
patch. This attempts to make resolve_ref() cheap for _any_ caller. Ren=C3=
=A9's
patch avoids calling it in one specific callchain.

They address different issues. Ren=C3=A9's patch is probably an indepen=
dently
good change (I haven't thought about the interactions with the topics i=
n
flight and its implications on the future direction), but would not hel=
p
other/new callers that make many calls to resolve_ref().

Julian Phillips

2011-09-30 08:04:02 UTC

Post by Martin Fick
This works for me, however unfortunately, I cannot find any
scenarios where it improves anything over the previous fix
by Ren=C3=A9. :(

Nevertheless, I would appreciate it if you can try this _without_=20
Ren=C3=A9's
patch. This attempts to make resolve_ref() cheap for _any_ caller.=20
Ren=C3=A9's
patch avoids calling it in one specific callchain.
They address different issues. Ren=C3=A9's patch is probably an=20
independently
good change (I haven't thought about the interactions with the topics=

=20

in
flight and its implications on the future direction), but would not=20
help
other/new callers that make many calls to resolve_ref().

It certainly helps with my test repo (~140k refs, of which ~40k are=20
branches). User times for checkout starting from an orphaned commit=20
are:

No fix : ~16m8s
+ Binary Search : ~4s
+ Ren=C3=A9's patch : ~2s

(The 2s includes both patches, though the timing is the same for Ren=C3=
=A9's=20
patch alone)

--=20
Julian

Martin Fick

2011-09-30 15:45:34 UTC

On Thursday, September 29, 2011 09:44:40 pm Junio C Hamano=20

Post by Junio C Hamano

Post by Martin Fick
This works for me, however unfortunately, I cannot find
any scenarios where it improves anything over the
previous fix by Ren=C3=A9. :(

=20
Nevertheless, I would appreciate it if you can try this
_without_ Ren=C3=A9's patch. This attempts to make
resolve_ref() cheap for _any_ caller. Ren=C3=A9's patch
avoids calling it in one specific callchain.
=20
They address different issues. Ren=C3=A9's patch is probably
an independently good change (I haven't thought about
the interactions with the topics in flight and its
implications on the future direction), but would not
help other/new callers that make many calls to
resolve_ref().

Agreed. Here is what I am seeing without Ren=C3=A9's patch.

Checkout in NON packed ref repo takes about 20s, with patch=20
v3 of binary search, it takes about 11s (1s slower than=20
Ren=C3=A9's patch).

Checkout in packed ref repo takes about 5:30min, with patch=20
v3 of binary search, it takes about 10s (also 1s slower than=20
Ren=C3=A9's patch).

I'd say that's not bad, it seems like the 1s difference is=20
doing the search 60K+times (my tests don't quite scan the=20
full list), so the search seems to scale well with patch v3.

-Martin

--=20
Employee of Qualcomm Innovation Center, Inc. which is a=20
member of Code Aurora Forum

Julian Phillips

2011-09-29 22:04:34 UTC

Currently we linearly search through lists of refs when we need to
find a specific ref. This can be very slow if we need to lookup a
large number of refs. By changing to a binary search we can make this
faster.

In order to be able to use a binary search we need to change from
using linked lists to arrays, which we can manage using ALLOC_GROW.

We can now also use the standard library qsort function to sort the
refs arrays.

Signed-off-by: Julian Phillips <***@quantumfyre.co.uk>
---

Previous version caused a regression in the test suite ... :$

refs.c | 329 ++++++++++++++++++++++++++--------------------------------------
1 files changed, 133 insertions(+), 196 deletions(-)

diff --git a/refs.c b/refs.c
index a49ff74..35bba97 100644
--- a/refs.c
+++ b/refs.c
@@ -8,14 +8,18 @@
#define REF_KNOWS_PEELED 04
#define REF_BROKEN 010

-struct ref_list {
- struct ref_list *next;
+struct ref_entry {
unsigned char flag; /* ISSYMREF? ISPACKED? */
unsigned char sha1[20];
unsigned char peeled[20];
char name[FLEX_ARRAY];
};

+struct ref_array {
+ int nr, alloc;
+ struct ref_entry **refs;
+};
+
static const char *parse_ref_line(char *line, unsigned char *sha1)
{
/*
@@ -44,108 +48,58 @@ static const char *parse_ref_line(char *line, unsigned char *sha1)
return line;
}

-static struct ref_list *add_ref(const char *name, const unsigned char *sha1,
- int flag, struct ref_list *list,
- struct ref_list **new_entry)
+static void add_ref(const char *name, const unsigned char *sha1,
+ int flag, struct ref_array *refs,
+ struct ref_entry **new_entry)
{
int len;
- struct ref_list *entry;
+ struct ref_entry *entry;

/* Allocate it and add it in.. */
len = strlen(name) + 1;
- entry = xmalloc(sizeof(struct ref_list) + len);
+ entry = xmalloc(sizeof(struct ref) + len);
hashcpy(entry->sha1, sha1);
hashclr(entry->peeled);
memcpy(entry->name, name, len);
entry->flag = flag;
- entry->next = list;
if (new_entry)
*new_entry = entry;
- return entry;
+ ALLOC_GROW(refs->refs, refs->nr + 1, refs->alloc);
+ refs->refs[refs->nr++] = entry;
}

-/* merge sort the ref list */
-static struct ref_list *sort_ref_list(struct ref_list *list)
+static int ref_entry_cmp(const void *a, const void *b)
{
- int psize, qsize, last_merge_count, cmp;
- struct ref_list *p, *q, *l, *e;
- struct ref_list *new_list = list;
- int k = 1;
- int merge_count = 0;
-
- if (!list)
- return list;
-
- do {
- last_merge_count = merge_count;
- merge_count = 0;
-
- psize = 0;
+ struct ref_entry *one = *(struct ref_entry **)a;
+ struct ref_entry *two = *(struct ref_entry **)b;
+ return strcmp(one->name, two->name);
+}

- p = new_list;
- q = new_list;
- new_list = NULL;
- l = NULL;
+static void sort_ref_array(struct ref_array *array)
+{
+ qsort(array->refs, array->nr, sizeof(*array->refs), ref_entry_cmp);
+}

- while (p) {
- merge_count++;
+static struct ref_entry *search_ref_array(struct ref_array *array, const char *name)
+{
+ struct ref_entry *e, **r;
+ int len;

- while (psize < k && q->next) {
- q = q->next;
- psize++;
- }
- qsize = k;
-
- while ((psize > 0) || (qsize > 0 && q)) {
- if (qsize == 0 || !q) {
- e = p;
- p = p->next;
- psize--;
- } else if (psize == 0) {
- e = q;
- q = q->next;
- qsize--;
- } else {
- cmp = strcmp(q->name, p->name);
- if (cmp < 0) {
- e = q;
- q = q->next;
- qsize--;
- } else if (cmp > 0) {
- e = p;
- p = p->next;
- psize--;
- } else {
- if (hashcmp(q->sha1, p->sha1))
- die("Duplicated ref, and SHA1s don't match: %s",
- q->name);
- warning("Duplicated ref: %s", q->name);
- e = q;
- q = q->next;
- qsize--;
- free(e);
- e = p;
- p = p->next;
- psize--;
- }
- }
+ if (name == NULL)
+ return NULL;

- e->next = NULL;
+ len = strlen(name) + 1;
+ e = xmalloc(sizeof(struct ref) + len);
+ memcpy(e->name, name, len);

- if (l)
- l->next = e;
- if (!new_list)
- new_list = e;
- l = e;
- }
+ r = bsearch(&e, array->refs, array->nr, sizeof(*array->refs), ref_entry_cmp);

- p = q;
- };
+ free(e);

- k = k * 2;
- } while ((last_merge_count != merge_count) || (last_merge_count != 1));
+ if (r == NULL)
+ return NULL;

- return new_list;
+ return *r;
}

/*
@@ -155,38 +109,37 @@ static struct ref_list *sort_ref_list(struct ref_list *list)
static struct cached_refs {
char did_loose;
char did_packed;
- struct ref_list *loose;
- struct ref_list *packed;
+ struct ref_array loose;
+ struct ref_array packed;
} cached_refs, submodule_refs;
-static struct ref_list *current_ref;
+static struct ref_entry *current_ref;

-static struct ref_list *extra_refs;
+static struct ref_array extra_refs;

-static void free_ref_list(struct ref_list *list)
+static void free_ref_array(struct ref_array *array)
{
- struct ref_list *next;
- for ( ; list; list = next) {
- next = list->next;
- free(list);
- }
+ int i;
+ for (i = 0; i < array->nr; i++)
+ free(array->refs[i]);
+ free(array->refs);
+ array->nr = array->alloc = 0;
+ array->refs = NULL;
}

static void invalidate_cached_refs(void)
{
struct cached_refs *ca = &cached_refs;

- if (ca->did_loose && ca->loose)
- free_ref_list(ca->loose);
- if (ca->did_packed && ca->packed)
- free_ref_list(ca->packed);
- ca->loose = ca->packed = NULL;
+ if (ca->did_loose)
+ free_ref_array(&ca->loose);
+ if (ca->did_packed)
+ free_ref_array(&ca->packed);
ca->did_loose = ca->did_packed = 0;
}

static void read_packed_refs(FILE *f, struct cached_refs *cached_refs)
{
- struct ref_list *list = NULL;
- struct ref_list *last = NULL;
+ struct ref_entry *last = NULL;
char refline[PATH_MAX];
int flag = REF_ISPACKED;

@@ -205,7 +158,7 @@ static void read_packed_refs(FILE *f, struct cached_refs *cached_refs)

name = parse_ref_line(refline, sha1);
if (name) {
- list = add_ref(name, sha1, flag, list, &last);
+ add_ref(name, sha1, flag, &cached_refs->packed, &last);
continue;
}
if (last &&
@@ -215,21 +168,20 @@ static void read_packed_refs(FILE *f, struct cached_refs *cached_refs)
!get_sha1_hex(refline + 1, sha1))
hashcpy(last->peeled, sha1);
}
- cached_refs->packed = sort_ref_list(list);
+ sort_ref_array(&cached_refs->packed);
}

void add_extra_ref(const char *name, const unsigned char *sha1, int flag)
{
- extra_refs = add_ref(name, sha1, flag, extra_refs, NULL);
+ add_ref(name, sha1, flag, &extra_refs, NULL);
}

void clear_extra_refs(void)
{
- free_ref_list(extra_refs);
- extra_refs = NULL;
+ free_ref_array(&extra_refs);
}

-static struct ref_list *get_packed_refs(const char *submodule)
+static struct ref_array *get_packed_refs(const char *submodule)
{
const char *packed_refs_file;
struct cached_refs *refs;
@@ -237,7 +189,7 @@ static struct ref_list *get_packed_refs(const char *submodule)
if (submodule) {
packed_refs_file = git_path_submodule(submodule, "packed-refs");
refs = &submodule_refs;
- free_ref_list(refs->packed);
+ free_ref_array(&refs->packed);
} else {
packed_refs_file = git_path("packed-refs");
refs = &cached_refs;
@@ -245,18 +197,17 @@ static struct ref_list *get_packed_refs(const char *submodule)

if (!refs->did_packed || submodule) {
FILE *f = fopen(packed_refs_file, "r");
- refs->packed = NULL;
if (f) {
read_packed_refs(f, refs);
fclose(f);
}
refs->did_packed = 1;
}
- return refs->packed;
+ return &refs->packed;
}

-static struct ref_list *get_ref_dir(const char *submodule, const char *base,
- struct ref_list *list)
+static void get_ref_dir(const char *submodule, const char *base,
+ struct ref_array *array)
{
DIR *dir;
const char *path;
@@ -299,7 +250,7 @@ static struct ref_list *get_ref_dir(const char *submodule, const char *base,
if (stat(refdir, &st) < 0)
continue;
if (S_ISDIR(st.st_mode)) {
- list = get_ref_dir(submodule, ref, list);
+ get_ref_dir(submodule, ref, array);
continue;
}
if (submodule) {
@@ -314,12 +265,11 @@ static struct ref_list *get_ref_dir(const char *submodule, const char *base,
hashclr(sha1);
flag |= REF_BROKEN;
}
- list = add_ref(ref, sha1, flag, list, NULL);
+ add_ref(ref, sha1, flag, array, NULL);
}
free(ref);
closedir(dir);
}
- return list;
}

struct warn_if_dangling_data {
@@ -356,21 +306,21 @@ void warn_dangling_symref(FILE *fp, const char *msg_fmt, const char *refname)
for_each_rawref(warn_if_dangling_symref, &data);
}

-static struct ref_list *get_loose_refs(const char *submodule)
+static struct ref_array *get_loose_refs(const char *submodule)
{
if (submodule) {
- free_ref_list(submodule_refs.loose);
- submodule_refs.loose = get_ref_dir(submodule, "refs", NULL);
- submodule_refs.loose = sort_ref_list(submodule_refs.loose);
- return submodule_refs.loose;
+ free_ref_array(&submodule_refs.loose);
+ get_ref_dir(submodule, "refs", &submodule_refs.loose);
+ sort_ref_array(&submodule_refs.loose);
+ return &submodule_refs.loose;
}

if (!cached_refs.did_loose) {
- cached_refs.loose = get_ref_dir(NULL, "refs", NULL);
- cached_refs.loose = sort_ref_list(cached_refs.loose);
+ get_ref_dir(NULL, "refs", &cached_refs.loose);
+ sort_ref_array(&cached_refs.loose);
cached_refs.did_loose = 1;
}
- return cached_refs.loose;
+ return &cached_refs.loose;
}

/* We allow "recursive" symbolic refs. Only within reason, though */
@@ -381,8 +331,8 @@ static int resolve_gitlink_packed_ref(char *name, int pathlen, const char *refna
{
FILE *f;
struct cached_refs refs;
- struct ref_list *ref;
- int retval;
+ struct ref_entry *ref;
+ int retval = -1;

strcpy(name + pathlen, "packed-refs");
f = fopen(name, "r");
@@ -390,17 +340,12 @@ static int resolve_gitlink_packed_ref(char *name, int pathlen, const char *refna
return -1;
read_packed_refs(f, &refs);
fclose(f);
- ref = refs.packed;
- retval = -1;
- while (ref) {
- if (!strcmp(ref->name, refname)) {
- retval = 0;
- memcpy(result, ref->sha1, 20);
- break;
- }
- ref = ref->next;
+ ref = search_ref_array(&refs.packed, refname);
+ if (ref != NULL) {
+ memcpy(result, ref->sha1, 20);
+ retval = 0;
}
- free_ref_list(refs.packed);
+ free_ref_array(&refs.packed);
return retval;
}

@@ -501,15 +446,13 @@ const char *resolve_ref(const char *ref, unsigned char *sha1, int reading, int *
git_snpath(path, sizeof(path), "%s", ref);
/* Special case: non-existing file. */
if (lstat(path, &st) < 0) {
- struct ref_list *list = get_packed_refs(NULL);
- while (list) {
- if (!strcmp(ref, list->name)) {
- hashcpy(sha1, list->sha1);
- if (flag)
- *flag |= REF_ISPACKED;
- return ref;
- }
- list = list->next;
+ struct ref_array *packed = get_packed_refs(NULL);
+ struct ref_entry *r = search_ref_array(packed, ref);
+ if (r != NULL) {
+ hashcpy(sha1, r->sha1);
+ if (flag)
+ *flag |= REF_ISPACKED;
+ return ref;
}
if (reading || errno != ENOENT)
return NULL;
@@ -584,7 +527,7 @@ int read_ref(const char *ref, unsigned char *sha1)

#define DO_FOR_EACH_INCLUDE_BROKEN 01
static int do_one_ref(const char *base, each_ref_fn fn, int trim,
- int flags, void *cb_data, struct ref_list *entry)
+ int flags, void *cb_data, struct ref_entry *entry)
{
if (prefixcmp(entry->name, base))
return 0;
@@ -630,18 +573,12 @@ int peel_ref(const char *ref, unsigned char *sha1)
return -1;

if ((flag & REF_ISPACKED)) {
- struct ref_list *list = get_packed_refs(NULL);
+ struct ref_array *array = get_packed_refs(NULL);
+ struct ref_entry *r = search_ref_array(array, ref);

- while (list) {
- if (!strcmp(list->name, ref)) {
- if (list->flag & REF_KNOWS_PEELED) {
- hashcpy(sha1, list->peeled);
- return 0;
- }
- /* older pack-refs did not leave peeled ones */
- break;
- }
- list = list->next;
+ if (r != NULL && r->flag & REF_KNOWS_PEELED) {
+ hashcpy(sha1, r->peeled);
+ return 0;
}
}

@@ -660,36 +597,39 @@ fallback:
static int do_for_each_ref(const char *submodule, const char *base, each_ref_fn fn,
int trim, int flags, void *cb_data)
{
- int retval = 0;
- struct ref_list *packed = get_packed_refs(submodule);
- struct ref_list *loose = get_loose_refs(submodule);
+ int retval = 0, i, p = 0, l = 0;
+ struct ref_array *packed = get_packed_refs(submodule);
+ struct ref_array *loose = get_loose_refs(submodule);

- struct ref_list *extra;
+ struct ref_array *extra = &extra_refs;

- for (extra = extra_refs; extra; extra = extra->next)
- retval = do_one_ref(base, fn, trim, flags, cb_data, extra);
+ for (i = 0; i < extra->nr; i++)
+ retval = do_one_ref(base, fn, trim, flags, cb_data, extra->refs[i]);

- while (packed && loose) {
- struct ref_list *entry;
- int cmp = strcmp(packed->name, loose->name);
+ while (p < packed->nr && l < loose->nr) {
+ struct ref_entry *entry;
+ int cmp = strcmp(packed->refs[p]->name, loose->refs[l]->name);
if (!cmp) {
- packed = packed->next;
+ p++;
continue;
}
if (cmp > 0) {
- entry = loose;
- loose = loose->next;
+ entry = loose->refs[l++];
} else {
- entry = packed;
- packed = packed->next;
+ entry = packed->refs[p++];
}
retval = do_one_ref(base, fn, trim, flags, cb_data, entry);
if (retval)
goto end_each;
}

- for (packed = packed ? packed : loose; packed; packed = packed->next) {
- retval = do_one_ref(base, fn, trim, flags, cb_data, packed);
+ if (l < loose->nr) {
+ p = l;
+ packed = loose;
+ }
+
+ for (; p < packed->nr; p++) {
+ retval = do_one_ref(base, fn, trim, flags, cb_data, packed->refs[p]);
if (retval)
goto end_each;
}
@@ -1005,24 +945,24 @@ static int remove_empty_directories(const char *file)
}

static int is_refname_available(const char *ref, const char *oldref,
- struct ref_list *list, int quiet)
-{
- int namlen = strlen(ref); /* e.g. 'foo/bar' */
- while (list) {
- /* list->name could be 'foo' or 'foo/bar/baz' */
- if (!oldref || strcmp(oldref, list->name)) {
- int len = strlen(list->name);
+ struct ref_array *array, int quiet)
+{
+ int i, namlen = strlen(ref); /* e.g. 'foo/bar' */
+ for (i = 0; i < array->nr; i++ ) {
+ struct ref_entry *entry = array->refs[i];
+ /* entry->name could be 'foo' or 'foo/bar/baz' */
+ if (!oldref || strcmp(oldref, entry->name)) {
+ int len = strlen(entry->name);
int cmplen = (namlen < len) ? namlen : len;
- const char *lead = (namlen < len) ? list->name : ref;
- if (!strncmp(ref, list->name, cmplen) &&
+ const char *lead = (namlen < len) ? entry->name : ref;
+ if (!strncmp(ref, entry->name, cmplen) &&
lead[cmplen] == '/') {
if (!quiet)
error("'%s' exists; cannot create '%s'",
- list->name, ref);
+ entry->name, ref);
return 0;
}
}
- list = list->next;
}
return 1;
}
@@ -1129,18 +1069,13 @@ static struct lock_file packlock;

static int repack_without_ref(const char *refname)
{
- struct ref_list *list, *packed_ref_list;
- int fd;
- int found = 0;
+ struct ref_array *packed;
+ struct ref_entry *ref;
+ int fd, i;

- packed_ref_list = get_packed_refs(NULL);
- for (list = packed_ref_list; list; list = list->next) {
- if (!strcmp(refname, list->name)) {
- found = 1;
- break;
- }
- }
- if (!found)
+ packed = get_packed_refs(NULL);
+ ref = search_ref_array(packed, refname);
+ if (ref == NULL)
return 0;
fd = hold_lock_file_for_update(&packlock, git_path("packed-refs"), 0);
if (fd < 0) {
@@ -1148,17 +1083,19 @@ static int repack_without_ref(const char *refname)
return error("cannot delete '%s' from packed refs", refname);
}

- for (list = packed_ref_list; list; list = list->next) {
+ for (i = 0; i < packed->nr; i++) {
char line[PATH_MAX + 100];
int len;

- if (!strcmp(refname, list->name))
+ ref = packed->refs[i];
+
+ if (!strcmp(refname, ref->name))
continue;
len = snprintf(line, sizeof(line), "%s %s\n",
- sha1_to_hex(list->sha1), list->name);
+ sha1_to_hex(ref->sha1), ref->name);
/* this should not happen but just being defensive */
if (len > sizeof(line))
- die("too long a refname '%s'", list->name);
+ die("too long a refname '%s'", ref->name);
write_or_die(fd, line, len);
}
return commit_lock_file(&packlock);

--
1.7.6.1

Julian Phillips

2011-09-29 19:10:57 UTC

Post by RenÃ© Scharfe

Post by Julian Phillips
Does the following help?
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 5e356a6..f0f4ca1 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -605,7 +605,7 @@ static int add_one_ref_to_rev_list_arg(const=20
char
*refname,
int flags,
void *cb_data)
{
- add_one_rev_list_arg(cb_data, refname);
+ add_one_rev_list_arg(cb_data, strdup(sha1_to_hex(sha1)));
return 0;
}

Hmm. Can we get rid of the multiple ref lookups fixed by the above
*and* the overhead of dealing with a textual argument list at the=20
same
time by calling add_pending_object directly, like this? (Factoring
out add_pending_sha1 should be a separate patch..)

Seems like a good idea. I get the same sort of times as with my patch,=
=20
but it makes the code _feel_ much nicer (and slightly smaller). Mine=20
was definitely more of a "it's 2am, but I think the problem is here"=20
type of patch ;)

--=20
Julian

Martin Fick

2011-09-29 20:11:06 UTC

On Thursday, September 29, 2011 12:27:43 pm Ren=C3=A9 Scharfe=20

Post by RenÃ© Scharfe
Hmm. Can we get rid of the multiple ref lookups fixed by
the above *and* the overhead of dealing with a textual
argument list at the same time by calling
add_pending_object directly, like this? (Factoring out
add_pending_sha1 should be a separate patch..)

=20
Ren=C3=A9,

Your patch works well for me. It achieves about the same=20
gains as Julian's patch. Thanks!

After all the performance fixes get merged for large ref=20
counts, it sure should help the Gerrit community. I wonder=20
how it might impact Gerrit mirroring...

-Martin

Employee of Qualcomm Innovation Center, Inc. which is a=20
member of Code Aurora Forum

René Scharfe

2011-09-30 09:12:08 UTC

Hi Martin,

Post by Martin Fick
Your patch works well for me. It achieves about the same=20
gains as Julian's patch. Thanks!

OK, and what happens if you apply the following patch on top of my firs=
t
one? It avoids going through all the refs a second time during cleanup=
,
at the cost of going through the list of all known objects. I wonder i=
f
that's any faster in your case.

Thanks,
Ren=C3=A9

diff --git a/builtin/checkout.c b/builtin/checkout.c
index 84e0cdc..a4b1003 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -596,15 +596,14 @@ static int add_pending_uninteresting_ref(const ch=
ar *refname,
return 0;
}
=20
-static int clear_commit_marks_from_one_ref(const char *refname,
- const unsigned char *sha1,
- int flags,
- void *cb_data)
+static void clear_commit_marks_for_all(unsigned int mark)
{
- struct commit *commit =3D lookup_commit_reference_gently(sha1, 1);
- if (commit)
- clear_commit_marks(commit, -1);
- return 0;
+ unsigned int i, max =3D get_max_object_index();
+ for (i =3D 0; i < max; i++) {
+ struct object *object =3D get_indexed_object(i);
+ if (object && object->type =3D=3D OBJ_COMMIT)
+ object->flags &=3D ~mark;
+ }
}
=20
static void describe_one_orphan(struct strbuf *sb, struct commit *comm=
it)
@@ -690,8 +689,7 @@ static void orphaned_commit_warning(struct commit *=
commit)
else
describe_detached_head(_("Previous HEAD position was"), commit);
=20
- clear_commit_marks(commit, -1);
- for_each_ref(clear_commit_marks_from_one_ref, NULL);
+ clear_commit_marks_for_all(ALL_REV_FLAGS);
}
=20
static int switch_branches(struct checkout_opts *opts, struct branch_i=
nfo *new)

Martin Fick

2011-09-30 16:09:05 UTC

On Friday, September 30, 2011 03:12:08 am Ren=C3=A9 Scharfe=20

Post by RenÃ© Scharfe
OK, and what happens if you apply the following patch on
top of my first one? It avoids going through all the
refs a second time during cleanup, at the cost of going
through the list of all known objects. I wonder if
that's any faster in your case.

This patch helps a bit more. It seems to shave about=20
another .5s off in packed and non packed case w or w/o=20
binary search.

-Martin

Post by RenÃ© Scharfe
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 84e0cdc..a4b1003 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -596,15 +596,14 @@ static int
add_pending_uninteresting_ref(const char *refname,
return 0;
}
=20
-static int clear_commit_marks_from_one_ref(const char
*refname, - const unsigned=20

char *sha1,

Post by RenÃ© Scharfe
- int flags,
- void *cb_data)
+static void clear_commit_marks_for_all(unsigned int
mark) {
- struct commit *commit =3D
lookup_commit_reference_gently(sha1, 1); - if (commit)
- clear_commit_marks(commit, -1);
- return 0;
+ unsigned int i, max =3D get_max_object_index();
+ for (i =3D 0; i < max; i++) {
+ struct object *object =3D=20

get_indexed_object(i);

Post by RenÃ© Scharfe
+ if (object && object->type =3D=3D OBJ_COMMIT)
+ object->flags &=3D ~mark;
+ }
}
=20
static void describe_one_orphan(struct strbuf *sb,
orphaned_commit_warning(struct commit *commit) else
describe_detached_head(_("Previous HEAD=20

position

Post by RenÃ© Scharfe
was"), commit);
=20
- clear_commit_marks(commit, -1);
- for_each_ref(clear_commit_marks_from_one_ref, NULL);
+ clear_commit_marks_for_all(ALL_REV_FLAGS);
}
=20
static int switch_branches(struct checkout_opts *opts,
struct branch_info *new) --
To unsubscribe from this list: send the line "unsubscribe
git" in the body of a message to
http://vger.kernel.org/majordomo-info.html

--=20
Employee of Qualcomm Innovation Center, Inc. which is a=20
member of Code Aurora Forum

Junio C Hamano

2011-09-30 16:52:34 UTC

Post by RenÃ© Scharfe
Hi Martin,

Post by Martin Fick
Your patch works well for me. It achieves about the same=20
gains as Julian's patch. Thanks!

OK, and what happens if you apply the following patch on top of my fi=

rst

Post by RenÃ© Scharfe
one? It avoids going through all the refs a second time during clean=

up,

Post by RenÃ© Scharfe
at the cost of going through the list of all known objects. I wonder=

Post by RenÃ© Scharfe
that's any faster in your case.
...
static void describe_one_orphan(struct strbuf *sb, struct commit *co=

mmit)

Post by RenÃ© Scharfe
@@ -690,8 +689,7 @@ static void orphaned_commit_warning(struct commit=

*commit)

Post by RenÃ© Scharfe
else
describe_detached_head(_("Previous HEAD position was"), commit);
=20
- clear_commit_marks(commit, -1);
- for_each_ref(clear_commit_marks_from_one_ref, NULL);
+ clear_commit_marks_for_all(ALL_REV_FLAGS);
}

The function already clears all the flag bits from commits near the tip=
of
all the refs (i.e. whatever commit it traverses until it gets to the fo=
rk
point), so it cannot be reused in other contexts where the caller

- first marks commit objects with some flag bits for its own purpose,
unrelated to the "orphaned"-ness check;
- calls this function to issue a warning; and then
- use the flag it earlier set to do something useful.

which requires "cleaning after yourself, by clearing only the bits you
used without disturbing other bits that you do not use" pattern.

It might be a better solution to not bother to clear the marks at all;
would it break anything in this codepath?

René Scharfe

2011-09-30 18:17:26 UTC

=20

Post by RenÃ© Scharfe
Hi Martin,

Post by Martin Fick
Your patch works well for me. It achieves about the same=20
gains as Julian's patch. Thanks!

OK, and what happens if you apply the following patch on top of my f=

irst

Post by RenÃ© Scharfe
one? It avoids going through all the refs a second time during clea=

nup,

Post by RenÃ© Scharfe
at the cost of going through the list of all known objects. I wonde=

r if

Post by RenÃ© Scharfe
that's any faster in your case.
...
static void describe_one_orphan(struct strbuf *sb, struct commit *c=

ommit)

Post by RenÃ© Scharfe
@@ -690,8 +689,7 @@ static void orphaned_commit_warning(struct commi=

t *commit)

=20
The function already clears all the flag bits from commits near the t=

ip of

all the refs (i.e. whatever commit it traverses until it gets to the =

fork

point), so it cannot be reused in other contexts where the caller
=20
- first marks commit objects with some flag bits for its own purpose=

unrelated to the "orphaned"-ness check;
- calls this function to issue a warning; and then
- use the flag it earlier set to do something useful.
=20
which requires "cleaning after yourself, by clearing only the bits yo=

used without disturbing other bits that you do not use" pattern.

Yes, clear_commit_marks_for_all is a bit brutal. Callers could clear
specfic bits (e.g. SEEN|UNINTERESTING) instead of ALL_REV_FLAGS, though=
=2E

It might be a better solution to not bother to clear the marks at all=

;

would it break anything in this codepath?

Unfortunately, yes; the cleanup part was added by 5c08dc48 later, when
it become apparent that it's really needed.

However, since the patch only buys us a 5% speedup I'm not sure it's
worth it in its current form.

Ren=C3=A9

Martin Fick

2011-09-26 15:15:29 UTC

OK, I have found what I believe is another performance
regression for large ref counts (~100K).

When I run git br on my repo which only has one branch, but
has ~100K refs under ref/changes (a gerrit repo), it takes
normally 3-6mins depending on whether my caches are fresh or
not. After bisecting some older changes, I noticed that
this ref seems to be where things start to get slow:
c774aab98ce6c5ef7aaacbef38da0a501eb671d4

commit c774aab98ce6c5ef7aaacbef38da0a501eb671d4
Author: Julian Phillips <***@quantumfyre.co.uk>
Date: Tue Apr 17 02:42:50 2007 +0100

refs.c: add a function to sort a ref list, rather then
sorting on add

Rather than sorting the refs list while building it,
sort in one
go after it is built using a merge sort. This has a
large
performance boost with large numbers of refs.

It shouldn't happen that we read duplicate entries into
the same
list, but just in case sort_ref_list drops them if the
SHA1s are
the same, or dies, as we have no way of knowing which
one is the
correct one.

Signed-off-by: Julian Phillips
<***@quantumfyre.co.uk>
Acked-by: Linus Torvalds <***@linux-foundation.org>
Signed-off-by: Junio C Hamano <***@cox.net>

which is a bit strange since that commit's purpose was to
actually speed things up in the case of many refs. Just to
verify, I reverted the commit on 1.7.7.rc0.73 and sure
enough, things speed up down to the 14-20s range depending
on caching.

If this change does not actually speed things up, should it
be reverted? Or was there a bug in the change that makes it
not do what it was supposed to do?

Thanks,

-Martin

--
Employee of Qualcomm Innovation Center, Inc. which is a
member of Code Aurora Forum

Sverre Rabbelier

2011-09-26 15:21:30 UTC

Heya,

Post by Martin Fick
If this change does not actually speed things up, should it
be reverted? =C2=A0Or was there a bug in the change that makes it
not do what it was supposed to do?

It probably looks at the refs in refs/changes while it shouldn't,
hence worsening your performance compared to not looking at those
refs. I assume that it does improve your situation if you have all
those refs under say refs/heads.

--=20
Cheers,

Sverre Rabbelier

Martin Fick

2011-09-26 15:48:25 UTC

On Monday, September 26, 2011 09:21:30 am Sverre Rabbelier

Heya,
On Mon, Sep 26, 2011 at 17:15, Martin Fick

Post by Martin Fick
If this change does not actually speed things up,
should it be reverted? Or was there a bug in the
change that makes it not do what it was supposed to
do?

It probably looks at the refs in refs/changes while it
shouldn't, hence worsening your performance compared to
not looking at those refs. I assume that it does improve
your situation if you have all those refs under say
refs/heads.

Hmm, I was thinking that too, and I just did a test.

Instead of storing the changes under refs/changes, I fetched
them under refs/heads/changes and then ran git 1.7.6 and it
took about 3 mins. Then, I ran the 1.7.7.rc0.73 with
c774aab98ce6c5ef7aaacbef38da0a501eb671d4 reverted and it
only took 13s! So, if this indeed tests what you were
suggesting, I think it shows that even in the intended case
this change slowed things down?

-Martin

--
Employee of Qualcomm Innovation Center, Inc. which is a
member of Code Aurora Forum

Sverre Rabbelier

2011-09-26 15:56:50 UTC

Heya,

Post by Martin Fick
Hmm, I was thinking that too, and I just did a test.
Instead of storing the changes under refs/changes, I fetched
them under refs/heads/changes and then ran git 1.7.6 and it
took about 3 mins. =C2=A0Then, I ran the 1.7.7.rc0.73 with
c774aab98ce6c5ef7aaacbef38da0a501eb671d4 reverted and it
only took 13s! =C2=A0So, if this indeed tests what you were
suggesting, I think it shows that even in the intended case
this change slowed things down?

And if you run 1.7.7 without that commit reverted?

--=20
Cheers,

Sverre Rabbelier

Martin Fick

2011-09-26 16:38:34 UTC

On Monday, September 26, 2011 09:56:50 am Sverre Rabbelier

Heya,
On Mon, Sep 26, 2011 at 17:48, Martin Fick

Post by Martin Fick
Hmm, I was thinking that too, and I just did a test.
Instead of storing the changes under refs/changes, I
fetched them under refs/heads/changes and then ran git
1.7.6 and it took about 3 mins. Then, I ran the
1.7.7.rc0.73 with
c774aab98ce6c5ef7aaacbef38da0a501eb671d4 reverted and
it only took 13s! So, if this indeed tests what you
were suggesting, I think it shows that even in the
intended case this change slowed things down?

And if you run 1.7.7 without that commit reverted?

Sorry, I probably confused things by mentioning 1.7.6, the
bad commit was way before that early 1.5 days...

As for 1.7.7, I don't think that exists yet, so did you mean
the 1.7.7.rc0.73 version that I mentioned above without the
revert? Strangely enough, that ends up being
1.7.7.rc0.72.g4b5ea. That is also slow with
refs/heads/changes > 3mins.

-Martin

--
Employee of Qualcomm Innovation Center, Inc. which is a
member of Code Aurora Forum

Julian Phillips

2011-09-26 16:49:06 UTC

Post by Martin Fick
On Monday, September 26, 2011 09:56:50 am Sverre Rabbelier

Heya,
On Mon, Sep 26, 2011 at 17:48, Martin Fick

Post by Martin Fick
Hmm, I was thinking that too, and I just did a test.
Instead of storing the changes under refs/changes, I
fetched them under refs/heads/changes and then ran git
1.7.6 and it took about 3 mins. Then, I ran the
1.7.7.rc0.73 with
c774aab98ce6c5ef7aaacbef38da0a501eb671d4 reverted and
it only took 13s! So, if this indeed tests what you
were suggesting, I think it shows that even in the
intended case this change slowed things down?

And if you run 1.7.7 without that commit reverted?

Sorry, I probably confused things by mentioning 1.7.6, the
bad commit was way before that early 1.5 days...
As for 1.7.7, I don't think that exists yet, so did you mean
the 1.7.7.rc0.73 version that I mentioned above without the
revert? Strangely enough, that ends up being
1.7.7.rc0.72.g4b5ea. That is also slow with
refs/heads/changes > 3mins.

Hmm ... something interesting is going on.

I created a little test repo with ~100k unpacked refs.

I tried "time git branch" with three versions of git, and I got (hot
cache times):

git version 1.7.6.1: ~1.2s
git version 1.7.7.rc3: ~1.2s
git version 1.7.7.rc3.1.gbc93f: ~40s

Where the third was with the commit reverted. That was almost 40s of
100% CPU - my poor laptop had to turn the fans up to noisy ...

Post by Martin Fick
-Martin

--
Julian

Martin Fick

2011-09-26 18:07:52 UTC

Post by Martin Fick
OK, I have found what I believe is another performance
regression for large ref counts (~100K).
When I run git br on my repo which only has one branch,
but has ~100K refs under ref/changes (a gerrit repo), it
takes normally 3-6mins depending on whether my caches
are fresh or not. After bisecting some older changes, I
noticed that this ref seems to be where things start to
get slow: c774aab98ce6c5ef7aaacbef38da0a501eb671d4
commit c774aab98ce6c5ef7aaacbef38da0a501eb671d4
Date: Tue Apr 17 02:42:50 2007 +0100
refs.c: add a function to sort a ref list, rather
then sorting on add
Rather than sorting the refs list while building it,
sort in one
go after it is built using a merge sort. This has a
large
performance boost with large numbers of refs.
It shouldn't happen that we read duplicate entries
into the same
list, but just in case sort_ref_list drops them if
the SHA1s are
the same, or dies, as we have no way of knowing which
one is the
correct one.
Signed-off-by: Julian Phillips
Acked-by: Linus Torvalds
which is a bit strange since that commit's purpose was to
actually speed things up in the case of many refs. Just
to verify, I reverted the commit on 1.7.7.rc0.73 and
sure enough, things speed up down to the 14-20s range
depending on caching.
If this change does not actually speed things up, should
it be reverted? Or was there a bug in the change that
makes it not do what it was supposed to do?

Ahh, I think I have some more clues. So while this change
does not speed things up for me normally, I found a case
where it does! I set my .git/config to have

[core]
compression = 0

and ran git-gc on my repo. Now, with a modern git with this
optimization in it (1.7.6, 1.7.7.rc0...), 'git branch' is
almost instantaneous (.05s)! But, if I revert c774aa it
takes > ~15s.

So, it appears that this optimization is foiled by
compression? In the case when this optimization helps, it
save about 15s, when it hurts (with compression), it seems
to cost > 3mins. I am not sure this optimization is worth
it? Would there be someway for it to adjust to the repo
conditions?

Thanks,

-Martin

--
Employee of Qualcomm Innovation Center, Inc. which is a
member of Code Aurora Forum

Julian Phillips

2011-09-26 18:37:10 UTC

On Mon, 26 Sep 2011 12:07:52 -0600, Martin Fick wrote:
-- snip --

Post by Martin Fick
Ahh, I think I have some more clues. So while this change
does not speed things up for me normally, I found a case
where it does! I set my .git/config to have
[core]
compression = 0
and ran git-gc on my repo. Now, with a modern git with this
optimization in it (1.7.6, 1.7.7.rc0...), 'git branch' is
almost instantaneous (.05s)! But, if I revert c774aa it
takes > ~15s.

I don't understand this. I don't see why core.compression should have
anything to do with refs ...

Post by Martin Fick
So, it appears that this optimization is foiled by
compression? In the case when this optimization helps, it
save about 15s, when it hurts (with compression), it seems
to cost > 3mins. I am not sure this optimization is worth
it? Would there be someway for it to adjust to the repo
conditions?

Well, in the case I tried it was 1.2s vs 40s. It would seem that you
have managed to find some corner case. It doesn't seem right to punish
everyone who has large numbers of refs by making their commands take
orders of magnitude longer to save one person 3m. Much better to find,
understand and fix the actual cause.

I really can't see what effect core.compression can have on loading the
ref_list. Certainly the sort doesn't load anything from the object
database. It would be really good to profile and find out what is
taking all the time - I am assuming that the CPU is at 100% for the 3+
minutes?

Random thought. What happens to the with compression case if you leave
the commit in, but add a sleep(15) to the end of sort_refs_list?

--
Julian

Martin Fick

2011-09-26 20:01:38 UTC

On Monday, September 26, 2011 12:37:10 pm Julian Phillips

Post by Julian Phillips
-- snip --

Post by Martin Fick
Ahh, I think I have some more clues. So while this
change does not speed things up for me normally, I
found a case where it does! I set my .git/config to
have
[core]
compression = 0
and ran git-gc on my repo. Now, with a modern git with
this optimization in it (1.7.6, 1.7.7.rc0...), 'git
branch' is almost instantaneous (.05s)! But, if I
revert c774aa it takes > ~15s.

I don't understand this. I don't see why
core.compression should have anything to do with refs
...

Post by Martin Fick
So, it appears that this optimization is foiled by
compression? In the case when this optimization helps,
it save about 15s, when it hurts (with compression),
it seems to cost > 3mins. I am not sure this
optimization is worth it? Would there be someway for
it to adjust to the repo conditions?

I am not sure mine is the corner case, it is a real repo
(albeit a Gerrit repo with strange refs/changes), while it
sounds like yours is a test repo. It seems likely that
whatever you did to create the test repo makes it perform
well? I am also guessing that it is not the refs which are
the problem but the objects since the refs don't get
compressed do they? Does your repo have real data in it
(not just 100K refs)?

My repo compressed is about ~2G and uncompressed is ~1.1G
Yes, the compressed one is larger than the uncompressed one.
Since the compressed repo above was larger, I thought that I
should at lest gc it. After git gc, it is ~1.1G, so it
looks like the size difference was really because of not
having gced it at first after fetching the 100K refs.

After a gc, the repo does perform the similar to the
uncompressed one (which was achieved via gc). After gc, it
takes ~.05s do to a 'git branch' with 1.7.6 and
git.1.7.7.rc0.72.g4b5ea. It also takes a bit more than 15s
with the patch reverted. So it appears that compression is
not likely the culprit, but rather the need to be gced.

So, maybe you are correct, maybe my repo is the corner case?
Is a repo which needs to be gced considered a corner case?
Should git be able to detect that the repo is so in
desperate need of gcing? Is it normal for git to need to gc
right after a clone and then fetching ~100K refs?

I am not sure what is right here, if this patch makes a repo
which needs gcing degrade 5 to 10 times worse than the
benefit of this patch, it still seems questionable to me.

Post by Julian Phillips
I really can't see what effect core.compression can have
on loading the ref_list. Certainly the sort doesn't
load anything from the object database. It would be
really good to profile and find out what is taking all
the time - I am assuming that the CPU is at 100% for the
3+ minutes?

Yes, 100% CPU (I mostly run the tests at least twice and
have 8G of RAM, so I think the entire repo gets cached).

Post by Julian Phillips
Random thought. What happens to the with compression
case if you leave the commit in, but add a sleep(15) to
the end of sort_refs_list?

Why, what are you thinking? Hmm, I am trying this on the
non gced repo and it doesn't seem to be completing (no cpu
usage)! It appears that perhaps it is being called many
times (the sleeping would explain no cpu usage)?!? This
could be a real problem, this should only get called once
right?

-Martin

--
Employee of Qualcomm Innovation Center, Inc. which is a
member of Code Aurora Forum

Junio C Hamano

2011-09-26 20:07:25 UTC

Post by Martin Fick
After a gc, the repo does perform the similar to the
uncompressed one (which was achieved via gc). After gc, it
takes ~.05s do to a 'git branch' with 1.7.6 and
git.1.7.7.rc0.72.g4b5ea. It also takes a bit more than 15s
with the patch reverted. So it appears that compression is
not likely the culprit, but rather the need to be gced.

Isn't packing refs part of "gc" these days?

Julian Phillips

2011-09-26 20:28:53 UTC

On Mon, 26 Sep 2011 14:01:38 -0600, Martin Fick wrote:
-- snip --

Post by Martin Fick
So, maybe you are correct, maybe my repo is the corner case?
Is a repo which needs to be gced considered a corner case?
Should git be able to detect that the repo is so in
desperate need of gcing? Is it normal for git to need to gc
right after a clone and then fetching ~100K refs?

Were you 100k refs packed before the gc? If not, perhaps your refs are
causing a lot of trouble for the merge sort? They will be written out
sorted to the packed-refs file, so the merge sort won't have to do any
real work when loading them after that...

Post by Martin Fick
I am not sure what is right here, if this patch makes a repo
which needs gcing degrade 5 to 10 times worse than the
benefit of this patch, it still seems questionable to me.

Well - it does this _for your repo_, that doesn't automatically mean
that it does generally, or frequently. For instance, none of my normal
repos that have a lot of refs are Gerrit ones, and I wouldn't be
surprised if they benefitted from the merge sort (assuming that I am
right that the merge sort is taking a long time on your gerrit refs).

Besides, you would be better off running gc, and thus getting the
benefit too.

Post by Julian Phillips
Random thought. What happens to the with compression
case if you leave the commit in, but add a sleep(15) to
the end of sort_refs_list?

I was just wondering if the time taken to get the refs was changing the
interaction with something else. Not very likely, but ...

I added a print statement, and it was called four times when I had
unpacked refs, and once with packed. So, maybe you are hitting some
nasty case with unpacked refs. If you use a print statement instead of
a sleep, how many times does sort_refs_lists get called in your unpacked
case? It may well also be worth calculating the time taken to do the
sort.

--
Julian

Martin Fick

2011-09-26 21:39:33 UTC

On Monday, September 26, 2011 02:28:53 pm Julian Phillips

Post by Julian Phillips
-- snip --

Post by Martin Fick
So, maybe you are correct, maybe my repo is the corner
case? Is a repo which needs to be gced considered a
corner case? Should git be able to detect that the
repo is so in desperate need of gcing? Is it normal
for git to need to gc right after a clone and then
fetching ~100K refs?

Were you 100k refs packed before the gc? If not, perhaps
your refs are causing a lot of trouble for the merge
sort? They will be written out sorted to the
packed-refs file, so the merge sort won't have to do any
real work when loading them after that...

I am not sure how to determine that (?), but I think they
were packed. Under .git/objects/pack there were 2 large
files, both close to 500MB. Those 2 files constituted most
of the space in the repo (I was wrong about the repo sizes,
that included the working dir, so think about half the
quoted sizes for all of .git). So does that mean it is
mostly packed? Aside from the pack and idx files, there was
nothing else under the objects dir. After gcing, it is down
to just one ~500MB pack file.

Post by Martin Fick
I am not sure what is right here, if this patch makes a
repo which needs gcing degrade 5 to 10 times worse
than the benefit of this patch, it still seems
questionable to me.

Well - it does this _for your repo_, that doesn't
automatically mean that it does generally, or
frequently.

Oh, def agreed! I just didn't want to discount it so quickly
as being a corner case.

Post by Julian Phillips
For instance, none of my normal repos that
have a lot of refs are Gerrit ones, and I wouldn't be
surprised if they benefitted from the merge sort
(assuming that I am right that the merge sort is taking
a long time on your gerrit refs).
Besides, you would be better off running gc, and thus
getting the benefit too.

Agreed, which is why I was asking if git should have noticed
my "degenerate" case and auto gced? But hopefully, there is
an actual bug here somewhere and we both will get to eat our
cake. :)

Post by Julian Phillips
Random thought. What happens to the with compression
case if you leave the commit in, but add a sleep(15)
to the end of sort_refs_list?

Why, what are you thinking? Hmm, I am trying this on
the non gced repo and it doesn't seem to be completing
(no cpu usage)! It appears that perhaps it is being
called many times (the sleeping would explain no cpu
usage)?!? This could be a real problem, this should
only get called once right?

I was just wondering if the time taken to get the refs
was changing the interaction with something else. Not
very likely, but ...
I added a print statement, and it was called four times
when I had unpacked refs, and once with packed. So,
maybe you are hitting some nasty case with unpacked
refs. If you use a print statement instead of a sleep,
how many times does sort_refs_lists get called in your
unpacked case? It may well also be worth calculating
the time taken to do the sort.

In my case it was called 18785 times! Any other tests I
should run?

-Martin

--
Employee of Qualcomm Innovation Center, Inc. which is a
member of Code Aurora Forum

Martin Fick

2011-09-26 21:52:04 UTC

Post by Martin Fick
On Monday, September 26, 2011 02:28:53 pm Julian Phillips

Post by Julian Phillips
Random thought. What happens to the with
compression case if you leave the commit in, but
add a sleep(15) to the end of sort_refs_list?

Why, what are you thinking? Hmm, I am trying this on
the non gced repo and it doesn't seem to be
completing (no cpu usage)! It appears that perhaps
it is being called many times (the sleeping would
explain no cpu usage)?!? This could be a real
problem, this should only get called once right?

I was just wondering if the time taken to get the refs
was changing the interaction with something else. Not
very likely, but ...
I added a print statement, and it was called four times
when I had unpacked refs, and once with packed. So,
maybe you are hitting some nasty case with unpacked
refs. If you use a print statement instead of a sleep,
how many times does sort_refs_lists get called in your
unpacked case? It may well also be worth calculating
the time taken to do the sort.

In my case it was called 18785 times! Any other tests I
should run?

Gerrit stores the changes in directories under refs/changes
named after the last 2 digits of the change. Then under
each change it stores each patchset. So it looks like this:
refs/changes/dd/change_num/ps_num

I noticed that:

ls refs/changes/* | wc -l
-> 18876

somewhat close, but not super close to 18785, I am not sure
if that is a clue. It's almost like each change is causing
a re-sort,

-Martin

--
Employee of Qualcomm Innovation Center, Inc. which is a
member of Code Aurora Forum

Julian Phillips

2011-09-26 23:26:55 UTC

Post by Martin Fick
On Monday, September 26, 2011 02:28:53 pm Julian Phillips

Post by Julian Phillips
Random thought. What happens to the with
compression case if you leave the commit in, but
add a sleep(15) to the end of sort_refs_list?

Why, what are you thinking? Hmm, I am trying this on
the non gced repo and it doesn't seem to be
completing (no cpu usage)! It appears that perhaps
it is being called many times (the sleeping would
explain no cpu usage)?!? This could be a real
problem, this should only get called once right?

I was just wondering if the time taken to get the refs
was changing the interaction with something else. Not
very likely, but ...
I added a print statement, and it was called four times
when I had unpacked refs, and once with packed. So,
maybe you are hitting some nasty case with unpacked
refs. If you use a print statement instead of a sleep,
how many times does sort_refs_lists get called in your
unpacked case? It may well also be worth calculating
the time taken to do the sort.

In my case it was called 18785 times! Any other tests I
should run?

Gerrit stores the changes in directories under refs/changes
named after the last 2 digits of the change. Then under
refs/changes/dd/change_num/ps_num
ls refs/changes/* | wc -l
-> 18876
somewhat close, but not super close to 18785, I am not sure
if that is a clue. It's almost like each change is causing
a re-sort,

basically, it is ...

Back when I made that change, I failed to notice that get_ref_dir was
recursive for subdirectories ... sorry ...

Hopefully this should speed things up. My test repo went from ~17m
user time, to ~2.5s.
Packing still make things much faster of course.

diff --git a/refs.c b/refs.c
index a615043..212e7ec 100644
--- a/refs.c
+++ b/refs.c
@@ -319,7 +319,7 @@ static struct ref_list *get_ref_dir(const char
*submodule, c
free(ref);
closedir(dir);
}
- return sort_ref_list(list);
+ return list;
}

struct warn_if_dangling_data {
@@ -361,11 +361,13 @@ static struct ref_list *get_loose_refs(const char
*submodu
if (submodule) {
free_ref_list(submodule_refs.loose);
submodule_refs.loose = get_ref_dir(submodule, "refs",
NULL);
+ submodule_refs.loose =
sort_refs_list(submodule_refs.loose);
return submodule_refs.loose;
}

if (!cached_refs.did_loose) {
cached_refs.loose = get_ref_dir(NULL, "refs", NULL);
+ cached_refs.loose = sort_refs_list(cached_refs.loose);
cached_refs.did_loose = 1;
}
return cached_refs.loose;

Post by Martin Fick
-Martin

--
Julian

David Michael Barr

2011-09-26 23:37:59 UTC

On Tue, Sep 27, 2011 at 9:26 AM, Julian Phillips

Post by Martin Fick
On Monday, September 26, 2011 02:28:53 pm Julian Phillips

Random thought. =A0What happens to the with
compression case if you leave the commit in, but
add a sleep(15) to the end of sort_refs_list?

Why, what are you thinking? =A0Hmm, I am trying this on
the non gced repo and it doesn't seem to be
completing (no cpu usage)! =A0It appears that perhaps
it is being called many times (the sleeping would
explain no cpu usage)?!? =A0This could be a real
problem, this should only get called once right?

I was just wondering if the time taken to get the refs
was changing the interaction with something else. =A0Not
very likely, but ...
I added a print statement, and it was called four times
when I had unpacked refs, and once with packed. =A0So,
maybe you are hitting some nasty case with unpacked
refs. =A0If you use a print statement instead of a sleep,
how many times does sort_refs_lists get called in your
unpacked case? =A0It may well also be worth calculating
the time taken to do the sort.

In my case it was called 18785 times! =A0Any other tests I
should run?

Gerrit stores the changes in directories under refs/changes
named after the last 2 digits of the change. =A0Then under
refs/changes/dd/change_num/ps_num
=A0ls refs/changes/* | wc -l
=A0-> 18876
somewhat close, but not super close to 18785, =A0I am not sure
if that is a clue. =A0It's almost like each change is causing
a re-sort,

basically, it is ...
Back when I made that change, I failed to notice that get_ref_dir was=

recursive for subdirectories ... sorry ...

Post by Julian Phillips
Hopefully this should speed things up. =A0My test repo went from ~17m=

user time, to ~2.5s.

Post by Julian Phillips
Packing still make things much faster of course.
diff --git a/refs.c b/refs.c
index a615043..212e7ec 100644
--- a/refs.c
+++ b/refs.c
@@ -319,7 +319,7 @@ static struct ref_list *get_ref_dir(const char *s=

ubmodule, c

Post by Julian Phillips
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0free(ref);
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0closedir(dir);
=A0 =A0 =A0 =A0}
- =A0 =A0 =A0 return sort_ref_list(list);
+ =A0 =A0 =A0 return list;
=A0}
=A0struct warn_if_dangling_data {
@@ -361,11 +361,13 @@ static struct ref_list *get_loose_refs(const ch=

ar *submodu

Post by Julian Phillips
=A0 =A0 =A0 =A0if (submodule) {
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0free_ref_list(submodule_refs.loose);
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0submodule_refs.loose =3D get_ref_dir(s=

ubmodule, "refs", NULL);

Post by Julian Phillips
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 submodule_refs.loose =3D sort_refs_list=

(submodule_refs.loose);

Post by Julian Phillips
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0return submodule_refs.loose;
=A0 =A0 =A0 =A0}
=A0 =A0 =A0 =A0if (!cached_refs.did_loose) {
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0cached_refs.loose =3D get_ref_dir(NULL=

, "refs", NULL);

Post by Julian Phillips
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 cached_refs.loose =3D sort_refs_list(ca=

ched_refs.loose);

Post by Julian Phillips
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0cached_refs.did_loose =3D 1;
=A0 =A0 =A0 =A0}
=A0 =A0 =A0 =A0return cached_refs.loose;

Post by Martin Fick
-Martin

--
Julian
--
To unsubscribe from this list: send the line "unsubscribe git" in
More majordomo info at =A0http://vger.kernel.org/majordomo-info.html

Well done! I'll try to compose a patch attributed to Julian with the
information from this thread.

--
David Barr

David Barr

2011-09-27 01:01:23 UTC

Martin Fick reported:
OK, I have found what I believe is another performance
regression for large ref counts (~100K).

When I run git br on my repo which only has one branch, but
has ~100K refs under ref/changes (a gerrit repo), it takes
normally 3-6mins depending on whether my caches are fresh or
not. After bisecting some older changes, I noticed that
this ref seems to be where things start to get slow:
v1.5.2-rc0~21^2 (refs.c: add a function to sort a ref list,
rather then sorting on add) (Julian Phillips, Apr 17, 2007)

Martin Fick observed that sort_refs_lists() was called almost
as many times as there were loose refs.

Julian Phillips commented:
Back when I made that change, I failed to notice that get_ref_dir
was recursive for subdirectories ... sorry ...

Hopefully this should speed things up. My test repo went from
~17m user time, to ~2.5s.
Packing still make things much faster of course.

Martin Fick acked:
Excellent! This works (almost, in my refs.c it is called
sort_ref_list, not sort_refs_list). So, on the non garbage
collected repo, git branch now takes ~.5s, and in the
garbage collected one it takes only ~.05s!

[db: summarised transcript, rewrote patch to fix callee not callers]

[attn jch: patch applies to maint]

Analyzed-by: Martin Fick <***@codeaurora.org>
Inspired-by: Julian Phillips <***@quantumfyre.co.uk>
Acked-by: Martin Fick <***@codeaurora.org>
Signed-off-by: David Barr <***@google.com>
---
refs.c | 14 ++++++++++----
1 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/refs.c b/refs.c
index 4c1fd47..e40a09c 100644
--- a/refs.c
+++ b/refs.c
@@ -255,8 +255,8 @@ static struct ref_list *get_packed_refs(const char *submodule)
return refs->packed;
}

-static struct ref_list *get_ref_dir(const char *submodule, const char *base,
- struct ref_list *list)
+static struct ref_list *walk_ref_dir(const char *submodule, const char *base,
+ struct ref_list *list)
{
DIR *dir;
const char *path;
@@ -299,7 +299,7 @@ static struct ref_list *get_ref_dir(const char *submodule, const char *base,
if (stat(refdir, &st) < 0)
continue;
if (S_ISDIR(st.st_mode)) {
- list = get_ref_dir(submodule, ref, list);
+ list = walk_ref_dir(submodule, ref, list);
continue;
}
if (submodule) {
@@ -319,7 +319,13 @@ static struct ref_list *get_ref_dir(const char *submodule, const char *base,
free(ref);
closedir(dir);
}
- return sort_ref_list(list);
+ return list;
+}
+
+static struct ref_list *get_ref_dir(const char *submodule, const char *base,
+ struct ref_list *list)
+{
+ return sort_ref_list(walk_ref_dir(submodule, base, list));
}

struct warn_if_dangling_data {

--
1.7.5.75.g69330

David Michael Barr

2011-09-27 02:04:43 UTC

+cc Shawn O. Pearce

I used the following to generate a test repo shaped like
a gerrit mirror with unpacked refs (10k, because life is too short for
100k tests):

cd test.git
git init
touch empty
git add empty
git commit -m 'empty'
REV=3D`git rev-parse HEAD`
for ((d=3D0;d<100;++d)); do
for ((n=3D0;n<100;++n)); do
let r=3Dn*100+d
mkdir -p .git/refs/changes/$d/$r
echo $REV > .git/refs/changes/$d/$r/1
done
done
time git branch xyz

With warm caches...

Git 1.7.6.4:
real 0m8.232s
user 0m7.842s
sys 0m0.385s

Git 1.7.6.4, with patch below:
real 0m0.394s
user 0m0.069s
sys 0m0.324s

=A0OK, I have found what I believe is another performance
=A0regression for large ref counts (~100K).
=A0When I run git br on my repo which only has one branch, but
=A0has ~100K refs under ref/changes (a gerrit repo), it takes
=A0normally 3-6mins depending on whether my caches are fresh or
=A0not. =A0After bisecting some older changes, I noticed that
=A0v1.5.2-rc0~21^2 (refs.c: add a function to sort a ref list,
=A0rather then sorting on add) (Julian Phillips, Apr 17, 2007)
Martin Fick observed that sort_refs_lists() was called almost
as many times as there were loose refs.
=A0Back when I made that change, I failed to notice that get_ref_dir
=A0was recursive for subdirectories ... sorry ...
=A0Hopefully this should speed things up. My test repo went from
=A0~17m user time, to ~2.5s.
=A0Packing still make things much faster of course.
=A0Excellent! =A0This works (almost, in my refs.c it is called
=A0sort_ref_list, not sort_refs_list). =A0So, on the non garbage
=A0collected repo, git branch now takes ~.5s, and in the
=A0garbage collected one it takes only ~.05s!
[db: summarised transcript, rewrote patch to fix callee not callers]
[attn jch: patch applies to maint]
---
=A0refs.c | =A0 14 ++++++++++----
=A01 files changed, 10 insertions(+), 4 deletions(-)
diff --git a/refs.c b/refs.c
index 4c1fd47..e40a09c 100644
--- a/refs.c
+++ b/refs.c
@@ -255,8 +255,8 @@ static struct ref_list *get_packed_refs(const cha=

r *submodule)

=A0 =A0 =A0 =A0return refs->packed;
=A0}
-static struct ref_list *get_ref_dir(const char *submodule, const cha=

r *base,

- =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=

struct ref_list *list)

+static struct ref_list *walk_ref_dir(const char *submodule, const ch=

ar *base,

+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=

=A0struct ref_list *list)

=A0{
=A0 =A0 =A0 =A0DIR *dir;
=A0 =A0 =A0 =A0const char *path;
@@ -299,7 +299,7 @@ static struct ref_list *get_ref_dir(const char *s=

ubmodule, const char *base,

=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (stat(refdir, &st) =

< 0)

=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0contin=

ue;

=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (S_ISDIR(st.st_mode=

)) {

- =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 list =3D=

get_ref_dir(submodule, ref, list);

+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 list =3D=

walk_ref_dir(submodule, ref, list);

=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0contin=

ue;

=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0}
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (submodule) {
@@ -319,7 +319,13 @@ static struct ref_list *get_ref_dir(const char *=

submodule, const char *base,

=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0free(ref);
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0closedir(dir);
=A0 =A0 =A0 =A0}
- =A0 =A0 =A0 return sort_ref_list(list);
+ =A0 =A0 =A0 return list;
+}
+
+static struct ref_list *get_ref_dir(const char *submodule, const cha=

r *base,

+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=

struct ref_list *list)

+{
+ =A0 =A0 =A0 return sort_ref_list(walk_ref_dir(submodule, base, list=

));

=A0}
=A0struct warn_if_dangling_data {
--
1.7.5.75.g69330

--=20

David Barr=A0|=A0Software Engineer=A0|=***@google.com=A0|=A0614=
-3438-8348

Junio C Hamano

2011-09-26 23:38:54 UTC

Post by Julian Phillips
Back when I made that change, I failed to notice that get_ref_dir was
recursive for subdirectories ... sorry ...

Aha, I also was blind while I was watching this discussion from the
sideline, and I thought I re-read the codepath involved X-<. Indeed
we were sorting the list way too early and the patch looks correct.

Thanks.

Post by Julian Phillips
Hopefully this should speed things up. My test repo went from ~17m
user time, to ~2.5s.
Packing still make things much faster of course.
diff --git a/refs.c b/refs.c
index a615043..212e7ec 100644
--- a/refs.c
+++ b/refs.c
@@ -319,7 +319,7 @@ static struct ref_list *get_ref_dir(const char
*submodule, c
free(ref);
closedir(dir);
}
- return sort_ref_list(list);
+ return list;
}
struct warn_if_dangling_data {
@@ -361,11 +361,13 @@ static struct ref_list *get_loose_refs(const
char *submodu
if (submodule) {
free_ref_list(submodule_refs.loose);
submodule_refs.loose = get_ref_dir(submodule, "refs",
NULL);
+ submodule_refs.loose =
sort_refs_list(submodule_refs.loose);
return submodule_refs.loose;
}
if (!cached_refs.did_loose) {
cached_refs.loose = get_ref_dir(NULL, "refs", NULL);
+ cached_refs.loose = sort_refs_list(cached_refs.loose);
cached_refs.did_loose = 1;
}
return cached_refs.loose;

Post by Martin Fick
-Martin

Julian Phillips

2011-09-27 00:00:09 UTC

get_ref_dir is called recursively for subdirectories, which means that
we were calling sort_ref_list for each directory of refs instead of
once for all the refs. This is a massive wast of processing, so now
just call sort_ref_list on the result of the top-level get_ref_dir, so
that the sort is only done once.

In the common case of only a few different directories of refs the
difference isn't very noticable, but it becomes very noticeable when
you have a large number of direcotries containing refs (e.g. as
created by Gerrit).

Reported by Martin Fick.

Signed-off-by: Julian Phillips <***@quantumfyre.co.uk>
---

This time the typos are fixed too ... perhaps I wrote the original commit at 1am
too ... :$

refs.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/refs.c b/refs.c
index a615043..a49ff74 100644
--- a/refs.c
+++ b/refs.c
@@ -319,7 +319,7 @@ static struct ref_list *get_ref_dir(const char *submodule, const char *base,
free(ref);
closedir(dir);
}
- return sort_ref_list(list);
+ return list;
}

struct warn_if_dangling_data {
@@ -361,11 +361,13 @@ static struct ref_list *get_loose_refs(const char *submodule)
if (submodule) {
free_ref_list(submodule_refs.loose);
submodule_refs.loose = get_ref_dir(submodule, "refs", NULL);
+ submodule_refs.loose = sort_ref_list(submodule_refs.loose);
return submodule_refs.loose;
}

if (!cached_refs.did_loose) {
cached_refs.loose = get_ref_dir(NULL, "refs", NULL);
+ cached_refs.loose = sort_ref_list(cached_refs.loose);
cached_refs.did_loose = 1;
}
return cached_refs.loose;

--
1.7.6.1

Martin Fick

2011-09-27 00:12:31 UTC

On Monday, September 26, 2011 05:26:55 pm Julian Phillips

On Monday, September 26, 2011 03:39:33 pm Martin Fick

Post by Martin Fick
On Monday, September 26, 2011 02:28:53 pm Julian
Phillips
In my case it was called 18785 times! Any other tests
I should run?

Gerrit stores the changes in directories under
refs/changes named after the last 2 digits of the
change. Then under each change it stores each
refs/changes/dd/change_num/ps_num
ls refs/changes/* | wc -l
-> 18876
somewhat close, but not super close to 18785, I am not
sure if that is a clue. It's almost like each change
is causing a re-sort,

Excellent! This works (almost, in my refs.c it is called
sort_ref_list, not sort_refs_list). So, on the non garbage
collected repo, git branch now takes ~.5s, and in the
garbage collected one it takes only ~.05s!

Thanks way much!!!

-Martin

--
Employee of Qualcomm Innovation Center, Inc. which is a
member of Code Aurora Forum

Julian Phillips

2011-09-27 00:22:32 UTC

Post by Martin Fick
On Monday, September 26, 2011 05:26:55 pm Julian Phillips

-- snip --

Post by Julian Phillips
Back when I made that change, I failed to notice that
get_ref_dir was recursive for subdirectories ... sorry
...
Hopefully this should speed things up. My test repo went
from ~17m user time, to ~2.5s.
Packing still make things much faster of course.

Excellent! This works (almost, in my refs.c it is called
sort_ref_list, not sort_refs_list).

Yeah, in mine too ;) It's late and I got the compile/send mail
sequence backwards. :$
It's fixed in the proper patch email.

Post by Martin Fick
So, on the non garbage
collected repo, git branch now takes ~.5s, and in the
garbage collected one it takes only ~.05s!

That sounds a lot better. Hopefully other commands should be faster
now too.

Post by Martin Fick
Thanks way much!!!

No problem. Thank you for all the time you've put in to help chase
this down. Makes it so much easier when the person with original
problem mucks in with the investigation.
Just think how much time you've saved for anyone with a large number of
those Gerrit change refs ;)

--
Julian

Martin Fick

2011-09-27 02:34:02 UTC

Post by Julian Phillips
That sounds a lot better. Hopefully other commands should be faster
now too.

Yeah, I will try this in a few other places to see.

Post by Martin Fick
Thanks way much!!!

Perhaps this is a naive question, but why are all these refs being put into a list to be sorted, only to be discarded soon thereafter anyway? After all, git branch knows that it isn't going to print these, and the refs are stored precategorized, so why not only grab the refs which matter upfront?

-Martin

Julian Phillips

2011-09-27 07:59:24 UTC

Post by Julian Phillips
That sounds a lot better. Hopefully other commands should be faster
now too.

Yeah, I will try this in a few other places to see.

Post by Martin Fick
Thanks way much!!!

Perhaps this is a naive question, but why are all these refs being
put into a list to be sorted, only to be discarded soon thereafter
anyway? After all, git branch knows that it isn't going to print
these, and the refs are stored precategorized, so why not only grab
the refs which matter upfront?

I can't say that I am aware of a specific decision having been taken on
the subject, but I'll have a guess at the reason:

The extra code it would take to have an API for getting a list of only
a subset of the refs has never been considered worth the cost. It would
take effort to implement, test and maintain - and it would have to be
done separately for packed and unpacked cases to avoid still loading and
discarding unwanted refs. All that to not do something that no-one has
noticed taking any time? Until now, I doubt anyone has considered it
something that was a problem - and now that even with 100k refs it takes
less than a second, I doubt anyone will feel all that inclined to have a
crack at it now either.

--
Julian

Sverre Rabbelier

2011-09-27 08:20:29 UTC

Heya,

Post by Julian Phillips
Back when I made that change, I failed to notice that get_ref_dir was
recursive for subdirectories ... sorry ...
Hopefully this should speed things up. =C2=A0My test repo went from ~=

17m user

Post by Julian Phillips
time, to ~2.5s.
Packing still make things much faster of course.

Can we perhaps also have some tests to prevent this from happening agai=
n?

--=20
Cheers,

Sverre Rabbelier

Julian Phillips

2011-09-27 09:01:56 UTC

Heya,
On Tue, Sep 27, 2011 at 01:26, Julian Phillips

Back when I made that change, I failed to notice that get_ref_dir=20
was
recursive for subdirectories ... sorry ...
Hopefully this should speed things up. =C2=A0My test repo went from =

~17m=20

user
time, to ~2.5s.
Packing still make things much faster of course.

Can we perhaps also have some tests to prevent this from happening=20
again?

Um ... any suggestion what to test?

It has to be hot-cache, otherwise time taken to read the refs from disk=
=20
will mean that it is always slow. On my Mac it seems to _always_ be=20
slow reading the refs from disk, so even the "fast" case still takes=20
~17m.

Also, what counts as ok, and what as broken?

--=20
Julian

Sverre Rabbelier

2011-09-27 10:01:24 UTC

Heya,

It has to be hot-cache, otherwise time taken to read the refs from di=

sk will

mean that it is always slow. =C2=A0On my Mac it seems to _always_ be =

slow reading

the refs from disk, so even the "fast" case still takes ~17m.

Ah, that seems unfortunate. Not sure how to test it then.

--=20
Cheers,

Sverre Rabbelier

Nguyen Thai Ngoc Duy

2011-09-27 10:25:08 UTC

Post by Sverre Rabbelier
Heya,

It has to be hot-cache, otherwise time taken to read the refs from d=

isk will

mean that it is always slow. =C2=A0On my Mac it seems to _always_ be=

slow reading

the refs from disk, so even the "fast" case still takes ~17m.

Ah, that seems unfortunate. Not sure how to test it then.

If you care about performance, a perf test suite could be made,
perhaps as a separate project. The output would be charts or
spreadsheets, that interesting parties can look at and point out
regressions. We may start with a set of common used operations.
--=20
Duy

Michael Haggerty

2011-09-27 11:07:15 UTC

Post by Julian Phillips
It has to be hot-cache, otherwise time taken to read the refs from disk
will mean that it is always slow. On my Mac it seems to _always_ be
slow reading the refs from disk, so even the "fast" case still takes ~17m.

This case should be helped by lazy-loading of loose references, which I
am working on. So if you develop some benchmarking code, it would help
me with my work.

Michael

--
Michael Haggerty
***@alum.mit.edu
http://softwareswirl.blogspot.com/

Julian Phillips

2011-09-27 12:10:25 UTC

This case should be helped by lazy-loading of loose references, which I
am working on. So if you develop some benchmarking code, it would help
me with my work.

The attached script creates the repo structure I was testing with ...

If you create a repo with 100k refs it takes quite a while to read the
refs from disk. If you are lazy-loading then it should take practically
no time, since the only interesting ref is refs/heads/master.

The following is the hot-cache timing for "./refs-stress c 40000", with
the sorting patch applied (wasn't prepared to wait for numbers with 100k
refs).

***@rayne: refs>(cd c; time ~/misc/git/git/git branch)
* master

real 0m0.885s
user 0m0.161s
sys 0m0.722s

After doing "rm -rf c/.git/refs/changes/*", I get:

***@rayne: refs>(cd c; time ~/misc/git/git/git branch)
* master

real 0m0.004s
user 0m0.001s
sys 0m0.002s

--
Julian

Julian Phillips

2011-09-26 22:30:46 UTC

Post by Martin Fick
On Monday, September 26, 2011 02:28:53 pm Julian Phillips

Post by Julian Phillips
-- snip --

Post by Martin Fick
So, maybe you are correct, maybe my repo is the corner
case? Is a repo which needs to be gced considered a
corner case? Should git be able to detect that the
repo is so in desperate need of gcing? Is it normal
for git to need to gc right after a clone and then
fetching ~100K refs?

Were you 100k refs packed before the gc? If not, perhaps
your refs are causing a lot of trouble for the merge
sort? They will be written out sorted to the
packed-refs file, so the merge sort won't have to do any
real work when loading them after that...

If refs are listed under .git/refs/... they are unpacked, if they are
listed in .git/packed-refs they are packed.
They can be in both if updated since the last pack.

Post by Martin Fick
I am not sure what is right here, if this patch makes a
repo which needs gcing degrade 5 to 10 times worse
than the benefit of this patch, it still seems
questionable to me.

Well - it does this _for your repo_, that doesn't
automatically mean that it does generally, or
frequently.

Oh, def agreed! I just didn't want to discount it so quickly
as being a corner case.

Agreed, which is why I was asking if git should have noticed
my "degenerate" case and auto gced? But hopefully, there is
an actual bug here somewhere and we both will get to eat our
cake. :)

I think automatic gc is currently only triggered by unpacked objects,
not unpacked refs ... perhaps the auto-gc should cover refs too?

Post by Julian Phillips
Random thought. What happens to the with compression
case if you leave the commit in, but add a sleep(15)
to the end of sort_refs_list?

Why, what are you thinking? Hmm, I am trying this on
the non gced repo and it doesn't seem to be completing
(no cpu usage)! It appears that perhaps it is being
called many times (the sleeping would explain no cpu
usage)?!? This could be a real problem, this should
only get called once right?

I was just wondering if the time taken to get the refs
was changing the interaction with something else. Not
very likely, but ...
I added a print statement, and it was called four times
when I had unpacked refs, and once with packed. So,
maybe you are hitting some nasty case with unpacked
refs. If you use a print statement instead of a sleep,
how many times does sort_refs_lists get called in your
unpacked case? It may well also be worth calculating
the time taken to do the sort.

In my case it was called 18785 times! Any other tests I
should run?

That's a lot of sorts. I really can't see why there would need to be
more than one ...

I've created a new test repo, using a more complicated method to
construct the 100k refs, and it took ~40m to run "git branch" instead of
the 1.2s for the previous repo. So, I think the ref naming pattern used
by Gerrit is definitely triggering something odd. However, progress is
a bit slow - now that it takes over 1/2 an hour to try things out ...

--
Julian

Michael Haggerty

2011-09-26 15:32:14 UTC

Post by Martin Fick
A coworker of mine pointed out to me that a simple
git checkout
can also take rather long periods of time > 3 mins when run
on a repo with ~100K refs.
While this is not massive like the other problem I reported,
it still seems like it is more than one would expect. So, I
tried an older version of git, and to my surprise/delight,
it was much faster (.2s). So, I bisected this issue also,
and it seems that the "offending" commit is

I'm still working on changes to store references hierarchically in the
cache and read them lazily. I hope that it will help some scaling
problems with large number of refs.

Unfortunately I keep getting tangled up in side issues, so it is taking
a lot longer than expected. But there's still hope.

Michael

--
Michael Haggerty
***@alum.mit.edu
http://softwareswirl.blogspot.com/

Martin Fick

2011-09-26 15:42:08 UTC

On Monday, September 26, 2011 09:32:14 am Michael Haggerty

Post by Martin Fick
A coworker of mine pointed out to me that a simple
git checkout
can also take rather long periods of time > 3 mins when
run on a repo with ~100K refs.
While this is not massive like the other problem I
reported, it still seems like it is more than one
would expect. So, I tried an older version of git,
and to my surprise/delight, it was much faster (.2s).
So, I bisected this issue also, and it seems that the
"offending" commit is

I'm still working on changes to store references
hierarchically in the cache and read them lazily. I
hope that it will help some scaling problems with large
number of refs.
Unfortunately I keep getting tangled up in side issues,
so it is taking a lot longer than expected. But there's
still hope.
Michael

Thanks Michael, I look forward to those changes. In the
meantime however, I will try to take advantage of the
current inefficiencies of large ref counts to attempt to
find places where there are obvious problems in the code
paths. I suspect that there are several commands in git
which inadvertently scan all the refs when they probably
shouldn't. Since this is likely very slow now, it should be
easy to find those, if it were faster, this might get
overlooked. I feel like git checkout is one of those cases,
it does not seem like git checkout should be affected by the
number of refs in a repo?

-Martin

--
Employee of Qualcomm Innovation Center, Inc. which is a
member of Code Aurora Forum

Thomas Rast

2011-09-26 16:25:39 UTC

Post by Martin Fick
I suspect that there are several commands in git
which inadvertently scan all the refs when they probably
shouldn't. [...] I feel like git checkout is one of those cases,
it does not seem like git checkout should be affected by the
number of refs in a repo?

git-checkout checks whether you are leaving any unreferenced
(orphaned) commits behind when you leave a detached HEAD, which
requires that it scan the history of all refs for the commit you just
left.

So unless you disable that warning it'll be pretty expensive
regardless.

--
Thomas Rast
trast@{inf,student}.ethz.ch

Michael Haggerty

2011-09-09 13:50:37 UTC

Post by Martin Fick
Just thought that I should add some numbers to this thread as it seems that
the later versions of git are worse off by several orders of magnitude on
this one.
We have a Gerrit repo with just under 100K refs in refs/changes/*. When I
fetch them all with git 1.7.6 it does not seem to complete. Even after 5
days, it is just under half way through the ref #s! [...]

I recently reported very slow performance when doing a "git
filter-branch" involving only about 1000 tags, with hints of O(N^3)
scaling [1]. That could certainly explain enormous runtimes for 100k refs.

References are cached in git in a single linked list, so it is easy to
imagine O(N^2) all over the place (which is bad enough for 100k
references). I am working on improving the situation by reorganizing
how the reference cache is stored in memory, but progress is slow.

I'm not sure whether your problem is related. For example, it is not
obvious to me why the commit that you cite (88a21979) would make the
reference problem so dramatically worse.

I suggest the following experiments to characterize the problem:

1. Fetch the references in batches of a few hundred each, and see if
that dramatically decreases the total time.

2. Same as (1), except run "git pack-refs --all --prune" between the
batches. In my experiments, packing references made a dramatic
difference in runtimes.

3. Try using the --no-replace-objects option (I assume that it can be
used like "git --no-replace-objects fetch ..."). In my case this option
made a dramatic improvement in the runtimes.

4. Try a test using a repository generated something like the test
script that I posted in [1]. If it also gives pathologically bad
performance, then it can serve as a test case to use while we debug the
problem.

Yours,
Michael

[1] http://comments.gmane.org/gmane.comp.version-control.git/177103

--
Michael Haggerty
***@alum.mit.edu
http://softwareswirl.blogspot.com/

Michael Haggerty

2011-09-09 15:51:04 UTC

Post by Michael Haggerty
3. Try using the --no-replace-objects option (I assume that it can be
used like "git --no-replace-objects fetch ..."). In my case this option
made a dramatic improvement in the runtimes.

This does not seem to help much.

Post by Michael Haggerty
4. Try a test using a repository generated something like the test
script that I posted in [1]. If it also gives pathologically bad
performance, then it can serve as a test case to use while we debug the
problem.

Yes, a simple test repo like that created by the script is enough to
reproduce the problem. The slowdown becomes very obvious after only a
few hundred references.

Curiously, "git clone" is very fast under the same circumstances that
"git fetch" is excruciatingly slow.

According to strace, git seems to be repopulating the ref cache after
each new ref is created (it walks through the whole refs subdirectory
and reads every file). Apparently the ref cache is being discarded
completely whenever a ref is added (which can and should be fixed) and
then being reloaded for some reason (though single refs can be inspected
much faster without reading the cache). This situation should be
improved by the hierarchical refcache changes that I'm working on plus
smarter updating (rather than discarding) of the cache when a new
reference is created.

Some earlier speculation in this thread was that that slowdowns might be
caused by "pessimal" ordering of revisions in the walker queue. But my
test repository shards the references in such a way that the lexical
order of the refnames does not correspond to the topological order of
the commits. So that can't be the whole story.

Michael

--
Michael Haggerty
***@alum.mit.edu
http://softwareswirl.blogspot.com/

Jens Lehmann

2011-09-09 16:03:31 UTC

88a21979 is the reason, as since then a "git rev-list <sha1> --not --all" is
run for *every* updated ref to find out all new commits fetched for that ref.
And if you have 100K of them ...

Andreas Ericsson

2011-06-10 07:41:53 UTC

Post by A Large Angry SCM

Assuming a very large number of refs, what is it that makes git so
horrifically slow? Is there a design or implementation lesson here?

A few things.
Git does a sequential scan of all references when it first needs to
access references for an operation. This requires reading the entire
packed-refs file, and the recursive scan of the "refs/" subdirectory
for any loose refs that might override the packed-refs file.
A lot of operations toss every commit that a reference points at into
the revision walker's LRU queue. If you have a tag pointing to every
commit, then the entire project history enters the LRU queue at once,
up front. That queue is managed with O(N^2) insertion time. And the
entire queue has to be filled before anything can be output.

Hmm. Since we're using pre-hashed data with an obvious lookup method
we should be able to do much, much better than O(n^2) for insertion
and better than O(n) for worst-case lookups. I'm thinking a 1-byte
trie, resulting in a depth, lookup and insertion complexity of 20. It
would waste some memory but it might be worth it for fixed asymptotic
complexity for both insertion and lookup.

Shawn Pearce

2011-06-10 19:41:39 UTC

Post by Andreas Ericsson

Not really.

The queue isn't sorting by SHA-1. Its sorting by commit timestamp,
descending. Those aren't pre-hashed. The O(N^2) insertion is because
the code is trying to find where this commit belongs in the list of
commits as sorted by commit timestamp.

There are some priority queue datastructures designed for this sort of
work, e.g. a calendar queue might help. But its not as simple as a 1
byte trie.

--
Shawn.

Jakub Narebski

2011-06-10 20:12:12 UTC

Post by Andreas Ericsson

Not really.
The queue isn't sorting by SHA-1. Its sorting by commit timestamp,
descending. Those aren't pre-hashed. The O(N^2) insertion is because
the code is trying to find where this commit belongs in the list of
commits as sorted by commit timestamp.
There are some priority queue datastructures designed for this sort of
work, e.g. a calendar queue might help. But its not as simple as a 1
byte trie.

In the case of Subversion numbers (revision number to hash mapping)
sorted by name (in version order at least) means sorted by date. I
wonder if there is data structure for which this is optimum insertion
order (like for insertion sort almost sorted data is best case).

--
Jakub Narebski
Poland
ShadeHawk on #git

Jeff King

2011-06-10 20:35:45 UTC

Post by Shawn Pearce
Not really.
The queue isn't sorting by SHA-1. Its sorting by commit timestamp,
descending. Those aren't pre-hashed. The O(N^2) insertion is because
the code is trying to find where this commit belongs in the list of
commits as sorted by commit timestamp.
There are some priority queue datastructures designed for this sort of
work, e.g. a calendar queue might help. But its not as simple as a 1
byte trie.

All you really need is a heap-based priority queue, which gives O(lg n)
insertion and popping (and O(1) peeking at the top). I even wrote one
and posted it recently (I won't dig up the reference, but I posted it
elsewhere in this thread, I think).

The problem is that many parts of the code assume that commit_list is a
linked list and do fast iterations, or even splicing. It's nothing you
couldn't get around with some work, but it turns out to involve a lot
of code changes.

-Peff

Andreas Ericsson

2011-06-13 07:08:09 UTC

Post by Andreas Ericsson

Hmm. We should still be able to do better than that, and particularly
for the "tag-each-commit" workflow. Since it's most likely those tags
are generated using incrementing numbers, we could have a cut-off where
we first parse all the refs and make an optimistic assumption that an
alphabetical sort of the refs provides a map of insertion-points for
the commits. Since the best case behaviour is still O(1) for insertion
sort and it's unlikely that thousands of refs are in random order, that
should cause the vast majority of the refs we insert to follow the best
case scenario.

This will fall on its arse when people start doing hg-ref -> git-commit
tags ofcourse, but that doesn't seem to be happening, or at least not to
the same extent as with svn-revisions -> git-gommit mapping.

We're still not improving the asymptotic complexity, but it's a pretty
safe bet that we for a vast majority of cases improve wallclock runtime
by a hefty amount with a relatively minor effort.

Jakub Narebski

2011-06-09 11:18:09 UTC

Post by NAKAMURA Takumi
Hello, Git. It is my 1st post here.
I have tried tagging each commit as "refs/tags/rXXXXXX" on git-svn
repo locally. (over 100k refs/tags.)

[...]

That's insane. You would do much better to mark each commit with
note. Notes are designed to be scalable. See e.g. this thread

[RFD] Proposal for git-svn: storing SVN metadata (git-svn-id) in notes
http://article.gmane.org/gmane.comp.version-control.git/174657

--
Jakub Narebski
Poland
ShadeHawk on #git

Stephen Bash

2011-06-09 15:42:30 UTC