Discussion:
missing object on git-gc
Yossi Leybovich
2007-11-08 00:59:02 UTC
Permalink
I am running the git-gc tool over my repository and get the following e=
rror:
=A0
git-gc
=2E..
deltifying 3308 objects...
error: corrupt loose object '<sha1>'
fatal: object <sha1> cannot be read .
error: failed to run repack
=A0
when sha1 is 40 bytes number=20
=A0
Does any one know how I can solve thus issue?
=A0
Thanks
YOssi =20
Christian Couder
2007-11-08 05:03:40 UTC
Permalink
There is an entry "How to fix a broken repository?" in the Git Faq on t=
he=20
wiki:

http://git.or.cz/gitwiki/GitFaq#head-ac11406480d09e2df98588e800e41b7256=
602074

Maybe it can help you.

The same topic has been discussed at least 3 times on the mailing list.
By the way, if you find these discussions on gmane, please tell us so t=
hat=20
we can add the links to the FAQ entry. (You can also add them yourself.=
)

Thanks,
Christian. =20
Post by Yossi Leybovich
I am running the git-gc tool over my repository and get the following
error:=20
Post by Yossi Leybovich
git-gc
...
deltifying 3308 objects...
error: corrupt loose object '<sha1>'
fatal: object <sha1> cannot be read .
error: failed to run repack
=A0
when sha1 is 40 bytes number
=A0
Does any one know how I can solve thus issue?
=A0
Thanks
YOssi
-
To unsubscribe from this list: send the line "unsubscribe git" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
Yossi Leybovich
2007-11-08 23:59:47 UTC
Permalink
Hi

I wonder if someone can help in this error
I tried to do git-gc and got error on corrupted object.

I do the following:

$ git-gc
Generating pack...
Done counting 3037 objects.
Deltifying 3037 objects...
error: corrupt loose object '4b9458b3786228369c63936db65827de3cc06200'
fatal: object 4b9458b3786228369c63936db65827de3cc06200 cannot be read
error: failed to run repack

***@SLEYBO-LT /w/work/EMC/ib.071030.001/ib
$ cd .git/objects/4b/

***@SLEYBO-LT /w/work/EMC/ib.071030.001/ib/.git/objects/4b
$ git-fsck-objects.exe 9458b3786228369c63936db65827de3cc06200
error: corrupt loose object '4b9458b3786228369c63936db65827de3cc06200'
error: 4b9458b3786228369c63936db65827de3cc06200: object corrupt or
missing
error: invalid parameter: expected sha1, got
'9458b3786228369c63936db65827de3cc0
6200'
missing blob 4b9458b3786228369c63936db65827de3cc06200

Unfortunately I can't get this object from backup directories as advise
in the FAQ.
Can this object manually fixed by any tool? (the object is attached) I
don't even know to which file/tree/commit it belong how can I know that
?

Thanks
Yossi
Christian Couder
2007-11-09 05:13:17 UTC
Permalink
Unfortunately I can't get this object from backup directories as advi=
se
in the FAQ.
Can this object manually fixed by any tool? (the object is attached) =
I
don't even know to which file/tree/commit it belong how can I know th=
at
?
Could you try something like:

git-cat-file -p 4b9458b3786228369c63936db65827de3cc06200

in your repository ?

Thanks,
Christian.
Yossi Leybovich
2007-11-09 12:16:39 UTC
Permalink
Just tried it :

***@SLEYBO-LT /w/work/EMC/ib.071030.001/ib
$ git-cat-file.exe -p 4b9458b3786228369c63936db65827de3cc06200
error: corrupt loose object '4b9458b3786228369c63936db65827de3cc06200'
fatal: Cannot read object 4b9458b3786228369c63936db65827de3cc06200

Is this say something ?

Yossi=20
-----Original Message-----
Sent: Friday, November 09, 2007 12:13 AM
To: Yossi Leybovich
Subject: Re: corrupt object on git-gc
=20
Post by Yossi Leybovich
Unfortunately I can't get this object from backup directories as
advise
Post by Yossi Leybovich
in the FAQ.
Can this object manually fixed by any tool? (the object is attached=
)
I
Post by Yossi Leybovich
don't even know to which file/tree/commit it belong how can I know
that
Post by Yossi Leybovich
?
=20
=20
git-cat-file -p 4b9458b3786228369c63936db65827de3cc06200
=20
in your repository ?
=20
Thanks,
Christian.
Junio C Hamano
2007-11-09 17:45:56 UTC
Permalink
Post by Yossi Leybovich
$ git-cat-file.exe -p 4b9458b3786228369c63936db65827de3cc06200
error: corrupt loose object '4b9458b3786228369c63936db65827de3cc06200'
fatal: Cannot read object 4b9458b3786228369c63936db65827de3cc06200
Is this say something ?
Linus gave a good description of how to diagnose and assess the
extent of damage and potentially recover, so I won't repeat it,
but I am more interested in understanding how the object got
corrupted in the first place.

One thing the above says, with the .exe extension, is that you
are using it on some DOS derived platform. Is this Cygwin? Is
this WinGit? Is this (infamous) "text mount"?
Alex Riesen
2007-11-09 08:10:35 UTC
Permalink
Post by Yossi Leybovich
I wonder if someone can help in this error
I tried to do git-gc and got error on corrupted object.
$ git-gc
Generating pack...
Done counting 3037 objects.
Deltifying 3037 objects...
error: corrupt loose object '4b9458b3786228369c63936db65827de3cc06200'
It is loose. Nothing uses it in this repository. What do you need to
repair it for?
Post by Yossi Leybovich
fatal: object 4b9458b3786228369c63936db65827de3cc06200 cannot be read
error: failed to run repack
$ cd .git/objects/4b/
$ git-fsck-objects.exe 9458b3786228369c63936db65827de3cc06200
error: corrupt loose object '4b9458b3786228369c63936db65827de3cc06200'
error: 4b9458b3786228369c63936db65827de3cc06200: object corrupt or
missing
error: invalid parameter: expected sha1, got
'9458b3786228369c63936db65827de3cc06200'
missing blob 4b9458b3786228369c63936db65827de3cc06200
the directories directly under .git/objects contain the first bytes of
sha1, to use filesystem in a more efficient way. git-fsck expects an
sha1 (or a reference).

Try running moving the corrupt object (with its *whole* name) some
place else and run git-fsck --all.
Yossi Leybovich
2007-11-09 12:23:11 UTC
Permalink
Hi

I know its loose but still I think there are references in the
repository to this object.
How I can remove it from the repository ?


***@SLEYBO-LT /w/work/EMC/ib.071030.001/ib
$ mv .git/objects/4b ..

***@SLEYBO-LT /w/work/EMC/ib.071030.001/ib
$ git-fsck --full
broken link from tree ca8022a21a064d075d71a342744e584024fd2782
to blob 4b920d658a05a66a9d18dd34b51d6e3a9f229ce1
broken link from commit 2ca27acf05f1631586718b68ce43f0a0400e1f9b
to commit 4b1aabfe3ecc12007535369a2ba17bcee776df64
dangling commit 0d43a63623237385e432572bf61171713dcd8e98
dangling tree b303c073c5d6c30de761a5ecce39ab30da81e98a
dangling tree f3c333f9756e824e6b51e585d734e410790e7dc5
dangling tree 10a4688d94ab6b1fb1bb3aee7e77255a0e41ae94
broken link from tree eea47bf0788a38ac0988de26eddafa8d60caaa58
to blob 4b920d658a05a66a9d18dd34b51d6e3a9f229ce1
broken link from commit 06858a6c8d5a6b1ffbc203057d023c48567dd83e
to tree 4b89da873ce6e1b36a818d70d4665b3074f2354c
dangling commit 4fc6b1127e4a7f4ff5b65a2dd8a90779b5aff3e0
dangling commit 7da607374fe2b1ae09228d2035dd608c73dad7c8
missing tree 4b89da873ce6e1b36a818d70d4665b3074f2354c
broken link from tree 380b2b78d10136cc2b6e1578f4906fccb3e432b1
to blob 4b920d658a05a66a9d18dd34b51d6e3a9f229ce1>
-----Original Message-----



Thanks
Yossi
Sent: Friday, November 09, 2007 3:11 AM
To: Yossi Leybovich
Subject: Re: corrupt object on git-gc
Post by Yossi Leybovich
I wonder if someone can help in this error
I tried to do git-gc and got error on corrupted object.
$ git-gc
Generating pack...
Done counting 3037 objects.
Deltifying 3037 objects...
error: corrupt loose object
'4b9458b3786228369c63936db65827de3cc06200'
It is loose. Nothing uses it in this repository. What do you need to
repair it for?
Post by Yossi Leybovich
fatal: object 4b9458b3786228369c63936db65827de3cc06200 cannot be read
error: failed to run repack
$ cd .git/objects/4b/
$ git-fsck-objects.exe 9458b3786228369c63936db65827de3cc06200
error: corrupt loose object
'4b9458b3786228369c63936db65827de3cc06200'
Post by Yossi Leybovich
error: 4b9458b3786228369c63936db65827de3cc06200: object corrupt or
missing
error: invalid parameter: expected sha1, got
'9458b3786228369c63936db65827de3cc06200'
missing blob 4b9458b3786228369c63936db65827de3cc06200
the directories directly under .git/objects contain the first bytes of
sha1, to use filesystem in a more efficient way. git-fsck expects an
sha1 (or a reference).
Try running moving the corrupt object (with its *whole* name) some
place else and run git-fsck --all.
Andreas Ericsson
2007-11-09 12:56:45 UTC
Permalink
Post by Yossi Leybovich
Hi
I know its loose but still I think there are references in the
repository to this object.
How I can remove it from the repository ?
$ mv .git/objects/4b ..
That was not a very good idea. You just moved ALL objects whose hash
begin with 4b out of the object database.

Try only moving the offending file out of the 4b directory.
Post by Yossi Leybovich
$ git-fsck --full
to blob 4b920d658a05a66a9d18dd34b51d6e3a9f229ce1
to commit 4b1aabfe3ecc12007535369a2ba17bcee776df64
to blob 4b920d658a05a66a9d18dd34b51d6e3a9f229ce1
to tree 4b89da873ce6e1b36a818d70d4665b3074f2354c
to blob 4b920d658a05a66a9d18dd34b51d6e3a9f229ce1>
Notice how all of these start with 4b? Move the directory back and
get rid of just the object that causes errors.
--
Andreas Ericsson ***@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
Nicolas Pitre
2007-11-09 16:17:33 UTC
Permalink
Post by Alex Riesen
Post by Yossi Leybovich
I wonder if someone can help in this error
I tried to do git-gc and got error on corrupted object.
$ git-gc
Generating pack...
Done counting 3037 objects.
Deltifying 3037 objects...
error: corrupt loose object '4b9458b3786228369c63936db65827de3cc06200'
It is loose. Nothing uses it in this repository.
Very wrong. Loose object != unreachable object.


Nicolas
Yossi Leybovich
2007-11-09 13:38:39 UTC
Permalink
Post by Andreas Ericsson
Post by Yossi Leybovich
Hi
I know its loose but still I think there are references in the
repository to this object.
How I can remove it from the repository ?
That was not a very good idea. You just moved ALL objects whose hash
begin with 4b out of the object database.
Try only moving the offending file out of the 4b directory.
Did not help still the repository look for this object?
Any one know how can I track this object and understand which file is it



ib]$ mv .git/objects/4b/9458b3786228369c63936db65827de3cc06200 ../

ib]$ git-fsck --full
dangling commit 0d43a63623237385e432572bf61171713dcd8e98
dangling commit 4fc6b1127e4a7f4ff5b65a2dd8a90779b5aff3e0
dangling commit 7da607374fe2b1ae09228d2035dd608c73dad7c8
dangling commit 004ef09ae022c60a30f9cd61f90d18df5db3628e
dangling commit 85112c6fabb6b8913ab244a8645d67380616eba6
broken link from tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
to blob 4b9458b3786228369c63936db65827de3cc06200
missing blob 4b9458b3786228369c63936db65827de3cc06200
dangling commit bd98481afa93356fa6daa4b6f88c4e631ae2fd72
dangling commit e81e3d2c9c25e5bf5b31327b10b23f9bd0a6d056
dangling commit 92ff9b8cbc771345c9cde0c7fef2c23bb79242b9
Andreas Ericsson
2007-11-09 13:46:10 UTC
Permalink
Post by Yossi Leybovich
Post by Andreas Ericsson
Post by Yossi Leybovich
Hi
I know its loose but still I think there are references in the
repository to this object.
How I can remove it from the repository ?
That was not a very good idea. You just moved ALL objects whose hash
begin with 4b out of the object database.
Try only moving the offending file out of the 4b directory.
Did not help still the repository look for this object?
Any one know how can I track this object and understand which file is it
Is this a super-secret project or you can make a tarball of the .git
directory and send it to me? Trying to track down the cause through
email is decidedly slow.
Post by Yossi Leybovich
ib]$ mv .git/objects/4b/9458b3786228369c63936db65827de3cc06200 ../
ib]$ git-fsck --full
dangling commit 0d43a63623237385e432572bf61171713dcd8e98
dangling commit 4fc6b1127e4a7f4ff5b65a2dd8a90779b5aff3e0
dangling commit 7da607374fe2b1ae09228d2035dd608c73dad7c8
dangling commit 004ef09ae022c60a30f9cd61f90d18df5db3628e
dangling commit 85112c6fabb6b8913ab244a8645d67380616eba6
broken link from tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
to blob 4b9458b3786228369c63936db65827de3cc06200
One tree uses the object. I'm not sure if any commit-objects
use the tree. Try

for b in $(git branch --no-color -a | cut -b3-); do
for rev in $(git rev-list HEAD); do
git ls-tree -r $rev | grep -q 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
test $? -eq 0 && echo $rev && break
done
done

If it turns up empty, you *should* be able to safely delete
2d9263c6d23595e7cb2a21e5ebbb53655278dff8 and
4b9458b3786228369c63936db65827de3cc06200

Make sure to take a backup first though.
--
Andreas Ericsson ***@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
Yossi Leybovich
2007-11-09 15:01:29 UTC
Permalink
Post by Andreas Ericsson
Is this a super-secret project or you can make a tarball of the .git
directory and send it to me? Trying to track down the cause through
email is decidedly slow.
Actually yes , I am not sure I can send the repository , I will
farther check that.
Post by Andreas Ericsson
One tree uses the object. I'm not sure if any commit-objects
use the tree. Try
for b in $(git branch --no-color -a | cut -b3-); do
for rev in $(git rev-list HEAD); do
git ls-tree -r $rev | grep -q 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
test $? -eq 0 && echo $rev && break
done
done
tried this and it return empty

[***@mellanox-compile ib]$
[***@mellanox-compile ib]$ for b in $(git branch --no-color -a |
cut -b3-); do
Post by Andreas Ericsson
for rev in $(git rev-list HEAD); do
git ls-tree -r $rev | grep -q 2d9263c6d23595e7cb2a21e5ebbb53655278dff8;
test $? -eq 0 && echo $rev && break;
done; done
[***@mellanox-compile ib]$
[***@mellanox-compile ib]$

[BTW I didn't notice u use the b varieble so I also tried gi rev-list
$b but still empty ]
I also tried to remove object and tree and apperently other trees and
commits reference to these objects

mv ../9458b3786228369c63936db65827de3cc06200 ../4b/
mv: cannot stat `../9458b3786228369c63936db65827de3cc06200': No such
file or directory
[***@mellanox-compile ib]$ mv
.git/objects/4b/9458b3786228369c63936db65827de3cc06200 ../4b/
[***@mellanox-compile ib]$ mv
.git/objects/2d/9263c6d23595e7cb2a21e5ebbb53655278dff8 ../2d/
[***@mellanox-compile ib]$ git-fsck --full
broken link from tree e5a0044c4ccae7635f07414c1f155bac72d25fd9
to tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
dangling commit 0d43a63623237385e432572bf61171713dcd8e98
dangling commit 4fc6b1127e4a7f4ff5b65a2dd8a90779b5aff3e0
dangling commit 7da607374fe2b1ae09228d2035dd608c73dad7c8
dangling commit 004ef09ae022c60a30f9cd61f90d18df5db3628e
broken link from tree 8bd00402b2a20024f4556107b8a729b0205657db
to tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
dangling commit 85112c6fabb6b8913ab244a8645d67380616eba6
missing tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
dangling commit bd98481afa93356fa6daa4b6f88c4e631ae2fd72
dangling commit e81e3d2c9c25e5bf5b31327b10b23f9bd0a6d056
dangling commit 92ff9b8cbc771345c9cde0c7fef2c23bb79242b9
Post by Andreas Ericsson
If it turns up empty, you *should* be able to safely delete
2d9263c6d23595e7cb2a21e5ebbb53655278dff8 and
4b9458b3786228369c63936db65827de3cc06200
Make sure to take a backup first though.
a lot of commits and trees point to this
Post by Andreas Ericsson
--
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
Johannes Sixt
2007-11-09 15:34:38 UTC
Permalink
[about corrupt loose object '4b9458b3786228369c63936db65827de3cc06200']
You can try to create a clone (after you have fixed up the artificial
breakages that you made). If that goes well, then the bad object is
referenced only from reflogs.

-- Hannes
Yossi Leybovich
2007-11-09 15:53:46 UTC
Permalink
Post by Johannes Sixt
[about corrupt loose object '4b9458b3786228369c63936db65827de3cc06200']
You can try to create a clone (after you have fixed up the artificial
breakages that you made). If that goes well, then the bad object is
referenced only from reflogs.
git clone ib ib-clone
Initialized empty Git repository in /home/mellanox/work/symm/ib-clone/.git/
0 blocks



[***@mellanox-compile ib-clone]$ cd ib-clone/

[***@mellanox-compile ib-clone]$ git branch -a
* mlx4
origin/HEAD
origin/master
origin/mlx4
origin/mlx4-work
origin/mthca
origin/second_port


[***@mellanox-compile ib-clone]$ git-gc
Generating pack...
Done counting 3288 objects.
Deltifying 3288 objects...
error: corrupt loose object '4b9458b3786228369c63936db65827de3cc06200'
fatal: object 4b9458b3786228369c63936db65827de3cc06200 cannot be read
error: failed to run repack


So still I cant pack my repository
Post by Johannes Sixt
-- Hannes
Johannes Sixt
2007-11-09 16:03:23 UTC
Permalink
Post by Yossi Leybovich
Post by Johannes Sixt
[about corrupt loose object '4b9458b3786228369c63936db65827de3cc06200']
You can try to create a clone (after you have fixed up the artificial
breakages that you made). If that goes well, then the bad object is
referenced only from reflogs.
git clone ib ib-clone
Initialized empty Git repository in /home/mellanox/work/symm/ib-clone/.git/
0 blocks
Make this:

git clone file:///home/mellanox/work/symm/ib ib-clone

otherwise you get a hard-linked identical copy, but you want to use the git
protocol to create the clone.

-- Hannes
Nicolas Pitre
2007-11-09 16:03:35 UTC
Permalink
Post by Yossi Leybovich
Post by Johannes Sixt
[about corrupt loose object '4b9458b3786228369c63936db65827de3cc06200']
You can try to create a clone (after you have fixed up the artificial
breakages that you made). If that goes well, then the bad object is
referenced only from reflogs.
git clone ib ib-clone
Initialized empty Git repository in /home/mellanox/work/symm/ib-clone/.git/
0 blocks
Please try "file://ib" instead. Otherwise the clone will only hardlink
files to the original repository.


Nicolas
Yossi Leybovich
2007-11-09 16:31:51 UTC
Permalink
Post by Nicolas Pitre
Post by Yossi Leybovich
Post by Johannes Sixt
[about corrupt loose object '4b9458b3786228369c63936db65827de3cc06200']
You can try to create a clone (after you have fixed up the artificial
breakages that you made). If that goes well, then the bad object is
referenced only from reflogs.
git clone ib ib-clone
Initialized empty Git repository in /home/mellanox/work/symm/ib-clone/.git/
0 blocks
Please try "file://ib" instead. Otherwise the clone will only hardlink
files to the original repository.
And agian the corruption pop up again , so clone does not help


[***@mellanox-compile ib]$ git clone file://ib ib-clone
Initialized empty Git repository in /home/mellanox/work/symm/ib-clone/.git/
remote: Generating pack...
remote: Counting objects: 276
Done counting 3288 objects.
remote: Deltifying 3288 objects...
remote: error: remote: corrupt loose object
'4b9458b3786228369c63936db65827de3cc06200'remote:
remote: fatal: remote: object 4b9458b3786228369c63936db65827de3cc06200
cannot be readremote:
error: git-upload-pack: git-pack-objects died with error.
remote: aborting due to possible repository corruption on the remote side.
fatal: early EOF
fatal: index-pack died with error code 128
fetch-pack from 'file://ib' failed.
fatal: git-upload-pack: aborting due to possible repository corruption
on the remote side.
Post by Nicolas Pitre
Nicolas
Nicolas Pitre
2007-11-09 16:52:17 UTC
Permalink
Post by Yossi Leybovich
Post by Nicolas Pitre
Post by Yossi Leybovich
Post by Johannes Sixt
[about corrupt loose object '4b9458b3786228369c63936db65827de3cc06200']
You can try to create a clone (after you have fixed up the artificial
breakages that you made). If that goes well, then the bad object is
referenced only from reflogs.
git clone ib ib-clone
Initialized empty Git repository in /home/mellanox/work/symm/ib-clone/.git/
0 blocks
Please try "file://ib" instead. Otherwise the clone will only hardlink
files to the original repository.
And agian the corruption pop up again , so clone does not help
OK that means that the object is really part of your active history.

Linus just posted a nice summary of your only option left. If you
manage to recreate the damaged object then it would be nice of you if
you could provide us with both the bad and the good one for analysis.


Nicolas
Linus Torvalds
2007-11-09 16:28:38 UTC
Permalink
Post by Yossi Leybovich
Did not help still the repository look for this object?
Any one know how can I track this object and understand which file is it
So exactly *becuse* the SHA1 hash is cryptographically secure, the hash
itself doesn't actually tell you anything, in order to fix a corrupt
object you basically have to find the "original source" for it.

The easiest way to do that is almost always to have backups, and find the
same object somewhere else. Backups really are a good idea, and git makes
it pretty easy (if nothing else, just clone the repository somewhere else,
and make sure that you do *not* use a hard-linked clone, and preferably
not the same disk/machine).

But since you don't seem to have backups right now, the good news is that
especially with a single blob being corrupt, these things *are* somewhat
debuggable.

First off, move the corrupt object away, and *save* it. The most common
cause of corruption so far has been memory corruption, but even so, there
are people who would be interested in seeing the corruption - but it's
basically impossible to judge the corruption until we can also see the
original object, so right now the corrupt object is useless, but it's very
interesting for the future, in the hope that you can re-create a
non-corrupt version.
Post by Yossi Leybovich
ib]$ mv .git/objects/4b/9458b3786228369c63936db65827de3cc06200 ../
This is the right thing to do, although it's usually best to save it under
it's full SHA1 name (you just dropped the "4b" from the result ;).
Post by Yossi Leybovich
ib]$ git-fsck --full
broken link from tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
to blob 4b9458b3786228369c63936db65827de3cc06200
missing blob 4b9458b3786228369c63936db65827de3cc06200
Ok, I removed the "dangling commit" messages, because they are just
messages about the fact that you probably have rebased etc, so they're not
at all interesting. But what remains is still very useful. In particular,
we now know which tree points to it!

Now you can do

git ls-tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8

which will show something like

100644 blob 8d14531846b95bfa3564b58ccfb7913a034323b8 .gitignore
100644 blob ebf9bf84da0aab5ed944264a5db2a65fe3a3e883 .mailmap
100644 blob ca442d313d86dc67e0a2e5d584b465bd382cbf5c COPYING
100644 blob ee909f2cc49e54f0799a4739d24c4cb9151ae453 CREDITS
040000 tree 0f5f709c17ad89e72bdbbef6ea221c69807009f6 Documentation
100644 blob 1570d248ad9237e4fa6e4d079336b9da62d9ba32 Kbuild
100644 blob 1c7c229a092665b11cd46a25dbd40feeb31661d9 MAINTAINERS
...

and you should now have a line that looks like

10064 blob 4b9458b3786228369c63936db65827de3cc06200 my-magic-file

in the output. This already tells you a *lot* it tells you what file the
corrupt blob came from!

Now, it doesn't tell you quite enough, though: it doesn't tell what
*version* of the file didn't get correctly written! You might be really
lucky, and it may be the version that you already have checked out in your
working tree, in which case fixing this problem is really simple, just do

git hash-object -w my-magic-file

again, and if it outputs the missing SHA1 (4b945..) you're now all done!

But that's the really lucky case, so let's assume that it was some older
version that was broken. How do you tell which version it was?

The easiest way to do it is to do

git log --raw --all --full-history -- subdirectory/my-magic-file

and that will show you the whole log for that file (please realize that
the tree you had may not be the top-level tree, so you need to figure out
which subdirectory it was in on your own), and because you're asking for
raw output, you'll now get something like

commit abc
Author:
Date:
..
:100644 100644 4b9458b... newsha... M somedirectory/my-magic-file


commit xyz
Author:
Date:

..
:100644 100644 oldsha... 4b9458b... M somedirectory/my-magic-file

and this actually tells you what the *previous* and *subsequent* versions
of that file were! So now you can look at those ("oldsha" and "newsha"
respectively), and hopefully you have done commits often, and can
re-create the missing my-magic-file version by looking at those older and
newer versions!

If you can do that, you can now recreate the missing object with

git hash-object -w <recreated-file>

and your repository is good again!

(Btw, you could have ignored the fsck, and started with doing a

git log --raw --all

and just looked for the sha of the missing object (4b9458b..) in that
whole thing. It's up to you - git does *have* a lot of information, it is
just missing one particular blob version.

Trying to recreate trees and especially commits is *much* harder. So you
were lucky that it's a blob. It's quite possible that you can recreate the
thing.

Linus
Nicolas Pitre
2007-11-09 17:28:19 UTC
Permalink
Extracted from a post by Linus on the mailing list.

Signed-off-by: Nicolas Pitre <***@cam.org>
---
Post by Linus Torvalds
But since you don't seem to have backups right now, the good news is that
especially with a single blob being corrupt, these things *are* somewhat
debuggable.
I was in the process of writing a similar message, but Linus was quicker
and his version is actually much nicer. Certainly good howto material.

diff --git a/Documentation/howto/recover-corrupted-blob-object.txt b/Documentation/howto/recover-corrupted-blob-object.txt
new file mode 100644
index 0000000..9b6853c
--- /dev/null
+++ b/Documentation/howto/recover-corrupted-blob-object.txt
@@ -0,0 +1,134 @@
+Date: Fri, 9 Nov 2007 08:28:38 -0800 (PST)
+From: Linus Torvalds <***@linux-foundation.org>
+Subject: corrupt object on git-gc
+Abstract: Some tricks to reconstruct blob objects in order to fix
+ a corrupted repository.
+
+On Fri, 9 Nov 2007, Yossi Leybovich wrote:
+>
+> Did not help still the repository look for this object?
+> Any one know how can I track this object and understand which file is it
+
+So exactly *because* the SHA1 hash is cryptographically secure, the hash
+itself doesn't actually tell you anything, in order to fix a corrupt
+object you basically have to find the "original source" for it.
+
+The easiest way to do that is almost always to have backups, and find the
+same object somewhere else. Backups really are a good idea, and git makes
+it pretty easy (if nothing else, just clone the repository somewhere else,
+and make sure that you do *not* use a hard-linked clone, and preferably
+not the same disk/machine).
+
+But since you don't seem to have backups right now, the good news is that
+especially with a single blob being corrupt, these things *are* somewhat
+debuggable.
+
+First off, move the corrupt object away, and *save* it. The most common
+cause of corruption so far has been memory corruption, but even so, there
+are people who would be interested in seeing the corruption - but it's
+basically impossible to judge the corruption until we can also see the
+original object, so right now the corrupt object is useless, but it's very
+interesting for the future, in the hope that you can re-create a
+non-corrupt version.
+
+So:
+
+> ib]$ mv .git/objects/4b/9458b3786228369c63936db65827de3cc06200 ../
+
+This is the right thing to do, although it's usually best to save it under
+it's full SHA1 name (you just dropped the "4b" from the result ;).
+
+Let's see what that tells us:
+
+> ib]$ git-fsck --full
+> broken link from tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
+> to blob 4b9458b3786228369c63936db65827de3cc06200
+> missing blob 4b9458b3786228369c63936db65827de3cc06200
+
+Ok, I removed the "dangling commit" messages, because they are just
+messages about the fact that you probably have rebased etc, so they're not
+at all interesting. But what remains is still very useful. In particular,
+we now know which tree points to it!
+
+Now you can do
+
+ git ls-tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
+
+which will show something like
+
+ 100644 blob 8d14531846b95bfa3564b58ccfb7913a034323b8 .gitignore
+ 100644 blob ebf9bf84da0aab5ed944264a5db2a65fe3a3e883 .mailmap
+ 100644 blob ca442d313d86dc67e0a2e5d584b465bd382cbf5c COPYING
+ 100644 blob ee909f2cc49e54f0799a4739d24c4cb9151ae453 CREDITS
+ 040000 tree 0f5f709c17ad89e72bdbbef6ea221c69807009f6 Documentation
+ 100644 blob 1570d248ad9237e4fa6e4d079336b9da62d9ba32 Kbuild
+ 100644 blob 1c7c229a092665b11cd46a25dbd40feeb31661d9 MAINTAINERS
+ ...
+
+and you should now have a line that looks like
+
+ 10064 blob 4b9458b3786228369c63936db65827de3cc06200 my-magic-file
+
+in the output. This already tells you a *lot* it tells you what file the
+corrupt blob came from!
+
+Now, it doesn't tell you quite enough, though: it doesn't tell what
+*version* of the file didn't get correctly written! You might be really
+lucky, and it may be the version that you already have checked out in your
+working tree, in which case fixing this problem is really simple, just do
+
+ git hash-object -w my-magic-file
+
+again, and if it outputs the missing SHA1 (4b945..) you're now all done!
+
+But that's the really lucky case, so let's assume that it was some older
+version that was broken. How do you tell which version it was?
+
+The easiest way to do it is to do
+
+ git log --raw --all --full-history -- subdirectory/my-magic-file
+
+and that will show you the whole log for that file (please realize that
+the tree you had may not be the top-level tree, so you need to figure out
+which subdirectory it was in on your own), and because you're asking for
+raw output, you'll now get something like
+
+ commit abc
+ Author:
+ Date:
+ ..
+ :100644 100644 4b9458b... newsha... M somedirectory/my-magic-file
+
+
+ commit xyz
+ Author:
+ Date:
+
+ ..
+ :100644 100644 oldsha... 4b9458b... M somedirectory/my-magic-file
+
+and this actually tells you what the *previous* and *subsequent* versions
+of that file were! So now you can look at those ("oldsha" and "newsha"
+respectively), and hopefully you have done commits often, and can
+re-create the missing my-magic-file version by looking at those older and
+newer versions!
+
+If you can do that, you can now recreate the missing object with
+
+ git hash-object -w <recreated-file>
+
+and your repository is good again!
+
+(Btw, you could have ignored the fsck, and started with doing a
+
+ git log --raw --all
+
+and just looked for the sha of the missing object (4b9458b..) in that
+whole thing. It's up to you - git does *have* a lot of information, it is
+just missing one particular blob version.
+
+Trying to recreate trees and especially commits is *much* harder. So you
+were lucky that it's a blob. It's quite possible that you can recreate the
+thing.
+
+ Linus
Johannes Schindelin
2007-11-09 17:30:30 UTC
Permalink
Hi,
Post by Nicolas Pitre
Extracted from a post by Linus on the mailing list.
Heh. I was hoping that somebody would be quicker than me...

Ciao,
Dscho
J. Bruce Fields
2007-11-26 02:12:19 UTC
Permalink
Post by Nicolas Pitre
Extracted from a post by Linus on the mailing list.
I rearranged this some more and added it to the manual, assuming that
makes sense to everyone.

I think there needs to be some discussion of pack objects and stuff too
some day. I added a few mail archive references to the "todo" section.

--b.

commit d6e199cb6ff911e8e3e39c8b7021512a14ea79a5
Author: J. Bruce Fields <***@citi.umich.edu>
Date: Sat Mar 3 22:53:37 2007 -0500

user-manual: recovering from corruption

Some instructions on dealing with corruption of the object database.

Most of this text is from an example by Linus, identified by Nicolas
Pitre <***@cam.org> with a little further editing by me.

Signed-off-by: "J. Bruce Fields" <***@citi.umich.edu>

diff --git a/Documentation/user-manual.txt b/Documentation/user-manual.txt
index c027353..3166fb6 100644
--- a/Documentation/user-manual.txt
+++ b/Documentation/user-manual.txt
@@ -1554,6 +1554,11 @@ This may be time-consuming. Unlike most other git operations (including
git-gc when run without any options), it is not safe to prune while
other git operations are in progress in the same repository.

+If gitlink:git-fsck[1] complains about sha1 mismatches or missing
+objects, you may have a much more serious problem; your best option is
+probably restoring from backups. See
+<<recovering-from-repository-corruption>> for a detailed discussion.
+
[[recovering-lost-changes]]
Recovering lost changes
~~~~~~~~~~~~~~~~~~~~~~~
@@ -3172,6 +3177,127 @@ confusing and scary messages, but it won't actually do anything bad. In
contrast, running "git prune" while somebody is actively changing the
repository is a *BAD* idea).

+[[recovering-from-repository-corruption]]
+Recovering from repository corruption
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+By design, git treats data trusted to it with caution. However, even in
+the absence of bugs in git itself, it is still possible that hardware or
+operating system errors could corrupt data.
+
+The first defense against such problems is backups. You can back up a
+git directory using clone, or just using cp, tar, or any other backup
+mechanism.
+
+As a last resort, you can search for the corrupted objects and attempt
+to replace them by hand. Back up your repository before attempting this
+in case you corrupt things even more in the process.
+
+We'll assume that the problem is a single missing or corrupted blob,
+which is sometimes a solveable problem. (Recovering missing trees and
+especially commits is *much* harder).
+
+Before starting, verify that there is corruption, and figure out where
+it is with gitlink:git-fsck[1]; this may be time-consuming.
+
+Assume the output looks like this:
+
+------------------------------------------------
+$ git-fsck --full
+broken link from tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
+ to blob 4b9458b3786228369c63936db65827de3cc06200
+missing blob 4b9458b3786228369c63936db65827de3cc06200
+------------------------------------------------
+
+(Typically there will be some "dangling object" messages too, but they
+aren't interesting.)
+
+Now you know that blob 4b9458b3 is missing, and that the tree 2d9263c6
+points to it. If you could find just one copy of that missing blob
+object, possibly in some other repository, you could move it into
+.git/objects/4b/9458b3... and be done. Suppose you can't. You can
+still examine the tree that pointed to it with gitlink:git-ls-tree[1],
+which might output something like:
+
+------------------------------------------------
+$ git ls-tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
+100644 blob 8d14531846b95bfa3564b58ccfb7913a034323b8 .gitignore
+100644 blob ebf9bf84da0aab5ed944264a5db2a65fe3a3e883 .mailmap
+100644 blob ca442d313d86dc67e0a2e5d584b465bd382cbf5c COPYING
+...
+100644 blob 4b9458b3786228369c63936db65827de3cc06200 myfile
+...
+------------------------------------------------
+
+So now you know that the missing blob was the data for a file named
+"myfile". And chances are you can also identify the directory--let's
+say it's in "somedirectory". If you're lucky the missing copy might be
+the same as the copy you have checked out in your working tree at
+"somedirectory/myfile"; you can test whether that's right with
+gitlink:git-hash-object[1]:
+
+------------------------------------------------
+$ git hash-object -w somedirectory/myfile
+------------------------------------------------
+
+which will create and store a blob object with the contents of
+somedirectory/myfile, and output the sha1 of that object. if you're
+extremely lucky it might be 4b9458b3786228369c63936db65827de3cc06200, in
+which case you've guessed right, and the corruption is fixed!
+
+Otherwise, you need more information. How do you tell which version of
+the file has been lost?
+
+The easiest way to do this is with:
+
+------------------------------------------------
+$ git log --raw --all --full-history -- somedirectory/myfile
+------------------------------------------------
+
+Because you're asking for raw output, you'll now get something like
+
+------------------------------------------------
+commit abc
+Author:
+Date:
+...
+:100644 100644 4b9458b... newsha... M somedirectory/myfile
+
+
+commit xyz
+Author:
+Date:
+
+...
+:100644 100644 oldsha... 4b9458b... M somedirectory/myfile
+------------------------------------------------
+
+This tells you that the immediately preceding version of the file was
+"newsha", and that the immediately following version was "oldsha".
+You also know the commit messages that went with the change from oldsha
+to 4b9458b and with the change from 4b9458b to newsha.
+
+If you've been committing small enough changes, you may now have a good
+shot at reconstructing the contents of the in-between state 4b9458b.
+
+If you can do that, you can now recreate the missing object with
+
+------------------------------------------------
+$ git hash-object -w <recreated-file>
+------------------------------------------------
+
+and your repository is good again!
+
+(Btw, you could have ignored the fsck, and started with doing a
+
+------------------------------------------------
+$ git log --raw --all
+------------------------------------------------
+
+and just looked for the sha of the missing object (4b9458b..) in that
+whole thing. It's up to you - git does *have* a lot of information, it is
+just missing one particular blob version.
+
[[the-index]]
The index
-----------
@@ -4381,4 +4507,7 @@ Write a chapter on using plumbing and writing scripts.

Alternates, clone -reference, etc.

-git unpack-objects -r for recovery
+More on recovery from repository corruption. See:
+ http://marc.theaimsgroup.com/?l=git&m=117263864820799&w=2
+ http://marc.theaimsgroup.com/?l=git&m=117147855503798&w=2
+ http://marc.theaimsgroup.com/?l=git&m=117147855503798&w=2

Yossi Leybovich
2007-11-09 17:53:51 UTC
Permalink
Post by Linus Torvalds
and you should now have a line that looks like
10064 blob 4b9458b3786228369c63936db65827de3cc06200 my-magic-file
That works and now I know the file
Post by Linus Torvalds
The easiest way to do it is to do
git log --raw --all --full-history -- subdirectory/my-magic-file
and that will show you the whole log for that file (please realize that
the tree you had may not be the top-level tree, so you need to figure out
which subdirectory it was in on your own), and because you're asking for
raw output, you'll now get something like
commit abc
..
:100644 100644 4b9458b... newsha... M somedirectory/my-magic-file
commit xyz
..
:100644 100644 oldsha... 4b9458b... M somedirectory/my-magic-file
and this actually tells you what the *previous* and *subsequent* versions
of that file were! So now you can look at those ("oldsha" and "newsha"
respectively), and hopefully you have done commits often, and can
re-create the missing my-magic-file version by looking at those older and
newer versions!
If you can do that, you can now recreate the missing object with
Ok, tried that and unfortuantly the SHA1 number is apear only one

[***@mellanox-compile ib]$ git log --raw --all --full-history --
SymmK/St.c | grep 4b9
:100755 100755 308806c... 4b9458b3786228369c63936db65827de3cc06200 M
SymmK/St.c

git log --raw --all --full-history -- SymmK/St.c

...
...
commit 597e70e7dc8e06a7cdbe4d9e9727411c964bd023
Author: sleybo <***@mellanox.co.il>
Date: Fri Oct 5 10:41:43 2007 -0400

1. increase QPs parameters - QP is bigger than 4k
2. lock buffers use the dma key
3. add prints

:100755 100755 308806c... 4b9458b3786228369c63936db65827de3cc06200 M
SymmK/St.c


What intersting is that the SHA1 that I looked for apear only once
(only as new SHA1)

So I checkout version of the file which produce the old SHA1 308806c....

[***@mellanox-compile ib-tmp]$ git checkout mlx4-start -- SymmK/St.c
[***@mellanox-compile ib-tmp]$ git hash-object -w SymmK/St.c
308806cf3a864656a49d00edc35b9505abd627a2

than I did
[***@mellanox-compile ib-tmp]$ git diff-tree --stdin -p --pretty
597e70e7dc8e06a7cdbe4d9e9727411c964bd023 > commit-597e70e

( which is the commit SHA1)

[***@mellanox-compile ib-tmp]$ git apply commit-597e70e
Adds trailing whitespace.
../ib/commit-597e70e:1622:
Adds trailing whitespace.
../ib/commit-597e70e:1646: (int)devif->lock_dma +
lockid*sizeof(u64),
warning: 2 lines add whitespace errors.
[***@mellanox-compile ib-tmp]$ git hash-object -w SymmK/St.c
e69de29bb2d1d6434b8b29ae775ad8c2e48c5391


So the same commit actual lead to the wrong SHA1
(I tried this flow on different file and it works)

I think I am close but still not there , any suggestions ?

Thanks
Yossi
Linus Torvalds
2007-11-09 18:02:43 UTC
Permalink
Post by Yossi Leybovich
Ok, tried that and unfortuantly the SHA1 number is apear only one
SymmK/St.c | grep 4b9
:100755 100755 308806c... 4b9458b3786228369c63936db65827de3cc06200 M SymmK/St.c
Actually, that's not at all "unfortunately", because that implies that
it's the very *latest* version of that "SymmK/St.c" file. I really think
you already had it checked out, but didn't try my first suggestion of just
doing "git hash-object -w SymmK/St.c" which likely would have fixed it
already (unless you had changed it in your working tree, of course!)

Linus
Yossi Leybovich
2007-11-09 18:37:35 UTC
Permalink
Post by Linus Torvalds
Post by Yossi Leybovich
Ok, tried that and unfortuantly the SHA1 number is apear only one
SymmK/St.c | grep 4b9
:100755 100755 308806c... 4b9458b3786228369c63936db65827de3cc06200 M SymmK/St.c
Actually, that's not at all "unfortunately", because that implies that
it's the very *latest* version of that "SymmK/St.c" file. I really think
you already had it checked out, but didn't try my first suggestion of just
doing "git hash-object -w SymmK/St.c" which likely would have fixed it
already (unless you had changed it in your working tree, of course!)
Its very old version of the file.
What interesting is the second part of the experiment
I tried to apply the same commit on this file and it leaded to different SHA1
Post by Linus Torvalds
Linus
Linus Torvalds
2007-11-09 18:55:03 UTC
Permalink
Post by Yossi Leybovich
What interesting is the second part of the experiment
I tried to apply the same commit on this file and it leaded to different SHA1
Eh. That commit was basically corrupt, because the blob had gotten
removed. I don't even understand how git diff-tree gave a diff with that
file at all (side note: I'd also suggest you just use "git show <commit>"
instead of that complex and _really_ old git-diff-tree incantation).

So no, you didn't "apply the same commit".

But if you have the diff somewhere (perhaps email archive? you sent it to
somebody?) or you can re-create it exactly, then..

Linus
Mike Hommey
2007-11-09 19:07:07 UTC
Permalink
Post by Linus Torvalds
Post by Yossi Leybovich
What interesting is the second part of the experiment
I tried to apply the same commit on this file and it leaded to different SHA1
Eh. That commit was basically corrupt, because the blob had gotten
removed. I don't even understand how git diff-tree gave a diff with that
file at all (side note: I'd also suggest you just use "git show <commit>"
instead of that complex and _really_ old git-diff-tree incantation).
So no, you didn't "apply the same commit".
But if you have the diff somewhere (perhaps email archive? you sent it to
somebody?) or you can re-create it exactly, then..
Or maybe just from memory, by looking at the diff between the previous version
and the next version of the file.

Mike
Yossi Leybovich
2007-11-09 19:41:05 UTC
Permalink
What I do notice is that this commit involve few files. most of the
file the commit generate the right next SHA1
only for one file its generate broken SHA1
From the git show <commit> I can see that the file which end up
corrupted is actually being totaly remove from

diff --git a/SymmK/St.c b/SymmK/St.c
index 308806c..4b9458b 100755
--- a/SymmK/St.c
+++ b/SymmK/St.c
@@ -1,1535 +0,0 @@
-MODULE_ALIAS(m_st);
-
-#include <errno.h>
-#include <string.h>
-#include <stdarg.h>
-#include <sys/types.h>
-#include <sys/time.h>
-#include "ib_global_init.h"
....
.....
....


While I tried to delete the whole file and I did not get the right SHA1
Is this soud familiar to some one ?
maybe its releated to issue with some kind of white character I cant see.

Yossi
Mike Hommey
2007-11-09 19:52:37 UTC
Permalink
Post by Yossi Leybovich
What I do notice is that this commit involve few files. most of the
file the commit generate the right next SHA1
only for one file its generate broken SHA1
From the git show <commit> I can see that the file which end up
corrupted is actually being totaly remove from
diff --git a/SymmK/St.c b/SymmK/St.c
index 308806c..4b9458b 100755
--- a/SymmK/St.c
+++ b/SymmK/St.c
@@ -1,1535 +0,0 @@
-MODULE_ALIAS(m_st);
-
-#include <errno.h>
-#include <string.h>
-#include <stdarg.h>
-#include <sys/types.h>
-#include <sys/time.h>
-#include "ib_global_init.h"
....
.....
....
While I tried to delete the whole file and I did not get the right SHA1
Is this soud familiar to some one ?
maybe its releated to issue with some kind of white character I cant see.
Because the blob is corrupted, git show can't display the correct diff.
You have to guess it by yourself ! The best you can do is look at the
diff for this file between its previous version and the one just after
the corrupted version.

Mike
Loading...