Discussion:
symbolic link management in git-archive
Sergio Callegari
2008-03-27 11:29:28 UTC
Permalink
Hi,

I guess the answer is "no" or "not yet", but is there a way to tell the zip
backend of git-archive to follow symbolic links rather than to store them?

Sergio
Miklos Vajna
2008-03-27 11:40:24 UTC
Permalink
Post by Sergio Callegari
I guess the answer is "no" or "not yet", but is there a way to tell the zip
backend of git-archive to follow symbolic links rather than to store them?
how would that handle a '. -> foo' symlink? following such a recursion
would lead to an infinite loop, i guess.
Johannes Schindelin
2008-03-27 11:44:57 UTC
Permalink
Hi,
Post by Miklos Vajna
Post by Sergio Callegari
I guess the answer is "no" or "not yet", but is there a way to tell the zip
backend of git-archive to follow symbolic links rather than to store them?
how would that handle a '. -> foo' symlink? following such a recursion
would lead to an infinite loop, i guess.
Don't forget '/ -> foo'.

Ciao,
Dscho
Sergio Callegari
2008-03-27 12:11:48 UTC
Permalink
Post by Johannes Schindelin
Hi,
On Thu, Mar 27, 2008 at 11:29:28AM +0000, Sergio Callegari <sergio.callegari
Post by Sergio Callegari
I guess the answer is "no" or "not yet", but is there a way to tell the zip
backend of git-archive to follow symbolic links rather than to store them?
how would that handle a '. -> foo' symlink? following such a recursion
would lead to an infinite loop, i guess.
Don't forget '/ -> foo'.
Ciao,
Dscho
My question was inspired by the fact that the unix version of the zip program
has a switch, precisely to decide whether to follow or to store links.
I believe that the reason why this switch exist is clear: for some mysterious
reason the world is populated by OSs that do not understand symlinks and while
someone finds out why it is sensible to have workarounds. Obviously zip archives
with stored links are completely useless on these OSs.
BTW, note that also tar has a similar switch.

I believe that cases like the foo -> . or the foo -> / (or even foo -> ..) can
be solved by either:

- limiting the symlink dereferentiation only to symlinks pointing to plain files

or

- leaving it to the user not to ask for link following when he knows that he has
such types of links

Personally I prefer the second. In the end:
- If these types of link exist, it is clear that the git-managed-stuff is not
made for certain OSs, so the user should know that asking for link following
makes no sense
- The user should know that there are some commands that might be dangerous
(think rm -fr ~), just warn them in the manual.
- The maximum risk here is to have the command never stop and fill the disk.

Sergio
Junio C Hamano
2008-03-27 16:31:23 UTC
Permalink
Post by Sergio Callegari
I guess the answer is "no" or "not yet", but is there a way to tell the zip
backend of git-archive to follow symbolic links rather than to store them?
I am not sure what you mean. Are you tracking a symbolic link X that
points at Y in your revision and expecting git-archive to include whatever
happens to be at Y (which may or may not even exist) when you run the
command?

If that is the case, the answer is "no" and "will never happen". If you
are tracking a symbolic link X that points at Y, the information git
tracks is the fact that there is a symbolic link X that points at Y, and
not what Y happens to look like at a random moment. Change to Y is not
tracked by git so why should you get different output from git-archive of
the same revision before and after you modify Y which is not part of the
revision to begin with?

If that is not what you are asking, please restate the question.
Sergio Callegari
2008-03-27 18:34:30 UTC
Permalink
Post by Junio C Hamano
Post by Sergio Callegari
I guess the answer is "no" or "not yet", but is there a way to tell the zip
backend of git-archive to follow symbolic links rather than to store them?
I am not sure what you mean. Are you tracking a symbolic link X that
points at Y in your revision and expecting git-archive to include whatever
happens to be at Y (which may or may not even exist) when you run the
command?
Yes, this is the case.
Post by Junio C Hamano
If that is the case, the answer is "no" and "will never happen". If you
are tracking a symbolic link X that points at Y, the information git
tracks is the fact that there is a symbolic link X that points at Y,
... I obviously agree with this!
Post by Junio C Hamano
and
not what Y happens to look like at a random moment.
Well, certainly if the zip backend of git-archive in addition to the -1 ... -9
switches also supported a --dereference switch, I would not run git-archive
--format=zip --dereference at "random moments". I would run it when I am sure
that things pointed at by the symbolic links in the tracked project actually
point at sensible things.
Post by Junio C Hamano
Change to Y is not
tracked by git
Unless Y is also in the tracked project...
Post by Junio C Hamano
so why should you get different output from git-archive of
the same revision before and after you modify Y which is not part of the
revision to begin with?
Basically, because if I make a zip archive instead of a git bundle, I want to
store (and probably give to someone else) a copy of what _I see_ at a certain
instant in time which might not exactly coincide with the tracked state.

Note that if Y is outside of the tracked project and I make an archive, and then
I give the archive to my friend X, Mr. X will see the same symbolic link, but
still a completely and randomly different content than I do, depending on where
he is unpacking the archive.

But to get to a more practical case, my situation is this:

I have a project where I need to have the same content in multiple places,
otherwise development tools are not happy.
Since I do most of the development on Linux, I use symbolic links. This is very
good, not just because I save space, but particularly because I am sure that the
content cannot loose coherency (which would be very bad) at the different
places.

Every now and then, to distribute snapshots of the project, I run git archive to
make a zip file, that I give to other people. Unfortunately, some of them use
Windows, where the symbolic links appear as one-liner text files. And obviously
they cannot compile anything and they complain.

So I would like git archive to be able to make zip archives with the symbolic
links /resolved/.

what I am asking is not to modify the standard behaviour of git archive to
follow links, rather to have a switch to activate the dereferencing behaviour
only when needed. In the end git archive is a nice shorthand for a checkout and
a successive run of zip or tar and both zip and tar have a switch to control
this dereferencing behaviour (BTW, zip on my distro dereferences by default, the
switch is to store symbolic links).

Sergio
Junio C Hamano
2008-03-27 19:05:25 UTC
Permalink
Post by Sergio Callegari
Unless Y is also in the tracked project...
...
Note that if Y is outside of the tracked project and I make an archive,
and then I give the archive to my friend X, Mr. X will see the same
symbolic link, but still a completely and randomly different content
than I do, depending on where he is unpacking the archive.
If you _do_ keep track of Y in a separate repository, I think two archives
(the one that has a pointer to Y, and the other that is taken from the
repository Y _at the revision you are using_), would solve that more
naturally. Then the version markers recorded in the archives would still
be valid.

Side note. If we ever teach git-archive to create a recursive
tarball that contains a submodule, we should be doing something
like that, not necessarily as two separate tarballs but possibly
with a single tarball that has two comments that describe the
revision of the toplevel and the submodule.
Post by Sergio Callegari
... In the end git archive is a nice shorthand for a checkout and a
successive run of zip or tar and both zip and tar have a switch to
control this dereferencing behaviour (BTW, zip on my distro dereferences
by default, the switch is to store symbolic links).
Under such an option, at least the comment in the archive (both for zip
and tar) that notes which revision the tarball was taken from should be
omitted. As long as that is done, I think it is Ok to have such an
optional behaviour.
Sergio Callegari
2008-03-27 19:20:16 UTC
Permalink
Post by Junio C Hamano
Under such an option, at least the comment in the archive (both for zip
and tar) that notes which revision the tarball was taken from should be
omitted. As long as that is done, I think it is Ok to have such an
optional behaviour.
yes, I overlooked this, but it makes perfect sense: if it is not an exact copy
of the tracked state, it should not pretend to be so by indicating a revision.
René Scharfe
2008-03-31 20:44:58 UTC
Permalink
I have a project where I need to have the same content in multiple pl=
aces,
otherwise development tools are not happy.
Since I do most of the development on Linux, I use symbolic links. Th=
is is very
good, not just because I save space, but particularly because I am su=
re that the
content cannot loose coherency (which would be very bad) at the diffe=
rent
places.
=20
Every now and then, to distribute snapshots of the project, I run git=
archive to
make a zip file, that I give to other people. Unfortunately, some of=
them use
Windows, where the symbolic links appear as one-liner text files. And=
obviously
they cannot compile anything and they complain.
=20
So I would like git archive to be able to make zip archives with the =
symbolic
links /resolved/.
Windows 2000 and up has support for symbolic links; it's just strangely
restricted, e.g. on Windows 2000 you can only link to directories and
you have to use tools that aren't shipped with the OS to create them.
Microsoft even calls them differently; here's a good starting point for
more information: http://en.wikipedia.org/wiki/NTFS_junction_point

Arguably, your unzip program should create junction points for symlinks
in zip files. I wouldn't be surprised if none of the existing ones
support that, though; junction points have been left undocumented for a
long time. It's also possible that they'd understand a different zip
dialect for symlinks than the Info-Zip one produced by git-archive.

Would it be practical for you to distribute a junction point creation
tool like Mark Russinovich's Junction (except that Junction's EULA
forbids redistribution under most circumstances; see here:
http://www.microsoft.com/technet/sysinternals/FileAndDisk/Junction.mspx=
)
and a script that creates these symlinks for your audience?


It's harder for git-archive to support following symlinks than for e.g.
GNU tar. The reason is that the former operates on git objects, not
files, directories or symlinks. In order to follow a symlink it would
need to evaluate the symlink, follow it and then add actual files and
directories to the archive.

=46or your purposes, perhaps a slightly different implementation might =
be
sufficient: namely to follow only relative symlinks that point to
tracked objects. That way you still get a repeatable result and (most
importantly) git-archive doesn't need to look at files and directories,
it can stay safely in git land. Would such a way of operation be usefu=
l
to you?

Thanks,
Ren=E9
Sergio Callegari
2008-03-31 22:12:09 UTC
Permalink
Windows 2000 and up has support for symbolic links; it's just strange=
ly
restricted, [...] here's a good starting point for
more information: http://en.wikipedia.org/wiki/NTFS_junction_point
Arguably, your unzip program should create junction points for symlin=
ks
in zip files. I wouldn't be surprised if none of the existing ones
support that, though;
Would it be practical for you to distribute a junction point creation
tool like Mark Russinovich's Junction (except that Junction's EULA
http://www.microsoft.com/technet/sysinternals/FileAndDisk/Junction.ms=
px)
and a script that creates these symlinks for your audience?
=20
Thank you very much for the detailed explanation. Unfortunately, I do=20
not think that Junction can be an option. My colleagues using Windows=20
tend to be a bit "conservative" about the tools they use. If they=20
navigate filesystem contents with the graphical tool, and they=20
look-at/expand the content of a zip file from explorer they expect that=
=20
it should be immediately right, otherwise there must be something wrong=
=20
in the way /I/ create zip files. They would not appreciate having to=20
unzip the file and then run an additional program on it to fix the=20
unzipped stuff.
It's harder for git-archive to support following symlinks than for e.=
g.
GNU tar. The reason is that the former operates on git objects, not
files, directories or symlinks. In order to follow a symlink it woul=
d
need to evaluate the symlink, follow it and then add actual files and
directories to the archive.
For your purposes, perhaps a slightly different implementation might =
be
sufficient: namely to follow only relative symlinks that point to
tracked objects. That way you still get a repeatable result and (mos=
t
importantly) git-archive doesn't need to look at files and directorie=
s,
it can stay safely in git land. Would such a way of operation be use=
ful
to you?
=20
Absolutely positive. This would be already a great improvement,=20
fulfilling 99.9% of needs.

Sergio

Loading...