Discussion:
[PATCH 1/4] checkout: do not fail if target is an empty directory
Max Kirillov
2014-10-12 05:13:08 UTC
Permalink
Non-recursive checkout creates empty directpries in place of submodules.
If then I try to "checkout --to" submodules there, it refuses to do so,
because directory already exists.

Fix by allowing checking out to empty directory. Add test and modify the
existing one so that it uses non-empty directory.

Signed-off-by: Max Kirillov <***@max630.net>
---
builtin/checkout.c | 2 +-
t/t2025-checkout-to.sh | 7 ++++++-
2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index 01d0f2f..74eabe7 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -865,7 +865,7 @@ static int prepare_linked_checkout(const struct checkout_opts *opts,

if (!new->commit)
die(_("no branch specified"));
- if (file_exists(path))
+ if (file_exists(path) && !is_empty_dir(path))
die(_("'%s' already exists"), path);

len = strlen(path);
diff --git a/t/t2025-checkout-to.sh b/t/t2025-checkout-to.sh
index eddd325..915b506 100755
--- a/t/t2025-checkout-to.sh
+++ b/t/t2025-checkout-to.sh
@@ -13,10 +13,15 @@ test_expect_success 'checkout --to not updating paths' '
'

test_expect_success 'checkout --to an existing worktree' '
- mkdir existing &&
+ mkdir -p existing/subtree &&
test_must_fail git checkout --detach --to existing master
'

+test_expect_success 'checkout --to an existing empty worktree' '
+ mkdir existing_empty &&
+ git checkout --detach --to existing_empty master
+'
+
test_expect_success 'checkout --to refuses to checkout locked branch' '
test_must_fail git checkout --to zere master &&
! test -d zere &&
--
2.0.1.1697.g73c6810
Max Kirillov
2014-10-12 05:13:09 UTC
Permalink
Signed-off-by: Max Kirillov <***@max630.net>
---
submodule.c | 28 ++++++++++------------------
1 file changed, 10 insertions(+), 18 deletions(-)

diff --git a/submodule.c b/submodule.c
index 34094f5..4aad3d4 100644
--- a/submodule.c
+++ b/submodule.c
@@ -122,43 +122,35 @@ void stage_updated_gitmodules(void)

static int add_submodule_odb(const char *path)
{
- struct strbuf objects_directory = STRBUF_INIT;
struct alternate_object_database *alt_odb;
+ const char* objects_directory;
int ret = 0;
- const char *git_dir;

- strbuf_addf(&objects_directory, "%s/.git", path);
- git_dir = read_gitfile(objects_directory.buf);
- if (git_dir) {
- strbuf_reset(&objects_directory);
- strbuf_addstr(&objects_directory, git_dir);
- }
- strbuf_addstr(&objects_directory, "/objects/");
- if (!is_directory(objects_directory.buf)) {
+ objects_directory = git_path_submodule(path, "objects/");
+ if (!is_directory(objects_directory)) {
ret = -1;
goto done;
}
+
/* avoid adding it twice */
for (alt_odb = alt_odb_list; alt_odb; alt_odb = alt_odb->next)
- if (alt_odb->name - alt_odb->base == objects_directory.len &&
- !strncmp(alt_odb->base, objects_directory.buf,
- objects_directory.len))
+ if (alt_odb->name - alt_odb->base == strlen(objects_directory) &&
+ !strcmp(alt_odb->base, objects_directory))
goto done;

- alt_odb = xmalloc(objects_directory.len + 42 + sizeof(*alt_odb));
+ alt_odb = xmalloc(strlen(objects_directory) + 42 + sizeof(*alt_odb));
alt_odb->next = alt_odb_list;
- strcpy(alt_odb->base, objects_directory.buf);
- alt_odb->name = alt_odb->base + objects_directory.len;
+ strcpy(alt_odb->base, objects_directory);
+ alt_odb->name = alt_odb->base + strlen(objects_directory);
alt_odb->name[2] = '/';
alt_odb->name[40] = '\0';
alt_odb->name[41] = '\0';
alt_odb_list = alt_odb;

/* add possible alternates from the submodule */
- read_info_alternates(objects_directory.buf, 0);
+ read_info_alternates(objects_directory, 0);
prepare_alt_odb();
done:
- strbuf_release(&objects_directory);
return ret;
}
--
2.0.1.1697.g73c6810
Max Kirillov
2014-10-12 05:13:10 UTC
Permalink
Each working directory of main repository has its own working directory
of submodule, and in most cases they should be checked out to different
revisions. So they should be separated.

It looks logical to make submodule instances in different working
directories to reuse the submodule directory in the common dir of
the main repository, and probably this is how "checkout --to" should
initialize them called on the main repository, but they also should work
fine being completely separated clones.

Testfile t7410-submodule-checkout-to.sh demostrates the behavior.

Signed-off-by: Max Kirillov <***@max630.net>
---
Documentation/gitrepository-layout.txt | 4 +--
path.c | 2 +-
t/t7410-submodule-checkout-to.sh | 50 ++++++++++++++++++++++++++++++++++
3 files changed, 52 insertions(+), 4 deletions(-)
create mode 100755 t/t7410-submodule-checkout-to.sh

diff --git a/Documentation/gitrepository-layout.txt b/Documentation/gitrepository-layout.txt
index 2b30a92..7173b38 100644
--- a/Documentation/gitrepository-layout.txt
+++ b/Documentation/gitrepository-layout.txt
@@ -248,9 +248,7 @@ commondir::
incomplete without the repository pointed by "commondir".

modules::
- Contains the git-repositories of the submodules. This
- directory is ignored if $GIT_COMMON_DIR is set and
- "$GIT_COMMON_DIR/modules" will be used instead.
+ Contains the git-repositories of the submodules.

worktrees::
Contains worktree specific information of linked
diff --git a/path.c b/path.c
index 35d498e..a5c51a3 100644
--- a/path.c
+++ b/path.c
@@ -92,7 +92,7 @@ static void replace_dir(struct strbuf *buf, int len, const char *newdir)
}

static const char *common_list[] = {
- "/branches", "/hooks", "/info", "!/logs", "/lost-found", "/modules",
+ "/branches", "/hooks", "/info", "!/logs", "/lost-found",
"/objects", "/refs", "/remotes", "/worktrees", "/rr-cache", "/svn",
"config", "!gc.pid", "packed-refs", "shallow",
NULL
diff --git a/t/t7410-submodule-checkout-to.sh b/t/t7410-submodule-checkout-to.sh
new file mode 100755
index 0000000..8f30aed
--- /dev/null
+++ b/t/t7410-submodule-checkout-to.sh
@@ -0,0 +1,50 @@
+#!/bin/sh
+
+test_description='Combination of submodules and multiple workdirs'
+
+. ./test-lib.sh
+
+base_path=$(pwd -P)
+
+test_expect_success 'setup: make origin' \
+ 'mkdir -p origin/sub && ( cd origin/sub && git init &&
+ echo file1 >file1 &&
+ git add file1 &&
+ git commit -m file1 ) &&
+ mkdir -p origin/main && ( cd origin/main && git init &&
+ git submodule add ../sub &&
+ git commit -m "add sub" ) &&
+ ( cd origin/sub &&
+ echo file1updated >file1 &&
+ git add file1 &&
+ git commit -m "file1 updated" ) &&
+ ( cd origin/main/sub && git pull ) &&
+ ( cd origin/main &&
+ git add sub &&
+ git commit -m "sub updated" )'
+
+test_expect_success 'setup: clone' \
+ 'mkdir clone && ( cd clone &&
+ git clone --recursive "$base_path/origin/main")'
+
+rev1_hash_main=$(git --git-dir=origin/main/.git show --pretty=format:%h -q "HEAD~1")
+rev1_hash_sub=$(git --git-dir=origin/sub/.git show --pretty=format:%h -q "HEAD~1")
+
+test_expect_success 'checkout main' \
+ 'mkdir default_checkout &&
+ (cd clone/main &&
+ git checkout --to "$base_path/default_checkout/main" "$rev1_hash_main")'
+
+test_expect_failure 'can see submodule diffs just after checkout' \
+ '(cd default_checkout/main && git diff --submodule master"^!" | grep "file1 updated")'
+
+test_expect_success 'checkout main and initialize independed clones' \
+ 'mkdir fully_cloned_submodule &&
+ (cd clone/main &&
+ git checkout --to "$base_path/fully_cloned_submodule/main" "$rev1_hash_main") &&
+ (cd fully_cloned_submodule/main && git submodule update)'
+
+test_expect_success 'can see submodule diffs after independed cloning' \
+ '(cd fully_cloned_submodule/main && git diff --submodule master"^!" | grep "file1 updated")'
+
+test_done
--
2.0.1.1697.g73c6810
Max Kirillov
2014-10-12 05:13:11 UTC
Permalink
Currently git_path_submodule() does not handle submodules being a linked
checkout. The visible result is that "git diff --submodule" fails to
report changes in the submodule.

Perform the same resolution as with parent repository, but ignore the
GIT_COMMON_DIR environment variable, because it would mean common
directory for the parent repository and does not make sense for
submodule.

Also add test for functionality which uses this call.

Signed-off-by: Max Kirillov <***@max630.net>
---
cache.h | 1 +
path.c | 24 ++++++++++++++++++++----
setup.c | 17 ++++++++++++-----
t/t7410-submodule-checkout-to.sh | 12 ++++++++++++
4 files changed, 45 insertions(+), 9 deletions(-)

diff --git a/cache.h b/cache.h
index ab3adb6..09821df 100644
--- a/cache.h
+++ b/cache.h
@@ -437,6 +437,7 @@ extern char *get_object_directory(void);
extern char *get_index_file(void);
extern char *get_graft_file(void);
extern int set_git_dir(const char *path);
+extern int get_common_dir_noenv(struct strbuf *sb, const char *gitdir);
extern int get_common_dir(struct strbuf *sb, const char *gitdir);
extern const char *get_git_namespace(void);
extern const char *strip_namespace(const char *namespaced_ref);
diff --git a/path.c b/path.c
index a5c51a3..78f718f 100644
--- a/path.c
+++ b/path.c
@@ -98,7 +98,7 @@ static const char *common_list[] = {
NULL
};

-static void update_common_dir(struct strbuf *buf, int git_dir_len)
+static void update_common_dir(struct strbuf *buf, int git_dir_len, const char* common_dir)
{
char *base = buf->buf + git_dir_len;
const char **p;
@@ -115,12 +115,17 @@ static void update_common_dir(struct strbuf *buf, int git_dir_len)
path++;
is_dir = 1;
}
+
+ if (!common_dir) {
+ common_dir = get_git_common_dir();
+ }
+
if (is_dir && dir_prefix(base, path)) {
- replace_dir(buf, git_dir_len, get_git_common_dir());
+ replace_dir(buf, git_dir_len, common_dir);
return;
}
if (!is_dir && !strcmp(base, path)) {
- replace_dir(buf, git_dir_len, get_git_common_dir());
+ replace_dir(buf, git_dir_len, common_dir);
return;
}
}
@@ -160,7 +165,7 @@ static void adjust_git_path(struct strbuf *buf, int git_dir_len)
else if (git_db_env && dir_prefix(base, "objects"))
replace_dir(buf, git_dir_len + 7, get_object_directory());
else if (git_common_dir_env)
- update_common_dir(buf, git_dir_len);
+ update_common_dir(buf, git_dir_len, NULL);
}

static void do_git_path(struct strbuf *buf, const char *fmt, va_list args)
@@ -256,6 +261,8 @@ const char *git_path_submodule(const char *path, const char *fmt, ...)
{
struct strbuf *buf = get_pathname();
const char *git_dir;
+ struct strbuf git_submodule_common_dir = STRBUF_INIT;
+ struct strbuf git_submodule_dir = STRBUF_INIT;
va_list args;

strbuf_addstr(buf, path);
@@ -269,11 +276,20 @@ const char *git_path_submodule(const char *path, const char *fmt, ...)
strbuf_addstr(buf, git_dir);
}
strbuf_addch(buf, '/');
+ strbuf_addstr(&git_submodule_dir, buf->buf);

va_start(args, fmt);
strbuf_vaddf(buf, fmt, args);
va_end(args);
+
+ if (get_common_dir_noenv(&git_submodule_common_dir, git_submodule_dir.buf)) {
+ update_common_dir(buf, git_submodule_dir.len, git_submodule_common_dir.buf);
+ }
+
strbuf_cleanup_path(buf);
+
+ strbuf_release(&git_submodule_dir);
+ strbuf_release(&git_submodule_common_dir);
return buf->buf;
}

diff --git a/setup.c b/setup.c
index fb61860..ffda622 100644
--- a/setup.c
+++ b/setup.c
@@ -226,14 +226,21 @@ void verify_non_filename(const char *prefix, const char *arg)

int get_common_dir(struct strbuf *sb, const char *gitdir)
{
+ const char *git_env_common_dir = getenv(GIT_COMMON_DIR_ENVIRONMENT);
+ if (git_env_common_dir) {
+ strbuf_addstr(sb, git_env_common_dir);
+ return 1;
+ } else {
+ return get_common_dir_noenv(sb, gitdir);
+ }
+}
+
+int get_common_dir_noenv(struct strbuf *sb, const char *gitdir)
+{
struct strbuf data = STRBUF_INIT;
struct strbuf path = STRBUF_INIT;
- const char *git_common_dir = getenv(GIT_COMMON_DIR_ENVIRONMENT);
int ret = 0;
- if (git_common_dir) {
- strbuf_addstr(sb, git_common_dir);
- return 1;
- }
+
strbuf_addf(&path, "%s/commondir", gitdir);
if (file_exists(path.buf)) {
if (strbuf_read_file(&data, path.buf, 0) <= 0)
diff --git a/t/t7410-submodule-checkout-to.sh b/t/t7410-submodule-checkout-to.sh
index 8f30aed..153f5c5 100755
--- a/t/t7410-submodule-checkout-to.sh
+++ b/t/t7410-submodule-checkout-to.sh
@@ -47,4 +47,16 @@ test_expect_success 'checkout main and initialize independed clones' \
test_expect_success 'can see submodule diffs after independed cloning' \
'(cd fully_cloned_submodule/main && git diff --submodule master"^!" | grep "file1 updated")'

+test_expect_success 'checkout sub manually' \
+ 'mkdir linked_submodule &&
+ (cd clone/main &&
+ git checkout --to "$base_path/linked_submodule/main" "$rev1_hash_main") &&
+ (cd clone/main/sub &&
+ git checkout --to "$base_path/linked_submodule/main/sub" "$rev1_hash_sub") &&
+ mkdir clone/main/.git/worktrees/main/modules &&
+ echo "gitdir: ../../modules/sub/worktrees/sub" > clone/main/.git/worktrees/main/modules/sub'
+
+test_expect_success 'can see submodule diffs after manual checkout of linked submodule' \
+ '(cd linked_submodule/main && git diff --submodule master"^!" | grep "file1 updated")'
+
test_done
--
2.0.1.1697.g73c6810
Duy Nguyen
2014-10-14 12:17:54 UTC
Permalink
These are fixes of issues with submodules with use of multiple working
trees.
I think the patches look fine from the nd/multiple-work-trees writer's
perspective. I know too little about submodules to judge if this is
the right way and not that way..
--
Duy
Jens Lehmann
2014-10-14 17:09:45 UTC
Permalink
Post by Duy Nguyen
These are fixes of issues with submodules with use of multiple working
trees.
I think the patches look fine from the nd/multiple-work-trees writer's
perspective. I know too little about submodules to judge if this is
the right way and not that way..
Sorry, I was too busy to review this work until now, but here we go:

If I understand multiple work trees correctly, everything except work
tree local stuff is redirected into GIT_COMMON_DIR. This is a very cool
feature I'd love to see on our CI server to reduce disk footprint and
clone times, especially for submodules!

But I can't see how that can work by just sharing the modules directory
tree, as that contains work tree related files - e.g. the index - for
each submodule. AFAICS sharing them between work trees will work only
if the content of the modules directory is partly present in GIT_DIR -
for work tree related files - and only the common stuff is taken from
GIT_COMMON_DIR (Or did I just miss the magic that already does that?).
And I didn't try to wrap my head around recursive submodules yet ...

Until that problem is solved it looks wrong to pass GIT_COMMON_DIR into
submodule recursion, I believe GIT_COMMON_DIR should be added to the
local_repo_env array (and even if it is passed on later, we might have
to append "/modules/<submodule_name>" to make it point to the correct
location).

But maybe I'm missing something?
Junio C Hamano
2014-10-14 17:26:42 UTC
Permalink
Post by Jens Lehmann
But I can't see how that can work by just sharing the modules directory
tree, as that contains work tree related files - e.g. the index - for
each submodule. AFAICS sharing them between work trees will work only
if the content of the modules directory is partly present in GIT_DIR -
for work tree related files - and only the common stuff is taken from
GIT_COMMON_DIR (Or did I just miss the magic that already does that?).
The first time I saw the patch 3/4 in this series, my reaction was
"Huh, why should the repository data and branch tips be separated
out into multiple independent copies for the same module? Do we
force users to synchronise between these copies? It does not make
any sense at all."

But that was until I read your message ;-) You are right that the
index and HEAD are dependent to a particular working tree that is
checked out. There may be other things that logically are per-
working tree.

And multiple-worktree _is_ about keeping the same repository and
history data (i.e. object database, refs, rerere database, reflogs
for refs/*) only once, while allowing multiple working trees attached
to that single copy.

So it appears to me that to create a checkout-to copy of a
superproject with a submodule, a checkout-to copy of the
superproject would have a submodule, which is a checkout-to copy of
the submodule in the superproject.
Post by Jens Lehmann
And I didn't try to wrap my head around recursive submodules yet ...
Until that problem is solved it looks wrong to pass GIT_COMMON_DIR into
submodule recursion, I believe GIT_COMMON_DIR should be added to the
local_repo_env array (and even if it is passed on later, we might have
to append "/modules/<submodule_name>" to make it point to the correct
location).
Max Kirillov
2014-10-14 18:34:31 UTC
Permalink
Post by Junio C Hamano
And multiple-worktree _is_ about keeping the same repository and
history data (i.e. object database, refs, rerere database, reflogs for
refs/*) only once, while allowing multiple working trees attached to
that single copy.
So it appears to me that to create a checkout-to copy of a
superproject with a submodule, a checkout-to copy of the superproject
would have a submodule, which is a checkout-to copy of the submodule
in the superproject.
That's right, this linking should be more implicit.

But here are a lot of nuances. For example, it makes sense to have a
superproject checkout without submodules being initialized (so that they
don't waste space and machine time for working tree, which often is more
than repository data). And it may happen so that this checkout is the
master repository for superproject checkouts. But this should not
prevent users from using initialized submodules in other checkouts.

Then, a checkout copy of a submodule can be standalone (for example, git
and git-html-docs are submodules of msysgit). Or, it can even belong to
some other superproject. And in that cases they still should be able to
be linked.

Considering all above, and also the thing that I am quite new to
submodules (but have to use them currently), I did not intend to create
any new UI, only to make backend handle the already existing linked
checkouts, which can be made manually.
--
Max
Jens Lehmann
2014-10-14 19:51:22 UTC
Permalink
Post by Max Kirillov
Post by Junio C Hamano
And multiple-worktree _is_ about keeping the same repository and
history data (i.e. object database, refs, rerere database, reflogs for
refs/*) only once, while allowing multiple working trees attached to
that single copy.
So it appears to me that to create a checkout-to copy of a
superproject with a submodule, a checkout-to copy of the superproject
would have a submodule, which is a checkout-to copy of the submodule
in the superproject.
That's right, this linking should be more implicit.
Yep. And for the submodule of a submodule too ... ;-)
Post by Max Kirillov
But here are a lot of nuances. For example, it makes sense to have a
superproject checkout without submodules being initialized (so that they
don't waste space and machine time for working tree, which often is more
than repository data).
Hmm, I'm not sure if this is a problem. If the GIT_COMMON_DIR does have
the submodule repo but it isn't initialized locally, we shouldn't have a
problem (except for wasting some disk space if not a single checkout-to
superproject initializes this submodule). And if GIT_COMMON_DIR does not
have the submodule repo yet, wouldn't it be cloned the moment we init
the submodule in the checkout-to? Or would that need extra functionality?
Post by Max Kirillov
And it may happen so that this checkout is the
master repository for superproject checkouts. But this should not
prevent users from using initialized submodules in other checkouts.
I could live with the restriction that submodule's GIT_COMMON_DIRs always
live in their checkout-to superproject's GIT_COMMON_DIR. This would still
be an improvement for CI servers that have multiple clones of a super-
project, as they would all share their submodule common dirs at least
per superproject.
Post by Max Kirillov
Then, a checkout copy of a submodule can be standalone (for example, git
and git-html-docs are submodules of msysgit). Or, it can even belong to
some other superproject. And in that cases they still should be able to
be linked.
Maybe such configurations would have to be handled manually to achieve
maximum savings. At least I could live with that.
Post by Max Kirillov
Considering all above, and also the thing that I am quite new to
submodules (but have to use them currently), I did not intend to create
any new UI, only to make backend handle the already existing linked
checkouts, which can be made manually.
Maybe the way to go is to restrict GIT_COMMON_DIR to superprojects and
have a distinct GIT_COMMON_MODULES_DIR? We could even set that to the
same location for all submodules of different superprojects to achieve
minimum disk footprint (assuming they all have different names across
all superprojects and recursion levels).

Hmm, so I tend towards adding GIT_COMMON_DIR to local_repo_env until
we figured out how to handle this. Without that I fear bad things will
happen, at least for a superproject with multiple checkout-to work trees
where the same submodule is initialized more than once ...
Max Kirillov
2014-10-14 22:15:09 UTC
Permalink
Post by Jens Lehmann
Post by Max Kirillov
But here are a lot of nuances. For example, it makes
sense to have a superproject checkout without submodules
being initialized (so that they don't waste space and
machine time for working tree, which often is more than
repository data).
Hmm, I'm not sure if this is a problem. If the
GIT_COMMON_DIR does have the submodule repo but it isn't
initialized locally, we shouldn't have a problem (except
for wasting some disk space if not a single checkout-to
superproject initializes this submodule).
If initially a repository is clone without submodules, it
will not have anything in the GIT_COMMON_DIR.
Post by Jens Lehmann
And if GIT_COMMON_DIR does not have the submodule repo
yet, wouldn't it be cloned the moment we init the
submodule in the checkout-to? Or would that need extra
functionality?
I cannot say I like this. Network operations should be
caused only by clone and submodules.

I think the logic can be simple: it a submodule is not
checked-out in the repository "checkout --to" is called
from, then it is not checked-out to the new one also. If it
is, then checkout calls itself recursively in the submodule
and works like being run in standalone repository.
Post by Jens Lehmann
Post by Max Kirillov
Then, a checkout copy of a submodule can be standalone
(for example, git and git-html-docs are submodules of
msysgit). Or, it can even belong to some other
superproject. And in that cases they still should be able
to be linked.
Maybe such configurations would have to be handled
manually to achieve maximum savings. At least I could live
with that.
To make manual handling of the cases, and to skip
checking-out a module.

I would think about the following interface:

$ git checkout --to ... - does not checkout submodules,
creates empty directory.

$ git checkout --recursive --to ... - if a submodule is
checked-out in source repository, recursed there and run
"checkout --recursive" again. If a submodule is not
checked-out, does not checkout it, creates an empty
directory.

By the way, I have found your branch
recursive_submodule_checkout. Would you like to revive it?
Then it can be done with the same option.
Post by Jens Lehmann
Hmm, so I tend towards adding GIT_COMMON_DIR to
local_repo_env until we figured out how to handle this.
Without that I fear bad things will happen, at least for a
superproject with multiple checkout-to work trees where
the same submodule is initialized more than once ...
I learned about local_repo_env and agree it should include
GIT_COMMON_DIR. Unless it is removed at all...
Duy Nguyen
2014-10-15 14:14:10 UTC
Permalink
Post by Max Kirillov
Post by Jens Lehmann
Hmm, so I tend towards adding GIT_COMMON_DIR to
local_repo_env until we figured out how to handle this.
Without that I fear bad things will happen, at least for a
superproject with multiple checkout-to work trees where
the same submodule is initialized more than once ...
I learned about local_repo_env and agree it should include
GIT_COMMON_DIR. Unless it is removed at all...
It's in the same class as GIT_DIR and GIT_WORK_TREE so yeah it should
be in local_repo_env. Removing it would break t0060-path-utils.sh at
least. Unless we have a very good reason to remove it, I think we
should keep it as is.
--
Duy
Jens Lehmann
2014-10-15 18:57:20 UTC
Permalink
Post by Max Kirillov
Post by Jens Lehmann
Post by Max Kirillov
But here are a lot of nuances. For example, it makes
sense to have a superproject checkout without submodules
being initialized (so that they don't waste space and
machine time for working tree, which often is more than
repository data).
Hmm, I'm not sure if this is a problem. If the
GIT_COMMON_DIR does have the submodule repo but it isn't
initialized locally, we shouldn't have a problem (except
for wasting some disk space if not a single checkout-to
superproject initializes this submodule).
If initially a repository is clone without submodules, it
will not have anything in the GIT_COMMON_DIR.
Ok.
Post by Max Kirillov
Post by Jens Lehmann
And if GIT_COMMON_DIR does not have the submodule repo
yet, wouldn't it be cloned the moment we init the
submodule in the checkout-to? Or would that need extra
functionality?
I cannot say I like this. Network operations should be
caused only by clone and submodules.
Sure (and please add fetch to the list ;-). Maybe I confused
you by saying "init" when I meant the "submodule update" run
after initializing the submodule?
Post by Max Kirillov
I think the logic can be simple: it a submodule is not
checked-out in the repository "checkout --to" is called
from, then it is not checked-out to the new one also. If it
is, then checkout calls itself recursively in the submodule
and works like being run in standalone repository.
But when I later decide to populate the submodule in a
"checkout --to" work tree, should it automagically also
use the central storage, creating the modules/<name>
directory there if it doesn't exist yet? I think that'd
make sense to avoid having the work tree layout depend
on the order commands were ran in. And imagine new
submodules, they should not be handled differently from
those already present.
Post by Max Kirillov
Post by Jens Lehmann
Post by Max Kirillov
Then, a checkout copy of a submodule can be standalone
(for example, git and git-html-docs are submodules of
msysgit). Or, it can even belong to some other
superproject. And in that cases they still should be able
to be linked.
Maybe such configurations would have to be handled
manually to achieve maximum savings. At least I could live
with that.
To make manual handling of the cases, and to skip
checking-out a module.
$ git checkout --to ... - does not checkout submodules,
creates empty directory.
This is what checkout should always do (at least until it
learns --recurse-submodules, then it would populate the
submodule directories).
Post by Max Kirillov
$ git checkout --recursive --to ... - if a submodule is
checked-out in source repository, recursed there and run
"checkout --recursive" again. If a submodule is not
checked-out, does not checkout it, creates an empty
directory.
Hmm, I believe that when the user requests recursion
explicitly it should always be checked out, no matter in
what state the GIT_COMMON_DIR is in. Otherwise we'll see
problems when a new submodule shows up and is populated
depending on the point in time the "checkout --to" was
done ... not good.
Post by Max Kirillov
By the way, I have found your branch
recursive_submodule_checkout. Would you like to revive it?
Then it can be done with the same option.
No need to revive it, I'm currently working on that branch
whenever I find some time (but I'll still need some time
before I can post the next iteration).
Max Kirillov
2014-10-16 20:54:53 UTC
Permalink
Post by Jens Lehmann
Post by Max Kirillov
I think the logic can be simple: it a submodule is not
checked-out in the repository "checkout --to" is called
from, then it is not checked-out to the new one also. If it
is, then checkout calls itself recursively in the submodule
and works like being run in standalone repository.
But when I later decide to populate the submodule in a
"checkout --to" work tree, should it automagically also
use the central storage, creating the modules/<name>
directory there if it doesn't exist yet? I think that'd
make sense to avoid having the work tree layout depend
on the order commands were ran in. And imagine new
submodules, they should not be handled differently from
those already present.
Like place the common directory to
$MAIN_REPO/.git/modules/$SUB/ and worktree-specific part to
$MAIN_REPO/.git/worktrees/$WORKTREE/modules/$SUB, rather
than placing all into the socond one? It would make sense to
make, but then it would be imposible to checkout a diferent
repository into the same submodule in different superproject
checkouts. However stupid is sounds, there could be cases
if, for example, at some moment submodule is being replaced
by another one, and older worktrees should work with older
submodule, while newer uses the newer submodule.

Maybe, there could be some options to tell the command which
populates submodules (which commands that are? "submodule update"
and other submodule subcommands? or there is something
else?) to use the curent checkout space or the main one. But
I would still leave it depend on what user explicitly calls
and where the initial submodule update is executed.

Also, could you clarify the usage of the /modules/
directory. I did not notice it to affect anything after the
submofule is placed there. Submodule operations use the
submodule repositories directly (through the git link, which
can point anywhere), or in .gitmodules file, or maybe in
.git/config. So there is actually no need to have that
gitdir there. Is it correct?
--
Max
Jens Lehmann
2014-10-19 19:30:15 UTC
Permalink
Post by Max Kirillov
Post by Jens Lehmann
Post by Max Kirillov
I think the logic can be simple: it a submodule is not
checked-out in the repository "checkout --to" is called
from, then it is not checked-out to the new one also. If it
is, then checkout calls itself recursively in the submodule
and works like being run in standalone repository.
But when I later decide to populate the submodule in a
"checkout --to" work tree, should it automagically also
use the central storage, creating the modules/<name>
directory there if it doesn't exist yet? I think that'd
make sense to avoid having the work tree layout depend
on the order commands were ran in. And imagine new
submodules, they should not be handled differently from
those already present.
Like place the common directory to
$MAIN_REPO/.git/modules/$SUB/ and worktree-specific part to
$MAIN_REPO/.git/worktrees/$WORKTREE/modules/$SUB, rather
than placing all into the socond one? It would make sense to
make, but then it would be imposible to checkout a diferent
repository into the same submodule in different superproject
checkouts. However stupid is sounds, there could be cases
if, for example, at some moment submodule is being replaced
by another one, and older worktrees should work with older
submodule, while newer uses the newer submodule.
Yes, but I believe that the user must be careful to not
reuse the same submodule name for a different repo anyways,
no matter if shared or not. Currently you'll get a warning
about that when trying to add a submodule whose name is
already found in .git/modules to avoid such confusion.
Post by Max Kirillov
Maybe, there could be some options to tell the command which
populates submodules (which commands that are? "submodule update"
and other submodule subcommands? or there is something
else?) to use the curent checkout space or the main one. But
I would still leave it depend on what user explicitly calls
and where the initial submodule update is executed.
Currently only "submodule update" populates submodules, but
I'm currently working hard on teaching commands like checkout
(and lots of others) to do the same. I agree that the user
should be able to choose and for our CI server I would also
like to see a solution that could share submodules across
different superprojects. So having another environment
variable to decide where to put the work tree independent
parts of the .git/modules directory might make sense here.
Post by Max Kirillov
Also, could you clarify the usage of the /modules/
directory. I did not notice it to affect anything after the
submofule is placed there. Submodule operations use the
submodule repositories directly (through the git link, which
can point anywhere), or in .gitmodules file, or maybe in
.git/config. So there is actually no need to have that
gitdir there. Is it correct?
Nope. When submodules are cloned their git directory is
placed under .git/modules/<submodule name>, the .git file
in the work tree points there and the core.worktree setting
points back from there to the work tree.
Max Kirillov
2014-10-20 04:11:09 UTC
Permalink
Post by Jens Lehmann
Post by Max Kirillov
Post by Jens Lehmann
Post by Max Kirillov
I think the logic can be simple: it a submodule is not
checked-out in the repository "checkout --to" is called
from, then it is not checked-out to the new one also. If it
is, then checkout calls itself recursively in the submodule
and works like being run in standalone repository.
But when I later decide to populate the submodule in a
"checkout --to" work tree, should it automagically also
use the central storage, creating the modules/<name>
directory there if it doesn't exist yet? I think that'd
make sense to avoid having the work tree layout depend
on the order commands were ran in. And imagine new
submodules, they should not be handled differently from
those already present.
Like place the common directory to
$MAIN_REPO/.git/modules/$SUB/ and worktree-specific part to
$MAIN_REPO/.git/worktrees/$WORKTREE/modules/$SUB, rather
than placing all into the socond one? It would make sense to
make, but then it would be imposible to checkout a diferent
repository into the same submodule in different superproject
checkouts. However stupid is sounds, there could be cases
if, for example, at some moment submodule is being replaced
by another one, and older worktrees should work with older
submodule, while newer uses the newer submodule.
Yes, but I believe that the user must be careful to not
reuse the same submodule name for a different repo anyways,
no matter if shared or not. Currently you'll get a warning
about that when trying to add a submodule whose name is
already found in .git/modules to avoid such confusion.
Yes, while trying to write tests for this case I discovered
that there are warnings and the recommended way is to use
different names for different repositories even if they are
pointing to the same path. Then always placing common
directory into the .git/modules/<module> could be a good
idea, and in very special cases users could manually create
repositories with custom placement.
Post by Jens Lehmann
Post by Max Kirillov
Also, could you clarify the usage of the /modules/
directory. I did not notice it to affect anything after the
submofule is placed there. Submodule operations use the
submodule repositories directly (through the git link, which
can point anywhere), or in .gitmodules file, or maybe in
.git/config. So there is actually no need to have that
gitdir there. Is it correct?
Nope. When submodules are cloned their git directory is
placed under .git/modules/<submodule name>, the .git file
in the work tree points there and the core.worktree setting
points back from there to the work tree.
I meant is the fact that gitdir is placed in modules, rather
than in any other place, is used anywhere. There are 2
places to put the gitdir of submodule in linked copy:
1. $MAIN_REPO/.git/worktrees/$WORKTREE/modules/$SUB
2. $MAIN_REPO/.git/modules/$SUB/worktrees/$SUB_WTNAME
First one is suggested by submodule way of placing gitdirs,
and the second one by worrktree way. There are reasons to
have the second one - garbage collection and check that 2
branch is not checked out twice. Are there resons to have
the 1st one? The one is to prevent use of different
repositories with the same name, anything else?
--
Max
Max Kirillov
2014-10-14 20:31:14 UTC
Permalink
Post by Jens Lehmann
Until that problem is solved it looks wrong to pass
GIT_COMMON_DIR into submodule recursion, I believe
GIT_COMMON_DIR should be added to the local_repo_env array
(and even if it is passed on later, we might have to
append "/modules/<submodule_name>" to make it point to the
correct location).
Actually, why there should be an _environment_ variable
GIT_COMMON_DIR at all? I mean, gitdir is resolved to some
directory (through link or environment), and it contains the
shared data directly or referes to it with the commondir
link. In which case anyone would want to override that
location?

I searched though tests but they don't cover this.
Duy Nguyen
2014-10-15 13:08:33 UTC
Permalink
Post by Max Kirillov
Post by Jens Lehmann
Until that problem is solved it looks wrong to pass
GIT_COMMON_DIR into submodule recursion, I believe
GIT_COMMON_DIR should be added to the local_repo_env array
(and even if it is passed on later, we might have to
append "/modules/<submodule_name>" to make it point to the
correct location).
Actually, why there should be an _environment_ variable
GIT_COMMON_DIR at all? I mean, gitdir is resolved to some
directory (through link or environment), and it contains the
shared data directly or referes to it with the commondir
link. In which case anyone would want to override that
location?
It's how the implementation was built up. First I split the repo in
two and I need something to point the new/shared part.
$GIT_DIR/worktrees comes later to glue them up. Keeping it an
environment variable gives us a possibility to glue things up in a
different way than using $GIT_DIR/worktrees. Of course the odds of
anybody doing that are probably small or even near zero.

Still consuming the rest of this thread. Thanks all for working on the
submodule support for this.
--
Duy
Junio C Hamano
2014-10-15 17:09:59 UTC
Permalink
Post by Duy Nguyen
Post by Max Kirillov
Post by Jens Lehmann
Until that problem is solved it looks wrong to pass
GIT_COMMON_DIR into submodule recursion, I believe
GIT_COMMON_DIR should be added to the local_repo_env array
(and even if it is passed on later, we might have to
append "/modules/<submodule_name>" to make it point to the
correct location).
Actually, why there should be an _environment_ variable
GIT_COMMON_DIR at all? I mean, gitdir is resolved to some
directory (through link or environment), and it contains the
shared data directly or referes to it with the commondir
link. In which case anyone would want to override that
location?
It's how the implementation was built up. First I split the repo in
two and I need something to point the new/shared part.
$GIT_DIR/worktrees comes later to glue them up. Keeping it an
environment variable gives us a possibility to glue things up in a
different way than using $GIT_DIR/worktrees. Of course the odds of
anybody doing that are probably small or even near zero.
Still consuming the rest of this thread. Thanks all for working on the
submodule support for this.
Hmph. I was hoping that the multiple-work-trees topic was ready for
'next' by now, but we may want to wait to see how the interaction
with submodule plays out to have another chance of a clean reroll
before it happens. This is a topic with large impact and is quite
intrusive code-wise, even though the intrusive parts are cleanly
done. So we'd want to take more time to unleash it to the general
public than more usual small scale topics, anyway.

Thanks.
Duy Nguyen
2014-10-17 09:14:19 UTC
Permalink
Post by Junio C Hamano
Hmph. I was hoping that the multiple-work-trees topic was ready for
'next' by now, but we may want to wait to see how the interaction
with submodule plays out to have another chance of a clean reroll
before it happens. This is a topic with large impact and is quite
intrusive code-wise, even though the intrusive parts are cleanly
done. So we'd want to take more time to unleash it to the general
public than more usual small scale topics, anyway.
Originally I wanted to get it merged without submodule support, but I
failed to spot the local_repo_env problem and could have caused a
regression for submodules. So yeah delaying the series does not sound
bad. Not sure about the reroll (i.e. rewriting current patches). I
think putting patches on top with explanation is better. But we can
keep it in 'pu' and see if we really need to reroll.
--
Duy
Jens Lehmann
2014-10-19 19:34:33 UTC
Permalink
Post by Duy Nguyen
Post by Junio C Hamano
Hmph. I was hoping that the multiple-work-trees topic was ready for
'next' by now, but we may want to wait to see how the interaction
with submodule plays out to have another chance of a clean reroll
before it happens. This is a topic with large impact and is quite
intrusive code-wise, even though the intrusive parts are cleanly
done. So we'd want to take more time to unleash it to the general
public than more usual small scale topics, anyway.
Originally I wanted to get it merged without submodule support, but I
failed to spot the local_repo_env problem and could have caused a
regression for submodules. So yeah delaying the series does not sound
bad. Not sure about the reroll (i.e. rewriting current patches). I
think putting patches on top with explanation is better. But we can
keep it in 'pu' and see if we really need to reroll.
I didn't look into your series in detail, but it looks to my like
excepting the .git/modules directory from sharing (by putting it
into local_repo_env array) and adding a test for that (just to be
sure that no modules directory shows up where it shouldn't) should
be sufficient to get your stuff merged without submodule support.
It might be better to handle submodule support in a follow up series.

Does that make sense?
Continue reading on narkive:
Search results for '[PATCH 1/4] checkout: do not fail if target is an empty directory' (Questions and Answers)
9
replies
How should I advertise my new children's birthday party and play center?
started 2008-01-06 19:41:58 UTC
advertising & marketing
Loading...