Elijah Newren
2010-09-05 00:13:52 UTC
This patch series implements some basics for sparse clones, which I
define as a clone where not all blob, tree, or commit objects are
downloaded. The idea is to include sparseness both relative to span
of files/directories and depth of history, though currently I've only
put effort into span of paths.
This patch is built on pu, because it requires
en/object-list-with-pathspec.
What works:
* all operations on non-sparse clones (full testsuite passes)
* clone
* read-tree
* ls-files
* cat-file
* ls-tree
* checkout
* diff
* status
* log
* add (except for not giving errors for paths outside the sparse limi=
ts)
* commit
What doesn't work, yet:
* Probably everything not tested in the new t572*.sh tests :-)
Notable examples of things missing from t572*.sh tests:
* fetch
* push
* merge
* rebase
* thin packs (need to modify pack-objects to only delta against
objects within the sparse limits)
* densify command (to make a sparse repository non-sparse)
* "missing" commits (see README file in PATCH1)
Cursory comparison with Nguy=E1=BB=85n Th=C3=A1i Ng=E1=BB=8Dc Duy's sub=
tree clone (he's probably
made progress since his last submission, so this may be outdated):
* His series supports fetch, mine doesn't (yet).
* His series supports push, mine doesn't (yet).
* His series supports merge, mine doesn't (yet).
* His handling of subtree request over clone/fetch via capabilities
is probably the right way; I'm pretty sure my adding of sparse
limits are extra arguments to upload-pack would break backward
compatibility and be bad.
* He supports just one selected subtree (though he mentioned he's
working on extending that); I support arbitrary number of subtrees
or subfiles.
* He modifies index format (bumping to header version 4); I don't.
Perhaps it's necessary for merge handling as I haven't implemented
that, but at an early glance I don't think it's necessary.
* While there are some similarities in the low-level details of how
we've modified the git to avoid missing objects, there are many
differences as well. I'm hoping to provoke some good discussion.
Elijah Newren (15):
P1- README-sparse-clone: Add a basic writeup of my ideas for sparse c=
lones
Just a big old write-up. Not everything in it is implemented yet, but =
it
gives you the high-level picture.
P2- Add tests for client handling in a sparse repository
Tests! Yaay!
P3- Read sparse limiting args from $GIT_DIR/sparse-limit
When a sparse clone is created, limiting paths will be stored.
P4- When unpacking in a sparse repository, avoid traversing missing
trees/blobs
P5- read_tree_recursive: Avoid missing blobs and trees in a sparse
repository
P6- Automatically reuse sparse limiting arguments in revision walking
P7- cache_tree_update(): Capability to handle tree entries missing fr=
om
index
P8- cache_tree_update(): Require relevant tree to be passed
Avoiding missing trees/blobs. =20
P9- Add tests for communication dealing with sparse repositories
Tests for clone/fetch/push/etc. Just clone so far.
P10- sparse-repo: Provide a function to record sparse limiting argume=
nts
Can't just read from $GIT_DIR/sparse-limit; gotta write to it too.
P11- builtin-clone: Accept paths for sparse clone
P12- Pass extra (rev-list) args on, at least in some cases
P13- upload-pack: Handle extra rev-list arguments being passed
P14- EVIL COMMIT: Include all commits
P15- clone: Ensure sparse limiting arguments are used in subsequent
operations
I like the changes to how clone accepts additional rev-list arguments
to limit what is downloaded, but I'm not too happy with how these
patches pass those rev-list arguments on to upload-pack. So don't
bother looking too closely at these.
Makefile | 2 +
README-sparse-clone | 284 ++++++++++++++++++++=
++++++++
builtin/archive.c | 2 +-
builtin/checkout.c | 2 +-
builtin/clone.c | 39 +++-
builtin/commit.c | 15 +-
builtin/fetch-pack.c | 3 +-
builtin/merge.c | 19 +-
builtin/revert.c | 7 +-
builtin/send-pack.c | 3 +-
builtin/write-tree.c | 6 +-
cache-tree.c | 92 +++++++++-
cache-tree.h | 4 +-
cache.h | 5 +-
connect.c | 9 +-
diff.h | 1 -
environment.c | 2 +
merge-recursive.c | 6 +-
merge-recursive.h | 2 +-
revision.c | 21 ++-
revision.h | 3 +-
setup.c | 2 +
sparse-repo.c | 84 ++++++++
sparse-repo.h | 4 +
t/sparse-lib.sh | 38 ++++
t/t5601-clone.sh | 14 --
t/t5720-sparse-repository-basics.sh | 130 +++++++++++++
t/t5721-sparse-repository-communication.sh | 106 +++++++++++
test-dump-cache-tree.c | 3 +-
transport-helper.c | 5 +-
transport.c | 13 +-
transport.h | 9 +-
tree-diff.c | 4 +-
tree-walk.c | 48 ++++-
tree-walk.h | 3 +
tree.c | 5 +
upload-pack.c | 45 +++--
37 files changed, 952 insertions(+), 88 deletions(-)
create mode 100644 README-sparse-clone
create mode 100644 sparse-repo.c
create mode 100644 sparse-repo.h
create mode 100644 t/sparse-lib.sh
create mode 100755 t/t5720-sparse-repository-basics.sh
create mode 100755 t/t5721-sparse-repository-communication.sh
--=20
1.7.2.3.541.g94cc33
define as a clone where not all blob, tree, or commit objects are
downloaded. The idea is to include sparseness both relative to span
of files/directories and depth of history, though currently I've only
put effort into span of paths.
This patch is built on pu, because it requires
en/object-list-with-pathspec.
What works:
* all operations on non-sparse clones (full testsuite passes)
* clone
* read-tree
* ls-files
* cat-file
* ls-tree
* checkout
* diff
* status
* log
* add (except for not giving errors for paths outside the sparse limi=
ts)
* commit
What doesn't work, yet:
* Probably everything not tested in the new t572*.sh tests :-)
Notable examples of things missing from t572*.sh tests:
* fetch
* push
* merge
* rebase
* thin packs (need to modify pack-objects to only delta against
objects within the sparse limits)
* densify command (to make a sparse repository non-sparse)
* "missing" commits (see README file in PATCH1)
Cursory comparison with Nguy=E1=BB=85n Th=C3=A1i Ng=E1=BB=8Dc Duy's sub=
tree clone (he's probably
made progress since his last submission, so this may be outdated):
* His series supports fetch, mine doesn't (yet).
* His series supports push, mine doesn't (yet).
* His series supports merge, mine doesn't (yet).
* His handling of subtree request over clone/fetch via capabilities
is probably the right way; I'm pretty sure my adding of sparse
limits are extra arguments to upload-pack would break backward
compatibility and be bad.
* He supports just one selected subtree (though he mentioned he's
working on extending that); I support arbitrary number of subtrees
or subfiles.
* He modifies index format (bumping to header version 4); I don't.
Perhaps it's necessary for merge handling as I haven't implemented
that, but at an early glance I don't think it's necessary.
* While there are some similarities in the low-level details of how
we've modified the git to avoid missing objects, there are many
differences as well. I'm hoping to provoke some good discussion.
Elijah Newren (15):
P1- README-sparse-clone: Add a basic writeup of my ideas for sparse c=
lones
Just a big old write-up. Not everything in it is implemented yet, but =
it
gives you the high-level picture.
P2- Add tests for client handling in a sparse repository
Tests! Yaay!
P3- Read sparse limiting args from $GIT_DIR/sparse-limit
When a sparse clone is created, limiting paths will be stored.
P4- When unpacking in a sparse repository, avoid traversing missing
trees/blobs
P5- read_tree_recursive: Avoid missing blobs and trees in a sparse
repository
P6- Automatically reuse sparse limiting arguments in revision walking
P7- cache_tree_update(): Capability to handle tree entries missing fr=
om
index
P8- cache_tree_update(): Require relevant tree to be passed
Avoiding missing trees/blobs. =20
P9- Add tests for communication dealing with sparse repositories
Tests for clone/fetch/push/etc. Just clone so far.
P10- sparse-repo: Provide a function to record sparse limiting argume=
nts
Can't just read from $GIT_DIR/sparse-limit; gotta write to it too.
P11- builtin-clone: Accept paths for sparse clone
P12- Pass extra (rev-list) args on, at least in some cases
P13- upload-pack: Handle extra rev-list arguments being passed
P14- EVIL COMMIT: Include all commits
P15- clone: Ensure sparse limiting arguments are used in subsequent
operations
I like the changes to how clone accepts additional rev-list arguments
to limit what is downloaded, but I'm not too happy with how these
patches pass those rev-list arguments on to upload-pack. So don't
bother looking too closely at these.
Makefile | 2 +
README-sparse-clone | 284 ++++++++++++++++++++=
++++++++
builtin/archive.c | 2 +-
builtin/checkout.c | 2 +-
builtin/clone.c | 39 +++-
builtin/commit.c | 15 +-
builtin/fetch-pack.c | 3 +-
builtin/merge.c | 19 +-
builtin/revert.c | 7 +-
builtin/send-pack.c | 3 +-
builtin/write-tree.c | 6 +-
cache-tree.c | 92 +++++++++-
cache-tree.h | 4 +-
cache.h | 5 +-
connect.c | 9 +-
diff.h | 1 -
environment.c | 2 +
merge-recursive.c | 6 +-
merge-recursive.h | 2 +-
revision.c | 21 ++-
revision.h | 3 +-
setup.c | 2 +
sparse-repo.c | 84 ++++++++
sparse-repo.h | 4 +
t/sparse-lib.sh | 38 ++++
t/t5601-clone.sh | 14 --
t/t5720-sparse-repository-basics.sh | 130 +++++++++++++
t/t5721-sparse-repository-communication.sh | 106 +++++++++++
test-dump-cache-tree.c | 3 +-
transport-helper.c | 5 +-
transport.c | 13 +-
transport.h | 9 +-
tree-diff.c | 4 +-
tree-walk.c | 48 ++++-
tree-walk.h | 3 +
tree.c | 5 +
upload-pack.c | 45 +++--
37 files changed, 952 insertions(+), 88 deletions(-)
create mode 100644 README-sparse-clone
create mode 100644 sparse-repo.c
create mode 100644 sparse-repo.h
create mode 100644 t/sparse-lib.sh
create mode 100755 t/t5720-sparse-repository-basics.sh
create mode 100755 t/t5721-sparse-repository-communication.sh
--=20
1.7.2.3.541.g94cc33