Discussion:
Trying to sync two svn repositories with git-svn (repost)
Josef Wolf
2009-04-27 20:12:51 UTC
Permalink
Hello,

I have two subversion repositories which I would like to synchronize via
git-svn. For this, I have set up a git repository and configured two
branches to track the subversion repositories via git-svn:

mkdir test-sync
cd test-sync
git svn init --stdlayout file://$REPOSDIR/svn-first

for repos in svn-first svn-second; do
git config svn-remote.$repos.url file://$REPOSDIR/$repos
git config svn-remote.$repos.fetch trunk:refs/remotes/$repos/trunk
git config svn-remote.$repos.branches branches/*:refs/remotes/$repos/*
git config svn-remote.$repos.tags tags/*:refs/remotes/$repos/tags/*
git svn fetch -R $repos
git checkout -b $repos $repos/trunk
done
git gc

This gives me two remote and two local branches:

master
svn-first
* svn-second
svn-first/trunk
svn-second/trunk

As a first step, I tried to "mirror" the manual "merges" that were done
between the subversion repositories in the past:

git checkout svn-first
git cherry-pick svn-second-sha1 .... # repeat as needed

git checkout svn-second
git cherry-pick svn-first-sha1 .... # repeat as needed

So I've spent almost 4 weeks to cherry-pick and resolve all the conflicts.
Looks good so far, since

git diff svn-first svn-second
git diff svn-first/trunk svn-first/trunk
git diff svn-second/trunk svn-second/trunk

give me the desired outputs. Now I do

git checkout svn-first
git merge -s ours svn-second
git checkout svn-second
git merge -s ours svn-first

to tell git that the branches are in sync.

But now, when I try to

git checkout svn-second
git svn rebase

I get lots of conflicts. When I inspect the .git/rebase-apply directory
and the conflicts, it looks like "git svn rebase" tries to re-apply all
the commits from svn-first. When I omit the "git merge -s ours svn-first"
command, it does not re-apply those commits. So it looks like the
"git merge -s ours" wipes some information that git-svn needs to know
what was already merged.

What am I missing? I thought the "ours" strategy is meant to tell git
that everything from that branch was merged, either manually or by
cherry-pick.

Any hints how to track this down?

BTW: this is git version 1.6.0.2
Josef Wolf
2009-04-28 20:30:40 UTC
Permalink
On Mon, Apr 27, 2009 at 10:12:51PM +0200, Josef Wolf wrote:

[ ... ]
Post by Josef Wolf
give me the desired outputs. Now I do
git checkout svn-first
git merge -s ours svn-second
git checkout svn-second
git merge -s ours svn-first
to tell git that the branches are in sync.
But now, when I try to
git checkout svn-second
git svn rebase
I get lots of conflicts. When I inspect the .git/rebase-apply directory
and the conflicts, it looks like "git svn rebase" tries to re-apply all
the commits from svn-first. When I omit the "git merge -s ours svn-first"
command, it does not re-apply those commits. So it looks like the
"git merge -s ours" wipes some information that git-svn needs to know
what was already merged.
What am I missing? I thought the "ours" strategy is meant to tell git
that everything from that branch was merged, either manually or by
cherry-pick.
After lots of RTFM, I get the impression, that cherry-pick is the only
operation I can do to sync a git-svn branch with other (git or git-svn)
branches. merge/pull should be avoided.

But then, how do I mark cherry-picked commits as "already synched"?
Avery Pennarun
2009-04-28 20:53:52 UTC
Permalink
I have two subversion repositories which I would like to synchronize =
via
git-svn. =A0[...]
What you're attempting is rather complicated. I guess I'd suggest we
back up a step: why do you want to do this? In what way does a "pure
svn" tool like svnsync
(http://svnbook.red-bean.com/en/1.5/svn.ref.svnsync.html) not do what
you want?

Are you making changes to *both* svn repositories and then want to
synchronize their histories? This is basically impossible, since svn
only has a linear history. If you add commit A to one repo, and
commit B to the other, you will never make the histories identical in
both repos; one will necessarily have A and then B, and the other will
have B and then A. That's not really desirable, since one of those
two histories is a lie. If commit B breaks A, but it looks like A was
committed *after* B, then the person who wrote A will be blamed for
the error.

So, what is it you're *really* trying to do?

Have fun,

Avery
Josef Wolf
2009-04-28 22:37:28 UTC
Permalink
Thanks for your answer, Avery!
Post by Avery Pennarun
I have two subversion repositories which I would like to synchroniz=
e via
Post by Avery Pennarun
git-svn. =A0[...]
=20
What you're attempting is rather complicated. I guess I'd suggest we
back up a step: why do you want to do this? In what way does a "pure
svn" tool like svnsync
(http://svnbook.red-bean.com/en/1.5/svn.ref.svnsync.html) not do what
you want?
The explanation to that git somewhat longish, sorry for that.

The project is about configuring networks. If you think about cfengine=
,
you get pretty close to what it is about. It started out as a single
repository. But since configuring networks requires lots of
"localization", it was soon split into two parts: One part are the
"mechanics": generic libraries/scripts. The other part is the
"policy": the configuration that specifies how the mechanics should
do their task. The reason for the split was to allow multiple
policies (for independent administrations) to share one common set of
mechanics (as svn:externals). For security reasons, the mechanics were
imported back into the (multiple) policies later on.

Currently, there exist multiple independent repositories (for security
reasons). In the past, the repositories were "synchronized" manually.
So technically, the repositories have no common history (at least not
in svn's metadata). But the contents are actually rather "similar",
since they were synchronized multiple times in the past.

In the long term, I'd like to move everything completely to git. That
would make it much easier to move changes from one repos to the other
while keeping the (intended) differences in the policy.

So my first goal is to bring the contents into sync. The next step wou=
ld
be to create a "reference" (the official) git repository, which can be
cloned by the administrations to create their localized repositories.

In the meantime, I need a way to synchronize the contents from time to
time. I guess it will take some time to create the official repos and
get used to the work flow.
Post by Avery Pennarun
Are you making changes to *both* svn repositories
Yes.
Post by Avery Pennarun
and then want to synchronize their histories?
Since synchronization was done manually in the past, I do not (yet) car=
e
about the history very much. My first goal was to get the contents int=
o
a sane state: I've done lots of criss-cross cherry-picking. Now that t=
he
contents are in sync (within git branches), I'd like to do two things:

- Set "markers" to indicate which commits are already synchronized, so
the next synchronization will be easier. I thought "git-merge -s ou=
rs"
would be the correct way to do that. But as I already wrote: this
causes the next "git-svn rebase" to apply all the commits of the oth=
er
branch (which I already cherry-picked)

- Feed the cherry-picked commits back to the svn repositories. I've n=
ot
tried that yet, since I think I should "git-svn rebase" first.
Post by Avery Pennarun
This is basically impossible, since svn only has a linear history.
If you add commit A to one repo, and
commit B to the other, you will never make the histories identical in
both repos;
Yeah, I see. But I don't really care about that, as long as the relati=
ve
order of the commits is kept when they are moved to the other repos.
Since the svn repositories will die once the migration is done, this
is not a big deal.
Post by Avery Pennarun
one will necessarily have A and then B, and the other will
have B and then A. That's not really desirable, since one of those
two histories is a lie.
The current situation is an even bigger lie.
Post by Avery Pennarun
So, what is it you're *really* trying to do?
I hope my explanation was not too boring...
Post by Avery Pennarun
Have fun,
Yeah, I am having _lots_ of fun =3D8)
Avery Pennarun
2009-04-29 03:19:51 UTC
Permalink
Currently, there exist multiple independent repositories (for securit=
y
reasons). =A0In the past, the repositories were "synchronized" manual=
ly.
So technically, the repositories have no common history (at least not
in svn's metadata). =A0But the contents are actually rather "similar"=
,
since they were synchronized multiple times in the past.
In the long term, I'd like to move everything completely to git. =A0T=
hat
would make it much easier to move changes from one repos to the other
while keeping the (intended) differences in the policy.
So my first goal is to bring the contents into sync. =A0The next step=
would
be to create a "reference" (the official) git repository, which can b=
e
cloned by the administrations to create their localized repositories.
In the meantime, I need a way to synchronize the contents from time t=
o
time. =A0I guess it will take some time to create the official repos =
and
get used to the work flow.
Okay, I think I'm following you. And I think the difficulty of your
solution will depend on how important it is to cherry-pick each
individual commit from each repo vs. just merging everything as a
batch.

At Versabanq, we're using git for a bunch of stuff including our
autobuilder (http://github.com/apenwarr/gitbuilder) and my own
branching/merging. However, for historical reasons, everything needs
to also go into an svn repository, which some people use.

Yes, it is possible to rebase everything from git onto an svn branch,
and then git svn dcommit it. However, in my experience, this is
fairly nasty (and it also tries to linearize non-linear history, which
is just messy). What we've been doing lately is just merging all
changes from git into the svn branch as a single commit:

git checkout git-svn
git merge --no-ff mybranch # --no-ff prevents git-svn from
crazily linearizing things
git svn dcommit

git checkout mybranch
git merge git-svn

As long as you "git config merge.summary true" (to make the merge
commit list all the commits it's merging) and you merge frequently
enough, this is reasonably painless. You end up with a lot of merge
commits, but the git history is recording everything fully, so if you
want to throw away svn someday, you can just go ahead.

Now, your problem is a little more complex, because it sounds like
people are checking in two types of things on both sides: private
things and public things. So if you want *only* the private things,
you're going to have to cherry-pick, and cherry-picking is going to
confuse your merging.

If you could convince the people using svn to use two branches: one
for private stuff and one for public stuff, then your life would be
easier. You could just merge the public stuff in git, and ignore the
private stuff.

If that's not an option, you *can* combine cherry-pick with -s ours as
you suggest, though it's kind of nasty. The trick is to merge -s ours
in *both* directions at the right time, so you can avoid conflicts.

git checkout git-svn
git merge mybranch

git checkout mybranch
[git cherry-pick or git merge *everything* you're missing from git-=
svn...]
git merge -s ours --no-ff git-svn
# future merges from git-svn will ignore everything in mybranch u=
p to now

git checkout git-svn
# we know git-svn is already up to date, because of the first mer=
ge above
git merge -s ours --no-ff mybranch
# future merges from mybranch will ignore everything in git-svn u=
p to now
git svn dcommit

After these steps (WARNING: I didn't actually run them, so I might
have made a mistake), you should have both branches in sync, and you
*should* be able to merge in both directions whenever you want (make
sure you use --no-ff), until the next time someone commits something
private and screws you over again.

If you have more than one svn server, the above method should be
extensible; just use another svn branch in place of 'mybranch' or keep
cross-merging across all the branches.

Good luck :)

Avery
Josef Wolf
2009-04-29 16:01:30 UTC
Permalink
Thanks for your answer, Avery!
Post by Avery Pennarun
Currently, there exist multiple independent repositories (for secur=
ity
Post by Avery Pennarun
reasons). =A0In the past, the repositories were "synchronized" manu=
ally.
Post by Avery Pennarun
So technically, the repositories have no common history (at least n=
ot
Post by Avery Pennarun
in svn's metadata). =A0But the contents are actually rather "simila=
r",
Post by Avery Pennarun
since they were synchronized multiple times in the past.
In the long term, I'd like to move everything completely to git. =A0=
That
Post by Avery Pennarun
would make it much easier to move changes from one repos to the oth=
er
Post by Avery Pennarun
while keeping the (intended) differences in the policy.
So my first goal is to bring the contents into sync. =A0The next st=
ep would
Post by Avery Pennarun
be to create a "reference" (the official) git repository, which can=
be
Post by Avery Pennarun
cloned by the administrations to create their localized repositorie=
s.
Post by Avery Pennarun
In the meantime, I need a way to synchronize the contents from time=
to
Post by Avery Pennarun
time. =A0I guess it will take some time to create the official repo=
s and
Post by Avery Pennarun
get used to the work flow.
=20
Okay, I think I'm following you. And I think the difficulty of your
solution will depend on how important it is to cherry-pick each
individual commit from each repo vs. just merging everything as a
batch.
I've already done the cherry-picking. Basically, I've done this:

# first, move patches from second-svn to first-svn
git checkout first-svn
git svn rebase
git cherry-pick sha1 # repeat as needed
git merge -s ours second-svn

# Now, the other way around
git checkout second-svn
git svn rebase
git cherry-pick sha1 # repeat as needed
git merge -s ours first-svn

The first git-svn-rebase after the merge causes all the (already picked=
)
commits from the other branch to be pulled into the current branch.
Adding the --no-ff option does not help. Omitting the cherry-picking
does not help, either.

To be honest, I do not understand this behavior at all. I thought
"-s ours" should mark the other branch as "already merged". IMHO, this
should prevent future merges from pulling those commits again. But
this seems not to be true: git-svn-rebase tries to apply them _again_,
causing almost everything to conflict.
Post by Avery Pennarun
At Versabanq, we're using git for a bunch of stuff including our
autobuilder (http://github.com/apenwarr/gitbuilder) and my own
Interesting project. One question: the README mentions that the
gitbuilder as capable to update itself. But I can not actually see
this functionality in the scripts. Is that just a typo or am I missing
something?
Post by Avery Pennarun
Yes, it is possible to rebase everything from git onto an svn branch,
and then git svn dcommit it.
AFAICS, this is the preferred work flow of git-svn.
Post by Avery Pennarun
However, in my experience, this is
fairly nasty (and it also tries to linearize non-linear history, whic=
h
Post by Avery Pennarun
is just messy). What we've been doing lately is just merging all
=20
git checkout git-svn
git merge --no-ff mybranch # --no-ff prevents git-svn from
crazily linearizing things
git svn dcommit
=20
git checkout mybranch
git merge git-svn
How would this be changed if the commits are coming from the other svn
repo instead of directly from git?

So should I replace all my "git cherry-pick sha1" by corresponding
"git merge -s ??? --no-ff sha1" commands?
Post by Avery Pennarun
As long as you "git config merge.summary true" (to make the merge
commit list all the commits it's merging)
How does this option influence the merge operation? Or is this meant
to provide additional information to the person who does the next merge=
?
Post by Avery Pennarun
and you merge frequently
enough, this is reasonably painless. You end up with a lot of merge
commits, but the git history is recording everything fully, so if you
want to throw away svn someday, you can just go ahead.
Sounds good, but I still don't get it :-) Can you provide a more
verbose example of the workflow?
Post by Avery Pennarun
Now, your problem is a little more complex, because it sounds like
people are checking in two types of things on both sides: private
things and public things. So if you want *only* the private things,
I want both. The difference is that I (usually) want to pull the publi=
c
things unmodified, while I want to generalize/localize the private thin=
gs.
So when merging the private part, I would not want to pick the specific
entries. But I still want to pick the _structure_ (possibly removing o=
r
modifying the localized entries).
Post by Avery Pennarun
you're going to have to cherry-pick, and cherry-picking is going to
confuse your merging.
=20
If you could convince the people using svn to use two branches: one
for private stuff and one for public stuff, then your life would be
easier. You could just merge the public stuff in git, and ignore the
private stuff.
=20
If that's not an option, you *can* combine cherry-pick with -s ours a=
s
Post by Avery Pennarun
you suggest, though it's kind of nasty. The trick is to merge -s our=
s
Post by Avery Pennarun
in *both* directions at the right time, so you can avoid conflicts.
=20
git checkout git-svn
git merge mybranch
=20
git checkout mybranch
[git cherry-pick or git merge *everything* you're missing from gi=
t-svn...]
Post by Avery Pennarun
git merge -s ours --no-ff git-svn
# future merges from git-svn will ignore everything in mybranch=
up to now
Post by Avery Pennarun
=20
git checkout git-svn
# we know git-svn is already up to date, because of the first m=
erge above
Post by Avery Pennarun
git merge -s ours --no-ff mybranch
# future merges from mybranch will ignore everything in git-svn=
up to now
Post by Avery Pennarun
git svn dcommit
Does that mean I should do a normal merge _before_ I go cherry-pick?
Post by Avery Pennarun
After these steps (WARNING: I didn't actually run them, so I might
have made a mistake), you should have both branches in sync, and you
*should* be able to merge in both directions whenever you want (make
sure you use --no-ff), until the next time someone commits something
private and screws you over again.
=20
If you have more than one svn server, the above method should be
extensible; just use another svn branch in place of 'mybranch' or kee=
p
Post by Avery Pennarun
cross-merging across all the branches.
Maybe I got confused about what I should do on which branch. Currently=
,
I have five branches:

first-svn/trunk # svn-remote branch of my repos
second-svn/trunk # svn-remote branch of their repos
first-svn # created by "git checkout -b first-svn/trunk"
second-svn # created by "git checkout -b second-svn/trunk"
master # I don't use this one yet

=46or this, I have the following configuration:

[svn-remote "first-svn"]
url =3D file:///var/tmp/builds/git-sync/svn/first-svn
fetch =3D trunk:refs/remotes/first-svn/trunk
branches =3D branches/*:refs/remotes/first-svn/*
tags =3D tags/*:refs/remotes/first-svn/tags/*
[svn-remote "second-svn"]
url =3D file:///var/tmp/builds/git-sync/svn/second-svn
fetch =3D trunk:refs/remotes/second-svn/trunk
branches =3D branches/*:refs/remotes/second-svn/*
tags =3D tags/*:refs/remotes/second-svn/tags/*
Avery Pennarun
2009-04-29 18:13:29 UTC
Permalink
Post by Josef Wolf
Okay, I think I'm following you. =A0And I think the difficulty of yo=
ur
Post by Josef Wolf
solution will depend on how important it is to cherry-pick each
individual commit from each repo vs. just merging everything as a
batch.
I've already done the cherry-picking.
So you're saying that from now on, *all* changes from *both* branches
need to be integrated in both directions? If so, you're done with
cherry-picking. If not, you're not.
Post by Josef Wolf
=A0# first, move patches from second-svn to first-svn
=A0git checkout first-svn
=A0git svn rebase
=A0git cherry-pick sha1 # repeat as needed
=A0git merge -s ours second-svn
=A0# Now, the other way around
=A0git checkout second-svn
=A0git svn rebase
=A0git cherry-pick sha1 # repeat as needed
=A0git merge -s ours first-svn
The first git-svn-rebase after the merge causes all the (already pick=
ed)
Post by Josef Wolf
commits from the other branch to be pulled into the current branch.
Adding the --no-ff option does not help. =A0Omitting the cherry-picki=
ng
Post by Josef Wolf
does not help, either.
Hmm, I don't see any 'git svn dcommit' in there. The steps I listed
referred to dcommit, but explicitly left out calls to 'git svn
rebase'.

I think it's likely that your problems stem from this. The git svn
documentation refers to the 'git svn rebase' operation a lot, but it's
only really useful for one thing: linearizing history to make it look
(to svn) like git was never involved. This is handy for people who
want to use git at work without their boss knowing about it, but it
*loses information* and will mess up future merges.

In general, 'git svn rebase' should be avoided for all the same
reasons that 'git rebase' should be avoided. They're both great when
used carefully, but they shouldn't be your main day-to-day activity.
Unfortunately git svn encourages you to use rebase in your day-to-day
activity... but the workflow I'm talking about actually avoids this
problem completely. What you want most of the time is really just
'git svn fetch'. and 'git svn dcommit'.

I think I was also a bit too offhand in my previous email when
expanding my suggestion to work with multiple svn hosts. The clearest
way to do this is with three branches:

- 1 remote branch: git-svn-1
- 1 remote branch: git-svn-2
- 1 local branch: master

So the steps are something like this. (Again, WARNING: I'm not
running these as I type them, so I could be screwing up just about
anything.)

Getting started:

git checkout master
... Use 'git svn fetch' to update git-svn-1 and git-svn-2 ...
... git merge/cherry-pick what you want from git-svn-1 and
git-svn-2. ALWAYS use --no-ff if using git merge
git merge --no-ff -s ours git-svn-1
git merge --no-ff -s ours git-svn-2
# now master has everything from both svn repositories

=46rom now on:

# Update git-svn-1 with the latest master
git checkout git-svn-1
# since git-svn-1 is a remote branch, you now have a detached HEAD
git merge --no-ff master
git svn dcommit

# Update git-svn-2 with the latest master
git checkout git-svn-2
# since git-svn-2 is a remote branch, you now have a *different*
detached HEAD
git merge --no-ff master
git svn dcommit

# Update master with the latest svn
git checkout master
# HEAD is now attached to master
git merge --no-ff git-svn-1
git merge --no-ff git-svn-2
# no need for '-s ours' in the above merge, as no rebasing means
no merge history was lost
Post by Josef Wolf
At Versabanq, we're using git for a bunch of stuff including our
autobuilder (http://github.com/apenwarr/gitbuilder) and my own
Interesting project. =A0One question: the README mentions that the
gitbuilder as capable to update itself. =A0But I can not actually see
this functionality in the scripts. =A0Is that just a typo or am I mis=
sing
Post by Josef Wolf
something?
I guess you're reading the line that says, "Now that your gitbuilder
is working, you probably want to have it continue
to update itself automatically." This is actually talking about
*running* itself automatically, as in "upating the build results to
the latest copy of your project." I can see how it's a very unclear
word to use there. Thanks for the feedback.
Post by Josef Wolf
As long as you "git config merge.summary true" (to make the merge
commit list all the commits it's merging)
How does this option influence the merge operation? =A0Or is this mea=
nt
Post by Josef Wolf
to provide additional information to the person who does the next mer=
ge?

When you *merge* (as opposed to rebase or cherry-pick) into an svn
branch, you only create a *single* svn commit that contains *all* the
changes. The above config setting just makes the merge commit contain
a list of all the commits it contains.
Post by Josef Wolf
Now, your problem is a little more complex, because it sounds like
people are checking in two types of things on both sides: private
things and public things. =A0So if you want *only* the private thing=
s,
Post by Josef Wolf
I want both. =A0The difference is that I (usually) want to pull the p=
ublic
Post by Josef Wolf
things unmodified, while I want to generalize/localize the private th=
ings.
Post by Josef Wolf
So when merging the private part, I would not want to pick the specif=
ic
Post by Josef Wolf
entries. =A0But I still want to pick the _structure_ (possibly removi=
ng or
Post by Josef Wolf
modifying the localized entries).
If you're going to be mangling things so thoroughly, then you might
just have to resort to cherry-picking everything one by one from one
branch to the other. It doesn't sound very fun, but if other people
are being so uncooperative by mixing public and private stuff in their
repositories, there's no way I can see to automate it anyhow.

If you're using cherry-pick for everything, there's no reason to use
tricks like 'merge -s ours'. Just leave out the merging entirely and
don't pretend that what you're doing is merging; it isn't. (You still
don't need 'git svn rebase' for anything. Just checkout the branch
you want to change, cherry-pick stuff into it, and 'git svn dcommit'
if appropriate.)

If the situation ever changes, you can always do one last 'merge -s
ours' and mark the histories as combined. Then future merges will
bring in any future changes.

Good luck.

Avery
Josef Wolf
2009-04-29 22:37:47 UTC
Permalink
Thanks for your great explanations, Avery!
Post by Avery Pennarun
Post by Josef Wolf
Okay, I think I'm following you. =A0And I think the difficulty of =
your
Post by Avery Pennarun
Post by Josef Wolf
solution will depend on how important it is to cherry-pick each
individual commit from each repo vs. just merging everything as a
batch.
I've already done the cherry-picking.
=20
So you're saying that from now on, *all* changes from *both* branches
need to be integrated in both directions?
Exactly. Those three commands:

git diff first-svn second-svn # this should be the "private" =
diff
git diff first-svn/trunk first-svn # what my cherry-picking has ch=
anged
# (and waits for push) in first=
-svn
git diff second-svn/trunk second-svn # what my cherry-picking has ch=
anged
# (and waits for push) in secon=
d-svn

show me _exactly_ what I want them to be. The manual synchronizations
which were done in the past are resolved now. But I can't find the way
how to put the result of this cherry-picking back into the svn reposito=
ries
Post by Avery Pennarun
If so, you're done with cherry-picking. If not, you're not.
Yeah, I think I'm done with it. That's why I tried to "git merge -s ou=
rs"
to tell git about the good news ;-)
Post by Avery Pennarun
Post by Josef Wolf
=A0# first, move patches from second-svn to first-svn
=A0git checkout first-svn
=A0git svn rebase
=A0git cherry-pick sha1 # repeat as needed
=A0git merge -s ours second-svn
=A0# Now, the other way around
=A0git checkout second-svn
=A0git svn rebase
=A0git cherry-pick sha1 # repeat as needed
=A0git merge -s ours first-svn
The first git-svn-rebase after the merge causes all the (already pi=
cked)
Post by Avery Pennarun
Post by Josef Wolf
commits from the other branch to be pulled into the current branch.
Adding the --no-ff option does not help. =A0Omitting the cherry-pic=
king
Post by Avery Pennarun
Post by Josef Wolf
does not help, either.
=20
Hmm, I don't see any 'git svn dcommit' in there. The steps I listed
referred to dcommit, but explicitly left out calls to 'git svn
rebase'.
Ah! I thought I _have_ to "git svn rebase" before I dcommit, like I ne=
ed
to "svn update" before I can do "svn commit".
Post by Avery Pennarun
I think it's likely that your problems stem from this. The git svn
documentation refers to the 'git svn rebase' operation a lot, but it'=
s
Post by Avery Pennarun
only really useful for one thing: linearizing history to make it look
(to svn) like git was never involved. This is handy for people who
want to use git at work without their boss knowing about it, but it
*loses information* and will mess up future merges.
OK, I'll try without rebase.
Post by Avery Pennarun
In general, 'git svn rebase' should be avoided for all the same
reasons that 'git rebase' should be avoided. They're both great when
used carefully, but they shouldn't be your main day-to-day activity.
Unfortunately, all the howto's I could find recommend exactly that:
git-svn-rebase for getting commits from svn and dcommit for sending
commits to svn.
Post by Avery Pennarun
Unfortunately git svn encourages you to use rebase in your day-to-day
activity... but the workflow I'm talking about actually avoids this
problem completely. What you want most of the time is really just
'git svn fetch'. and 'git svn dcommit'.
=20
I think I was also a bit too offhand in my previous email when
expanding my suggestion to work with multiple svn hosts. The cleares=
t
Post by Avery Pennarun
=20
- 1 remote branch: git-svn-1
- 1 remote branch: git-svn-2
- 1 local branch: master
I will try this one. But this will take a while, since my
cherry-picking was done criss-cross. Thus, I need to "rebase"
the cherries now to get them onto a single branch. Is there
a simple way to do that or do I have to redo the cherry-picking from
scratch?
Post by Avery Pennarun
So the steps are something like this. (Again, WARNING: I'm not
running these as I type them, so I could be screwing up just about
anything.)
=20
=20
git checkout master
... Use 'git svn fetch' to update git-svn-1 and git-svn-2 ...
... git merge/cherry-pick what you want from git-svn-1 and
git-svn-2. ALWAYS use --no-ff if using git merge
git merge --no-ff -s ours git-svn-1
git merge --no-ff -s ours git-svn-2
# now master has everything from both svn repositories
I guess I need

git checkout git-svn-1; git svn dcommit
git checkout git-svn-2; git svn dcommit

at this point to push the synchronization work to the svn repositories?
Post by Avery Pennarun
=20
# Update git-svn-1 with the latest master
git checkout git-svn-1
# since git-svn-1 is a remote branch, you now have a detached HEA=
D
Post by Avery Pennarun
git merge --no-ff master
git svn dcommit
=20
# Update git-svn-2 with the latest master
git checkout git-svn-2
# since git-svn-2 is a remote branch, you now have a *different*
detached HEAD
git merge --no-ff master
git svn dcommit
=20
# Update master with the latest svn
git checkout master
# HEAD is now attached to master
git merge --no-ff git-svn-1
git merge --no-ff git-svn-2
# no need for '-s ours' in the above merge, as no rebasing means
no merge history was lost
Looks reasonable.
Post by Avery Pennarun
Post by Josef Wolf
At Versabanq, we're using git for a bunch of stuff including our
autobuilder (http://github.com/apenwarr/gitbuilder) and my own
Interesting project. =A0One question: the README mentions that the
gitbuilder as capable to update itself. =A0But I can not actually s=
ee
Post by Avery Pennarun
Post by Josef Wolf
this functionality in the scripts. =A0Is that just a typo or am I m=
issing
Post by Avery Pennarun
Post by Josef Wolf
something?
=20
I guess you're reading the line that says, "Now that your gitbuilder
is working, you probably want to have it continue
to update itself automatically."
Exactly.
Post by Avery Pennarun
This is actually talking about
*running* itself automatically, as in "upating the build results to
the latest copy of your project." I can see how it's a very unclear
word to use there. Thanks for the feedback.
OK..
Post by Avery Pennarun
Post by Josef Wolf
As long as you "git config merge.summary true" (to make the merge
commit list all the commits it's merging)
How does this option influence the merge operation? =A0Or is this m=
eant
Post by Avery Pennarun
Post by Josef Wolf
to provide additional information to the person who does the next m=
erge?
Post by Avery Pennarun
=20
When you *merge* (as opposed to rebase or cherry-pick) into an svn
branch, you only create a *single* svn commit that contains *all* the
changes. The above config setting just makes the merge commit contai=
n
Post by Avery Pennarun
a list of all the commits it contains.
But git will not use this information in any way, AFAIK. So this infor=
mation
is only for the person who will do the next merge?
Post by Avery Pennarun
Post by Josef Wolf
Now, your problem is a little more complex, because it sounds like
people are checking in two types of things on both sides: private
things and public things. =A0So if you want *only* the private thi=
ngs,
Post by Avery Pennarun
Post by Josef Wolf
I want both. =A0The difference is that I (usually) want to pull the=
public
Post by Avery Pennarun
Post by Josef Wolf
things unmodified, while I want to generalize/localize the private =
things.
Post by Avery Pennarun
Post by Josef Wolf
So when merging the private part, I would not want to pick the spec=
ific
Post by Avery Pennarun
Post by Josef Wolf
entries. =A0But I still want to pick the _structure_ (possibly remo=
ving or
Post by Avery Pennarun
Post by Josef Wolf
modifying the localized entries).
=20
If you're going to be mangling things so thoroughly, then you might
just have to resort to cherry-picking everything one by one from one
branch to the other. It doesn't sound very fun, but if other people
are being so uncooperative by mixing public and private stuff in thei=
r
Post by Avery Pennarun
repositories, there's no way I can see to automate it anyhow.
The people are not uncooperative. It is just that there's no way to
completely separate the public and private content. For example, the
private part of my apache config looks like this (somewhat simplified):

&set_conf (apache =3D> {
servername =3D> "my.host.org",
davlock =3D> "/m/a/dav/lock",
confdir =3D> "/m/a/etc/apache",
httpdir =3D> "/m/b/httpd",
docroot =3D> "/m/b/httpd/htdocs",
svndir =3D> "/m/b/repos/svn",
gitdir =3D> "/m/b/repos/git",
vhosts =3D> {
"*:80"=3D> {
downloads =3D> {
"my-debian-repos"=3D>["/m/b/lib/my-debian",
},
},
"*:443"=3D> {
docroot =3D> "/m/b/httpd/htdocs",
downloads =3D> {
"kdb" =3D> ["/m/b/lib/kdb/SRC", "Kdb downloa=
d"],
"kdbdemo" =3D> ["/m/b/lib/kdbdemo", "Kdbdemo dow=
nload"],
"pictures" =3D> ["/m/b/Pictures", "Pictures"],
},
cgis =3D> {
"svn" =3D> ["/m/l/svn/cgi", "Svn repos administrat=
ion"],
"misc" =3D> ["/m/a/cgi", "Misc cgi scripts"],
},
svn =3D> { # FIXME: svn, repos
"ab" =3D> ["Ab repository"],
"misc" =3D> ["Subversion Repository"],
"pmisc" =3D> ["Private Subversion Repositor=
y"],
},
git =3D> {
"test" =3D> ["Git test repos"],
},
revproxies =3D> {
"/test/" =3D> ["http://localhost:3000/railstest/"=
],
"/foobar/" =3D> ["http://foo.bar.org:8001/"],
}
},
},
});

=46rom this information, the public part knows how to generate the apac=
he
config. Other people are not interested which locations are defined
here, But they _are_ interested in the layout how this (perl) structur=
e=20
looks like. So I need to strip my localizations and provide the
structure as a template for other people to fill in. Of course, this
leads to more conflicts in the future every time the layout is changed.
Post by Avery Pennarun
If you're using cherry-pick for everything, there's no reason to use
tricks like 'merge -s ours'. Just leave out the merging entirely and
don't pretend that what you're doing is merging; it isn't. (You stil=
l
Post by Avery Pennarun
don't need 'git svn rebase' for anything. Just checkout the branch
you want to change, cherry-pick stuff into it, and 'git svn dcommit'
if appropriate.)
But then I have to do the book-keeping (what was already picked in whic=
h
direction) by myself?
Post by Avery Pennarun
If the situation ever changes, you can always do one last 'merge -s
ours' and mark the histories as combined. Then future merges will
bring in any future changes.
OK
Avery Pennarun
2009-04-30 02:07:31 UTC
Permalink
So you're saying that from now on, *all* changes from *both* branche=
s
need to be integrated in both directions?
=A0git diff first-svn =A0 =A0 =A0 =A0second-svn =A0# this should be t=
he "private" diff
=A0git diff first-svn/trunk =A0first-svn =A0 # what my cherry-picking=
has changed
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
=A0 =A0# (and waits for push) in first-svn
=A0git diff second-svn/trunk second-svn =A0# what my cherry-picking h=
as changed
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
=A0 =A0# (and waits for push) in second-svn
show me _exactly_ what I want them to be. =A0The manual synchronizati=
ons
which were done in the past are resolved now. =A0But I can't find the=
way
how to put the result of this cherry-picking back into the svn reposi=
tories

Okay, I think perhaps you're missing something that took me a long
time to figure out about git-svn, but once I understood it, my life
went a lot more smoothly.

Basically, 'git svn fetch' updates just a *remote* branch. (Remote
branches are in .git/refs/remotes/* and you can't change them
yourself, because you can't attach your HEAD to them. If you try to
check them out, you get the latest revision in that branch, but your
HEAD becomes detached.)

I'm not actually sure which of the above branches you're referring to
is remote and which is local. Let's guess that first-svn/trunk is
remote, and first-svn is a local copy of it, onto which you
cherry-picked some extra patches.

Some people like to think of just making a copy of the git-svn remote
branch, doing stuff to it, and then doing git-svn dcommit and/or
rebase on that branch. This sounds good, but it makes a mess *if*
you're doing any kind of git merging (which many git-svn users never
do, but which it seems you'd like to do extensively). Remember: git
merge is 100% totally incompatible with rebasing.

What I'm suggesting is that you think of your local branch (first-svn
in this case, I guess) as *not* an svn branch at all. You never do
any git-svn operations directly on this branch. In fact, rename it to
master so you aren't tempted to get yourself into trouble.

Now, you've merged from first-svn/trunk and cherry-picked some extra
stuff onto this branch, right? Good. Now you want to *merge* this
branch into first-svn/trunk, producing just *one* new commit, and
dcommit that into svn.

git checkout first-svn/trunk
# detaches the HEAD
git merge master
# produces a merge commit on the detached HEAD
git svn dcommit
# produces a *different* commit object on the first-svn/trunk bra=
nch
# ...and moves HEAD to it.

The newly-produced commit tells git that first-svn/trunk is now
up-to-date with master. Note that the interim merge commit (produced
by 'git merge') is never shared with *anyone*, so it's perfectly okay
that we replace it with the next command.

What you probably thought you should do, given that the existing
git-svn documentation says to do it, is more like this:

# WRONG
git checkout first-svn
git cherry-pick some stuff
git merge [perhaps -s ours] second-svn/trunk
git svn dcommit

But the above will *change* every single commit you put on first-svn,
because dcommit needs to *regenerate* all the commits after putting
them into svn and getting them back again. This is essentially a
rebase, and disrupts any merges you might have made from this branch
to another one. Next time you merge, you'll get a zillion conflicts.
Ah! =A0I thought I _have_ to "git svn rebase" before I dcommit, like =
I need
to "svn update" before I can do "svn commit".
This is true and yet not true. The reason I don't have to ever use
'git svn rebase' is that 'git svn fetch' updates my first-svn/trunk
branch, and then I quickly do a merge-then-dcommit on that branch. If
I was to do a 'git svn rebase' first, nothing would happen, because
svn doesn't change.

This is important, since 'git svn dcommit' actually *does* do a 'git
svn rebase' for you automatically, trying to be helpful.
In general, 'git svn rebase' should be avoided for all the same
reasons that 'git rebase' should be avoided. =A0They're both great w=
hen
used carefully, but they shouldn't be your main day-to-day activity.
git-svn-rebase for getting commits from svn and dcommit for sending
commits to svn.
Yeah, they're trying to keep things simple, at the cost of preventing
you from doing anything complicated. I'm not smart enough to do both,
so I'm just making things complicated for you here ;)

As it happens, I wrote the git-svn chapter for the
very-nearly-available new O'Reilly book "Version Control with Git." I
gave the complicated solution there too. I'm eagerly awaiting the
giant flames from people who actually wrote git-svn (and its
documentation) and therefore are highly qualified to disagree with me.
=A0 - 1 remote branch: git-svn-1
=A0 - 1 remote branch: git-svn-2
=A0 - 1 local branch: master
I will try this one. =A0But this will take a while, since my
cherry-picking was done criss-cross. =A0Thus, I need to "rebase"
the cherries now to get them onto a single branch. =A0Is there
a simple way to do that or do I have to redo the cherry-picking from
scratch?
No no! Stop rebasing!

You have a branch that looks the way you want, right? That means
you're 99% of the way there. You just have to convince git that this
branch and the svn branch are related to each other in the way they
actually are.

To do that, you just need to do is make a single merge commit on your
svn remote branch that looks the way you want and merges from your
existing branch, then do a single 'git svn dcommit'. Here's one way
(assuming you want to make svn look like your new local branch):

git checkout my-local-branch
git merge -s ours svn-branch
git checkout svn-branch
git merge --no-ff my-local-branch
git svn dcommit

(If the occasionally-suggested '-s theirs' merge strategy existed, you
could just do the last three steps using git merge -s theirs.)
Post by Avery Pennarun
As long as you "git config merge.summary true" (to make the merge
commit list all the commits it's merging)
When you *merge* (as opposed to rebase or cherry-pick) into an svn
branch, you only create a *single* svn commit that contains *all* th=
e
changes. =A0The above config setting just makes the merge commit con=
tain
a list of all the commits it contains.
But git will not use this information in any way, AFAIK. =A0So this i=
nformation
is only for the person who will do the next merge?
In fact, it *only* affects the svn log. Otherwise svn log ends up
with a useless commit that says "Merged from commit (giant hex
string)", and you can't actually do anything with the giant hex string
because svn doesn't know what it is.

If nobody looks at your svn changelog, it's irrelevant.
The people are not uncooperative. =A0It is just that there's no way t=
o
completely separate the public and private content.
There is, if you're willing to do it. The usual way is two have two
branches: public and private.

Whenever you make a change that you want to be public, you commit it
on the public branch, then merge (git merge or svn merge, it doesn't
matter) from public to private. If you want to make a private change,
you just commit it directly to private.

This way, you will always have the two sets of changes isolated, you
never have to cherry-pick anything, and "git diff public private" is
always a sensible thing to do.

(In fact, when I do this, I often don't share the private branch with
anyone at all, which means it's safe to rebase. That means I can keep
a clean set of patches against the public branch, and sort and
rearrange or share them whenever I feel like it. This is useful in
some cases. Rebasing isn't *always* bad :))
If you're using cherry-pick for everything, there's no reason to use
tricks like 'merge -s ours'. =A0Just leave out the merging entirely =
and
don't pretend that what you're doing is merging; it isn't. =A0(You s=
till
don't need 'git svn rebase' for anything. =A0Just checkout the branc=
h
you want to change, cherry-pick stuff into it, and 'git svn dcommit'
if appropriate.)
But then I have to do the book-keeping (what was already picked in wh=
ich
direction) by myself?
On branch b, 'git merge x' will always merge all the changes from the
most recent merge of x into b (which might be a "-s ours" merge if you
want), up to the tip of x. So if you don't commit any *new* private
stuff to x, you can use merge. If you're intermixing the changes,
you'll need to use cherry-pick. git won't attempt to track the
cherry-picks for you (like eg. svnmerge will).

Have fun,

Avery
Josef Wolf
2009-04-30 22:28:09 UTC
Permalink
Post by Avery Pennarun
So you're saying that from now on, *all* changes from *both* branc=
hes
Post by Avery Pennarun
need to be integrated in both directions?
=A0git diff first-svn =A0 =A0 =A0 =A0second-svn =A0# this should be=
the "private" diff
Post by Avery Pennarun
=A0git diff first-svn/trunk =A0first-svn =A0 # what my cherry-picki=
ng has changed
Post by Avery Pennarun
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
=A0 =A0 =A0# (and waits for push) in first-svn
Post by Avery Pennarun
=A0git diff second-svn/trunk second-svn =A0# what my cherry-picking=
has changed
Post by Avery Pennarun
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
=A0 =A0 =A0# (and waits for push) in second-svn
Post by Avery Pennarun
show me _exactly_ what I want them to be. =A0The manual synchroniza=
tions
Post by Avery Pennarun
which were done in the past are resolved now. =A0But I can't find t=
he way
Post by Avery Pennarun
how to put the result of this cherry-picking back into the svn repo=
sitories
Post by Avery Pennarun
=20
Okay, I think perhaps you're missing something that took me a long
time to figure out about git-svn, but once I understood it, my life
went a lot more smoothly.
=20
Basically, 'git svn fetch' updates just a *remote* branch. (Remote
branches are in .git/refs/remotes/* and you can't change them
yourself, because you can't attach your HEAD to them. If you try to
check them out, you get the latest revision in that branch, but your
HEAD becomes detached.)
=20
I'm not actually sure which of the above branches you're referring to
is remote and which is local. Let's guess that first-svn/trunk is
remote, and first-svn is a local copy of it, onto which you
cherry-picked some extra patches.
Yeah, your guess is correct:

first-svn/trunk remote branch of the fist svn repos
second-svn/trunk remote branch of the second svn repos
first-svn local branch where first-svn/trunk is to be tracked
second-svn local branch where second-svn/trunk is to be tracke=
d
Post by Avery Pennarun
What I'm suggesting is that you think of your local branch (first-svn
in this case, I guess) as *not* an svn branch at all. You never do
any git-svn operations directly on this branch. In fact, rename it t=
o
Post by Avery Pennarun
master so you aren't tempted to get yourself into trouble.
I see. But currently, I have _two_ local branches and not only one, as
you suggested in your last post.
Post by Avery Pennarun
Now, you've merged from first-svn/trunk and cherry-picked some extra
stuff onto this branch, right? Good. Now you want to *merge* this
branch into first-svn/trunk, producing just *one* new commit, and
dcommit that into svn.
=20
git checkout first-svn/trunk
# detaches the HEAD
git merge master
# produces a merge commit on the detached HEAD
git svn dcommit
# produces a *different* commit object on the first-svn/trunk b=
ranch
Post by Avery Pennarun
# ...and moves HEAD to it.
=20
The newly-produced commit tells git that first-svn/trunk is now
up-to-date with master. Note that the interim merge commit (produced
by 'git merge') is never shared with *anyone*, so it's perfectly okay
that we replace it with the next command.
=46inally, this seems to work. Luckily, I've done the cherry-picking
not directly, but instead wrote a perl-script with instructions which
commits to pick and how to resolve the conflicts for specific commits.
I use copies of the repositories for this, so I can re-play the scenari=
o
until the result fits the expectations.

With your explanation, I finally arrived at the _first_ work flow
that is able to push the results of the cherry-picking back to the
svn repositories. Having the branches described above, it goes
like this:

# first, retrieve what is available
#
git svn fetch first-svn
git svn fetch second-svn

# cherry-pick from second-svn to first-svn
#
git svn checkout first-svn
git cherry-pick sha1-from-second-svn # repeat as needed
git checkout first-svn/trunk
git merge --no-ff first-svn
git diff first-svn/trunk first-svn >changes.diff
git svn dcommit

# now do the same the other way around
#
git svn checkout second-svn
git cherry-pick sha1-from-first-svn # repeat as needed
git checkout second-svn/trunk
git merge --no-ff second-svn
git diff second-svn/trunk second-svn >changes.diff
git svn dcommit

But I am still somewhat confused:

git log -1 first-svn/trunk

says "Merge branch first-svn into HEAD". But this does not reflect
what I've actually done: I've picked _from_ second-svn and committed
that _to_ first-svn.
Post by Avery Pennarun
What you probably thought you should do, given that the existing
=20
# WRONG
git checkout first-svn
git cherry-pick some stuff
git merge [perhaps -s ours] second-svn/trunk
git svn dcommit
Almost... In addition, I was trying to "git svn rebase" before the
dcommit
Post by Avery Pennarun
But the above will *change* every single commit you put on first-svn,
because dcommit needs to *regenerate* all the commits after putting
them into svn and getting them back again. This is essentially a
rebase, and disrupts any merges you might have made from this branch
to another one. Next time you merge, you'll get a zillion conflicts.
=20
Ah! =A0I thought I _have_ to "git svn rebase" before I dcommit, lik=
e I need
Post by Avery Pennarun
to "svn update" before I can do "svn commit".
=20
This is true and yet not true. The reason I don't have to ever use
'git svn rebase' is that 'git svn fetch' updates my first-svn/trunk
branch, and then I quickly do a merge-then-dcommit on that branch. I=
f
Post by Avery Pennarun
I was to do a 'git svn rebase' first, nothing would happen, because
svn doesn't change.
=20
This is important, since 'git svn dcommit' actually *does* do a 'git
svn rebase' for you automatically, trying to be helpful.
What would happen if somebody else creates a new commit just after I
"git svn fetch" but before I dcommit? Guess, svn will not accept this
commit, because it is based on an outdated revision. How would I
get out from this situation?
Post by Avery Pennarun
In general, 'git svn rebase' should be avoided for all the same
reasons that 'git rebase' should be avoided. =A0They're both great=
when
Post by Avery Pennarun
used carefully, but they shouldn't be your main day-to-day activit=
y.
Post by Avery Pennarun
git-svn-rebase for getting commits from svn and dcommit for sending
commits to svn.
=20
Yeah, they're trying to keep things simple, at the cost of preventing
you from doing anything complicated. I'm not smart enough to do both=
,
Post by Avery Pennarun
so I'm just making things complicated for you here ;)
At least I could finally submit the cherries to the svn repositories.
That's a big step forward, although I still don't fully understand all
the details.
Post by Avery Pennarun
As it happens, I wrote the git-svn chapter for the
very-nearly-available new O'Reilly book "Version Control with Git." =
I
Post by Avery Pennarun
gave the complicated solution there too.
Interesting. Do you have any information when it will be available?
Post by Avery Pennarun
I'm eagerly awaiting the
giant flames from people who actually wrote git-svn (and its
documentation) and therefore are highly qualified to disagree with me=
=2E
Post by Avery Pennarun
=20
=A0 - 1 remote branch: git-svn-1
=A0 - 1 remote branch: git-svn-2
=A0 - 1 local branch: master
I will try this one. =A0But this will take a while, since my
cherry-picking was done criss-cross. =A0Thus, I need to "rebase"
the cherries now to get them onto a single branch. =A0Is there
a simple way to do that or do I have to redo the cherry-picking fro=
m
Post by Avery Pennarun
scratch?
=20
No no! Stop rebasing!
=20
You have a branch that looks the way you want, right?
Ummm, no.. I have _two_ branches:

first-svn: contains the cherries that I picked from second-svn. This
branch looks the way first-svn/trunk should be
second-svn: contains the cherries that I picked from first-svn. This
looks the way second-svn/trunk should be

Don't I need to rebase at least one of them if I want to "merge" those
two branches into a single one?

I have a hard time to adopt my mental model to the one-branch method fo=
r
some reason. OTOH, I can easily understand the multiple-branch method:
for every remote branch, I have a local branch on which I collect the
commits that should go to this remote.
Post by Avery Pennarun
That means
you're 99% of the way there. You just have to convince git that this
branch and the svn branch are related to each other in the way they
actually are.
=20
To do that, you just need to do is make a single merge commit on your
svn remote branch that looks the way you want and merges from your
existing branch, then do a single 'git svn dcommit'. Here's one way
=20
git checkout my-local-branch
git merge -s ours svn-branch
git checkout svn-branch
git merge --no-ff my-local-branch
git svn dcommit
=20
(If the occasionally-suggested '-s theirs' merge strategy existed, yo=
u
Post by Avery Pennarun
could just do the last three steps using git merge -s theirs.)
=20
As long as you "git config merge.summary true" (to make the mer=
ge
Post by Avery Pennarun
commit list all the commits it's merging)
When you *merge* (as opposed to rebase or cherry-pick) into an svn
branch, you only create a *single* svn commit that contains *all* =
the
Post by Avery Pennarun
changes. =A0The above config setting just makes the merge commit c=
ontain
Post by Avery Pennarun
a list of all the commits it contains.
But git will not use this information in any way, AFAIK. =A0So this=
information
Post by Avery Pennarun
is only for the person who will do the next merge?
=20
In fact, it *only* affects the svn log. Otherwise svn log ends up
with a useless commit that says "Merged from commit (giant hex
string)", and you can't actually do anything with the giant hex strin=
g
Post by Avery Pennarun
because svn doesn't know what it is.
OK. I just noticed the list is limited to 22 entries. Can this be
configured somehow to contain the complete list?
Post by Avery Pennarun
The people are not uncooperative. =A0It is just that there's no way=
to
Post by Avery Pennarun
completely separate the public and private content.
=20
There is, if you're willing to do it. The usual way is two have two
branches: public and private.
Well, my plan was to have one (generic) public repository that contains
templates instead of the localized information. Separating the
repositories is a security measure here.

Whether separate repositories or only different branches, conflicts
_are_ to be expected in this area. =20
Post by Avery Pennarun
Whenever you make a change that you want to be public, you commit it
on the public branch, then merge (git merge or svn merge, it doesn't
matter) from public to private. If you want to make a private change=
,
Post by Avery Pennarun
you just commit it directly to private.
=20
This way, you will always have the two sets of changes isolated, you
never have to cherry-pick anything, and "git diff public private" is
always a sensible thing to do.
Yeah, I see. OTOH, I'd rather avoid doing two steps in one go.
Currently, I have a hard time to get _one_ branch in sync. I'll come
back to the multiple-branch-thing as soon as I have mastered the
one-branch-thing ;-)
Post by Avery Pennarun
(In fact, when I do this, I often don't share the private branch with
anyone at all, which means it's safe to rebase. That means I can kee=
p
Post by Avery Pennarun
a clean set of patches against the public branch, and sort and
rearrange or share them whenever I feel like it. This is useful in
some cases. Rebasing isn't *always* bad :))
=20
If you're using cherry-pick for everything, there's no reason to u=
se
Post by Avery Pennarun
tricks like 'merge -s ours'. =A0Just leave out the merging entirel=
y and
Post by Avery Pennarun
don't pretend that what you're doing is merging; it isn't. =A0(You=
still
Post by Avery Pennarun
don't need 'git svn rebase' for anything. =A0Just checkout the bra=
nch
Post by Avery Pennarun
you want to change, cherry-pick stuff into it, and 'git svn dcommi=
t'
Post by Avery Pennarun
if appropriate.)
But then I have to do the book-keeping (what was already picked in =
which
Post by Avery Pennarun
direction) by myself?
=20
On branch b, 'git merge x' will always merge all the changes from the
most recent merge of x into b (which might be a "-s ours" merge if yo=
u
Post by Avery Pennarun
want), up to the tip of x. So if you don't commit any *new* private
stuff to x, you can use merge. If you're intermixing the changes,
you'll need to use cherry-pick. git won't attempt to track the
cherry-picks for you (like eg. svnmerge will).
Avery Pennarun
2009-04-30 22:59:50 UTC
Permalink
=A0# cherry-pick from second-svn to first-svn
=A0#
=A0git svn checkout first-svn
=A0git cherry-pick sha1-from-second-svn # repeat as needed
=A0git checkout first-svn/trunk
=A0git merge --no-ff first-svn
=A0git diff first-svn/trunk first-svn >changes.diff
=A0git svn dcommit
[...]
=A0git log -1 first-svn/trunk
says "Merge branch first-svn into HEAD". =A0But this does not reflect
what I've actually done: I've picked _from_ second-svn and committed
that _to_ first-svn.
The most recent commit to first-svn/trunk was "git merge --no-ff
first-svn", which creates the merge commit you're seeing here. (HEAD
=3D=3D first-svn/trunk). So this sounds right to me.

"git log -1 first-svn" would give you the first cherry-pick. But
remember, it's a completely different branch.
Post by Avery Pennarun
What you probably thought you should do, given that the existing
=A0 =A0# WRONG
=A0 =A0git checkout first-svn
=A0 =A0git cherry-pick some stuff
=A0 =A0git merge [perhaps -s ours] second-svn/trunk
=A0 =A0git svn dcommit
Almost... In addition, I was trying to "git svn rebase" before the
dcommit
'git svn dcommit' implies 'git svn rebase' first anyway, so it's the sa=
me.
What would happen if somebody else creates a new commit just after I
"git svn fetch" but before I dcommit? =A0Guess, svn will not accept t=
his
commit, because it is based on an outdated revision. =A0How would I
get out from this situation?
AFAIK, it will attempt to do "git svn rebase" first, and if that
succeeds, it will do the commit.

In such a case, the rebase should be okay, because it's only changing
commits (in fact, just one commit: the merge commit) that don't exist
on any other branch. Thus it won't mangle any other merges.
Post by Avery Pennarun
As it happens, I wrote the git-svn chapter for the
very-nearly-available new O'Reilly book "Version Control with Git." =
=A0I
Post by Avery Pennarun
gave the complicated solution there too.
Interesting. =A0Do you have any information when it will be available=
?

Mid-May, as I understand it.
=A0first-svn: =A0contains the cherries that I picked from second-svn.=
This
=A0 =A0 =A0 =A0 =A0 =A0 =A0branch looks the way first-svn/trunk shoul=
d be
=A0second-svn: contains the cherries that I picked from first-svn. Th=
is
=A0 =A0 =A0 =A0 =A0 =A0 =A0looks the way second-svn/trunk should be
Okay, if you want to end up with two different remote branches, it
makes sense to have two different local branches.
Don't I need to rebase at least one of them if I want to "merge" thos=
e
two branches into a single one?
I don't think so. If you merge them together, what do you *want* it
to look like? And what do you want to do with that branch afterwards?
It's hard for me to guess, but it seems unlikely that rebasing things
will get you there.

If what you want is "one central branch that currently looks like
first-svn/trunk or second-svn/trunk or maybe something else, but we'll
be merging future changes to first-svn and second-svn into it in the
future", then you would do:

git checkout -b one-true-branch
...make it look however you want...
# now mark it as up-to-date with svn, but don't change anything
git merge -s ours first-svn/trunk
git merge -s ours second-svn/trunk

And then in the future, whenever first-svn/trunk or second-svn/trunk
change, you would do:

git merge first-svn/trunk
git merge second-svn/trunk

etc.
I have a hard time to adopt my mental model to the one-branch method =
for
some reason. =A0OTOH, I can easily understand the multiple-branch met=
for every remote branch, I have a local branch on which I collect the
commits that should go to this remote.
It's indeed pretty complicated. I had much better luck once I finally
separated the two concepts in my mind. In general, you want to name
local branches after what they *do*; you almost never have a 1:1
mapping between local and remote branches, particularly when using
git-svn (at least, when using git-svn with merges instead of rebases).

Maybe think of it like this: you're not "collecting commits." You're
simply merging changes from one place to another, and testing them
out. Once you're happy with what's on your local branch, you want to
merge it (--no-ff) into the remote branch and git svn dcommit it. svn
will never see the individual commits; it only sees the merge commits.
Post by Avery Pennarun
As long as you "git config merge.summary true" (to make the me=
rge
Post by Avery Pennarun
commit list all the commits it's merging)
[...]
Post by Avery Pennarun
In fact, it *only* affects the svn log. =A0Otherwise svn log ends up
with a useless commit that says "Merged from commit (giant hex
string)", and you can't actually do anything with the giant hex stri=
ng
Post by Avery Pennarun
because svn doesn't know what it is.
OK. =A0I just noticed the list is limited to 22 entries. =A0Can this =
be
configured somehow to contain the complete list?
Ah, that. I don't think there's an obvious way. I forgot I did it
until now, but my copy of git is patched to "fix" this. My patch is
attached below.
Post by Avery Pennarun
The people are not uncooperative. =A0It is just that there's no wa=
y to
Post by Avery Pennarun
completely separate the public and private content.
There is, if you're willing to do it. =A0The usual way is two have t=
wo
Post by Avery Pennarun
branches: public and private.
Well, my plan was to have one (generic) public repository that contai=
ns
templates instead of the localized information. =A0Separating the
repositories is a security measure here.
Whether separate repositories or only different branches, conflicts
_are_ to be expected in this area.
Conflicts are normal. But you can simply resolve them, using normal
git mechanisms, when merging from public-svn/trunk to
second-svn/trunk, or whatever. At least you shouldn't have to
cherry-pick or rebase anything.

Have fun,

Avery


diff --git a/builtin-fmt-merge-msg.c b/builtin-fmt-merge-msg.c
index df18f40..96c42ff 100644
--- a/builtin-fmt-merge-msg.c
+++ b/builtin-fmt-merge-msg.c
@@ -255,7 +255,7 @@ static void shortlog(const char *name, unsigned cha=
r *sha1,
}

int fmt_merge_msg(int merge_summary, struct strbuf *in, struct strbuf =
*out) {
- int limit =3D 20, i =3D 0, pos =3D 0;
+ int limit =3D 2000, i =3D 0, pos =3D 0;
char line[1024];
char *p =3D line, *sep =3D "";
unsigned char head_sha1[20];
Josef Wolf
2009-05-01 14:28:11 UTC
Permalink
Post by Avery Pennarun
=A0# cherry-pick from second-svn to first-svn
=A0#
=A0git svn checkout first-svn
=A0git cherry-pick sha1-from-second-svn # repeat as needed
=A0git checkout first-svn/trunk
=A0git merge --no-ff first-svn
=A0git diff first-svn/trunk first-svn >changes.diff
=A0git svn dcommit
[...]
=A0git log -1 first-svn/trunk
says "Merge branch first-svn into HEAD". =A0But this does not refle=
ct
Post by Avery Pennarun
what I've actually done: I've picked _from_ second-svn and committe=
d
Post by Avery Pennarun
that _to_ first-svn.
=20
The most recent commit to first-svn/trunk was "git merge --no-ff
first-svn", which creates the merge commit you're seeing here. (HEAD
=3D=3D first-svn/trunk). So this sounds right to me.
=20
"git log -1 first-svn" would give you the first cherry-pick. But
remember, it's a completely different branch.
I can see why this happens, but I still find it confusing. Maybe I
should help with the -m option?
Post by Avery Pennarun
Post by Avery Pennarun
What you probably thought you should do, given that the existing
=A0 =A0# WRONG
=A0 =A0git checkout first-svn
=A0 =A0git cherry-pick some stuff
=A0 =A0git merge [perhaps -s ours] second-svn/trunk
=A0 =A0git svn dcommit
Almost... In addition, I was trying to "git svn rebase" before the
dcommit
=20
'git svn dcommit' implies 'git svn rebase' first anyway, so it's the =
same.
Post by Avery Pennarun
=20
What would happen if somebody else creates a new commit just after =
I
Post by Avery Pennarun
"git svn fetch" but before I dcommit? =A0Guess, svn will not accept=
this
Post by Avery Pennarun
commit, because it is based on an outdated revision. =A0How would I
get out from this situation?
=20
AFAIK, it will attempt to do "git svn rebase" first, and if that
succeeds, it will do the commit.
=20
In such a case, the rebase should be okay, because it's only changing
commits (in fact, just one commit: the merge commit) that don't exist
on any other branch. Thus it won't mangle any other merges.
Yeah, that's the simple case. But what if the rebase don't succeed?
Post by Avery Pennarun
=A0first-svn: =A0contains the cherries that I picked from second-sv=
n. This
Post by Avery Pennarun
=A0 =A0 =A0 =A0 =A0 =A0 =A0branch looks the way first-svn/trunk sho=
uld be
Post by Avery Pennarun
=A0second-svn: contains the cherries that I picked from first-svn. =
This
Post by Avery Pennarun
=A0 =A0 =A0 =A0 =A0 =A0 =A0looks the way second-svn/trunk should be
=20
Okay, if you want to end up with two different remote branches, it
makes sense to have two different local branches.
Well, I _have_ two different remotes because I have two svn repositorie=
s.
Post by Avery Pennarun
Don't I need to rebase at least one of them if I want to "merge" th=
ose
Post by Avery Pennarun
two branches into a single one?
=20
I don't think so. If you merge them together, what do you *want* it
to look like? And what do you want to do with that branch afterwards=
?
Post by Avery Pennarun
It's hard for me to guess, but it seems unlikely that rebasing thing=
s
Post by Avery Pennarun
will get you there.
Then I have to keep both local branches. But I still wonder why you
suggested to go with _one_ local branch.
Post by Avery Pennarun
If what you want is "one central branch that currently looks like
first-svn/trunk or second-svn/trunk or maybe something else, but we'l=
l
Post by Avery Pennarun
be merging future changes to first-svn and second-svn into it in the
=20
git checkout -b one-true-branch
...make it look however you want...
# now mark it as up-to-date with svn, but don't change anything
git merge -s ours first-svn/trunk
git merge -s ours second-svn/trunk
=20
And then in the future, whenever first-svn/trunk or second-svn/trunk
=20
git merge first-svn/trunk
git merge second-svn/trunk
=20
etc.
I guess that might be interesting for the generic branch, where all the
localizations are replaced by templates.
Avery Pennarun
2009-05-01 19:17:14 UTC
Permalink
"git log -1 first-svn" would give you the first cherry-pick. =A0But
remember, it's a completely different branch.
I can see why this happens, but I still find it confusing. =A0Maybe I
should help with the -m option?
I don't know what -m does. Maybe try looking at the graph with gitk;
that might give some clues.
AFAIK, it will attempt to do "git svn rebase" first, and if that
succeeds, it will do the commit.
In such a case, the rebase should be okay, because it's only changin=
g
commits (in fact, just one commit: the merge commit) that don't exis=
t
on any other branch. =A0Thus it won't mangle any other merges.
Yeah, that's the simple case. =A0But what if the rebase don't succeed=
?

Then you'll get a conflict, and you'll have to fix it first before you
can dcommit.
Okay, if you want to end up with two different remote branches, it
makes sense to have two different local branches.
Well, I _have_ two different remotes because I have two svn repositor=
ies.

Right. I was just wondering whether you wanted the two branches'
contents to be *different* or identical. I guess different.
Then I have to keep both local branches. =A0But I still wonder why yo=
u
suggested to go with _one_ local branch.
=46or my own purposes, I try not to create a 1:1 mapping between local
branches and remote branches; this just ends up being confusing,
because I can have commits in my local branch that aren't in the
remote one, and vice versa. So it's not very useful to create a local
branch *just* because I have a corresponding remote branch.

In your case, you might want to have just a single local branch for
your "public" stuff. You would then merge changes from the two svn
remote branches into your local branch, and you'd also merge from your
local branch into your remote branches (using a disconnected HEAD and
svn dcommit).

Have fun,

Avery
Josef Wolf
2009-05-02 21:58:52 UTC
Permalink
Post by Avery Pennarun
"git log -1 first-svn" would give you the first cherry-pick. =A0Bu=
t
Post by Avery Pennarun
remember, it's a completely different branch.
I can see why this happens, but I still find it confusing. =A0Maybe=
I
Post by Avery Pennarun
should help with the -m option?
=20
I don't know what -m does. Maybe try looking at the graph with gitk;
that might give some clues.
Option -m lets me set the log message explicitly :)
Post by Avery Pennarun
Okay, if you want to end up with two different remote branches, it
makes sense to have two different local branches.
Well, I _have_ two different remotes because I have two svn reposit=
ories.
Post by Avery Pennarun
=20
Right. I was just wondering whether you wanted the two branches'
contents to be *different* or identical. I guess different.
They have to stay different, because they are localized.
Post by Avery Pennarun
Then I have to keep both local branches. =A0But I still wonder why =
you
Post by Avery Pennarun
suggested to go with _one_ local branch.
=20
For my own purposes, I try not to create a 1:1 mapping between local
branches and remote branches; this just ends up being confusing,
because I can have commits in my local branch that aren't in the
remote one, and vice versa. So it's not very useful to create a loca=
l
Post by Avery Pennarun
branch *just* because I have a corresponding remote branch.
=20
In your case, you might want to have just a single local branch for
your "public" stuff. You would then merge changes from the two svn
remote branches into your local branch, and you'd also merge from you=
r
Post by Avery Pennarun
local branch into your remote branches (using a disconnected HEAD and
svn dcommit).
But I am working not only on the "public" stuff. Additionally, I am
working on _multiple_ localized stuff. Thus, I have multiple remote
repositories.

Somehow, I still can't get it work. This is what I do:

# create the repos
#
git svn init --stdlayout file:///var/tmp/builds/git-sync/svn/svn-1
git config merge.stat true

# add configuration for svn-1 repos
#
git config svn-remote.svn-1.url file:///var/tmp/builds/git-sync/=
svn/svn-1
git config svn-remote.svn-1.fetch trunk:refs/remotes/svn-1/trunk
git config svn-remote.svn-1.branches branches/*:refs/remotes/svn-1/*
git config svn-remote.svn-1.tags tags/*:refs/remotes/svn-1/tags/*

# add configuration for svn-2 repos
#
git config svn-remote.svn-2.url file:///var/tmp/builds/git-sync/=
svn/svn-2
git config svn-remote.svn-2.fetch trunk:refs/remotes/svn-2/trunk
git config svn-remote.svn-2.branches branches/*:refs/remotes/svn-2/*
git config svn-remote.svn-2.tags tags/*:refs/remotes/svn-2/tags/*

# fetch the commits from svn repositories
#
git svn fetch -R svn-1
git svn fetch -R svn-2

# create local tracking branches
#
git checkout -b svn-1 svn-1/trunk
git checkout -b svn-2 svn-2/trunk

# just to see what we've done
#
git tag svn-1-orig svn-1
git tag svn-2-orig svn-2

# move stuff from svn-2 to svn-1
#
git svn fetch svn-2
git checkout svn-1
git cherry-pick 05b964
[ continue cherry-picking ]
git merge --no-ff -s ours svn-1

# check what I have done
#
git diff svn-1-orig svn-1/trunk # shows what I expect

# move the result to svn-1
#
git checkout svn-1/trunk
git merge --no-ff svn-1
git svn dcommit

# move stuff from svn-1 to svn-2
#
git svn fetch svn-1
git checkout svn-2
git cherry-pick -n c9dae
[ continue cherry-picking ]
git merge --no-ff -s ours svn-2

# check what I have done
#
git diff svn-2-orig svn-2/trunk # shows what I expect

# move the result to svn-2
#
git checkout svn-2/trunk
git merge --no-ff svn-2
git svn dcommit

At this point, we should be synchronized.

git checkout svn-2/trunk
git svn fetch svn-1
git merge --no-ff svn-1

BOOM. Although no new commits were fetched, we get a lot of conflicts
here. So git is not fully aware about the fact that we are synchronize=
d.
Avery Pennarun
2009-05-04 15:58:20 UTC
Permalink
[...]
=A0# move stuff from svn-2 to svn-1
=A0#
=A0git svn fetch svn-2
=A0git checkout svn-1
=A0git cherry-pick 05b964
=A0[ =A0continue cherry-picking ]
=A0git merge --no-ff -s ours svn-1
Note that you probably should be merging '-s ours svn-2' here, not
svn-1. svn-1 already contains svn-1 (of course) so that merge didn't
do anything. It most especially doesn't mark svn-1 as being
up-to-date with svn-2, and that's probably going to make trouble
later.
=A0# check what I have done
=A0#
=A0git diff svn-1-orig svn-1/trunk # shows what I expect
This is unsurprising, since you haven't changed either branch during th=
e above.
=A0# move the result to svn-1
=A0#
=A0git checkout svn-1/trunk
=A0git merge --no-ff svn-1
=A0git svn dcommit
This looks ok.
=A0# move stuff from svn-1 to svn-2
=A0#
=A0git svn fetch svn-1
=A0git checkout svn-2
=A0git cherry-pick -n c9dae
=A0[ continue cherry-picking ]
=A0git merge --no-ff -s ours svn-2
Again, you seem to have merged in the wrong branch here. This one
should be svn-1.
=A0# check what I have done
=A0#
=A0git diff svn-2-orig svn-2/trunk # shows what I expect
Again, these branches haven't changed, so no surprise here either.
=A0# move the result to svn-2
=A0#
=A0git checkout svn-2/trunk
=A0git merge --no-ff svn-2
=A0git svn dcommit
This seems ok.
At this point, we should be synchronized.
Yes, although there are no merges between svn-1 and svn-2, so the next
attempt at merging will merge *everything*, causing conflicts.
=A0git checkout svn-2/trunk
=A0git svn fetch svn-1
=A0git merge --no-ff svn-1
BOOM. =A0Although no new commits were fetched, we get a lot of confli=
cts
here. =A0So git is not fully aware about the fact that we are synchro=
nized.

You seem to almost have it. Fix the -s ours merges above and I think
you'll be in business.

Have fun,

Avery
Josef Wolf
2009-05-04 21:14:23 UTC
Permalink
Post by Avery Pennarun
[...]
=A0# move stuff from svn-2 to svn-1
=A0#
=A0git svn fetch svn-2
=A0git checkout svn-1
=A0git cherry-pick 05b964
=A0[ =A0continue cherry-picking ]
=A0git merge --no-ff -s ours svn-1
=20
Note that you probably should be merging '-s ours svn-2' here, not
svn-1. svn-1 already contains svn-1 (of course) so that merge didn't
do anything. It most especially doesn't mark svn-1 as being
up-to-date with svn-2, and that's probably going to make trouble
later.
Yeah, you're right. That was a typo.

=46ixing this, the "getting started" seems to work now: the cherry-pick=
ed
commits end up in the svn repositories. But the synchronization after
the "getting started" does not seem to work yet. Here's what I've done=
:

git tag svn-1-orig svn-1
git tag svn-2-orig svn-2

# move cherries to svn-2
#
git svn fetch svn-1
git checkout svn-2
git cherry-pick c9da
[ ... ]
git merge --no-ff -s ours svn-1
git checkout svn-2/trunk
git merge --no-ff svn-2
git svn dcommit
git diff svn-2-orig svn-2/trunk # check what I've done

# move cherries to svn-1
#
git svn fetch svn-2
git checkout svn-1
git cherry-pick 05b9
[ ... ]
git merge --no-ff -s ours svn-2
git checkout svn-1/trunk
git merge --no-ff svn-1
git svn dcommit
git diff svn-1-orig svn-1/trunk # check what I've done

git diff svn-1/trunk svn-2/trunk # shows the diffs I want to keep

# now try a synchronization
#
git checkout svn-2/trunk
git svn fetch svn-1 # nothing new was checked in yet
git merge --no-ff svn-1
git svn dcommit

Since no new commits were made in svn, those four commands should be
almost a no-op.

But Instead of merging only the changes that were done after the last
synchronization, the last dcommit makes svn-2/trunk identical to svn-1.
This effectively wipes all the differences which I would like to keep.
Josef Wolf
2009-05-06 18:52:24 UTC
Permalink
On Mon, May 04, 2009 at 11:14:23PM +0200, Josef Wolf wrote:

I am still trying to understand what is going on here, so I tried to
draw the history-graph. It turns out that after the "getting-started"
Post by Josef Wolf
git tag svn-1-orig svn-1
git tag svn-2-orig svn-2
# move cherries to svn-2
#
git svn fetch svn-1
git checkout svn-2
git cherry-pick c9da
[ ... ]
git merge --no-ff -s ours svn-1
git checkout svn-2/trunk
git merge --no-ff svn-2
git svn dcommit
# move cherries to svn-1
#
git svn fetch svn-2
git checkout svn-1
git cherry-pick 05b9
[ ... ]
git merge --no-ff -s ours svn-2
git checkout svn-1/trunk
git merge --no-ff svn-1
git svn dcommit
Here's what I have at this point:

------------------S1TRUNK
/ /
--hs1--O1--c2...c2-------S1
\ /
`+++++++. /
\ /
--hs2--O2--c1...c1--S2
\ \
-------------S2TRUNK

hs1, hs2: history imported from svn-1 and svn2, respectively
O1, O2: the svn-1-orig and svn-2-orig tags
c1, c2: cherries picked from hs1 and hs2, respectively
S1, S2: svn-1 and svn-2, the local tracking branches
S1TRUNK, S2TRUNK: the remotes/svn-X/trunk branches

I would have expected a symmetrical diagram. But it turns out that the
connection marked with plusses is still at O1 instead of S1. So it takes
no wonder that the c2 cherries get re-applied to the s2 branch on the
next merge.

Is this understanding somewhat plausible? Any ideas how to get this
this fixed?
Avery Pennarun
2009-05-06 19:23:40 UTC
Permalink
=A0 =A0 =A0 =A0 =A0 =A0 ------------------S1TRUNK
=A0 =A0 =A0 =A0 =A0 =A0/ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 /
=A0 --hs1--O1--c2...c2-------S1
=A0 =A0 =A0 =A0 =A0 =A0\ =A0 =A0 =A0 =A0 =A0 =A0 =A0/
=A0 =A0 =A0 =A0 =A0 =A0 `+++++++. =A0 =A0/
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0\ =A0/
=A0 --hs2--O2--c1...c1--S2
=A0 =A0 =A0 =A0 =A0 =A0\ =A0 =A0 =A0 =A0 =A0 =A0\
=A0 =A0 =A0 =A0 =A0 =A0 -------------S2TRUNK
=A0 hs1, hs2: =A0 =A0 =A0 =A0 history imported from svn-1 and svn2, r=
espectively
=A0 O1, O2: =A0 =A0 =A0 =A0 =A0 the svn-1-orig and svn-2-orig tags
=A0 c1, c2: =A0 =A0 =A0 =A0 =A0 cherries picked from hs1 and hs2, res=
pectively
=A0 S1, S2: =A0 =A0 =A0 =A0 =A0 svn-1 and svn-2, the local tracking b=
ranches
=A0 S1TRUNK, S2TRUNK: the remotes/svn-X/trunk branches
I would have expected a symmetrical diagram. =A0But it turns out that=
the
connection marked with plusses is still at O1 instead of S1. =A0So it=
takes
no wonder that the c2 cherries get re-applied to the s2 branch on the
next merge.
That's a well-drawn diagram, but unfortunately I'm still confused.
What is the "connection marked with plusses" and does it have a name?
It *looks* to me like both S1TRUNK and S2TRUNK should be okay, but
it's hard to tell what has actually happened here.

If you could post a screenshot of 'gitk --all' it might help.

Avery
Josef Wolf
2009-05-06 22:50:42 UTC
Permalink
Post by Avery Pennarun
=A0 =A0 =A0 =A0 =A0 =A0 ------------------S1TRUNK
=A0 =A0 =A0 =A0 =A0 =A0/ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 /
=A0 --hs1--O1--c2...c2-------S1
=A0 =A0 =A0 =A0 =A0 =A0\ =A0 =A0 =A0 =A0 =A0 =A0 =A0/
=A0 =A0 =A0 =A0 =A0 =A0 `+++++++. =A0 =A0/
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0\ =A0/
=A0 --hs2--O2--c1...c1--S2
=A0 =A0 =A0 =A0 =A0 =A0\ =A0 =A0 =A0 =A0 =A0 =A0\
=A0 =A0 =A0 =A0 =A0 =A0 -------------S2TRUNK
=A0 hs1, hs2: =A0 =A0 =A0 =A0 history imported from svn-1 and svn2,=
respectively
Post by Avery Pennarun
=A0 O1, O2: =A0 =A0 =A0 =A0 =A0 the svn-1-orig and svn-2-orig tags
=A0 c1, c2: =A0 =A0 =A0 =A0 =A0 cherries picked from hs1 and hs2, r=
espectively
Post by Avery Pennarun
=A0 S1, S2: =A0 =A0 =A0 =A0 =A0 svn-1 and svn-2, the local tracking=
branches
Post by Avery Pennarun
=A0 S1TRUNK, S2TRUNK: the remotes/svn-X/trunk branches
I would have expected a symmetrical diagram. =A0But it turns out th=
at the
Post by Avery Pennarun
connection marked with plusses is still at O1 instead of S1. =A0So =
it takes
Post by Avery Pennarun
no wonder that the c2 cherries get re-applied to the s2 branch on t=
he
Post by Avery Pennarun
next merge.
=20
That's a well-drawn diagram, but unfortunately I'm still confused.
What is the "connection marked with plusses" and does it have a name?
Well, the whole history (including this connection) was created by the
commands I posted. The only exception are hs1 and hs2, which were
imported from the svn repositories (but they are linear).

AFAICS, this connection was created by the very first merge after the
cherry-picking from hs1 (the cherries marked as c1..c1). After importi=
ng
the svn repositories, I've done this:

git checkout svn-2 # S2 in the diagram above, but used
# to be identical to O2 at that =
time
[ cherry-picking c1...c1 ] # S2 now moved to its place in diag=
ram
git merge --no-ff -s ours svn-1 # S1 in the diagram above, but used
# to be identical to O1 at the
# time of that merge, This merge
# creates the mystical
# "connection marked with plusse=
s"

Then we've done the "detached head" merge, that created the S2TRUNK

git checkout svn-2/trunk # S2TRUNK was at O2 at that time
git merge --no-ff svn-2
git svn dcommit # moves S2TRUNK to the place in the diagr=
am

Now take care of the other direction:

git checkout svn-1 # S1 in the diagram above, but stil=
l
# identical to O1 at that time
[ cherry-picking c2...c2 ] # S1 now moved to its place in diag=
ram
git merge --no-ff -s ours svn-2 # S2 in the diagram above. Unlike =
S1,
# S2 already _is_ at the place w=
here
# it is drawn in the diagram. S=
o
# this merge creates the connect=
ion
# S2->S1

Now we do the "detached head" merge, that creates the S1TRUNK

git checkout svn-1/trunk # S1TRUNK was at O1 at that time
git merge --no-ff svn-1
git svn dcommit # moves S1TRUNK to the place in the diagr=
am

So the "connection marked with plusses" is basically the counterpart of
the "S2->S1" connection. But while "S2->S1" got its proper position at
the time it was created, the plus-connection was created before the c2
cherries. And it was never adjusted. AFAICS, those two connections
should be symmetrical: "S1->S2" and "S2->S1".
Post by Avery Pennarun
It *looks* to me like both S1TRUNK and S2TRUNK should be okay, but
it's hard to tell what has actually happened here.
Yes, the trunks (and the svn repositories) look pretty good at _that_
point in time. But the next merge on S2TRUNK moves all the modificatio=
ns
done by the c2 cherries down to S2TRUNK.
Post by Avery Pennarun
If you could post a screenshot of 'gitk --all' it might help.
IMHO, Screenshots are not of much help here. That's why I posted

http://www.spinics.net/lists/git/msg102609.html

The svn histories are about 1250 commits each. The cherry-pickings are
about 350 commits each. This gives histories running in parallel for
long distances. Add to this gitk's tendency to change lanes at every
occasion: There's no chance to get multiple screen shots (the interest=
ing
branch/merge-points, as I described in the thread referenced above) in
sync. There's many opportunities to get confused. At least for me, as=
a
newbie to git.
Avery Pennarun
2009-05-08 20:44:39 UTC
Permalink
The svn histories are about 1250 commits each. =A0The cherry-pickings=
are
about 350 commits each. =A0This gives histories running in parallel f=
or
long distances. =A0Add to this gitk's tendency to change lanes at eve=
ry
occasion: =A0There's no chance to get multiple screen shots (the inte=
resting
branch/merge-points, as I described in the thread referenced above) i=
n
sync. =A0There's many opportunities to get confused. =A0At least for =
me, as a
newbie to git.
I agree that gitk's lane-changing can be a bit confusing. Could you
try making a slightly modified version of your script, where you only
cherry-pick one or two commits in each direction? That should be
functionally identical, but a much simpler diagram.

Avery
Josef Wolf
2009-05-08 23:58:22 UTC
Permalink
Post by Avery Pennarun
I agree that gitk's lane-changing can be a bit confusing. Could you
try making a slightly modified version of your script, where you only
cherry-pick one or two commits in each direction? That should be
functionally identical, but a much simpler diagram.
Somehow I can't keep git-svn from fetching all the svn revisions, thus
it's still split over a long distance.

But in the meantime, I've hacked a quick-n-dirty script to show only
the 'interesting' commits with an optional context. This helped me a
lot to get a better understanding what's going on. I have appended the
result at the end of this mail. Please convert it with "dot -Tps" to
postscript to view the result.

So here I go again with my attempt to analyze what happens. I attach
the sha1 of created commits as a comment to the command that creates the
commit.

# Create tags so we can see later what we have done
#
git tag svn-1-orig svn-1
git tag svn-2-orig svn-2

# move cherries from svn-1 to svn-2
#
git svn fetch svn-1
git checkout svn-2
[ cherry-picking, creates 67446..0a742 ]
git merge --no-ff -s ours svn-1 -m 'merge ours svn-1 to svn-2' # 5d9a0
git checkout svn-2/trunk
git merge --no-ff svn-2 -m 'merge svn-1 to svn-2' # f80d2
git svn dcommit

# check the results
#
git diff svn-2-orig svn-2/trunk
git diff svn-2-orig svn-2

# move cherries from svn-1 to svn-2
#
git svn fetch svn-2
git checkout svn-1
[ cherry-picking, creates a5cf3..c3ff2 ]
git merge --no-ff -s ours svn-2 -m 'merge ours svn-2 to svn-1' # 2379d
git checkout -q svn-1/trunk
git merge --no-ff svn-1 -m 'merge svn-2 to svn-1' # 693fa
git svn dcommit --no-rebase

# again, check the results
#
git diff svn-1-orig svn-1/trunk
git diff svn-1-orig svn-1

At this time, I made the graph attached below. Two things are
interesting in this graph:
- svn-1 has all the imported commits and all the cherries as parent
svn-2 does _not_ have a5cf3..c3ff2 as parent
- same thing happens for remotes/svn-2/trunk vs. remotes/svn-2/trunk


Now, when I do

git checkout svn-2 # same thing happens when svn-2/trunk is used
git merge --no-ff svn-1

And here I'm completely barfed. The tree is set _identical_ to the tree
in 2379d. All the differences which should be kept are lost here.
I can easily see (although I don't know how to avoid it) why a5cf3..c3ff2
might be applied to svn-2 and svn-2/trunk though it should not be applied.

But I completely fail to see why the tree is set identical to 2379d.


Here is an overview of the created commits:

# b8bf1, 8536f..09393, d0f29 imported from svn-1 repository
# 7b397, 17156..e0772, 05eb1 imported from svn-2 repository
# a5cf3..c3ff2 cherries from svn-2 to svn-1
# 67446..0a742 cherries from svn-1 to svn-2
# 5d9a0 merge ours svn-1 to svn-2
# 693fa merge svn-2 to svn-1
# 2379d merge ours svn-2 to svn-1
# f80d2 merge svn-1 to svn-2

And here's the .dot graph. Please pipe it through "dot -Tps" to create
a postscript file of the graph.

strict digraph G {
size = "7,10"
"8536f" [label="8536f"] ;
"b8bf1"->"8536f" ;
"693fa" [label="693fa\nremotes/svn-1/trunk"] ;
"d0f29"->"693fa" ;
"2379d"->"693fa" ;
"5d9a0" [label="5d9a0\nsvn-2"] ;
"d0f29"->"5d9a0" ;
"0a742"->"5d9a0" ;
"f80d2" [label="f80d2\nremotes/svn-2/trunk"] ;
"05eb1"->"f80d2" ;
"5d9a0"->"f80d2" ;
"b8bf1" [label="b8bf1"] ;
"09393" [label="09393"] ;
"8536f"->"09393" [style="dotted"] ;
"05eb1" [label="05eb1\nsvn-2-orig"] ;
"e0772"->"05eb1" ;
"67446" [label="67446"] ;
"05eb1"->"67446" ;
"d0f29" [label="d0f29\nmaster\nsvn-1-orig"] ;
"09393"->"d0f29" ;
"a5cf3" [label="a5cf3"] ;
"d0f29"->"a5cf3" ;
"17156" [label="17156"] ;
"7b397"->"17156" ;
"2379d" [label="2379d\nsvn-1"] ;
"c3ff2"->"2379d" ;
"5d9a0"->"2379d" ;
"0a742" [label="0a742"] ;
"67446"->"0a742" [style="dotted"] ;
"c3ff2" [label="c3ff2"] ;
"a5cf3"->"c3ff2" [style="dotted"] ;
"7b397" [label="7b397"] ;
"e0772" [label="e0772"] ;
"17156"->"e0772" [style="dotted"] ;
}
Josef Wolf
2009-05-13 12:09:22 UTC
Permalink
On Sat, May 09, 2009 at 01:58:22AM +0200, Josef Wolf wrote:

After lots of trial-and-error, I guess I've located the reason of
Post by Josef Wolf
# Create tags so we can see later what we have done
#
git tag svn-1-orig svn-1
git tag svn-2-orig svn-2
# move cherries from svn-1 to svn-2
#
git svn fetch svn-1
git checkout svn-2
[ cherry-picking, creates 67446..0a742 ]
git merge --no-ff -s ours svn-1 -m 'merge ours svn-1 to svn-2' # 5d9a0
git checkout svn-2/trunk
git merge --no-ff svn-2 -m 'merge svn-1 to svn-2' # f80d2
git svn dcommit
# check the results
#
git diff svn-2-orig svn-2/trunk
git diff svn-2-orig svn-2
# move cherries from svn-1 to svn-2
#
git svn fetch svn-2
git checkout svn-1
[ cherry-picking, creates a5cf3..c3ff2 ]
git merge --no-ff -s ours svn-2 -m 'merge ours svn-2 to svn-1' # 2379d
git checkout -q svn-1/trunk
git merge --no-ff svn-1 -m 'merge svn-2 to svn-1' # 693fa
git svn dcommit --no-rebase
# again, check the results
#
git diff svn-1-orig svn-1/trunk
git diff svn-1-orig svn-1
At this time, I made the graph attached below. Two things are
- svn-1 has all the imported commits and all the cherries as parent
svn-2 does _not_ have a5cf3..c3ff2 as parent
- same thing happens for remotes/svn-2/trunk vs. remotes/svn-2/trunk
So, in order to mark a5cf3..c3ff2 as ancestors of svn-2, I've inserted
following step at this place:

git checkout svn-2
git merge --no-ff -s ours svn-1

This works fine, a5cf3..c3ff2 are now recorded as ancestors of svn-2 and
will no longer be picked on future merges.

git checkout svn-2/trunk
git merge --no-ff -s ours svn-1
git svn dcommit

Now here's the problem: This last dcommit does simply a reset, because
nothing has changed since the last dcommit. So a5cf3..c3ff2 are _not_
marked as ancestors of svn-2/trunk, causing those cherries to be rebased
at the next dcommit with real changes.

Unfortunately, dcommit doesn't seem to have an option to force rebase
instead of resetting.

Any ideas how to mark those commits as ancestors of svn-2/trunk?
Post by Josef Wolf
# b8bf1, 8536f..09393, d0f29 imported from svn-1 repository
# 7b397, 17156..e0772, 05eb1 imported from svn-2 repository
# a5cf3..c3ff2 cherries from svn-2 to svn-1
# 67446..0a742 cherries from svn-1 to svn-2
# 5d9a0 merge ours svn-1 to svn-2
# 693fa merge svn-2 to svn-1
# 2379d merge ours svn-2 to svn-1
# f80d2 merge svn-1 to svn-2
And here's the .dot graph. Please pipe it through "dot -Tps" to create
a postscript file of the graph.
strict digraph G {
size = "7,10"
"8536f" [label="8536f"] ;
"b8bf1"->"8536f" ;
"693fa" [label="693fa\nremotes/svn-1/trunk"] ;
"d0f29"->"693fa" ;
"2379d"->"693fa" ;
"5d9a0" [label="5d9a0\nsvn-2"] ;
"d0f29"->"5d9a0" ;
"0a742"->"5d9a0" ;
"f80d2" [label="f80d2\nremotes/svn-2/trunk"] ;
"05eb1"->"f80d2" ;
"5d9a0"->"f80d2" ;
"b8bf1" [label="b8bf1"] ;
"09393" [label="09393"] ;
"8536f"->"09393" [style="dotted"] ;
"05eb1" [label="05eb1\nsvn-2-orig"] ;
"e0772"->"05eb1" ;
"67446" [label="67446"] ;
"05eb1"->"67446" ;
"d0f29" [label="d0f29\nmaster\nsvn-1-orig"] ;
"09393"->"d0f29" ;
"a5cf3" [label="a5cf3"] ;
"d0f29"->"a5cf3" ;
"17156" [label="17156"] ;
"7b397"->"17156" ;
"2379d" [label="2379d\nsvn-1"] ;
"c3ff2"->"2379d" ;
"5d9a0"->"2379d" ;
"0a742" [label="0a742"] ;
"67446"->"0a742" [style="dotted"] ;
"c3ff2" [label="c3ff2"] ;
"a5cf3"->"c3ff2" [style="dotted"] ;
"7b397" [label="7b397"] ;
"e0772" [label="e0772"] ;
"17156"->"e0772" [style="dotted"] ;
}
--
To unsubscribe from this list: send the line "unsubscribe git" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
Avery Pennarun
2009-05-13 17:28:04 UTC
Permalink
Now here's the problem: =A0This last dcommit does simply a reset, bec=
ause
nothing has changed since the last dcommit. =A0So a5cf3..c3ff2 are _n=
ot_
marked as ancestors of svn-2/trunk, causing those cherries to be reba=
sed
at the next dcommit with real changes.
I find this a *bit* curious, since each dcommit should be adding the
cherry-picked changes you just now picked from the opposite branch,
right? If you weren't going to change anything, then you wouldn't
have needed to do the cherry picks at all; you could have just done a
merge -s ours in both directions in the first place.

Anyway, regardless of the above, AFAIK there's no way to force svn to
make an empty commit, which is a problem in this case. You can make a
nonempty commit, though; I've done this in the past by just adding a
newline to the end of some arbitrary file. Basically:

git merge -s ours whatever
echo >>Makefile
git add Makefile
git commit --amend
git svn dcommit

Silly, but it works.
Unfortunately, dcommit doesn't seem to have an option to force rebase
instead of resetting.
Well, in fact it *is* rebasing, which throws away the extra commit
because it thinks that commit didn't do anything. I've experienced
this problem a few times in the past, but I knew what was happening
and I figured my case was too rare to matter. Perhaps not.

This could be considered a bug in git-svn, so I cc:'d Eric Wong, who I
think is the main git-svn developer. Anyway, try the workaround
above.

Good luck,

Avery
Josef Wolf
2009-05-13 22:22:43 UTC
Permalink
Thanks for your patience, Avery! I would be completely lost here witho=
ut
your help..
Post by Avery Pennarun
Now here's the problem: =A0This last dcommit does simply a reset, b=
ecause
Post by Avery Pennarun
nothing has changed since the last dcommit. =A0So a5cf3..c3ff2 are =
_not_
Post by Avery Pennarun
marked as ancestors of svn-2/trunk, causing those cherries to be re=
based
Post by Avery Pennarun
at the next dcommit with real changes.
=20
I find this a *bit* curious, since each dcommit should be adding the
cherry-picked changes you just now picked from the opposite branch,
right?
They _are_ already added. I try to outline what happens based on the
graph I posted in my previous mail:

# move cherries from svn-1 to svn-2
#
git svn fetch svn-1
git checkout svn-2
[ cherry-picking, creates 67446..0a742 ]
git merge --no-ff -s ours svn-1 -m 'merge ours svn-1 to svn-2' # 5d9a=
0
git checkout svn-2/trunk
git merge --no-ff svn-2 -m 'merge svn-1 to svn-2' # f80d=
2
git svn dcommit

This sequence moves cherries 67446..0a742 to svn-2/trunk and creates
5d9a0+f80d2
67446..0a742 are now ancestors of svn-2/trunk
a5cf3..c3ff2 are _not_ ancestors of svn-2/trunk (they don't even exist =
yet)

# move cherries from svn-1 to svn-2
#
git svn fetch svn-2
git checkout svn-1
[ cherry-picking, creates a5cf3..c3ff2 ]
git merge --no-ff -s ours svn-2 -m 'merge ours svn-2 to svn-1' # 2379=
d
git checkout -q svn-1/trunk
git merge --no-ff svn-1 -m 'merge svn-2 to svn-1' # 693f=
a
git svn dcommit --no-rebase

This sequence moves cherries a5cf3..c3ff2 to svn-1/trunk and creates
2379d+693fa
a5cf3..c3ff2 are now ancestors of svn-1/trunk
67446..0a742 are also ancestors of svn-1/trunk since 5d9a0 pulls them i=
n.

Please notice the asymmetry here. If I try to merge another change to
svn-2 at this point, dcommit tries to pull the cherries a5cf3..c3ff2.
Since those cherries came initially from 17156..e0772, they are already
included literally in the current tree and I get a lot of conflicts,
To tell git about the fact that those cherries are already available in
svn-2/trunk, I try yet another merge set of commands:

git checkout svn-2
git merge --no-ff -s ours svn-1

This works fine, a5cf3..c3ff2 are now recorded as ancestors of svn-2 an=
d
will no longer be picked on future merges.

git checkout svn-2/trunk
git merge --no-ff -s ours svn-1
git svn dcommit

And _this_ is the dcommit I was talking about in the paragraph you cite=
d
above. This dcommit notices that the resulting tree is identical to th=
e
tree already in svn, since it was merged with "-s ours" and no real cha=
nge
was done in the mean time. So dcommit just resets and a5cf3..c3ff2 are
still not marked as ancestors of svn-2/trunk and would be pulled again
at the next merge attempt, resulting in conflicts.
Post by Avery Pennarun
If you weren't going to change anything, then you wouldn't
have needed to do the cherry picks at all; you could have just done a
merge -s ours in both directions in the first place.
The cherries _are_ moved in both directions. But the ancestry is not ye=
t
adopted because at the time of the move from svn-1 to svn-2 the cherrie=
s
that were picked from svn-2 did not exist. Therefore dcommit wants to
move them to svn-2 (where the textual contents of those cherries alread=
y
are).
Post by Avery Pennarun
Anyway, regardless of the above, AFAIK there's no way to force svn to
make an empty commit, which is a problem in this case. You can make =
a
Post by Avery Pennarun
nonempty commit, though; I've done this in the past by just adding a
=20
git merge -s ours whatever
echo >>Makefile
git add Makefile
git commit --amend
git svn dcommit
Ah, I see :-) You can do a _lot_ with git if you know _how_ to do it ;=
-)

Yes, that helps a little bit: all the cherries are now ancestors of bot=
h
remote branches and both local branches. But after this, all dcommits
complain about outdated transactions although there were no commits
to the svn repositories in the meantime:

$ git merge --no-ff -s ours svn-1
Merge made by ours. =20
$ echo >>Makefile
$ git add Makefile
$ git commit --amend -m 'Force merge ours svn-1 to svn-2/trunk'
Created commit ae455ca: Force merge ours svn-1 to svn-2/trunk
$ git svn dcommit
Committing to file:///var/tmp/builds/git-sync/svn/svn-2/trunk ...
M Makefile =
=20
Committed r1260 =
=20
M Makefile =
=20
r1260 =3D 372579ff221a151f026eef42213e52e1b9bb9d47 (svn-2/trunk) =20
No changes between current HEAD and refs/remotes/svn-2/trunk =20
Resetting to the latest refs/remotes/svn-2/trunk =20
$ git checkout svn-2
Previous HEAD position was 372579f... Force merge ours svn-1 to svn-2=
/trunk
Switched to branch "svn-2" =
=20
$ git svn fetch svn-1
$ git merge --no-ff svn-1
Already up-to-date. =20
$ git svn dcommit
Committing to file:///var/tmp/builds/git-sync/svn/svn-2/trunk ...
Transaction is out of date: File '/trunk/policy.pl' is out of date at=
/usr/lib64/git/git-svn line 469

Gitk shows that svn-2 is no longer an ancestor of svn-2/trunk. Might t=
his
be the reason for the "transaction out of date"? How do I recover from=
that?
Post by Avery Pennarun
Unfortunately, dcommit doesn't seem to have an option to force reba=
se
Post by Avery Pennarun
instead of resetting.
=20
Well, in fact it *is* rebasing, which throws away the extra commit
because it thinks that commit didn't do anything. I've experienced
this problem a few times in the past, but I knew what was happening
and I figured my case was too rare to matter. Perhaps not.
=20
This could be considered a bug in git-svn, so I cc:'d Eric Wong, who =
I
Post by Avery Pennarun
think is the main git-svn developer. Anyway, try the workaround
above.
I am not sure this is a bug. I have still the feeling that I am doing
something wrong. Maybe I should not try to throw two svn remotes onto
a single git repository? Maybe I should create a separate repository
for every direction?
Avery Pennarun
2009-05-14 06:35:18 UTC
Permalink
Thanks for your patience, Avery! =A0I would be completely lost here w=
ithout
your help..
Okay, well, I think I've been making things worse instead of better :(

=46undamentally, my claim that merging symmetrically between two svn
branches "ought to be easy" was incorrect. The problem is that when
git does a merge from branch A to B and then back to A, it really
*really* wants the two branches to end up identical. All of git's
merge machinery works on the assumption that this is what you want.

Now, you can still bypass this by using various clever tricks. The
solution we were working with turned out to be *almost* right, and I
did get it working in my test environment, but it got so convoluted
that I couldn't even explain to myself why it was correct. That's
usually a bad sign. So I threw that one away.

By far the sanest thing you could possibly do is to create a central
"public" branch that contains all the common commits, then merge from
that public branch to the site-specific branches, but never merge in
the opposite direction. In case you happen to make some changes on
the site-specific branches that you want to share, you can just
cherry-pick them; the resulting conflicts when merging back are likely
to be fairly minor. This would be entirely consistent with git's
normal operations, and would be easy:

git checkout public
git cherry-pick stuff # as rarely as possible; do the work
directly on public if you can

git checkout svn-1
git merge --no-ff public
git svn dcommit

git checkout svn-2
git merge --no-ff public
git svn dcommit

No criss-cross merges, no insanity, no question about whether it's corr=
ect.

More as an academic exercise than anything, I did find a way that will
let you do criss-cross merging of all changes on A and B. I still
don't *really* recommend you use it, because it's extremely error
prone, and there are lots of places where you could get merge
conflicts and then end up in trouble. (The above simple method, in
contrast, might get conflicts sometimes, but you can just fix them as
you encounter them and be done with it.)

The script below demonstrates how to take branches remote-ab and
remote-ac, and auto-pick their changes (as they happen) into a new
(automatically managed) branch public. Then it merges public back
into each branch, while avoiding conflicts. The magic itself happens
in extract() and crossmerge().

If nothing else, this method makes the gitk output far more sane than
the original method. This is because it doesn't include the history
of 'public' in the site-specific branches. That was the fundamental
flaw in the method I had identified originally. You can trick that
original method into working too, but it's stunningly complex. This
is much more sane, albeit still not really sane.

Enjoy!

Have fun,

Avery

P.S. Sorry for the mess. I suppose I should have broken down and
written (or asked for :)) a minimal test case earlier, as it quickly
revealed the problem.


#!/bin/bash -x
set -e
rm -rf tt
mkdir tt
cd tt
git init

count=3D100
newfile()
{
count=3D$(($count + 1))
echo $count >$1
git add $1
git commit -m "$1"
}

newfile .gitignore
git checkout -b public
newfile a
git checkout -b remote-ab public
newfile b
git checkout -b remote-ac public
newfile c

# We've simulated two remote branches (perhaps svn repositories), remot=
e-ab
# and remote-ac. They contain one identical file (a) and one different=
file
# (b vs. c). We've arranged for the common part to end up in 'public'.

git tag -f remote-ab-lastmerge remote-ab
git tag -f remote-ac-lastmerge remote-ac

extract()
{
last_public=3D"$(git merge-base $1-lastmerge public)"
if false; then
# use this if you want each patch separately
git branch -f $1-public $1
git checkout $1-public
git rebase --onto "$last_public" $1-lastmerge
else
# use this if you want changes squashed into one patch
git branch -f $1-public "$last_public"
git checkout $1-public
git diff --binary $1-lastmerge $1 -- | git apply --index
(
echo "merged from $1"
echo
git log $1-lastmerge..$1
) | git commit -F -
fi
=09
git checkout $1
git merge -s ours -m 'no-op' $1-public
}

crossmerge()
{
branches=3D"remote-ab remote-ac"
=09
for b in $branches; do
# extract the most recent changes from $b-lastmerge..$b
# The changes can be found as public..$b-public
extract $b
=09
# Merge those completed changes into public
git checkout public
git merge $b-public
git branch -d $b-public
git tag -f $b-lastmerge $b # to reduce problems if this script dies
halfway through
done
=09
# merge changes from public back into each branch.
# changes that originated in each branch won't be re-merged, because
# we already merged back $b-public into each $b.
for b in $branches; do
git checkout $b
git merge --no-ff public
git tag -f $b-lastmerge $b
done
}

iterate()
{
# Some changes have arrived in the remote repos:
#git svn fetch ab
#git svn fetch ac
git checkout remote-ab
newfile x$1
git checkout remote-ac
newfile y$1

crossmerge
}

iterate 1
git tag remote-ab-1 remote-ab
git tag remote-ac-1 remote-ac

iterate 2
git tag remote-ab-2 remote-ab
git tag remote-ac-2 remote-ac

iterate 3

gitk --all
Josef Wolf
2009-05-14 21:41:20 UTC
Permalink
Post by Avery Pennarun
By far the sanest thing you could possibly do is to create a central
"public" branch that contains all the common commits, then merge from
that public branch to the site-specific branches, but never merge in
the opposite direction. In case you happen to make some changes on
the site-specific branches that you want to share, you can just
cherry-pick them; the resulting conflicts when merging back are likely
to be fairly minor. This would be entirely consistent with git's
git checkout public
git cherry-pick stuff # as rarely as possible; do the work
directly on public if you can
git checkout svn-1
git merge --no-ff public
git svn dcommit
git checkout svn-2
git merge --no-ff public
git svn dcommit
No criss-cross merges, no insanity, no question about whether it's correct.
Indeed, this looks pretty simple. But AFAICS, this works only when
starting out with a virgin repository. In my situation, public is
currently empty and have to be constructed from scratch by picking
from the privates.

So it seems I have to sync the privates in a first step and build the
public from that in a second step.

So here's my second plan:
1. instead of doing the cherry-picking in a single repository, it might
be helpful to do it in separate repositories: one repository for each
direction. While there are still two remote svn repositories in each
svn repository, there is no need for criss-cross anymore. The flow
of the data is in one direction and it seems (at least at first glance)
I can use git-svn-rebase to get a linear history.
2. After the synchronization is done, I would merge the two repositories
into a third one to create the public repository. Since this will be
a pure git environment, I hope that the problems that are caused svn's
lack of merge support will vanish.
3. Once the public repository exists, create the privates based on that
public.

Here's my first attempt for the first step:

# setup a repository template for the synchronization and configure the
# svn remotes
mkdir -p svn-sync.templ
(
cd svn-sync.templ
git svn init --stdlayout file:///svn/svn-1
git config merge.stat true
for remote in svn-1 svn-2; do
git config svn-remote.$remote.url file:///svn/$remote
git config svn-remote.$remote.fetch trunk:refs/remotes/$remote/trunk
git config svn-remote.$remote.branches branches/*:refs/remotes/$remote/*
git config svn-remote.$remote.tags tags/*:refs/remotes/$remote/tags/*
git svn fetch -R $remote
git checkout -b $remote $remote/trunk
git tag $remote-orig $remote
done
git gc
)

# now copy the template to create the repositories where the actual
# synchronization will be done
cp -a svn-sync.templ to-svn-1
cp -a svn-sync.templ to-svn-2

# move cherries from svn-1 to svn-2 in the to-svn-2 repository
(
cd to-svn-2
git svn fetch svn-1
git checkout svn-2
[ pick cherries ]
git svn dcommit
git tag -f svn-1-lastmerge svn-1
)

# move cherries from svn-2 to svn-1 in the to-svn-1 repository
(
cd to-svn-1
git svn fetch svn-2
git checkout svn-1
[ pick cherries ]
git svn dcommit
git tag -f svn-2-lastmerge svn-2
)

# time passes

# Move new commits from svn-1 to svn-2
(
cd to-svn-2
git checkout svn-1
git svn rebase
git checkout svn-2
git svn rebase svn-1
[ more cherries ]
git svn dcommit
git tag -f svn-1-lastmerge svn-1
)

# Move new commits from svn-2 to svn-1
(
cd to-svn-1
git checkout svn-2
git svn rebase
git checkout svn-1
git svn rebase svn-2
[ more cherries ]
git svn dcommit
git tag -f svn-2-lastmerge svn-2
)

At first glance, this seems to work. But there's the drawback that I
have to keep track of what have been merged manually. So there's
certainly room for improvement :)
Post by Avery Pennarun
More as an academic exercise than anything, I did find a way that will
let you do criss-cross merging of all changes on A and B. I still
don't *really* recommend you use it, because it's extremely error
prone, and there are lots of places where you could get merge
conflicts and then end up in trouble. (The above simple method, in
contrast, might get conflicts sometimes, but you can just fix them as
you encounter them and be done with it.)
The script below demonstrates how to take branches remote-ab and
remote-ac, and auto-pick their changes (as they happen) into a new
(automatically managed) branch public. Then it merges public back
into each branch, while avoiding conflicts. The magic itself happens
in extract() and crossmerge().
If nothing else, this method makes the gitk output far more sane than
the original method. This is because it doesn't include the history
of 'public' in the site-specific branches. That was the fundamental
flaw in the method I had identified originally. You can trick that
original method into working too, but it's stunningly complex. This
is much more sane, albeit still not really sane.
I will have to play a little bit with this script to get a better
understanding how it works. But from the description, I got the
impression that it matches my (current) work flow pretty good:
Currently, initial changes are done in some private repository and
propagated to the other repositories from there. The only exception
is that currently, there's no such thing as a "public" repository.
Post by Avery Pennarun
P.S. Sorry for the mess. I suppose I should have broken down and
written (or asked for :)) a minimal test case earlier, as it quickly
revealed the problem.
Oh, I have learned a lot in this thread. And BTW: I _have_ tried to
write a minimal test case several times. But I simply was not able to
reproduce the problems there. The problems showed up only on the real
repositories.

Thank you very much Avery!
Avery Pennarun
2009-05-14 21:57:00 UTC
Permalink
No criss-cross merges, no insanity, no question about whether it's c=
orrect.
Indeed, this looks pretty simple. =A0But AFAICS, this works only when
starting out with a virgin repository. =A0In my situation, public is
currently empty and have to be constructed from scratch by picking
from the privates.
Not exactly; you simply produce a "public" repository however you want
to produce it. One easy way would be to copy one of the existing svn
branches, cherry pick and revert whatever commits you want, and call
it public. The prior history of that branch doesn't matter to the
algorithm.
1. instead of doing the cherry-picking in a single repository, it mig=
ht
=A0 be helpful to do it in separate repositories: one repository for =
each
=A0 direction. =A0While there are still two remote svn repositories i=
n each
=A0 svn repository, there is no need for criss-cross anymore. =A0The =
flow
=A0 of the data is in one direction and it seems (at least at first g=
lance)
=A0 I can use git-svn-rebase to get a linear history.
it's still criss-crossing, it's just less obvious that way. One
repository is exactly the same as two repositories in git; all that
matters is the branch histories. So if you think this will fix it,
you're probably missing something :)
2. After the synchronization is done, I would merge the two repositor=
ies
=A0 into a third one to create the public repository. =A0Since this w=
ill be
=A0 a pure git environment, I hope that the problems that are caused =
svn's
=A0 lack of merge support will vanish.
I'd say that basically none of your problems have anything to do with
svn's lack of merge support, and everything to do with the fact that
you aren't doing all your changes first on a 'public' branch and then
merging from there into the private branches. (That's really not so
hard to do in svn either, and would save a ton of confusion.)
At first glance, this seems to work. =A0But there's the drawback that=
I
have to keep track of what have been merged manually. =A0So there's
certainly room for improvement :)
Right, the crossmerge() function in the script I sent is designed to av=
oid that.
I will have to play a little bit with this script to get a better
understanding how it works. =A0But from the description, I got the
Currently, initial changes are done in some private repository and
propagated to the other repositories from there. =A0The only exceptio=
n
is that currently, there's no such thing as a "public" repository.
The public repository is nothing special and doesn't require any work
from you; it's simply maintained automatically from the private
branches. (Of course, if you start doing all your changes in the
public repository, life gets a little simpler.)

Have fun,

Avery
Josef Wolf
2009-05-15 17:52:03 UTC
Permalink
Post by Avery Pennarun
1. instead of doing the cherry-picking in a single repository, it m=
ight
Post by Avery Pennarun
=A0 be helpful to do it in separate repositories: one repository fo=
r each
Post by Avery Pennarun
=A0 direction. =A0While there are still two remote svn repositories=
in each
Post by Avery Pennarun
=A0 svn repository, there is no need for criss-cross anymore. =A0Th=
e flow
Post by Avery Pennarun
=A0 of the data is in one direction and it seems (at least at first=
glance)
Post by Avery Pennarun
=A0 I can use git-svn-rebase to get a linear history.
=20
it's still criss-crossing, it's just less obvious that way. One
repository is exactly the same as two repositories in git; all that
matters is the branch histories.
Yeah, I see... But this step is here _only_ to get the existing svn
repositories in sync again. After cherry-picking and dcommitting, thos=
e
cherry-pick repositories would be wiped. They have no real history. T=
he
steps I outlined in my previous mail wouldn't even create any files in
the .git/refs subdirectory.

Once that is done, I can declare one of the existing repositories as
public and pull it via git-svn into a freshly created repos. The other
repos can then be recreated by cloning and applying patches. No svn
involved anymore here.
Post by Avery Pennarun
2. After the synchronization is done, I would merge the two reposit=
ories
Post by Avery Pennarun
=A0 into a third one to create the public repository. =A0Since this=
will be
Post by Avery Pennarun
=A0 a pure git environment, I hope that the problems that are cause=
d svn's
Post by Avery Pennarun
=A0 lack of merge support will vanish.
=20
I'd say that basically none of your problems have anything to do with
svn's lack of merge support, and everything to do with the fact that
you aren't doing all your changes first on a 'public' branch and then
merging from there into the private branches. (That's really not so
hard to do in svn either, and would save a ton of confusion.)
The problem here is that it does not match the work flow. IMHO, my wor=
k
flow is very similar to the work flow of the kernel, so I fail to see w=
hy
it can not work. See the analogies:

kernel: Submodule maintainers are committing into private repositories
me: People are committing into private repositories

kernel: Those commits are forwarded to Linus's repository
me: Those commits are forwarded to the public repository

kernel: Maintainers receive commits for other submodules from linus
me: Commits are distributed from public to private repositories

I can't believe all changes spring into life in linus's repository.

The only differences I can see are:
- size of the project (obviously)
- convert from multiple svn repos instead of bitkeeper
- private repostories have to keep local patches (but I guess maintaine=
rs
do that also)
Avery Pennarun
2009-05-15 19:05:14 UTC
Permalink
1. instead of doing the cherry-picking in a single repository, it =
might
=A0 be helpful to do it in separate repositories: one repository f=
or each
=A0 direction. =A0While there are still two remote svn repositorie=
s in each
=A0 svn repository, there is no need for criss-cross anymore. =A0T=
he flow
=A0 of the data is in one direction and it seems (at least at firs=
t glance)
=A0 I can use git-svn-rebase to get a linear history.
it's still criss-crossing, it's just less obvious that way. =A0One
repository is exactly the same as two repositories in git; all that
matters is the branch histories.
Yeah, I see... =A0But this step is here _only_ to get the existing sv=
n
repositories in sync again. =A0After cherry-picking and dcommitting, =
those
cherry-pick repositories would be wiped. =A0They have no real history=
=2E =A0The
steps I outlined in my previous mail wouldn't even create any files i=
n
the .git/refs subdirectory.
Hmm, getting them in sync the first time seems to be "easy"
(relatively), in that you've already done it, right? So it's a
one-time thing, doesn't need automation, and you already figured that
part out. So it seems like a non-issue one way or the other.
Once that is done, I can declare one of the existing repositories as
public and pull it via git-svn into a freshly created repos. =A0The o=
ther
repos can then be recreated by cloning and applying patches. =A0No sv=
n
involved anymore here.
Yes, that works fine. Nothing stopping you from declaring one or the
other svn repos to be identical to "public."
I'd say that basically none of your problems have anything to do wit=
h
svn's lack of merge support, and everything to do with the fact that
you aren't doing all your changes first on a 'public' branch and the=
n
merging from there into the private branches. =A0(That's really not =
so
hard to do in svn either, and would save a ton of confusion.)
The problem here is that it does not match the work flow. =A0IMHO, my=
work
flow is very similar to the work flow of the kernel, so I fail to see=
why
kernel: Submodule maintainers are committing into private repositorie=
s
me: =A0 =A0 People are committing into private repositories
kernel: Those commits are forwarded to Linus's repository
me: =A0 =A0 Those commits are forwarded to the public repository
kernel: Maintainers receive commits for other submodules from linus
me: =A0 =A0 Commits are distributed from public to private repositori=
es

There is one critical difference here: if someone merges from Linus
and then Linus merges back from them, then the two resulting
repositories will be *identical* (at least, the trees will be; if the
second merge uses --no-ff, the histories will be very slightly
different, but not importantly so).

If someone has patches that they don't want to send back to Linus, and
those patches are intermixed with ones they *do* want to send back,
then they either have to cherry pick them over to a separate branch
(which Linus can then pull), or equivalently they email individual
patches to Linus, or they need to rebase a lot, or they need to just
put their "finished" patches onto a separate branch and keep the
unfinished ones somewhere else that Linus won't pull.

Rebasing is (I think) actually the most common solution to this
problem, but it doesn't help if you're using svn. svn has no concept
of rebasing. (git svn rebase uses git rebase, but it's not really for
the same purpose.)
- size of the project (obviously)
- convert from multiple svn repos instead of bitkeeper
- private repostories have to keep local patches (but I guess maintai=
ners
=A0do that also)
That last one is the source of all your problems. That said, the
script I provided *does* let you do this, if you're brave.

Have fun,

Avery
Josef Wolf
2009-05-17 11:24:49 UTC
Permalink
Post by Avery Pennarun
1. instead of doing the cherry-picking in a single repository, i=
t might
Post by Avery Pennarun
=A0 be helpful to do it in separate repositories: one repository=
for each
Post by Avery Pennarun
=A0 direction. =A0While there are still two remote svn repositor=
ies in each
Post by Avery Pennarun
=A0 svn repository, there is no need for criss-cross anymore. =A0=
The flow
Post by Avery Pennarun
=A0 of the data is in one direction and it seems (at least at fi=
rst glance)
Post by Avery Pennarun
=A0 I can use git-svn-rebase to get a linear history.
it's still criss-crossing, it's just less obvious that way. =A0One
repository is exactly the same as two repositories in git; all tha=
t
Post by Avery Pennarun
matters is the branch histories.
Yeah, I see... =A0But this step is here _only_ to get the existing =
svn
Post by Avery Pennarun
repositories in sync again. =A0After cherry-picking and dcommitting=
, those
Post by Avery Pennarun
cherry-pick repositories would be wiped. =A0They have no real histo=
ry. =A0The
Post by Avery Pennarun
steps I outlined in my previous mail wouldn't even create any files=
in
Post by Avery Pennarun
the .git/refs subdirectory.
=20
Hmm, getting them in sync the first time seems to be "easy"
(relatively), in that you've already done it, right?
No. I have a perl script to do the cherry-picking and to resolve
conflicts. Because of this, I can try out different methods to do the
sync. And of course, I'd like to do it in a way that produces the leas=
t
problems in the future, since this is a one-shot thing.
Post by Avery Pennarun
I'd say that basically none of your problems have anything to do w=
ith
Post by Avery Pennarun
svn's lack of merge support, and everything to do with the fact th=
at
Post by Avery Pennarun
you aren't doing all your changes first on a 'public' branch and t=
hen
Post by Avery Pennarun
merging from there into the private branches. =A0(That's really no=
t so
Post by Avery Pennarun
hard to do in svn either, and would save a ton of confusion.)
The problem here is that it does not match the work flow. =A0IMHO, =
my work
Post by Avery Pennarun
flow is very similar to the work flow of the kernel, so I fail to s=
ee why
Post by Avery Pennarun
kernel: Submodule maintainers are committing into private repositor=
ies
Post by Avery Pennarun
me: =A0 =A0 People are committing into private repositories
kernel: Those commits are forwarded to Linus's repository
me: =A0 =A0 Those commits are forwarded to the public repository
kernel: Maintainers receive commits for other submodules from linus
me: =A0 =A0 Commits are distributed from public to private reposito=
ries
Post by Avery Pennarun
=20
There is one critical difference here: if someone merges from Linus
and then Linus merges back from them, then the two resulting
repositories will be *identical* (at least, the trees will be; if the
second merge uses --no-ff, the histories will be very slightly
different, but not importantly so).
=20
If someone has patches that they don't want to send back to Linus, an=
d
Post by Avery Pennarun
those patches are intermixed with ones they *do* want to send back,
then they either have to cherry pick them over to a separate branch
(which Linus can then pull), or equivalently they email individual
patches to Linus, or they need to rebase a lot, or they need to just
put their "finished" patches onto a separate branch and keep the
unfinished ones somewhere else that Linus won't pull.
Does any description exist how this process works in detail?
Post by Avery Pennarun
Rebasing is (I think) actually the most common solution to this
problem, but it doesn't help if you're using svn. svn has no concept
of rebasing. (git svn rebase uses git rebase, but it's not really fo=
r
Post by Avery Pennarun
the same purpose.)
In the long term, I am willing to get rid of svn. But I have to create
a migration path, so I need to keep git+svn in parallel for a couple of
months.
Josef Wolf
2009-05-20 16:40:14 UTC
Permalink
Post by Avery Pennarun
I'd say that basically none of your problems have anything to do w=
ith
Post by Avery Pennarun
svn's lack of merge support, and everything to do with the fact th=
at
Post by Avery Pennarun
you aren't doing all your changes first on a 'public' branch and t=
hen
Post by Avery Pennarun
merging from there into the private branches. =A0(That's really no=
t so
Post by Avery Pennarun
hard to do in svn either, and would save a ton of confusion.)
The problem here is that it does not match the work flow. =A0IMHO, =
my work
Post by Avery Pennarun
flow is very similar to the work flow of the kernel, so I fail to s=
ee why
Post by Avery Pennarun
kernel: Submodule maintainers are committing into private repositor=
ies
Post by Avery Pennarun
me: =A0 =A0 People are committing into private repositories
kernel: Those commits are forwarded to Linus's repository
me: =A0 =A0 Those commits are forwarded to the public repository
kernel: Maintainers receive commits for other submodules from linus
me: =A0 =A0 Commits are distributed from public to private reposito=
ries
Post by Avery Pennarun
=20
There is one critical difference here: if someone merges from Linus
and then Linus merges back from them, then the two resulting
repositories will be *identical* (at least, the trees will be; if the
second merge uses --no-ff, the histories will be very slightly
different, but not importantly so).
Hmm, maybe submodules could be of some help here? If I put the generic
content into a submodule, and the localized content into (multiple)
superprojects, the kernel work flow should be easy to adopt, or am I
missing something?

Loading...