Discussion:
[RFC] Configuring (future) committags support in gitweb
(too old to reply)
Jakub Narebski
2008-11-08 19:07:53 UTC
Permalink
Francis Galiegue <***@one2team.net> writes
in "Need help for migration from CVS to git in one go..."
* third: also Bonsai-related; Bonsai can link to Bugzilla by
matching (wild guess) /\b(?:#?)(\d+)\b/ and transforming this into
http://your.bugzilla.fqdn.here/show_bug.cgi?id=$1. Does gitweb have
this built-in? (haven't looked yet) Is this planned, or has it been
discussed and been considered not worth the hassle?
Here below there is proposal how the committags support could look like
for gitweb _user_, which means how to configure gitweb to use (or do not
use) committags, how to configure committags, and how to define new
committags.


Committags are "tags" in commit messages, expanded when rendering commit
message, like gitweb now does for (shortened) SHA-1, converting them to
'object' view link. It should be done in a way to make it easy
configurable, preferably having to configure only variable part, and not
having to write whole replacement rule.

Possible committags include: _BUG(n)_, bug _#n_, _FEATURE(n),
Message-Id, plain text URL e.g. _http://repo.or.cz_, spam protecting
of email addresses, "rich text formatting" like *bold* and _underline_,
syntax highlighting of signoff lines.


I think it would be good idea to use repository config file for
setting-up repository-specific committags, and use whatever Perl
structure for global configuration. The config language can be
borrowed from "drivers" in gitattributes (`diff' and `merge' drivers).

So the example configuration could look like this:

[gitweb]
committags = sha1 signoff bugzilla

[committag "bugzilla"]
match = "\\b(?:#?)(\\d+)\\b"
link = "http://your.bugzilla.fqdn.here/show_bug.cgi?id=$1"

where 'sha1' and 'signoff' are built-in committags, committags are
applied in the order they are put in gitweb.committags; possible actions
for committag driver include:
* link: replace $match by '_<a href="$link">_$match_</a>_'
* html: replace $match by '_$html_'
* text: replace $match by '$text'
where '_a_' means that 'a' is treated as HTML, and is not expanded
further, and 'b' means that it can be further expanded by later
committags, and finally is HTML-escaped (esc_html).


What do you think about this?
--
Jakub Narebski
Poland
Francis Galiegue
2008-11-08 20:02:44 UTC
Permalink
Le Saturday 08 November 2008 20:07:53 Jakub Narebski, vous avez =E9crit=
Post by Jakub Narebski
in "Need help for migration from CVS to git in one go..."
* third: also Bonsai-related; Bonsai can link to Bugzilla by
matching (wild guess) /\b(?:#?)(\d+)\b/ and transforming this into
http://your.bugzilla.fqdn.here/show_bug.cgi?id=3D$1. Does gitweb ha=
ve
Post by Jakub Narebski
this built-in? (haven't looked yet) Is this planned, or has it been
discussed and been considered not worth the hassle?
Here below there is proposal how the committags support could look li=
ke
Post by Jakub Narebski
for gitweb _user_, which means how to configure gitweb to use (or do =
not
Post by Jakub Narebski
use) committags, how to configure committags, and how to define new
committags.
Your proposal goes much further than my initial question, but I thought=
I'd=20
jump in anyway :p
Post by Jakub Narebski
Committags are "tags" in commit messages, expanded when rendering com=
mit
Post by Jakub Narebski
message, like gitweb now does for (shortened) SHA-1, converting them =
to
Post by Jakub Narebski
'object' view link. It should be done in a way to make it easy
configurable, preferably having to configure only variable part, and =
not
Post by Jakub Narebski
having to write whole replacement rule.
Possible committags include: _BUG(n)_, bug _#n_, _FEATURE(n),
Message-Id, plain text URL e.g. _http://repo.or.cz_, spam protecting
of email addresses, "rich text formatting" like *bold* and _underline=
_,
Post by Jakub Narebski
syntax highlighting of signoff lines.
What do you mean with "not having to write whole replacement rule"?
Post by Jakub Narebski
I think it would be good idea to use repository config file for
setting-up repository-specific committags, and use whatever Perl
structure for global configuration. The config language can be
borrowed from "drivers" in gitattributes (`diff' and `merge' drivers)=
=2E
Post by Jakub Narebski
[gitweb]
committags =3D sha1 signoff bugzilla
[committag "bugzilla"]
match =3D "\\b(?:#?)(\\d+)\\b"
link =3D "http://your.bugzilla.fqdn.here/show_bug.cgi?id=3D$1"
where 'sha1' and 'signoff' are built-in committags, committags are
applied in the order they are put in gitweb.committags;
I don't understand what the "signoff" builtin is : is that a link to se=
e only=20
commits "Signed-off-by:" a particular person?

If so, might I suggest that an "alt" tells "Only show commits signed of=
f by=20
this person"?

And also, what about the sha1 builtin? AFAIK, a SHA1 can point to a com=
mit, a=20
tree, and others... In fact, it points to any of these right now, but h=
ow=20
would you tell apart these different SHA1s in a commit message? The onl=
y=20
obvious use I see for it is the builtin "Revert ..." commit message, th=
at the=20
commiter _can_ override...

Or would that be:

my $sha1_re =3D qr/[a-z[0-9]{40}/;

/(?:(?i:commit\s+))?\b($sha1_re)\b/ =3D> [link to commit $1]
/(?:(?i:tree\s+))\b($sha1_re)\b)/ =3D> [link to tree $1]
/(?:(?:tag\s+))\b($sha1_re)\b)/ =3D> [link to tag $1]

=46inally, is there any reason to think that a sha1 or signoff committa=
g will=20
ever need to be overriden in some way?
Post by Jakub Narebski
possible actions=20
* link: replace $match by '_<a href=3D"$link">_$match_</a>_'
* html: replace $match by '_$html_'
* text: replace $match by '$text'
where '_a_' means that 'a' is treated as HTML, and is not expanded
further, and 'b' means that it can be further expanded by later
committags, and finally is HTML-escaped (esc_html).
What use do you see for the html match? Just asking...

And I don't see what you '_a_' and '_b_' are about...

--=20
fge
Jakub Narebski
2008-11-08 22:35:48 UTC
Permalink
Le Saturday 08 November 2008 20:07:53 Jakub Narebski, vous avez =C3=A9=
Post by Jakub Narebski
in "Need help for migration from CVS to git in one go..."
* third: also Bonsai-related; Bonsai can link to Bugzilla by
matching (wild guess) /\b(?:#?)(\d+)\b/ and transforming this into
http://your.bugzilla.fqdn.here/show_bug.cgi?id=3D$1. Does gitweb ha=
ve
Post by Jakub Narebski
this built-in? (haven't looked yet) Is this planned, or has it been
discussed and been considered not worth the hassle?
[...]
Post by Jakub Narebski
Committags are "tags" in commit messages, expanded when rendering co=
mmit
Post by Jakub Narebski
message, like gitweb now does for (shortened) SHA-1, converting them=
to
Post by Jakub Narebski
'object' view link. It should be done in a way to make it easy
configurable, preferably having to configure only variable part, and=
not
Post by Jakub Narebski
having to write whole replacement rule.
Possible committags include: _BUG(n)_, bug _#n_, _FEATURE(n),
Message-Id, plain text URL e.g. _http://repo.or.cz_, spam protecting
of email addresses, "rich text formatting" like *bold* and _underlin=
e_,
Post by Jakub Narebski
syntax highlighting of signoff lines.
=20
What do you mean with "not having to write whole replacement rule"?
Like in example with 'link' rule, not having to write whole
<a href=3D"http://example.com/bugzilla.php?id=3D$1">$&</a>
(or something like that).
Post by Jakub Narebski
I think it would be good idea to use repository config file for
setting-up repository-specific committags, and use whatever Perl
structure for global configuration. The config language can be
borrowed from "drivers" in gitattributes (`diff' and `merge' drivers=
).
Post by Jakub Narebski
[gitweb]
committags =3D sha1 signoff bugzilla
[committag "bugzilla"]
match =3D "\\b(?:#?)(\\d+)\\b"
link =3D "http://your.bugzilla.fqdn.here/show_bug.cgi?id=3D$1"
where 'sha1' and 'signoff' are built-in committags, committags are
applied in the order they are put in gitweb.committags;
=20
I don't understand what the "signoff" builtin is : is that a link to =
see only=20
commits "Signed-off-by:" a particular person?
Committags doesn't need to be replaced by links. In this case I meant
here using 'signoff' class for Signed-off-by: (and the like) lines, by
wrapping it in '<span class=3D"signoff">' ... '</a>'.
And also, what about the sha1 builtin? AFAIK, a SHA1 can point to a c=
ommit, a=20
tree, and others... In fact, it points to any of these right now, but=
how=20
would you tell apart these different SHA1s in a commit message? The o=
nly=20
obvious use I see for it is the builtin "Revert ..." commit message, =
that the=20
commiter _can_ override...
SHA1 (or shortened SHA1 from 8 charasters to 40 characters, or to be
even more exact something that looks like SHA1) is replaced by link
to 'object' view, which in turn finds type of object and _redirect_
to proper view, be it 'commit' (most frequent), 'tag', 'blob' or 'tree'=
=2E

We could have used instead gitweb link with 'h' (hash) parameter, but
without 'a' (action) parameter, which currently finds type of object
and _uses_ correct view...
=20
Finally, is there any reason to think that a sha1 or signoff committa=
g will=20
ever need to be overriden in some way?
One might not want to link SHA1, for example if there are lots of false
positives because of commit message conventions or something, or refine
'signoff' committag to use different styles for different types of
signoff: Signed-off-by, Acked-by, Tested-by, other. Having explicit
'signoff' committag allows us also to put some committags _after_ it,
for example SPAM-protection of emails, or add some committag before
'sha1' to filter out some SHA1 match false positives.
=20
Post by Jakub Narebski
possible actions=20
* link: replace $match by '_<a href=3D"$link">_$match_</a>_'
* html: replace $match by '_$html_'
* text: replace $match by '$text'
where '_a_' means that 'a' is treated as HTML, and is not expanded
further, and 'b' means that it can be further expanded by later
committags, and finally is HTML-escaped (esc_html).
=20
What use do you see for the html match? Just asking...
=46or example 'signoff' committag... well, it is not exactly pure "html=
"
but rather something like template.

[committag "signoff"]
match =3D "(?i)^ *(signed[ \\-]off[ \\-]by[ :]|acked[ \\-]by[ :]|cc[=
:])"
templ =3D "{<span class=3D\"signoff\">}$1{</span>}"

Or simpler

[committag "signoff"]
match =3D "(?i)^ *(signed[ \\-]off[ \\-]by[ :]|acked[ \\-]by[ :]|cc[=
:])"
class =3D signoff
And I don't see what you '_a_' and '_b_' are about...
=46or example in link match, the text of the link can be further refine=
d
by committags later in sequence.

--=20
Jakub Narebski
Poland
Francis Galiegue
2008-11-08 23:27:14 UTC
Permalink
Le Saturday 08 November 2008 23:35:48 Jakub Narebski, vous avez =C3=A9c=
Post by Jakub Narebski
Post by Francis Galiegue
What do you mean with "not having to write whole replacement rule"?
Like in example with 'link' rule, not having to write whole
<a href=3D"http://example.com/bugzilla.php?id=3D$1">$&</a>
(or something like that).
OK, good one.
Post by Jakub Narebski
Post by Francis Galiegue
Post by Jakub Narebski
I think it would be good idea to use repository config file for
setting-up repository-specific committags, and use whatever Perl
structure for global configuration. The config language can be
borrowed from "drivers" in gitattributes (`diff' and `merge' drive=
rs).
Post by Jakub Narebski
Post by Francis Galiegue
Post by Jakub Narebski
[gitweb]
committags =3D sha1 signoff bugzilla
[committag "bugzilla"]
match =3D "\\b(?:#?)(\\d+)\\b"
link =3D "http://your.bugzilla.fqdn.here/show_bug.cgi?id=3D$1"
where 'sha1' and 'signoff' are built-in committags, committags are
applied in the order they are put in gitweb.committags;
I don't understand what the "signoff" builtin is : is that a link t=
o see
Post by Jakub Narebski
Post by Francis Galiegue
only commits "Signed-off-by:" a particular person?
Committags doesn't need to be replaced by links. In this case I meant
here using 'signoff' class for Signed-off-by: (and the like) lines, b=
y
Post by Jakub Narebski
wrapping it in '<span class=3D"signoff">' ... '</a>'.
Well, this would also mean to update gitweb.css, wouldn't it?
Post by Jakub Narebski
Post by Francis Galiegue
And also, what about the sha1 builtin? AFAIK, a SHA1 can point to a
commit, a tree, and others... In fact, it points to any of these ri=
ght
Post by Jakub Narebski
Post by Francis Galiegue
now, but how would you tell apart these different SHA1s in a commit
message? The only obvious use I see for it is the builtin "Revert .=
=2E."
Post by Jakub Narebski
Post by Francis Galiegue
commit message, that the commiter _can_ override...
SHA1 (or shortened SHA1 from 8 charasters to 40 characters, or to be
even more exact something that looks like SHA1) is replaced by link
to 'object' view, which in turn finds type of object and _redirect_
to proper view, be it 'commit' (most frequent), 'tag', 'blob' or 'tre=
e'.
Post by Jakub Narebski
We could have used instead gitweb link with 'h' (hash) parameter, but
without 'a' (action) parameter, which currently finds type of object
and _uses_ correct view...
OK, you lost me somewhat.

What I understand is that right now, the SHA1 links are "pre-processed"=
by=20
gitweb so that the 'a' parameter is correct, right?

Out of curiosity, I just went to the kernel git repository (I don't kno=
w the=20
git version that git.kernel.org uses offhand) and altered the 'a' param=
eter=20
to something which is not even an 'a' command at all: 500...

However, if I try a VALID 'a' command with an "irrelevant" 'h' paramete=
r, it=20
acts quite funny: it just looks like it wants to try the closest match,=
but=20
takes some time figuring it out... Sometimes to something relevant, som=
etimes=20
to nothing really relevant. See for instance [1], in which 'a' was=20
originally "commit".

Ow.

[1]
http://git.kernel.org/?p=3Dlinux/kernel/git/stable/linux-2.6.27.y.git;a=
=3Dtag;h=3D788a5f3f70e2a9c46020bdd3a195f2a866441c5d
Post by Jakub Narebski
Post by Francis Galiegue
Finally, is there any reason to think that a sha1 or signoff commit=
tag
Post by Jakub Narebski
Post by Francis Galiegue
will ever need to be overriden in some way?
One might not want to link SHA1, for example if there are lots of fal=
se
Post by Jakub Narebski
positives because of commit message conventions or something, or refi=
ne
Post by Jakub Narebski
'signoff' committag to use different styles for different types of
signoff: Signed-off-by, Acked-by, Tested-by, other. Having explicit
'signoff' committag allows us also to put some committags _after_ it,
for example SPAM-protection of emails, or add some committag before
'sha1' to filter out some SHA1 match false positives.
Hmmm, so this means you'd want to make styles customizable somewhat (si=
gnoff).=20
In fact, what you really want is span for CSS! Then why not, just, maki=
ng a=20
document to say "This is what you can do with CSS for gitweb", and say =
"these=20
are the available CSS tags", and then be done with it?

I mean, when comes the day that someone will WANT other spans to be def=
ined,=20
badly, it's not like it will be unheard of, won't it?=20
Post by Jakub Narebski
Post by Francis Galiegue
And I don't see what you '_a_' and '_b_' are about...
For example in link match, the text of the link can be further refine=
d
Post by Jakub Narebski
by committags later in sequence.
I still don't get it. Can you give an example?

[personal thoughts: it would be really, really nice if, somewhat, gitwe=
b.perl=20
were splitted somewhat into different modules, and ideally use more=20
of "what's out there on CPAN". I'm convinced that some CPAN modules wou=
ld be=20
of GREAT help to gitweb, as well as I'm convinced that not many people =
out=20
there use Windows to run gitweb anyway :p]

--=20
fge
Jakub Narebski
2008-11-09 00:25:11 UTC
Permalink
Le Saturday 08 November 2008 23:35:48 Jakub Narebski, vous avez =C3=A9=
Post by Francis Galiegue
I don't understand what the "signoff" builtin is : is that a link t=
o see
Post by Francis Galiegue
only commits "Signed-off-by:" a particular person?
Committags doesn't need to be replaced by links. In this case I mean=
t
here using 'signoff' class for Signed-off-by: (and the like) lines, =
by
wrapping it in '<span class=3D"signoff">' ... '</a>'.
=20
Well, this would also mean to update gitweb.css, wouldn't it?
Not necessary. Please remember that you can configure gitweb to either
use _alternate_ stylesheet (instead of provided gitweb.css), or use
_additional_ stylesheet (for example gitweb-commit.css in addition to
gitweb.css)
=20
Post by Francis Galiegue
And also, what about the sha1 builtin? AFAIK, a SHA1 can point to a
commit, a tree, and others... In fact, it points to any of these ri=
ght
Post by Francis Galiegue
now, but how would you tell apart these different SHA1s in a commit
message? The only obvious use I see for it is the builtin "Revert .=
=2E."
Post by Francis Galiegue
commit message, that the commiter _can_ override...
SHA1 (or shortened SHA1 from 8 charasters to 40 characters, or to be
even more exact something that looks like SHA1) is replaced by link
to 'object' view, which in turn finds type of object and _redirect_
to proper view, be it 'commit' (most frequent), 'tag', 'blob' or 'tr=
ee'.
We could have used instead gitweb link with 'h' (hash) parameter, bu=
t
without 'a' (action) parameter, which currently finds type of object
and _uses_ correct view...
=20
OK, you lost me somewhat.
I'm sorry about that. Perhaps I should use only one mechanism.
=20
What I understand is that right now, the SHA1 links are "pre-processe=
d" by=20
gitweb so that the 'a' parameter is correct, right?
No, they are 'post-processed': finding correct action is left to
_after_ you click on the link (it is more natural, and helps
performance).

If you don't know action for given SHA-1 you can use either

http://example.com/gitweb.cgi?a=3Dobject;h=3Ddeadbeef

which finds correct type (for example 'commit') and redirects using
HTTP 302 Found redirection to

http://example.com/gitweb.cgi?a=3Dcommit;h=3Ddeadbeef

This way is used bu SHA-1 committag in commit messages.


Alternate solution (meant for bugtrackers), done by another author,
is to simply skip action ('a') parameter:

http://example.com/gitweb.cgi?h=3Ddeadbeef

and then gitweb finds type of object and act accordingly (without
redirect to correct view).

[...]
Post by Francis Galiegue
Finally, is there any reason to think that a sha1 or signoff commit=
tag
Post by Francis Galiegue
will ever need to be overriden in some way?
One might not want to link SHA1, for example if there are lots of fa=
lse
positives because of commit message conventions or something, or ref=
ine
'signoff' committag to use different styles for different types of
signoff: Signed-off-by, Acked-by, Tested-by, other. Having explicit
'signoff' committag allows us also to put some committags _after_ it=
,
for example SPAM-protection of emails, or add some committag before
'sha1' to filter out some SHA1 match false positives.
=20
Hmmm, so this means you'd want to make styles customizable somewhat (=
signoff).=20
In fact, what you really want is span for CSS! Then why not, just, ma=
king a=20
document to say "This is what you can do with CSS for gitweb", and sa=
y "these=20
are the available CSS tags", and then be done with it?
=20
I mean, when comes the day that someone will WANT other spans to be d=
efined,=20
badly, it's not like it will be unheard of, won't it?=20
Errr... I don't understand.

The examples perhaps are not the best. One would be for example to use
different class (different style) for Signed-off-by signoff (which
denotes signing Certificate of Origin), and the rest of (informative)
signoff:

[gitweb]
committags =3D sha1 signoff_signed signoff

Another example would be to add SPAM-protection of emails, for example
replacing '***@example.com' by 'user AT example DOT com', or something
more advanced. This have to be used _after_ signoff, because otherwise
regexp could have difficulties matching mangled email

[gitweb]
committags =3D sha1 signoff mailto
Post by Francis Galiegue
And I don't see what you '_a_' and '_b_' are about...
For example in link match, the text of the link can be further refin=
ed
by committags later in sequence.
=20
I still don't get it. Can you give an example?
=46or example signoff line

Signed-off-by: A U Thor <***@example.com>

would be replaced by
{</a>}
=20
where parts inside {...} is HTML code, and should be not expanded
further, and parts outside it could be expanded further by following
lower priority committags (like anti-SPAM for emails), and have to be
finally HTML escaped (like '<' and '>' in email in signoff).


=2E....................................................................
[personal thoughts: it would be really, really nice if, somewhat, git=
web.perl=20
were splitted somewhat into different modules, and ideally use more=20
of "what's out there on CPAN". I'm convinced that some CPAN modules w=
ould be=20
of GREAT help to gitweb, as well as I'm convinced that not many peopl=
e out=20
there use Windows to run gitweb anyway :p]
=46irst, having gitweb in (almost) one piece makes for easier installat=
ion.
But there are plans to have gitweb use Git.pm or future Got::Repo and
friends. I'm not sure about further splitting...

Second, we cannot in good faith use CPAN modules which cannot be found
in standard Perl distributions, or at least in some trusted extras
package (application) repositories, as gitweb is sometimes run on
machines with tight security (think kernel.org for example) where you
cannot simply ask admin to install some third-party alpha-version CPAN
module.

--=20
Jakub Narebski
Poland
Marcel M. Cary
2009-02-17 15:32:01 UTC
Permalink
I'm interested in cross-linking bug references in commit messages to a
bug tracking system. I started tinkering a couple weeks ago and am
finally understanding that committags encompass this functionality.
(From the subject line I first understood "tags" to mean git tags rathe=
r
than commit message munging.)

Is the committags idea still under active development?

I'd be happy to share what I have, which is for bug linking only, but
very little code would apply to the more general concept of committags.
Here are some ideas that might apply...


Two regexes would make it easier to configure a driver without needing
look-ahead and look-behind assertions. For example, if you want to
match non-negative integers but only in the context of a Resolves-bug
header:

Resolves-bug: 1234, 1235

With two regexes you can write:

/^Resolves-bug: \d+(, \d+)*/
/\d+/

But with only one, you'd have to write:

/(?<=3D^Resolves-bug: (\d+, )*)(\d+)/

The need for a lookbehind assertion means I need to stop at perlreref t=
o
lookup syntax. Hrm... and with testing I see that it's worse than that=
:

$ perl -wpe 's/(?<=3D^Resolves-bug: (\d+, )*)(\d+)/[$2]/g'
Variable length lookbehind not implemented in regex; marked by <--
HERE in m/(?<=3D^Resolves-bug: (\d+, )*)(\d+) <-- HERE / at -e line=
1.

I guess it can't be done even with the look-behind assertion.


I got the two-regex idea from a spec I ran across while evaluating
Subversion:

http://guest:@tortoisesvn.tigris.org/svn/tortoisesvn/trunk/doc/issuetra=
ckers.txt

If there is interest in BTS integration beyond gitweb, for example in
git-gui, gitk, or the Windows UIs, then perhaps this spec would be wort=
h
considering. It covers more than just hyperlinking. It also considers
issues like how to draw the form field for a bug ID as part of a commit
message form, how to validate that form field, and then how to munge th=
e
log message to include the entered ID. Some details, like using a
newline to separate the two regexes, might be more awkward for Git than
Subversion.


I like the idea of allowing a regex writer -- a gitweb admin or a
repository owner -- to ignore issues regarding HTML escaping. For
example, I'd rather not have &nbsp; in the regex. And I don't want the
replacement to have to escape "&" in a query string. That's a strength
of not having to write the whole link replacement rule. And I think
hyperlinking will be one of the most common uses of this committag
feature, so it's worth special support.

In the case of false positives, it might also be helpful to have a titl=
e
attribute that explains the committag's interpretation of the text.

I also like the idea of giving the admin full control to specify a Perl
function of some sort, which might go as far as looking up bug summarie=
s
for the "title" attribute or adding JS to fetch it via AJAX on
mouse-over. But I doubt I would bother with that myself.


Appealing as it is, the use of '$1' in my replacements didn't work for =
me:

$ perl -wpe '$reg =3D "(\\d+)"; $rep =3D ".\$1."; s/$reg/$rep/g'
123
.$1.

I think usage of capturing parenthesis is important, even with two
regexes, because it makes it easier to specify link text that's broader
than the data that goes in the URL. Specifically, I wanted to be able
to produce HTML like this, with the hash mark hyperlinked but not used
in the URL:

<a href=3D"...bug=3D123">#123</a>

I guess that's just my aesthetic. To support that, my code calls
sprintf with $&, $1, $2, ... $9, and that particualr replacement URL
uses %2$s.


I'm concerned about the composition of these committag drivers. In
other words, will it be hard for the configurer to manage interactions
between committag drivers? To choose a sane order, will I have to
understand the implementation details of each committag driver?

Perhaps a simpler alternative would be to let at most one driver proces=
s
a given snippet of text, forbidding nesting of replacements. (If I
understand Junio's suggestion to use a list of strings and refs,
non-nesting overlaps are already not supported.) If all replacements
were hyperlinks -- and I expect that to be the common case -- they
wouldn't be nestable anyway. I wouldn't see it as a huge loss for the
nesting examples I can think of: Separate rules for span around S-o-b
and linking or obfuscation of email could be combined into one... A
rule to shade text quoted email-style with leading angle brackets could
just clobber any further processing of that text. And it might simplif=
y
the code and testing of it quite a bit.

If committags do turn out to support nesting, perhaps it would make
sense to stratify the ordering so that it's clear whether a particular
driver takes as input HTML vs. text and outputs HTML vs. text. (For
example, weak email obfuscation might be text -> text.) I guess to
strictly honor the input and output types of a driver, the text -> html
drivers still have to be expanded in a single pass.


A few ideas for drivers that I don't think have been mentioned yet:

* Wiki page names, like to [[Feature Documentation]]. These are notabl=
e
because they tend to contain punctuation that get HTML-escaped, like
quotes and ampersands.

* Links to gitweb itself, such as 123abc:file.txt and HEAD:file.txt. I
guess the current hash linking sort of does the first case except that
you have to get the hash of the blob instead of using the commit hash,
and the current hash linking wouldn't reveal the filename until after
you click, nor when viewing textual log messages. I'm not sure whether
special support for linking to multi-commit diffs or other object types
would be as helpful.

Marcel
Post by Jakub Narebski
Le Saturday 08 November 2008 20:07:53 Jakub Narebski, vous avez =E9c=
Post by Jakub Narebski
in "Need help for migration from CVS to git in one go..."
* third: also Bonsai-related; Bonsai can link to Bugzilla by
matching (wild guess) /\b(?:#?)(\d+)\b/ and transforming this into
http://your.bugzilla.fqdn.here/show_bug.cgi?id=3D$1. Does gitweb h=
ave
Post by Jakub Narebski
Post by Jakub Narebski
this built-in? (haven't looked yet) Is this planned, or has it bee=
n
Post by Jakub Narebski
Post by Jakub Narebski
discussed and been considered not worth the hassle?
[...]
Post by Jakub Narebski
Committags are "tags" in commit messages, expanded when rendering c=
ommit
Post by Jakub Narebski
Post by Jakub Narebski
message, like gitweb now does for (shortened) SHA-1, converting the=
m to
Post by Jakub Narebski
Post by Jakub Narebski
'object' view link. It should be done in a way to make it easy
configurable, preferably having to configure only variable part, an=
d not
Post by Jakub Narebski
Post by Jakub Narebski
having to write whole replacement rule.
Possible committags include: _BUG(n)_, bug _#n_, _FEATURE(n),
Message-Id, plain text URL e.g. _http://repo.or.cz_, spam protectin=
g
Post by Jakub Narebski
Post by Jakub Narebski
of email addresses, "rich text formatting" like *bold* and _underli=
ne_,
Post by Jakub Narebski
Post by Jakub Narebski
syntax highlighting of signoff lines.
What do you mean with "not having to write whole replacement rule"?
Like in example with 'link' rule, not having to write whole
<a href=3D"http://example.com/bugzilla.php?id=3D$1">$&</a>
(or something like that).
Post by Jakub Narebski
I think it would be good idea to use repository config file for
setting-up repository-specific committags, and use whatever Perl
structure for global configuration. The config language can be
borrowed from "drivers" in gitattributes (`diff' and `merge' driver=
s).
Post by Jakub Narebski
Post by Jakub Narebski
[gitweb]
committags =3D sha1 signoff bugzilla
[committag "bugzilla"]
match =3D "\\b(?:#?)(\\d+)\\b"
link =3D "http://your.bugzilla.fqdn.here/show_bug.cgi?id=3D$1=
"
Post by Jakub Narebski
Post by Jakub Narebski
where 'sha1' and 'signoff' are built-in committags, committags are
applied in the order they are put in gitweb.committags;
I don't understand what the "signoff" builtin is : is that a link to=
see only
Post by Jakub Narebski
commits "Signed-off-by:" a particular person?
Committags doesn't need to be replaced by links. In this case I meant
here using 'signoff' class for Signed-off-by: (and the like) lines, b=
y
Post by Jakub Narebski
wrapping it in '<span class=3D"signoff">' ... '</a>'.
And also, what about the sha1 builtin? AFAIK, a SHA1 can point to a =
commit, a
Post by Jakub Narebski
tree, and others... In fact, it points to any of these right now, bu=
t how
Post by Jakub Narebski
would you tell apart these different SHA1s in a commit message? The =
only
Post by Jakub Narebski
obvious use I see for it is the builtin "Revert ..." commit message,=
that the
Post by Jakub Narebski
commiter _can_ override...
SHA1 (or shortened SHA1 from 8 charasters to 40 characters, or to be
even more exact something that looks like SHA1) is replaced by link
to 'object' view, which in turn finds type of object and _redirect_
to proper view, be it 'commit' (most frequent), 'tag', 'blob' or 'tre=
e'.
Post by Jakub Narebski
We could have used instead gitweb link with 'h' (hash) parameter, but
without 'a' (action) parameter, which currently finds type of object
and _uses_ correct view...
Finally, is there any reason to think that a sha1 or signoff committ=
ag will
Post by Jakub Narebski
ever need to be overriden in some way?
One might not want to link SHA1, for example if there are lots of fal=
se
Post by Jakub Narebski
positives because of commit message conventions or something, or refi=
ne
Post by Jakub Narebski
'signoff' committag to use different styles for different types of
signoff: Signed-off-by, Acked-by, Tested-by, other. Having explicit
'signoff' committag allows us also to put some committags _after_ it,
for example SPAM-protection of emails, or add some committag before
'sha1' to filter out some SHA1 match false positives.
Post by Jakub Narebski
possible actions
* link: replace $match by '_<a href=3D"$link">_$match_</a>_'
* html: replace $match by '_$html_'
* text: replace $match by '$text'
where '_a_' means that 'a' is treated as HTML, and is not expanded
further, and 'b' means that it can be further expanded by later
committags, and finally is HTML-escaped (esc_html).
What use do you see for the html match? Just asking...
For example 'signoff' committag... well, it is not exactly pure "html=
"
Post by Jakub Narebski
but rather something like template.
[committag "signoff"]
match =3D "(?i)^ *(signed[ \\-]off[ \\-]by[ :]|acked[ \\-]by[=
:]|cc[ :])"
Post by Jakub Narebski
templ =3D "{<span class=3D\"signoff\">}$1{</span>}"
Or simpler
[committag "signoff"]
match =3D "(?i)^ *(signed[ \\-]off[ \\-]by[ :]|acked[ \\-]by[=
:]|cc[ :])"
Post by Jakub Narebski
class =3D signoff
And I don't see what you '_a_' and '_b_' are about...
For example in link match, the text of the link can be further refine=
d
Post by Jakub Narebski
by committags later in sequence.
--
Jakub Narebski
Poland
--
To unsubscribe from this list: send the line "unsubscribe git" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
Marcel M. Cary
2009-02-18 03:00:42 UTC
Permalink
When a feature like "blame" is permitted to be overridden in the
repository configuration but it is not actually set in the
repository, a warning is emitted due to the undefined value
of the repository configuration, even though it's a perfectly
normal condition.

The warning is grounds for test failure in the gitweb test script,
so it causes some new feature tests of mine to fail.

This patch prevents warning and adds a test case to exercise it.

Signed-off-by: Marcel M. Cary <***@oak.homeunix.org>
---

Here's a small patch I put together while tinkering with bug hyperlinking.
Does this look reasonable?

Marcel


gitweb/gitweb.perl | 8 +++++---
t/t9500-gitweb-standalone-no-errors.sh | 5 +++++
2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 7c48181..653f0be 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -402,13 +402,13 @@ sub feature_bool {
my $key = shift;
my ($val) = git_get_project_config($key, '--bool');

- if ($val eq 'true') {
+ if (!defined $val) {
+ return ($_[0]);
+ } elsif ($val eq 'true') {
return (1);
} elsif ($val eq 'false') {
return (0);
}
-
- return ($_[0]);
}

sub feature_snapshot {
@@ -1978,6 +1978,8 @@ sub git_get_project_config {
$config_file = "$git_dir/config";
}

+ return undef if (!defined $config{"gitweb.$key"});
+
# ensure given type
if (!defined $type) {
return $config{"gitweb.$key"};
diff --git a/t/t9500-gitweb-standalone-no-errors.sh b/t/t9500-gitweb-standalone-no-errors.sh
index 7c6f70b..559045e 100755
--- a/t/t9500-gitweb-standalone-no-errors.sh
+++ b/t/t9500-gitweb-standalone-no-errors.sh
@@ -662,6 +662,11 @@ cat >>gitweb_config.perl <<EOF
EOF

test_expect_success \
+ 'config override: tree view, features not overridden in repo config' \
+ 'gitweb_run "p=.git;a=tree"'
+test_debug 'cat gitweb.log'
+
+test_expect_success \
'config override: tree view, features disabled in repo config' \
'git config gitweb.blame no &&
git config gitweb.snapshot none &&
--
1.6.1
Marcel M. Cary
2009-02-18 03:00:43 UTC
Permalink
The current implementation only hyperlinks the first hash on
a given line of the commit message. It seems sensible to
highlight all of them if there are multiple, and it seems
plausible that there would be multiple even with a tidy line
length limit, because they can be abbreviated as short as 8
characters.

Benchmark:

I wanted to make sure that using the 'e' switch to the Perl regex
wasn't going to kill performance, since this is called once per commit
message line displayed.

In all three A/B scenarios I tried, the A and B yielded the same
results within 2%, where A is the version of code before this patch
and B is the version after.

1: View a commit message containing the last 1000 commit hashes
2: View a commit message containing 1000 lines of 40 dots to avoid
hyperlinking at the same message length
3: View a short merge commit message with a few lines of text and
no hashes

All were run in CGI mode on my sub-production hardware on a recent
clone of git.git. Numbers are the average of 10 reqests per second
with the first request discarded, since I expect this change to affect
primarily CPU usage. Measured with ApacheBench.

Note that the web page rendered was the same; while the new code
supports multiple hashes per line, there was at most one per line.

The primary purpose of scenarios 2 and 3 were to verify that the
addition of 1000 commit messages had an impact on how much of the time
was spent rendering commit messages. They were all within 2% of 0.80
requests per second (much faster).

So I think the patch has no noticeable effect on performance.

Signed-off-by: Marcel M. Cary <***@oak.homeunix.org>
---

And here's another.

Marcel


gitweb/gitweb.perl | 12 +++++-------
1 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 653f0be..51b7f56 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -1384,13 +1384,11 @@ sub format_log_line_html {
my $line = shift;

$line = esc_html($line, -nbsp=>1);
- if ($line =~ m/\b([0-9a-fA-F]{8,40})\b/) {
- my $hash_text = $1;
- my $link =
- $cgi->a({-href => href(action=>"object", hash=>$hash_text),
- -class => "text"}, $hash_text);
- $line =~ s/$hash_text/$link/;
- }
+ $line =~ s{\b([0-9a-fA-F]{8,40})\b}{
+ return $cgi->a({-href => href(action=>"object", hash=>$1),
+ -class => "text"}, $1);
+ }eg;
+
return $line;
}
--
1.6.1
Jakub Narebski
2009-02-18 21:55:11 UTC
Permalink
Post by Marcel M. Cary
The current implementation only hyperlinks the first hash on
a given line of the commit message. It seems sensible to
highlight all of them if there are multiple, and it seems
plausible that there would be multiple even with a tidy line
length limit, because they can be abbreviated as short as 8
characters.
That is a good catch. Code simply was not modified since we required
fill-length 40-characters SHA-1 id.
Post by Marcel M. Cary
I wanted to make sure that using the 'e' switch to the Perl regex
wasn't going to kill performance, since this is called once per commit
message line displayed.
In all three A/B scenarios I tried, the A and B yielded the same
results within 2%, where A is the version of code before this patch
and B is the version after.
1: View a commit message containing the last 1000 commit hashes
2: View a commit message containing 1000 lines of 40 dots to avoid
hyperlinking at the same message length
3: View a short merge commit message with a few lines of text and
no hashes
I don't think we should worry about that; after all esc_path and
unescape subroutines also use 'e' switch to Perl regexp.

So the benchmark is nice addition, but I don't think it is really
necessary, especially that the change results in shorter and easier
(I think) to maintain code.

[...]
Post by Marcel M. Cary
So I think the patch has no noticeable effect on performance.
---
And here's another.
Marcel
Do I understand correctly that those patches are not related at all
semantically or textually, only in that you have them one after other
(and blob sha-1 in the index line reflects state after former), isn't
it?
Post by Marcel M. Cary
gitweb/gitweb.perl | 12 +++++-------
1 files changed, 5 insertions(+), 7 deletions(-)
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 653f0be..51b7f56 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -1384,13 +1384,11 @@ sub format_log_line_html {
my $line = shift;
$line = esc_html($line, -nbsp=>1);
- if ($line =~ m/\b([0-9a-fA-F]{8,40})\b/) {
- my $hash_text = $1;
- my $link =
- $cgi->a({-href => href(action=>"object", hash=>$hash_text),
- -class => "text"}, $hash_text);
- $line =~ s/$hash_text/$link/;
- }
+ $line =~ s{\b([0-9a-fA-F]{8,40})\b}{
+ return $cgi->a({-href => href(action=>"object", hash=>$1),
+ -class => "text"}, $1);
+ }eg;
+
Almost correct... but for this unnecessary 'return' statement.
Without it: ACK.
Post by Marcel M. Cary
return $line;
}
--
1.6.1
P.S. Why bare emails (without user names), e.g. "***@suse.cz"
and not "Petr Baudis <***@suse.cz>"? Just curious...
--
Jakub Narebski
Poland
Junio C Hamano
2009-02-20 08:35:41 UTC
Permalink
Post by Jakub Narebski
Post by Marcel M. Cary
+ $line =~ s{\b([0-9a-fA-F]{8,40})\b}{
+ return $cgi->a({-href => href(action=>"object", hash=>$1),
+ -class => "text"}, $1);
+ }eg;
+
Almost correct... but for this unnecessary 'return' statement.
Without it: ACK.
I've applied this directly on 'master' without the return from inside s///e
with your Ack. Please check the result.

Thanks.
Jakub Narebski
2009-02-20 11:46:57 UTC
Permalink
Post by Junio C Hamano
Post by Jakub Narebski
Post by Marcel M. Cary
+ $line =~ s{\b([0-9a-fA-F]{8,40})\b}{
+ return $cgi->a({-href => href(action=>"object", hash=>$1),
+ -class => "text"}, $1);
+ }eg;
+
Almost correct... but for this unnecessary 'return' statement.
Without it: ACK.
I've applied this directly on 'master' without the return from inside s///e
with your Ack. Please check the result.
I did quick test by installing newest gitweb (with above commit applied),
doing in gitweb searching commit message for '[a-f0-9]{8,40}' with regexp
search; everything looks all right. But I didn't do extensive tests.
--
Jakub Narebski
Poland
Marcel M. Cary
2009-02-24 15:38:44 UTC
Permalink
Thanks for the two patch tweaks.
I've been using "git send-email" for patches, and have Thunderbird as my
MUA otherwise. (I'd use (al)pine if I could make it work with
Exchange/NTLM at work, but that's another story...) I've been
transfering recipients (--to and --cc) from Thunderbird to the
commandline with copy/paste.

In Thurderbird, copying an email address from a message only gets you
the ***@domain part, not the "Full Name" <***@domain>. To get the
"Full Name" <***@domain> I would have to View Message Source and
pickout the CC line, which is marginally harder.

And on the commandline, instead of just pasting an email as a shell
word, I'd have to add single quotes (I think) to keep the whole "Full
Name" <***@domain> as one word and quote the shell meta characters.

Neither piece is all that onerous, I guess. Sounds like you would see
some value in the full names. Maybe I'll try including them on my next
patch. Looks like --cc-cmd or sendemail.aliasesfile might make it
easier, but I'd have to set them up.

Marcel
Jakub Narebski
2009-02-24 15:58:51 UTC
Permalink
Post by Marcel M. Cary
Thanks for the two patch tweaks.
I've been using "git send-email" for patches, and have Thunderbird as my
MUA otherwise. (I'd use (al)pine if I could make it work with
Exchange/NTLM at work, but that's another story...) I've been
transfering recipients (--to and --cc) from Thunderbird to the
commandline with copy/paste.
[...]

Well, 'technical reasons with copy'n'paste' would be enough for me.

P.S. But '"***@suse.cz" <***@suse.cz>' looks very silly...
--
Jakub Narebski
Poland
Marcel M. Cary
2009-02-24 16:33:37 UTC
Permalink
Post by Jakub Narebski
Do I understand correctly that those patches are not related at all
semantically or textually, only in that you have them one after other
(and blob sha-1 in the index line reflects state after former), isn't
it?
Yes, I think they were independent. They are related only in that they
are small tweaks to gitweb and products of my bug hyperlinking patch
(not submitted). I submitted them as a series mainly to group them.
Are you suggesting that I reserve threading patches together for when
one builds upon the previous one?

Marcel
Giuseppe Bilotta
2009-02-18 07:41:54 UTC
Permalink
Post by Marcel M. Cary
When a feature like "blame" is permitted to be overridden in the
repository configuration but it is not actually set in the
repository, a warning is emitted due to the undefined value
of the repository configuration, even though it's a perfectly
normal condition.
The warning is grounds for test failure in the gitweb test script,
so it causes some new feature tests of mine to fail.
This patch prevents warning and adds a test case to exercise it.
---
Here's a small patch I put together while tinkering with bug hyperlinking.
Does this look reasonable?
@@ -1978,6 +1978,8 @@ sub git_get_project_config {
$config_file = "$git_dir/config";
}
+ return undef if (!defined $config{"gitweb.$key"});
+
I'm no Perl expert, so I have no idea: how do non-bool config checks
(which expect arrays) cope with an undef? Also, you may want to add a
non-bool override test in the test suite.
--
Giuseppe "Oblomov" Bilotta
Junio C Hamano
2009-02-18 08:40:27 UTC
Permalink
Post by Marcel M. Cary
When a feature like "blame" is permitted to be overridden in the
repository configuration but it is not actually set in the
repository, a warning is emitted due to the undefined value
of the repository configuration, even though it's a perfectly
normal condition.
The warning is grounds for test failure in the gitweb test script,
so it causes some new feature tests of mine to fail.
This patch prevents warning and adds a test case to exercise it.
---
Here's a small patch I put together while tinkering with bug hyperlinking.
Does this look reasonable?
To my cursory look, it doesn't, and it is not entirely your fault, but if
we applied this patch, it would not improve things very much. It just
would shift the same problem around.
Post by Marcel M. Cary
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 7c48181..653f0be 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -402,13 +402,13 @@ sub feature_bool {
my $key = shift;
my ($val) = git_get_project_config($key, '--bool');
- if ($val eq 'true') {
+ if (!defined $val) {
+ return ($_[0]);
+ } elsif ($val eq 'true') {
return (1);
} elsif ($val eq 'false') {
return (0);
}
I think the warning you are talking about is to compare $val with 'true'
with 'eq' operator when $val could be undef. The check to see if $val is
undefined ato avoid that 'eq' comparison is fine, and the intent to return
false is also good, but I think feature_bool is meant to say "yes" or
"no", and existing code for 'false' is returning (0). I'd rather see your
new codepath normalize incoming undef the same way string 'false' is
normalized to return (0). Granted, the caller should be prepared to take
the answer as boolean and treat the usual Perl false values (numeric zero,
a string with single "0", or an undef) without barfing, so returning (undef)
from here ought to be safe (otherwise the callers are broken), but I'd
rather see this function play safe.

But it certainly is not my main complaint.
Post by Marcel M. Cary
sub feature_snapshot {
@@ -1978,6 +1978,8 @@ sub git_get_project_config {
$config_file = "$git_dir/config";
}
+ return undef if (!defined $config{"gitweb.$key"});
+
I think this change is missing a lot of necessary fixes associated with
it. Have you actually audited all the callers of this function you are
modifying? For example, feature_bool does this:

sub feature_bool {
my $key = shift;
my ($val) = git_get_project_config($key, '--bool');

if ($val eq 'true') {
return (1);
} elsif ($val eq 'false') {
...

With your above change, I think a missing configuration variable will
stuff undef in $val, and trigger the same "$val eq 'true'" comparison
warning here.

Granted, without your change the existing code triggers the same error in
another way, by calling config_to_bool sub with undef here:

# ensure given type
if (!defined $type) {
return $config{"gitweb.$key"};
} elsif ($type eq 'bool') {
# backward compatibility: 'git config --bool' returns true/false
return config_to_bool($config{"gitweb.$key"}) ? 'true' : 'false';

and config_to_bool sub is written in the same carelessness like so:

sub config_to_bool {
my $val = shift;

# strip leading and trailing whitespace
$val =~ s/^\s+//;

and triggers the same error here in the s/// operation. I think the right
fix for this part would look like this:

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 7c48181..2b140cc 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -1920,6 +1920,8 @@ sub git_parse_project_config {
sub config_to_bool {
my $val = shift;

+ return 1 if (!defined $val);
+
# strip leading and trailing whitespace
$val =~ s/^\s+//;
$val =~ s/\s+$//;

Because

[gitweb]
variable

parsed by git_parse_project_config('gitweb') should return a hash that
maps "gitweb.variable" to undef it must be fed as undef to
config_to_bool. This variable should be reported as "true".

There are tons of undef unsafeness in this file from a very cursory look.

Unrelated to any of these, I think the following is wrong:

sub feature_patches {
my @val = (git_get_project_config('patches', '--int'));

if (@val) {
return @val;
}

return ($_[0]);
}

As git_get_project_config() always returns something, hence "if (@val)"
can never be false.
Jakub Narebski
2009-02-18 13:09:41 UTC
Permalink
Fixed up patch at the bottom.
Post by Junio C Hamano
Post by Marcel M. Cary
When a feature like "blame" is permitted to be overridden in the
repository configuration but it is not actually set in the
repository, a warning is emitted due to the undefined value
of the repository configuration, even though it's a perfectly
normal condition.
The warning is grounds for test failure in the gitweb test script,
so it causes some new feature tests of mine to fail.
This patch prevents warning and adds a test case to exercise it.
---
Here's a small patch I put together while tinkering with bug hyperlinking.
Does this look reasonable?
Somewhat.

Corrected patch at the bottom.
Post by Junio C Hamano
To my cursory look, it doesn't, and it is not entirely your fault, but if
we applied this patch, it would not improve things very much. It just
would shift the same problem around.
Post by Marcel M. Cary
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 7c48181..653f0be 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -402,13 +402,13 @@ sub feature_bool {
my $key = shift;
my ($val) = git_get_project_config($key, '--bool');
- if ($val eq 'true') {
+ if (!defined $val) {
+ return ($_[0]);
+ } elsif ($val eq 'true') {
return (1);
} elsif ($val eq 'false') {
return (0);
}
I think the warning you are talking about is to compare $val with 'true'
with 'eq' operator when $val could be undef. The check to see if $val is
undefined to avoid that 'eq' comparison is fine, and the intent to return
false is also good, but I think feature_bool is meant to say "yes" or
"no", and existing code for 'false' is returning (0). I'd rather see your
new codepath normalize incoming undef the same way string 'false' is
normalized to return (0).
Actually git_get_project_config($key, '--bool') can return only three
values:
* 'true' if gitweb.$key config variable is 'true', 'yes', 1, or (after
fixes in the fixed up patch at the bottom) is undefined, i.e.
[gitweb]
blame
case
* 'false' if gitweb.$key exists and has other value (that includes
empty string value: "[section] val" is git-config --bool true, while
"[section] val = " is --bool false).
* undef if gitweb.$key does not exist in the config;
earlier version which used "git config --bool <variable>" returned
empty string ('') here.

We want to fall back to %feature value (i.e. do not override) if
variable is not set in config. Variable not set was '', and now is undef,
therefore need for this (correct) change.
Post by Junio C Hamano
Granted, the caller should be prepared to take
the answer as boolean and treat the usual Perl false values (numeric zero,
a string with single "0", or an undef) without barfing, so returning (undef)
from here ought to be safe (otherwise the callers are broken), but I'd
rather see this function play safe.
But it certainly is not my main complaint.
Well, I think we can now get rid of backwards compatibility (which is
not complete anyway: '' for not existent variable for old version, undef
for new version) with the old version which ran git-config once for each
variable, and do not go through 'true'/'false' to imitate calling
"git config --bool", which has to be converted back to Perl boolean
anyway.
Post by Junio C Hamano
Post by Marcel M. Cary
sub feature_snapshot {
@@ -1978,6 +1978,8 @@ sub git_get_project_config {
$config_file = "$git_dir/config";
}
+ return undef if (!defined $config{"gitweb.$key"});
+
It should be !exists, and not !defined here, see my fixed up patch
below.
Post by Junio C Hamano
I think this change is missing a lot of necessary fixes associated with
it. Have you actually audited all the callers of this function you are
sub feature_bool {
my $key = shift;
my ($val) = git_get_project_config($key, '--bool');
if ($val eq 'true') {
return (1);
} elsif ($val eq 'false') {
...
With your above change, I think a missing configuration variable will
stuff undef in $val, and trigger the same "$val eq 'true'" comparison
warning here.
Post by Marcel M. Cary
@@ -402,13 +402,13 @@ sub feature_bool {
my $key = shift;
my ($val) = git_get_project_config($key, '--bool');
- if ($val eq 'true') {
+ if (!defined $val) {
+ return ($_[0]);
+ } elsif ($val eq 'true') {
return (1);
} elsif ($val eq 'false') {
return (0);
}
Granted, without your change the existing code triggers the same error in
# ensure given type
if (!defined $type) {
return $config{"gitweb.$key"};
} elsif ($type eq 'bool') {
# backward compatibility: 'git config --bool' returns true/false
return config_to_bool($config{"gitweb.$key"}) ? 'true' : 'false';
sub config_to_bool {
my $val = shift;
# strip leading and trailing whitespace
$val =~ s/^\s+//;
and triggers the same error here in the s/// operation. I think the right
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 7c48181..2b140cc 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -1920,6 +1920,8 @@ sub git_parse_project_config {
sub config_to_bool {
my $val = shift;
+ return 1 if (!defined $val);
+
# strip leading and trailing whitespace
$val =~ s/^\s+//;
$val =~ s/\s+$//;
Because
[gitweb]
variable
parsed by git_parse_project_config('gitweb') should return a hash that
maps "gitweb.variable" to undef it must be fed as undef to
config_to_bool. This variable should be reported as "true".
Right (but not complete).

We do check !defined $val, but too late: _after_ trying to strip leading
and trailing whitespace. When going through various versions of
config_to_bool I have somehow forgot about this issue...
Post by Junio C Hamano
There are tons of undef unsafeness in this file from a very cursory look.
sub feature_patches {
}
return ($_[0]);
}
can never be false.
Actually after the patch below I think that git_get_project_config
returns empty list () in the list (array) context now if variable
does not exist in the config thanks to "return ;" magic. And empty
list evaluates to false in scalar context: "if (@val)" would be false
if variable does not exist in the config... in which case we would not
override 'default' value in %feature.

Alternate solution would be to use

sub feature_patches {
my $val = git_get_project_config('patches', '--int');

if (defined $val) {
return ($val);
}

return ($_[0]);
}



-- >8 --
From: Marcel M. Cary <***@oak.homeunix.org>
Subject: [PATCH] gitweb: Fix warnings with override permitted but no repo override

When a feature like "blame" is permitted to be overridden in the
repository configuration but it is not actually set in the repository,
a warning is emitted due to the undefined value of the repository
configuration, even though it's a perfectly normal condition.
Emitting warning is grounds for test failure in the gitweb test
script.

This error was caused by rewrite of git_get_project_config from using
"git config [<type>] <name>" for each individual configuration
variable checked to parsing "git config --list --null" output in
commit b201927 (gitweb: Read repo config using 'git config -z -l').
Earlier version of git_get_project_config was returning empty string
if variable do not exist in config; newer version is meant to return
undef in this case, therefore change in feature_bool was needed.

Additionally config_to_* subroutines were meant to be invoked only if
configuration variable exists; therefore we added early return to
git_get_project_config: it now returns no value if variable does not
exists in config. Otherwise config_to_* subroutines (config_to_bool
in paryicular) wouldn't be able to distinguish between the case where
variable does not exist and the case where variable doesn't have value
(the "[section] noval" case, which evaluates to true for boolean).

While at it fix bug in config_to_bool, where checking if $val is
defined (if config variable has value) was done _after_ stripping
leading and trailing whitespace, which lead to 'Use of uninitialized
value' warning.

Add test case for features overridable but not overriden in repo
config, and case for no value boolean configuration variable.

Signed-off-by: Marcel M. Cary <***@oak.homeunix.org>
Signed-off-by: Jakub Narebski <***@gmail.com>
---
gitweb/gitweb.perl | 16 ++++++++++------
t/t9500-gitweb-standalone-no-errors.sh | 18 +++++++++++++++++-
2 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 7c48181..83858fb 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -402,13 +402,13 @@ sub feature_bool {
my $key = shift;
my ($val) = git_get_project_config($key, '--bool');

- if ($val eq 'true') {
+ if (!defined $val) {
+ return ($_[0]);
+ } elsif ($val eq 'true') {
return (1);
} elsif ($val eq 'false') {
return (0);
}
-
- return ($_[0]);
}

sub feature_snapshot {
@@ -1914,18 +1914,19 @@ sub git_parse_project_config {
return %config;
}

-# convert config value to boolean, 'true' or 'false'
+# convert config value to boolean: 'true' or 'false'
# no value, number > 0, 'true' and 'yes' values are true
# rest of values are treated as false (never as error)
sub config_to_bool {
my $val = shift;

+ return 1 if !defined $val; # section.key
+
# strip leading and trailing whitespace
$val =~ s/^\s+//;
$val =~ s/\s+$//;

- return (!defined $val || # section.key
- ($val =~ /^\d+$/ && $val) || # section.key = 1
+ return (($val =~ /^\d+$/ && $val) || # section.key = 1
($val =~ /^(?:true|yes)$/i)); # section.key = true
}

@@ -1978,6 +1979,9 @@ sub git_get_project_config {
$config_file = "$git_dir/config";
}

+ # check if config variable (key) exists
+ return unless exists $config{"gitweb.$key"};
+
# ensure given type
if (!defined $type) {
return $config{"gitweb.$key"};
diff --git a/t/t9500-gitweb-standalone-no-errors.sh b/t/t9500-gitweb-standalone-no-errors.sh
index 7c6f70b..6ed10d0 100755
--- a/t/t9500-gitweb-standalone-no-errors.sh
+++ b/t/t9500-gitweb-standalone-no-errors.sh
@@ -662,6 +662,11 @@ cat >>gitweb_config.perl <<EOF
EOF

test_expect_success \
+ 'config override: tree view, features not overridden in repo config' \
+ 'gitweb_run "p=.git;a=tree"'
+test_debug 'cat gitweb.log'
+
+test_expect_success \
'config override: tree view, features disabled in repo config' \
'git config gitweb.blame no &&
git config gitweb.snapshot none &&
@@ -669,12 +674,23 @@ test_expect_success \
test_debug 'cat gitweb.log'

test_expect_success \
- 'config override: tree view, features enabled in repo config' \
+ 'config override: tree view, features enabled in repo config (1)' \
'git config gitweb.blame yes &&
git config gitweb.snapshot "zip,tgz, tbz2" &&
gitweb_run "p=.git;a=tree"'
test_debug 'cat gitweb.log'

+cat >.git/config <<\EOF
+# testing noval and alternate separator
+[gitweb]
+ blame
+ snapshot = zip tgz
+EOF
+test_expect_success \
+ 'config override: tree view, features enabled in repo config (2)' \
+ 'gitweb_run "p=.git;a=tree"'
+test_debug 'cat gitweb.log'
+
# ----------------------------------------------------------------------
# non-ASCII in README.html
--
1.6.1
Junio C Hamano
2009-02-18 19:02:42 UTC
Permalink
Post by Jakub Narebski
Fixed up patch at the bottom.
Thanks.
Jakub Narebski
2009-02-18 03:38:53 UTC
Permalink
Post by Marcel M. Cary
I'm interested in cross-linking bug references in commit messages to a
bug tracking system. I started tinkering a couple weeks ago and am
finally understanding that committags encompass this functionality.
(From the subject line I first understood "tags" to mean git tags rather
than commit message munging.)
What would you name this feature, then?
Post by Marcel M. Cary
Is the committags idea still under active development?
Well, it is in my todo list, rather further on...

[...]
Post by Marcel M. Cary
Two regexes would make it easier to configure a driver without needing
look-ahead and look-behind assertions. For example, if you want to
match non-negative integers but only in the context of a Resolves-bug
Resolves-bug: 1234, 1235
[...]
Post by Marcel M. Cary
I got the two-regex idea from a spec I ran across while evaluating
You don't need multiple regexps for that, and in above example it is
used _single_ regexp; only with more than one catching group.
Post by Marcel M. Cary
I like the idea of allowing a regex writer -- a gitweb admin or a
repository owner -- to ignore issues regarding HTML escaping. For
example, I'd rather not have &nbsp; in the regex. And I don't want the
replacement to have to escape "&" in a query string. That's a strength
of not having to write the whole link replacement rule. And I think
hyperlinking will be one of the most common uses of this committag
feature, so it's worth special support.
[...]
Post by Marcel M. Cary
I'm concerned about the composition of these committag drivers. In
other words, will it be hard for the configurer to manage interactions
between committag drivers? To choose a sane order, will I have to
understand the implementation details of each committag driver?
In current proposal the order of running committags drivers is
specified in configuration...
Post by Marcel M. Cary
Perhaps a simpler alternative would be to let at most one driver process
a given snippet of text, forbidding nesting of replacements. (If I
understand Junio's suggestion to use a list of strings and refs,
non-nesting overlaps are already not supported.) If all replacements
were hyperlinks -- and I expect that to be the common case -- they
wouldn't be nestable anyway. I wouldn't see it as a huge loss for the
nesting examples I can think of: Separate rules for span around S-o-b
and linking or obfuscation of email could be combined into one... A
rule to shade text quoted email-style with leading angle brackets could
just clobber any further processing of that text. And it might simplify
the code and testing of it quite a bit.
... but I guess that at first attempt we could support non-overlapping
committags only, i.e. replacement is always as whole not passed to
later committags.

Still there is a problem how to specify which parts of replacement for
committags have to be HTML escaped, and which are HTML and should not
be (and which are attributes, and have to be escaped too).

[...]
Post by Marcel M. Cary
* Wiki page names, like to [[Feature Documentation]]. These are notable
because they tend to contain punctuation that get HTML-escaped, like
quotes and ampersands.
Well, I think if it would be supported, it would be a very special
case, so I don't think generic support for this is needed nor required.
Post by Marcel M. Cary
* Links to gitweb itself, such as 123abc:file.txt and HEAD:file.txt. I
guess the current hash linking sort of does the first case except that
you have to get the hash of the blob instead of using the commit hash,
and the current hash linking wouldn't reveal the filename until after
you click, nor when viewing textual log messages. I'm not sure whether
special support for linking to multi-commit diffs or other object types
would be as helpful.
Also 'v1.5.4' etc linking to tag; both would be a good idea. At this
point I think we have already list of all references (for ref markers)
so it wouldn't require additional call to git command.


P.S. I understand that this post is an exception (send after long, long
time), but please do not toppost in replies. It goes against natural
reading order.
--
Jakub Narebski
Poland
Marcel M. Cary
2009-02-19 17:08:05 UTC
Permalink
Post by Jakub Narebski
Post by Marcel M. Cary
I'm interested in cross-linking bug references in commit messages to
a bug tracking system. I started tinkering a couple weeks ago and am
finally understanding that committags encompass this functionality.
(From the subject line I first understood "tags" to mean git tags
rather than commit message munging.)
What would you name this feature, then?
Heh, I'm not sure. It's like a filter in the unix pipeline sense, but
"commit message filter" sounds to me like some messages might be
rejected. Most of the drivers markup static text with HTML tags, but
not all of them. Maybe "commit message embellishment".

Perhaps a more important question is: will people find the feature once
it's implemented? I think that won't be a problem provided that it's
listed in the gitweb docs like the other configs.
Post by Jakub Narebski
Post by Marcel M. Cary
Is the committags idea still under active development?
Well, it is in my todo list, rather further on...
Is any code for it published in a repository anywhere? I see a branch
jn/gitweb-committag merged into master that looks relevant, but it only
has the sha1 regex improvement.
Post by Jakub Narebski
Post by Marcel M. Cary
Two regexes would make it easier to configure a driver without
needing look-ahead and look-behind assertions. For example, if you
want to match non-negative integers but only in the context of a
Resolves-bug: 1234, 1235
[...]
Post by Marcel M. Cary
I got the two-regex idea from a spec I ran across while evaluating
You don't need multiple regexps for that, and in above example it is
used _single_ regexp; only with more than one catching group.
I'm not sure what exactly you propose. In the second example in the
bugtraq spec, there are two regexes. Maybe you mean something like
this, but it breaks with three bugs:

$ perl -MData::Dumper -wne '
m/^Resolves-bug: (\d+)(?:, (\d+))*/;
print Dumper([$1, $2, $3, $4]);
'
Resolves-bug: 123
$VAR1 = [
'123',
undef,
undef,
undef
];
Resolves-bug: 123, 124
$VAR1 = [
'123',
'124',
undef,
undef
];
Resolves-bug: 123, 124, 125
$VAR1 = [
'123',
'125',
undef,
undef
];

Maybe something like this? But it's limited to an arbitrary number of
bug matches. Maybe it's good enough for pratical purposes, but it's
prone to unexpected breakage when the user exceeds the threshold of, in
this case, four bugs.

/^Resolves-bug: (\d+)(?:, (\d+))?(?:, (\d+))?(?:, (\d+))?/


Marcel
Marcel M. Cary
2009-06-19 14:13:50 UTC
Permalink
I want gitweb to hyperlink commits to my bug tracking system so that
information regarding the current status of a commit can be easily
cross-referenced. For example, the QA and release status of a commit
cannot be inserted into the comment. Maybe someday a "git notes"
feature will help with this, but for now, my organization has a
separate bug tracking system. Other repository browsers such as
unfuddle and websvn support similar features.

Since the bug hyperlinking feature was previously discussed as part of
"committags," a more general mechanism to embellish commit messages,
implement the more general mechanism instead, including the following
capabilities:

* Hyperlinking mentions of bug IDs to Bugzilla
* Hyperlinking URLs
* Hyperlinking Message-Ids to a mailing list archive
* Hyperlinking commit hashes as before by default, now with a
configurable regex
* Defining new committags per gitweb installation

Since different repositories may use different bug tracking systems or
mailing list archives, the URL parameter may be configured
per-repository without reiterating the regexes. To accomodate
different conventions, regexes may also be configured per-project.

This patch is heavily based on discussions and code samples from the
Git list:

[RFC/PATCH] gitweb: Add committags support, Sep 2006
http://thread.gmane.org/gmane.comp.version-control.git/27504

[RFC] gitweb: Add committags support (take 2), Dec 2006
http://thread.gmane.org/gmane.comp.version-control.git/33150

[RFC] Configuring (future) committags support in gitweb, Nov 2008
http://thread.gmane.org/gmane.comp.version-control.git/100415

Some issues I considered but punted:

* Should this configuration try to follow the bugtraq spec?

As far as I know, only subversion implements it. Separation of
regexes by a newline would be a little awkward in the git config.
And it is broader than just hyperlinking bugs: it also encompasses
GUI bug ID form fields. So gitweb would only implement a subset.
The gitweb configuration mechanism currently only reads
keys starting with "gitweb.", but these parameters would be more
broadly applicable, potentially to git-gui, for example.

However, it *would* be useful for Git tools to standardize on
config keys and interpretations of regexes and url formats. For
example, git-gui might be able to hyperlink the same text as gitweb,
and even show a separate bugID field when composing a commit
message.

* I would prefer the regex match against the whole commit message.

This would allow the regex to insist that a bug reference occur
on the first line or non-first line of the commit message. However,
even if we concatenated the log lines for the first committag,
subsequent committags would see the text broken up.

Also, it would allow the regex to match a phrase split across a
line boundary, as dicussed at some length in the first thread,
but again, only if no prior committags had interfered.

This could happen in a later patch.

* I would prefer the site admin have a way to let a repository
owner define new committags, which means having a way to specify
the 'sub' key from the repo config or having a flexible default.

The bugtraq and some of the regex questions must be decided now to
avoid breaking gitweb configs later.

Signed-off-by: Marcel M. Cary <***@oak.homeunix.org>
---
gitweb/INSTALL | 4 +
gitweb/gitweb.perl | 221 +++++++++++++++++++++++++++++++-
t/t9500-gitweb-standalone-no-errors.sh | 150 +++++++++++++++++++++-
3 files changed, 367 insertions(+), 8 deletions(-)

diff --git a/gitweb/INSTALL b/gitweb/INSTALL
index 18c9ce3..223e39e 100644
--- a/gitweb/INSTALL
+++ b/gitweb/INSTALL
@@ -123,6 +123,10 @@ GITWEB_CONFIG file:
$feature{'snapshot'}{'default'} = ['zip', 'tgz'];
$feature{'snapshot'}{'override'} = 1;

+ $feature{'committags'}{'default'} = ['sha1', 'url', 'bugzilla'];
+ $feature{'committags'}{'override'} = 1;
+
+

Gitweb repositories
-------------------
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 1e7e2d8..c66fdf3 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -195,6 +195,81 @@ our %known_snapshot_format_aliases = (
'x-zip' => undef, '' => undef,
);

+# Could call these something else besides committags... embellishments,
+# patterns, rewrite rules, ?
+#
+# In general, the site admin can enable/disable per-project configuration
+# of each committag. Only the 'options' part of the committag is configurable
+# per-project.
+#
+# The site admin can of course add new tags to this hash or override the
+# 'sub' key if necessary. But such changes may be fragile; this is not
+# designed as a full-blown plugin architecture.
+our %committags = (
+ # Link Git-style hashes to this gitweb
+ 'sha1' => {
+ 'options' => {
+ 'pattern' => qr/\b([0-9a-fA-F]{8,40})\b/,
+ },
+ 'override' => 0,
+ 'sub' => sub {
+ my ($opts, @match) = @_;
+ \$cgi->a({-href => href(action=>"object", hash=>$match[1]),
+ -class => "text"}, esc_html($match[0], -nbsp=>1));
+ },
+ },
+ # Link bug/features to Mantis bug tracker using Mantis-style contextual cues
+ 'mantis' => {
+ 'options' => {
+ 'pattern' => qr/(?:BUG|FEATURE)\((\d+)\)/,
+ 'url' => 'http://bugs.xmms2.xmms.se/view.php?id=',
+ },
+ 'override' => 0,
+ 'sub' => \&hyperlink_committag,
+ },
+ # Link mentions of bug IDs to bugzilla
+ 'bugzilla' => {
+ 'options' => {
+ 'pattern' => qr/bug\s+(\d+)/,
+ 'url' => 'http://bugzilla.kernel.org/show_bug.cgi?id=',
+ },
+ 'override' => 0,
+ 'sub' => \&hyperlink_committag,
+ },
+ # Link URLs
+ 'url' => {
+ 'options' => {
+ # Avoid matching punctuation that might immediately follow
+ # a url, is not part of the url, and is allowed in urls,
+ # like a full-stop ('.').
+ 'pattern' => qr!(http|ftp)s?://[-_a-zA-Z0-9\@/&=+~#<>;%:.?]+
+ [-_a-zA-Z0-9\@/&=+~#<>]!x,
+ },
+ 'override' => 0,
+ 'sub' => sub {
+ my ($opts, @match) = @_;
+ return
+ \$cgi->a({-href => $match[0],
+ -class => "text"},
+ esc_html($match[0], -nbsp=>1));
+ },
+ },
+ # Link Message-Id to mailing list archive
+ 'messageid' => {
+ 'options' => {
+ # The original pattern, which I don't really understand
+ #'pattern' => qr!(?:message|msg)-id:?\s+<([^>]+)>;!i,
+ 'pattern' => qr!(?:message|msg)-?id:?\s+(<[^>]+>)!i,
+ 'url' => 'http://news.gmane.org/find-root.php?message_id=',
+ },
+ 'override' => 0,
+ # The original version didn't include the "msg-id" text in the
+ # link text, but this does. In general, I think a little more
+ # context makes for better link text.
+ 'sub' => \&hyperlink_committag,
+ },
+);
+
# You define site-wide feature defaults here; override them with
# $GITWEB_CONFIG as necessary.
our %feature = (
@@ -365,6 +440,21 @@ our %feature = (
'sub' => \&feature_patches,
'override' => 0,
'default' => [16]},
+
+ # The selection and ordering of committags that are enabled.
+ # Committag transformations will be applied to commit log messages
+ # in this order if listed here.
+
+ # To disable system wide have in $GITWEB_CONFIG
+ # $feature{'committags'}{'default'} = [];
+ # To have project specific config enable override in $GITWEB_CONFIG
+ # $feature{'committags'}{'override'} = 1;
+ # and in project config gitweb.committags = sha1, url, bugzilla
+ # to enable those three committags for that project
+ 'committags' => {
+ 'sub' => \&feature_committags,
+ 'override' => 0,
+ 'default' => ['sha1']},
);

sub gitweb_get_feature {
@@ -433,6 +523,18 @@ sub feature_patches {
return ($_[0]);
}

+sub feature_committags {
+ my (@defaults) = @_;
+
+ my ($cfg) = git_get_project_config('committags');
+
+ if ($cfg) {
+ return ($cfg eq 'none' ? () : split(/\s*[,\s]\s*/, $cfg));
+ }
+
+ return @defaults;
+}
+
# checking HEAD file with -e is fragile if the repository was
# initialized long time ago (i.e. symlink HEAD) and was pack-ref'ed
# and then pruned.
@@ -814,6 +916,34 @@ $git_dir = "$projectroot/$project" if $project;
our @snapshot_fmts = gitweb_get_feature('snapshot');
@snapshot_fmts = filter_snapshot_fmts(@snapshot_fmts);

+# ordering of committags
+our @committags = gitweb_get_feature('committags');
+
+# Merge project configs with default committag definitions
+gitweb_load_project_committags();
+
+# Load committag configs from the repository config file and and
+# incorporate them into the gitweb defaults where permitted by the
+# site administrator.
+sub gitweb_load_project_committags {
+ return if (!$git_dir);
+ my %project_config = ();
+ my %raw_config = git_parse_project_config('gitweb\.committag');
+ foreach my $key (keys(%raw_config)) {
+ next if ($key !~ /gitweb\.committag\.[^.]+\.[^.]/);
+ my ($gitweb_prefix, $committag_prefix, $ctname, $option) =
+ split(/\./, $key, 4);
+ $project_config{$ctname}{$option} = $raw_config{$key};
+ }
+ foreach my $ctname (keys(%committags)) {
+ next if (!$committags{$ctname}{'override'});
+ foreach my $optname (keys %{$project_config{$ctname}}) {
+ $committags{$ctname}{'options'}{$optname} =
+ $project_config{$ctname}{$optname};
+ }
+ }
+}
+
# dispatch
if (!defined $action) {
if (defined $hash) {
@@ -1384,13 +1514,92 @@ sub file_type_long {
sub format_log_line_html {
my $line = shift;

- $line = esc_html($line, -nbsp=>1);
- $line =~ s{\b([0-9a-fA-F]{8,40})\b}{
- $cgi->a({-href => href(action=>"object", hash=>$1),
- -class => "text"}, $1);
- }eg;
+ # In this list of log message fragments, a string ref indicates HTML,
+ # and a string indicates plain text
+ my @list = ( $line );

- return $line;
+COMMITTAG:
+ foreach my $ctname (@committags) {
+ next COMMITTAG unless exists $committags{$ctname};
+ my $committag = $committags{$ctname};
+
+ next COMMITTAG unless exists $committag->{'options'};
+ my $opts = $committag->{'options'};
+
+ next COMMITTAG unless exists $opts->{'pattern'};
+ my $pattern = $opts->{'pattern'};
+
+ my @newlist = ();
+
+ PART:
+ foreach my $part (@list) {
+ next PART if $part eq "";
+ if (ref($part)) {
+ push @newlist, $part;
+ next PART;
+ }
+
+ my $oldpos = 0;
+
+ MATCH:
+ while ($part =~ m/$pattern/gc) {
+ my ($prepos, $postpos) = ($-[0], $+[0]);
+ my $repl = $committag->{'sub'}->($opts, $&, $1);
+ $repl = "" if (!defined $repl);
+
+ my $pre = substr($part, $oldpos, $prepos - $oldpos);
+ push_or_append(\@newlist, $pre);
+ push_or_append(\@newlist, $repl);
+
+ $oldpos = $postpos;
+ } # end while [regexp matches]
+
+ my $rest = substr($part, $oldpos);
+ push_or_append(\@newlist, $rest);
+
+ } # end foreach (@list)
+
+ @list = @newlist;
+ } # end foreach (@committags)
+
+ # Escape any remaining plain text and concatenate
+ my $html = '';
+ for my $part (@list) {
+ if (ref($part)) {
+ $html .= $$part;
+ } else {
+ $html .= esc_html($part, -nbsp=>1);
+ }
+ }
+
+ return $html;
+}
+
+# Returns a ref to an HTML snippet that links the second
+# parameter to a URL formed from the first and last parameters.
+# This is a helper function used in %committags.
+sub hyperlink_committag {
+ my ($opts, @match) = @_;
+ return
+ \$cgi->a({-href => $opts->{url} . CGI::escape($match[1]),
+ -class => "text"},
+ esc_html($match[0], -nbsp=>1));
+}
+
+
+sub push_or_append (\@@) {
+ my $list = shift;
+
+ if (ref $_[0] || ! @$list || ref $list->[-1]) {
+ push @$list, @_;
+ } else {
+ my $a = pop @$list;
+ my $b = shift @_;
+
+ push @$list, $a . $b, @_;
+ }
+ # imitate push
+ return scalar @$list;
}

# format marker of refs pointing to given object
diff --git a/t/t9500-gitweb-standalone-no-errors.sh b/t/t9500-gitweb-standalone-no-errors.sh
index d539619..37a127c 100755
--- a/t/t9500-gitweb-standalone-no-errors.sh
+++ b/t/t9500-gitweb-standalone-no-errors.sh
@@ -55,9 +55,9 @@ gitweb_run () {
# some of git commands write to STDERR on error, but this is not
# written to web server logs, so we are not interested in that:
# we are interested only in properly formatted errors/warnings
- rm -f gitweb.log &&
+ rm -f resp.http gitweb.log &&
perl -- "$SCRIPT_NAME" \
- >/dev/null 2>gitweb.log &&
+ > resp.http 2>gitweb.log &&
if grep "^[[]" gitweb.log >/dev/null 2>&1; then false; else true; fi

# gitweb.log is left for debugging
@@ -702,4 +702,150 @@ test_expect_success \
gitweb_run "p=.git;a=summary"'
test_debug 'cat gitweb.log'

+# ----------------------------------------------------------------------
+# sha1 linking
+#
+echo hi > file.txt
+git add file.txt
+git commit -q -F - file.txt <<END
+Summary
+
+See also commit 567890ab
+END
+test_expect_success 'sha1 link: enabled by default' '
+ h=$(git rev-parse --verify HEAD) &&
+ gitweb_run "p=.git;a=commit;h=$h" &&
+ grep -q \
+ "commit&nbsp;<a class=\"text\" href=\".*\">567890ab</a>" \
+ resp.http
+'
+test_debug 'cat gitweb.log'
+test_debug 'grep 567890ab resp.http'
+
+# ----------------------------------------------------------------------
+# bugzilla commit tag
+#
+
+echo foo > file.txt
+git add file.txt
+git commit -q -F - file.txt <<END
+Fix foo
+
+Fixes bug 1234 involving foo.
+END
+git config gitweb.committags 'sha1, bugzilla'
+test_expect_success 'bugzilla: enabled but not permitted' '
+ h=$(git rev-parse --verify HEAD) &&
+ gitweb_run "p=.git;a=commit;h=$h" &&
+ grep -F -q \
+ "Fixes&nbsp;bug&nbsp;1234&nbsp;involving" \
+ resp.http
+'
+test_debug 'cat gitweb.log'
+test_debug 'grep 1234 resp.http'
+
+echo '$feature{"committags"}{"override"} = 1;' >> gitweb_config.perl
+test_expect_success 'bugzilla: enabled' '
+ h=$(git rev-parse --verify HEAD) &&
+ gitweb_run "p=.git;a=commit;h=$h" &&
+ grep -F -q \
+ "Fixes&nbsp;<a class=\"text\" href=\"http://bugzilla.kernel.org/show_bug.cgi?id=1234\">bug&nbsp;1234</a>&nbsp;involving" \
+ resp.http
+'
+test_debug 'cat gitweb.log'
+test_debug 'grep 1234 resp.http'
+
+git config gitweb.committag.bugzilla.url 'http://bts.example.com?bug='
+test_expect_success 'bugzilla: url overridden but not permitted' '
+ h=$(git rev-parse --verify HEAD) &&
+ gitweb_run "p=.git;a=commit;h=$h" &&
+ grep -F -q \
+ "Fixes&nbsp;<a class=\"text\" href=\"http://bugzilla.kernel.org/show_bug.cgi?id=1234\">bug&nbsp;1234</a>&nbsp;involving" \
+ resp.http
+'
+test_debug 'cat gitweb.log'
+test_debug 'grep 1234 resp.http'
+
+echo '$committags{"bugzilla"}{"override"} = 1;' >> gitweb_config.perl
+test_expect_success 'bugzilla: url overridden' '
+ h=$(git rev-parse --verify HEAD) &&
+ gitweb_run "p=.git;a=commit;h=$h" &&
+ grep -F -q \
+ "Fixes&nbsp;<a class=\"text\" href=\"http://bts.example.com?bug=1234\">bug&nbsp;1234</a>&nbsp;involving" \
+ resp.http
+'
+test_debug 'cat gitweb.log'
+test_debug 'grep 1234 resp.http'
+
+git config gitweb.committag.bugzilla.pattern 'Fixes bug (\d+)'
+test_expect_success 'bugzilla: pattern overridden' '
+ h=$(git rev-parse --verify HEAD) &&
+ gitweb_run "p=.git;a=commit;h=$h" &&
+ grep -F -q \
+ "<a class=\"text\" href=\"http://bts.example.com?bug=1234\">Fixes&nbsp;bug&nbsp;1234</a>&nbsp;involving" \
+ resp.http
+'
+test_debug 'cat gitweb.log'
+test_debug 'grep 1234 resp.http'
+git config --unset gitweb.committag.bugzilla.pattern
+
+test_expect_success 'bugzilla: affects log view too' '
+ gitweb_run "p=.git;a=log" &&
+ grep -F -q \
+ "<a class=\"text\" href=\"http://bts.example.com?bug=1234\">bug&nbsp;1234</a>" \
+ resp.http
+'
+test_debug 'cat gitweb.log'
+test_debug 'grep 1234 resp.http'
+
+# ----------------------------------------------------------------------
+# url linking
+#
+echo url_test > file.txt
+git add file.txt
+url='http://***@pass:example.com/foo.html?u=v&x=y#z'
+url_esc="$(echo "$url" | sed 's/&/&amp;/g')"
+git commit -q -F - file.txt <<END
+Summary
+
+See also $url.
+END
+echo '$feature{"committags"}{"override"} = 1;' >> gitweb_config.perl
+git config gitweb.committags 'sha1, url'
+test_expect_success 'url link: links when enabled' '
+ h=$(git rev-parse --verify HEAD) &&
+ gitweb_run "p=.git;a=commit;h=$h" &&
+ grep -q -F \
+ "See&nbsp;also&nbsp;<a class=\"text\" href=\"$url_esc\">$url_esc</a>." \
+ resp.http
+'
+test_debug 'cat gitweb.log'
+test_debug 'grep -F "$url" resp.http'
+
+# ----------------------------------------------------------------------
+# message id linking
+#
+echo msgid_test > file.txt
+git add file.txt
+url='http://news.gmane.org/find-root.php?message_id='
+msgid='<***@y.z>'
+msgid_esc="$(echo "$msgid" | sed 's/</\&lt;/g; s/>/\&gt;/g')"
+msgid_url="$url$(echo "$msgid" | sed 's/</%3C/g; s/@/%40/g; s/>/%3E/g')"
+git commit -q -F - file.txt <<END
+Summary
+
+See msg-id $msgid.
+END
+echo '$feature{"committags"}{"override"} = 1;' >> gitweb_config.perl
+git config gitweb.committags 'sha1, messageid'
+test_expect_success 'msgid link: linked when enabled' '
+ h=$(git rev-parse --verify HEAD) &&
+ gitweb_run "p=.git;a=commit;h=$h" &&
+ grep -q -F \
+ "See&nbsp;<a class=\"text\" href=\"$msgid_url\">msg-id&nbsp;$msgid_esc</a>." \
+ resp.http
+'
+test_debug 'cat gitweb.log'
+test_debug 'grep -F "y.z" resp.http'
+
test_done
--
1.6.2
Jakub Narebski
2009-06-22 11:18:18 UTC
Permalink
On Fri, 19 June 2009, Marcel M. Cary wrote:

Thanks for diligently working on this issue. Good work!

I see that it is an RFC, and not final submission, but just in case
I'd like to remind you that some of information below should go into
commit message, but some of it should I think go to comments (between
"---" and diffstat).
Post by Marcel M. Cary
I want gitweb to hyperlink commits to my bug tracking system so that
information regarding the current status of a commit can be easily
cross-referenced. For example, the QA and release status of a commit
cannot be inserted into the comment. Maybe someday a "git notes"
feature will help with this, but for now, my organization has a
separate bug tracking system. Other repository browsers such as
unfuddle and websvn support similar features.
The paragraph above should, I think, be made more clear. You don't
need to mention what you don't do; the comment about "git notes" should
be not in commit message but in comments section.

What you want to have is to have some markers in commit message
(committags) hyperlinked; namely you want notifications about bug/issue
numbers in the commit message hyperlinked to appropriate bugtracker/issue
tracker URL. Do I understand this correctly?

I tried here to reword what you said, to come up with better commit
message for a future final submission.
Post by Marcel M. Cary
Since the bug hyperlinking feature was previously discussed as part of
"committags," a more general mechanism to embellish commit messages,
implement the more general mechanism instead, including the following
Well, I think that the fact that it would be not much harder to create
general mechanism for commit message transformation, than to add
suitably generic and well customizable support for bugtracker
'committags'.
Post by Marcel M. Cary
* Hyperlinking mentions of bug IDs to Bugzilla
* Hyperlinking URLs
* Hyperlinking Message-Ids to a mailing list archive
* Hyperlinking commit hashes as before by default, now with a
configurable regex
* Defining new committags per gitweb installation
Well, there is one _implicit_ (but important) commit message
transformation (filter) for display, which has to be always present[1],
namely HTML escaping. We make use of the fact that you can do
HTML escaping before doing the only currently supported committag,
namely hyperlinking (shortened) SHA-1 to 'object' gitweb URL, but for
other committags like mentioned "Message-Id to mail archive" committag
(filter) it would make them more difficult.

Also one might consider vertical whitespace simplification (removing
leading empty lines, compacting empty lines to single empty line
between paragraphs), and syntax highlighting signoff lines to be
a kind of commit message filter like mentioned above committags
(see git_print_log() subroutine).

Although probably vertical whitespace simplification should be not
made into commit message filter, as it is used not for all views.
Post by Marcel M. Cary
Since different repositories may use different bug tracking systems or
mailing list archives, the URL parameter may be configured
per-repository without reiterating the regexes. To accomodate
different conventions, regexes may also be configured per-project.
Also list of supported committags is separated from the list of
committags used[1]; just like it is done for snapshot formats.
This could be mentioned in final commit message.

[1] well, sequence rather than list in this case, as here
ordering does matter a bit
Post by Marcel M. Cary
This patch is heavily based on discussions and code samples from the
[RFC/PATCH] gitweb: Add committags support, Sep 2006
http://thread.gmane.org/gmane.comp.version-control.git/27504
[RFC] gitweb: Add committags support (take 2), Dec 2006
http://thread.gmane.org/gmane.comp.version-control.git/33150
[RFC] Configuring (future) committags support in gitweb, Nov 2008
http://thread.gmane.org/gmane.comp.version-control.git/100415
Hmmm... should this be put in final commit message, or only in comment
to the patch (should this be in commit history of git repository)?
Post by Marcel M. Cary
* Should this configuration try to follow the bugtraq spec?
As far as I know, only subversion implements it. Separation of
regexes by a newline would be a little awkward in the git config.
And it is broader than just hyperlinking bugs: it also encompasses
GUI bug ID form fields. So gitweb would only implement a subset.
I didn't even know that there is such spec. Were you talking about
http://tortoisesvn.net/issuetracker_integration or do you have different
URL in mind?
Post by Marcel M. Cary
The gitweb configuration mechanism currently only reads
keys starting with "gitweb.", but these parameters would be more
broadly applicable, potentially to git-gui, for example.
Actually the fact that gitweb reads only keys in the 'gitweb' section
from config is just a convention. There were (are) no config variables
in other places (other sections) which would be of interest to gitweb.
Post by Marcel M. Cary
However, it *would* be useful for Git tools to standardize on
config keys and interpretations of regexes and url formats. For
example, git-gui might be able to hyperlink the same text as gitweb,
and even show a separate bugID field when composing a commit
message.
This is I think a very good idea... but I think idea which
implementation can be left for later.
Post by Marcel M. Cary
* I would prefer the regex match against the whole commit message.
This would allow the regex to insist that a bug reference occur
on the first line or non-first line of the commit message. However,
even if we concatenated the log lines for the first committag,
subsequent committags would see the text broken up.
Also, it would allow the regex to match a phrase split across a
line boundary, as dicussed at some length in the first thread,
but again, only if no prior committags had interfered.
This could happen in a later patch.
Well, the change should be fairly easy: just concatenate lines before
passing them as single element list to commit message filters. OTOH
you would have to take care of end of line characters in committags
regexps.
Post by Marcel M. Cary
* I would prefer the site admin have a way to let a repository
owner define new committags, which means having a way to specify
the 'sub' key from the repo config or having a flexible default.
Perhaps, following your earlier suggestion to make committags supported
also by other tools, gitweb (and e.g. git-gui / gitk) use config
variable committag.<name>.<key> (where <key> can be 'pattern' or 'url';
although I wonder if we can allow 'pattern' as malicious user can do
a DoS attack against gitweb / server using badly behaved regexp).
Post by Marcel M. Cary
The bugtraq and some of the regex questions must be decided now to
avoid breaking gitweb configs later.
True.
Post by Marcel M. Cary
---
gitweb/INSTALL | 4 +
gitweb/gitweb.perl | 221 +++++++++++++++++++++++++++++++-
t/t9500-gitweb-standalone-no-errors.sh | 150 +++++++++++++++++++++-
3 files changed, 367 insertions(+), 8 deletions(-)
diff --git a/gitweb/INSTALL b/gitweb/INSTALL
index 18c9ce3..223e39e 100644
--- a/gitweb/INSTALL
+++ b/gitweb/INSTALL
$feature{'snapshot'}{'default'} = ['zip', 'tgz'];
$feature{'snapshot'}{'override'} = 1;
+ $feature{'committags'}{'default'} = ['sha1', 'url', 'bugzilla'];
+ $feature{'committags'}{'override'} = 1;
+
+
Gitweb repositories
-------------------
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 1e7e2d8..c66fdf3 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -195,6 +195,81 @@ our %known_snapshot_format_aliases = (
'x-zip' => undef, '' => undef,
);
I understand that comments such as one below would be not present in
a final submission, and they are here to provide running commentary for
code, isn't it?
Post by Marcel M. Cary
+# Could call these something else besides committags... embellishments,
+# patterns, rewrite rules, ?
They are "commit filters", or "commit message filters" (or 'formatters',
or 'processors'; they are not 'parsers').

The name 'committag' was first introduced as far as I remember in xmms2
fork of gitweb (in old times when gitweb was separate project, and not
part of git repository).
Post by Marcel M. Cary
+#
+# In general, the site admin can enable/disable per-project configuration
+# of each committag. Only the 'options' part of the committag is configurable
+# per-project.
See above caveat about allowing to customize 'regexp'/'pattern' part
in untrusted environment; you can construct regexp which has exponential
behavior.
Post by Marcel M. Cary
+#
+# The site admin can of course add new tags to this hash or override the
+# 'sub' key if necessary. But such changes may be fragile; this is not
+# designed as a full-blown plugin architecture.
+our %committags = (
You should put the comments here about supported keys, similar to the
one for %known_snapshot_formats and %feature hashes.
Post by Marcel M. Cary
+ # Link Git-style hashes to this gitweb
+ 'sha1' => {
+ 'options' => {
+ 'pattern' => qr/\b([0-9a-fA-F]{8,40})\b/,
+ },
+ 'override' => 0,
Shouldn't 'override' key be better last?
Post by Marcel M. Cary
+ 'sub' => sub {
+ \$cgi->a({-href => href(action=>"object", hash=>$match[1]),
+ -class => "text"}, esc_html($match[0], -nbsp=>1));
+ },
Style: although there is commonly used idiom to use 'sub { <expr>; }'
for a wrapper subroutines (e.g. 'sub { [] }' in Moose examples), one
should use explicit "return" statement instead of relying on Perl
behavior of returning last statement in a block.

See Perl::Critic::Policy::Subroutines::RequireFinalReturn policy in
Perl::Critic (perlcritic.com). "Perl Best Practices" says:

Subroutines without explicit 'return' statements at their ends can be
confusing. It can be challenging to deduce what the return value will be.
Post by Marcel M. Cary
+ },
+ # Link bug/features to Mantis bug tracker using Mantis-style contextual cues
+ 'mantis' => {
+ 'options' => {
+ 'pattern' => qr/(?:BUG|FEATURE)\((\d+)\)/,
+ 'url' => 'http://bugs.xmms2.xmms.se/view.php?id=',
I don't think we want to put such URL here. Please check if Mantis
documentation uses some specific links, or follow RFC conventions and
use 'example.com' as hostname (e.g. 'bugs.example.com').

By the way the bugtraq proposal you mentioned uses placeholder in URL
for putting issue number (%BUGID%). Perhaps gitweb should do the same
here.
Post by Marcel M. Cary
+ },
+ 'override' => 0,
+ 'sub' => \&hyperlink_committag,
+ },
+ # Link mentions of bug IDs to bugzilla
+ 'bugzilla' => {
+ 'options' => {
+ 'pattern' => qr/bug\s+(\d+)/,
+ 'url' => 'http://bugzilla.kernel.org/show_bug.cgi?id=',
The same comment as above.
Post by Marcel M. Cary
+ },
+ 'override' => 0,
+ 'sub' => \&hyperlink_committag,
+ },
+ # Link URLs
+ 'url' => {
+ 'options' => {
+ # Avoid matching punctuation that might immediately follow
+ # a url, is not part of the url, and is allowed in urls,
+ # like a full-stop ('.').
If you took this regexp from some place (like blog), it would be good
to mention URL here, to be able to check more detailed explanation of
construction of this URL-catching regexp.

Should we also support irc://, nntp:// (pseudo)protocols? What about
git:// ?
Post by Marcel M. Cary
+ },
+ 'override' => 0,
+ 'sub' => sub {
+ return
+ \$cgi->a({-href => $match[0],
+ -class => "text"},
+ esc_html($match[0], -nbsp=>1));
+ },
Here you use explicit return.
Post by Marcel M. Cary
+ },
+ # Link Message-Id to mailing list archive
+ 'messageid' => {
+ 'options' => {
+ # The original pattern, which I don't really understand
+ #'pattern' => qr!(?:message|msg)-id:?\s+<([^>]+)>;!i,
+ 'pattern' => qr!(?:message|msg)-?id:?\s+(<[^>]+>)!i,
Errr... how original patter is different from the one used? Also above
comment should be removed in final submission.
Post by Marcel M. Cary
+ 'url' => 'http://news.gmane.org/find-root.php?message_id=',
Same comment about generic URL... although on the other hand perhaps
having a few examples of mail archive sites which support finding
messages by Message-Id could be a good idea.

BTW. you can write 'http://mid.gmane.org/' instead...
Post by Marcel M. Cary
+ },
+ 'override' => 0,
+ # The original version didn't include the "msg-id" text in the
+ # link text, but this does. In general, I think a little more
+ # context makes for better link text.
I guess that is the result of using generic hyperlink_committag()
subroutine here. (This comment should be removed or reworded in final
submitted version, I think.)

BTW. it would be much easier with Perl6-ish (or Perl 5.10.x) named
captures (named groups):

'pattern' => qr!(?:message|msg)-?id:?\s+(?P<query><[^>]+>)!i,

or something like that.
Post by Marcel M. Cary
+ 'sub' => \&hyperlink_committag,
+ },
+);
+
# You define site-wide feature defaults here; override them with
# $GITWEB_CONFIG as necessary.
our %feature = (
@@ -365,6 +440,21 @@ our %feature = (
'sub' => \&feature_patches,
'override' => 0,
'default' => [16]},
+
+ # The selection and ordering of committags that are enabled.
+ # Committag transformations will be applied to commit log messages
+ # in this order if listed here.
/this/given/

You need to mention somewhere that committag subroutines return a list
of mixed scalar and reference to scalar elements, where using reference
to scalar removes value from the chain of filters (including implicit
final esc_html filter).
Post by Marcel M. Cary
+
+ # To disable system wide have in $GITWEB_CONFIG
+ # $feature{'committags'}{'default'} = [];
+ # To have project specific config enable override in $GITWEB_CONFIG
+ # $feature{'committags'}{'override'} = 1;
+ # and in project config gitweb.committags = sha1, url, bugzilla
+ # to enable those three committags for that project
Just a thought: perhaps we should provide support for 'default' in
config (which would currently be "sha1" or "sha1, url").

See also comment text for 'snapshot' feature, which says:

and in project config, a comma-separated list of [...] or
"none" to disable.
Post by Marcel M. Cary
+ 'committags' => {
+ 'sub' => \&feature_committags,
+ 'override' => 0,
+ 'default' => ['sha1']},
);
sub gitweb_get_feature {
@@ -433,6 +523,18 @@ sub feature_patches {
return ($_[0]);
}
+sub feature_committags {
+
+ my ($cfg) = git_get_project_config('committags');
+
+ if ($cfg) {
+ return ($cfg eq 'none' ? () : split(/\s*[,\s]\s*/, $cfg));
+ }
+
+}
As this would be second feature which uses comma-separated (or for
backward compatibility space separated) list of options, perhaps
we should factor out this part into common helper subroutine named
for example 'feature_list' or 'feature_multi' (like 'feature_bool').
Post by Marcel M. Cary
+
# checking HEAD file with -e is fragile if the repository was
# initialized long time ago (i.e. symlink HEAD) and was pack-ref'ed
# and then pruned.
@@ -814,6 +916,34 @@ $git_dir = "$projectroot/$project" if $project;
@snapshot_fmts = filter_snapshot_fmts(@snapshot_fmts);
+# ordering of committags
+
+# Merge project configs with default committag definitions
+gitweb_load_project_committags();
Good idea... although gitweb first defines and then uses subroutine,
see evaluate_path_info().
Post by Marcel M. Cary
+
+# Load committag configs from the repository config file and and
+# incorporate them into the gitweb defaults where permitted by the
+# site administrator.
+sub gitweb_load_project_committags {
+ return if (!$git_dir);
+ my %project_config = ();
+ my %raw_config = git_parse_project_config('gitweb\.committag');
Why not do lazy-loading of a whole config here? We use committag
info only for project-specific actions in gitweb.
Post by Marcel M. Cary
+ foreach my $key (keys(%raw_config)) {
+ next if ($key !~ /gitweb\.committag\.[^.]+\.[^.]/);
+ my ($gitweb_prefix, $committag_prefix, $ctname, $option) =
+ split(/\./, $key, 4);
+ $project_config{$ctname}{$option} = $raw_config{$key};
+ }
And use created subroutines to handle config?
Post by Marcel M. Cary
+ foreach my $ctname (keys(%committags)) {
+ next if (!$committags{$ctname}{'override'});
+ foreach my $optname (keys %{$project_config{$ctname}}) {
+ $committags{$ctname}{'options'}{$optname} =
+ $project_config{$ctname}{$optname};
+ }
+ }
+}
+
# dispatch
if (!defined $action) {
if (defined $hash) {
@@ -1384,13 +1514,92 @@ sub file_type_long {
sub format_log_line_html {
my $line = shift;
- $line = esc_html($line, -nbsp=>1);
- $line =~ s{\b([0-9a-fA-F]{8,40})\b}{
- $cgi->a({-href => href(action=>"object", hash=>$1),
- -class => "text"}, $1);
- }eg;
+ # In this list of log message fragments, a string ref indicates HTML,
+ # and a string indicates plain text
Well, to be more exact string ref means that the string referenced is
not to be processed by later filters, including final implicit esc_html.
Post by Marcel M. Cary
- return $line;
+ next COMMITTAG unless exists $committags{$ctname};
+ my $committag = $committags{$ctname};
+
+ next COMMITTAG unless exists $committag->{'options'};
+ my $opts = $committag->{'options'};
+
+ next COMMITTAG unless exists $opts->{'pattern'};
+ my $pattern = $opts->{'pattern'};
+
+
+ next PART if $part eq "";
+ if (ref($part)) {
+ next PART;
+ }
+
+ my $oldpos = 0;
+
+ while ($part =~ m/$pattern/gc) {
+ my ($prepos, $postpos) = ($-[0], $+[0]);
+ my $repl = $committag->{'sub'}->($opts, $&, $1);
+ $repl = "" if (!defined $repl);
+
+ my $pre = substr($part, $oldpos, $prepos - $oldpos);
+
+ $oldpos = $postpos;
+ } # end while [regexp matches]
+
+ my $rest = substr($part, $oldpos);
+
+
+
+ # Escape any remaining plain text and concatenate
+ my $html = '';
+ if (ref($part)) {
+ $html .= $$part;
+ } else {
+ $html .= esc_html($part, -nbsp=>1);
+ }
+ }
+
+ return $html;
+}
Nice.
Post by Marcel M. Cary
+
+# Returns a ref to an HTML snippet that links the second
+# parameter to a URL formed from the first and last parameters.
+# This is a helper function used in %committags.
+sub hyperlink_committag {
+ return
+ \$cgi->a({-href => $opts->{url} . CGI::escape($match[1]),
$opts->{'url'} not $opts->{url}

'$cgi->escapeHTML' I think, not 'CGI::escape' (but I am not sure here).
besides, we can always import 'escape'.
Post by Marcel M. Cary
+ -class => "text"},
+ esc_html($match[0], -nbsp=>1));
+}
+
+
Hmmm... this would be first use of Perl subroutine prototypes in gitweb.
But this is made to imitate 'push' built-in, so I think it is O.K.
Post by Marcel M. Cary
+ my $list = shift;
+
+ } else {
+
+ }
+ # imitate push
}
# format marker of refs pointing to given object
diff --git a/t/t9500-gitweb-standalone-no-errors.sh b/t/t9500-gitweb-standalone-no-errors.sh
index d539619..37a127c 100755
--- a/t/t9500-gitweb-standalone-no-errors.sh
+++ b/t/t9500-gitweb-standalone-no-errors.sh
@@ -55,9 +55,9 @@ gitweb_run () {
# some of git commands write to STDERR on error, but this is not
# we are interested only in properly formatted errors/warnings
- rm -f gitweb.log &&
+ rm -f resp.http gitweb.log &&
perl -- "$SCRIPT_NAME" \
- >/dev/null 2>gitweb.log &&
+ > resp.http 2>gitweb.log &&
if grep "^[[]" gitweb.log >/dev/null 2>&1; then false; else true; fi
Well, if you begin to check _output_ of gitweb, then it should be put
in separate test, not t/t9500-gitweb-standalone-no-errors.sh which is
only about no-errors... or change name of gitweb test.
Post by Marcel M. Cary
# gitweb.log is left for debugging
@@ -702,4 +702,150 @@ test_expect_success \
gitweb_run "p=.git;a=summary"'
test_debug 'cat gitweb.log'
+# ----------------------------------------------------------------------
+# sha1 linking
+#
+echo hi > file.txt
+git add file.txt
+git commit -q -F - file.txt <<END
+Summary
+
+See also commit 567890ab
+END
+test_expect_success 'sha1 link: enabled by default' '
+ h=$(git rev-parse --verify HEAD) &&
+ gitweb_run "p=.git;a=commit;h=$h" &&
Actually you can just use "h=HEAD" or use query without 'h' parameter
(which defaults to "HEAD") here.
Post by Marcel M. Cary
+ grep -q \
+ "commit&nbsp;<a class=\"text\" href=\".*\">567890ab</a>" \
+ resp.http
+'
+test_debug 'cat gitweb.log'
+test_debug 'grep 567890ab resp.http'
I'd rather use Test::* (e.g. Test::WWW::Mechanize::CGI) for that...
but having some output test for gitweb, even in such simple form would
certainly be nice.
Post by Marcel M. Cary
+
+# ----------------------------------------------------------------------
+# bugzilla commit tag
+#
+
+echo foo > file.txt
+git add file.txt
+git commit -q -F - file.txt <<END
+Fix foo
+
+Fixes bug 1234 involving foo.
+END
+git config gitweb.committags 'sha1, bugzilla'
+test_expect_success 'bugzilla: enabled but not permitted' '
+ h=$(git rev-parse --verify HEAD) &&
+ gitweb_run "p=.git;a=commit;h=$h" &&
+ grep -F -q \
+ "Fixes&nbsp;bug&nbsp;1234&nbsp;involving" \
+ resp.http
+'
+test_debug 'cat gitweb.log'
+test_debug 'grep 1234 resp.http'
+
+echo '$feature{"committags"}{"override"} = 1;' >> gitweb_config.perl
+test_expect_success 'bugzilla: enabled' '
+ h=$(git rev-parse --verify HEAD) &&
+ gitweb_run "p=.git;a=commit;h=$h" &&
+ grep -F -q \
+ "Fixes&nbsp;<a class=\"text\" href=\"http://bugzilla.kernel.org/show_bug.cgi?id=1234\">bug&nbsp;1234</a>&nbsp;involving" \
+ resp.http
+'
Hmmm...
--
Jakub Narebski
Poland
Marcel M. Cary
2009-11-18 06:22:29 UTC
Permalink
Committags are limited to the functionality configured by the site
administrator.

Provide two more general purpose committag subroutines that replace
text by feeding the capturing groups of a pattern to a sprintf format.
One additionally escapes the parameters of the capturing groups for
producing HTML snippets, the other does not.

Then, if permitted by the site administrator, allow the 'sub' key to
be overridden in an existing committag and allow a new committag to be
defined completely from within the repository configuration.

Signed-off-by: Marcel M. Cary <***@oak.homeunix.org>
---
gitweb/gitweb.perl | 135 ++++++++++++++++++++++++++++--------------
t/t9502-gitweb-committags.sh | 50 +++++++++++++++
2 files changed, 141 insertions(+), 44 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 7f7d3a3..d413f22 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -214,8 +214,7 @@ our %avatar_size = (
);

# In general, the site admin can enable/disable per-project
-# configuration of each committag. Only the 'options' part of the
-# committag is configurable per-project.
+# configuration of each committag.
#
# The site admin can of course add new tags to this hash or override
# the 'sub' key if necessary. But such changes may be fragile; this
@@ -241,12 +240,18 @@ our %avatar_size = (
# on the implementation of the 'sub' key. The hyperlink_committag
# value appends the first captured group to the 'url' option, for example.
#
+# The project configuration can define new committags. Although the
+# project configuration cannot supply code defining a new 'sub' key,
+# the project configuration can choose from a list of pre-approved
+# subroutines named in the 'allowed_committag_subs' feature. Both a
+# 'sub' key and 'pattern' key must be defined for the committag to be
+# used. If the 'allowed_committag_subs' feature is empty, no new
+# committags can be defined in the project config.
+#
our %committags = (
# Link Git-style hashes to this gitweb
'sha1' => {
- 'options' => {
- 'pattern' => qr/\b([0-9a-fA-F]{8,40})\b/,
- },
+ 'pattern' => qr/\b([0-9a-fA-F]{8,40})\b/,
'override' => 0,
'sub' => sub {
my ($opts, @match) = @_;
@@ -258,10 +263,8 @@ our %committags = (
# Facilitate styling of common header/footer lines, suppressing
# any trailing newlines
'signoff' => {
- 'options' => {
- 'pattern' =>
- qr/^( *(?:signed[ \-]off[ \-]by|acked[ \-]by|cc)[ :].*)\n*$/mi,
- },
+ 'pattern' =>
+ qr/^( *(?:signed[ \-]off[ \-]by|acked[ \-]by|cc)[ :].*)\n*$/mi,
'override' => 0,
'sub' => sub {
my ($opts, @match) = @_;
@@ -273,23 +276,19 @@ our %committags = (
# Link bug/features to Mantis bug tracker using Mantis-style
# contextual cues
'mantis' => {
- 'options' => {
- 'pattern' => qr/(?:BUG|FEATURE)\((\d+)\)/,
- 'url' => 'http://www.example.com/mantisbt/view.php?id=',
- },
+ 'pattern' => qr/(?:BUG|FEATURE)\((\d+)\)/,
+ 'url' => 'http://www.example.com/mantisbt/view.php?id=',
'override' => 0,
'sub' => \&hyperlink_committag,
},
# Link mentions of bugs to bugzilla, allowing for separate outer
# and inner regexes (see unit test for example)
'bugzilla' => {
- 'options' => {
- 'pattern' => qr/(?i:bugs?):?\s+
- [#]?\d+(?:(?:,\s*|,?\sand\s|,?\sn?or\s|\s+)
- [#]?\d+\b)*/x,
- 'innerpattern' => qr/#?(\d+)/,
- 'url' => 'http://bugzilla.example.com/show_bug.cgi?id=',
- },
+ 'pattern' => qr/(?i:bugs?):?\s+
+ [#]?\d+(?:(?:,\s*|,?\sand\s|,?\sn?or\s|\s+)
+ [#]?\d+\b)*/x,
+ 'innerpattern' => qr/#?(\d+)/,
+ 'url' => 'http://bugzilla.example.com/show_bug.cgi?id=',
'override' => 0,
'sub' => sub {
my ($opts, @match) = @_;
@@ -308,15 +307,13 @@ our %committags = (
},
# Link URLs
'url' => {
- 'options' => {
- # Avoid matching punctuation that might immediately follow
- # a url, is not part of the url, and is allowed in urls,
- # like a full-stop ('.').
- 'pattern' => qr!(https?|ftps?|git|ssh|ssh+git|sftp|smb|webdavs?|
- nfs|irc|nntp|rsync)
- ://[-_a-zA-Z0-9\@/&=+~#<>;%:.?]+
- [-_a-zA-Z0-9\@/&=+~#<>]!x,
- },
+ # Avoid matching punctuation that might immediately follow
+ # a url, is not part of the url, and is allowed in urls,
+ # like a full-stop ('.').
+ 'pattern' => qr!(https?|ftps?|git|ssh|ssh+git|sftp|smb|webdavs?|
+ nfs|irc|nntp|rsync)
+ ://[-_a-zA-Z0-9\@/&=+~#<>;%:.?]+
+ [-_a-zA-Z0-9\@/&=+~#<>]!x,
'override' => 0,
'sub' => sub {
my ($opts, @match) = @_;
@@ -327,10 +324,8 @@ our %committags = (
},
# Link Message-Id to mailing list archive
'messageid' => {
- 'options' => {
- 'pattern' => qr!(?:message|msg)-?id:?\s+(<[^>]+>)!i,
- 'url' => 'http://mid.gmane.org/',
- },
+ 'pattern' => qr!(?:message|msg)-?id:?\s+(<[^>]+>)!i,
+ 'url' => 'http://mid.gmane.org/',
'override' => 0,
# Includes the "msg-id" text in the link text.
# Since we don't support linking multiple msg-ids in one match, we
@@ -558,6 +553,22 @@ our %feature = (
'sub' => sub { feature_list('committags', @_) },
'override' => 0,
'default' => ['signoff', 'sha1']},
+
+ # The list of committag callbacks that are permitted to be used
+ # from within a repository configuration. These are interpretted
+ # as perl subrouting names. If none are listed, no new committags
+ # can be defined in the project config, which is the default.
+
+ # To enable system wide have in $GITWEB_CONFIG
+ # $feature{'allowed_committag_subs'}{'default'} = [
+ # 'hyperlink_committag',
+ # 'markup_committag',
+ # 'transform_committag',
+ # ];
+ # It would not make sense to allow per-project overrides of this.
+ 'allowed_committag_subs' => {
+ 'override' => 0,
+ 'default' => []},
);

sub gitweb_get_feature {
@@ -1030,6 +1041,9 @@ if ($git_avatar eq 'gravatar') {
# ordering of committags
our @committags = gitweb_get_feature('committags');

+# ordering of committags
+our @allowed_committag_subs = gitweb_get_feature('allowed_committag_subs');
+
# whether we've loaded committags for the project yet
our $loaded_project_committags = 0;

@@ -1046,9 +1060,18 @@ sub gitweb_load_project_committags {
split(/\./, $key, 4);
$project_config{$ctname}{$option} = $raw_config{$key};
}
- foreach my $ctname (keys(%committags)) {
- my $override = $committags{$ctname}{'override'};
+
+ my %allowed_subs = ();
+ foreach my $sub (@allowed_committag_subs) {
+ $allowed_subs{$sub} = 1;
+ }
+
+ foreach my $ctname (keys(%project_config)) {
+ my $override = $committags{$ctname}
+ ? $committags{$ctname}{'override'}
+ : 1;
next if (!$override);
+
my $override_keys = undef;
if (ref($override) eq "ARRAY") {
$override_keys = {};
@@ -1056,12 +1079,19 @@ sub gitweb_load_project_committags {
$override_keys->{$optname} = 1;
}
}
+
foreach my $optname (keys %{$project_config{$ctname}}) {
next if ($override_keys && !$override_keys->{$optname});
- $committags{$ctname}{'options'}{$optname} =
- $project_config{$ctname}{$optname};
+ my $value = $project_config{$ctname}{$optname};
+ if ($optname eq 'sub') {
+ if (!$allowed_subs{$value}) {
+ next;
+ }
+ }
+ $committags{$ctname}{$optname} = $value;
}
}
+
$loaded_project_committags = 1;
}

@@ -1654,11 +1684,8 @@ COMMITTAG:
next COMMITTAG unless exists $committag->{'sub'};
my $sub = $committag->{'sub'};

- next COMMITTAG unless exists $committag->{'options'};
- my $opts = $committag->{'options'};
-
- next COMMITTAG unless exists $opts->{'pattern'};
- my $pattern = $opts->{'pattern'};
+ next COMMITTAG unless exists $committag->{'pattern'};
+ my $pattern = $committag->{'pattern'};

my @new_message_fragments = ();

@@ -1672,7 +1699,8 @@ COMMITTAG:

push_or_append_replacements(\@new_message_fragments,
$pattern, $fragment, sub {
- $sub->($opts, @_);
+ no strict "refs"; # for custome committags
+ $sub->($committag, @_);
});

} # end foreach (@message_fragments)
@@ -1705,6 +1733,25 @@ sub hyperlink_committag {
esc_html($match[0], -nbsp=>1));
}

+# Returns a ref to an HTML snippet formed from the 'replacement'
+# option and match data. The match data is HTML-escaped, and the
+# 'replacement' option is used as a sprintf format. This is a helper
+# function used in %committags.
+sub markup_committag {
+ my ($opts, @match) = @_;
+ return \sprintf($opts->{'replacement'},
+ map { esc_html($_, -nbsp=>1) if defined } @match);
+}
+
+# Returns a text snippet formed from the 'replacement' option and
+# match data. The 'replacement' option is used as a sprintf format.
+# Because the result is text, it can be re-processed by subsequent
+# committags. This is a helper function used in %committags.
+sub transform_committag {
+ my ($opts, @match) = @_;
+ return sprintf($opts->{'replacement'}, @match);
+}
+
# Find $pattern in string $fragment, and push_or_append the parts
# between matches and the result of calling $sub with matched text to
# $new_fragments.
@@ -1717,7 +1764,7 @@ MATCH:
while ($fragment =~ m/$pattern/gc) {
my ($prepos, $postpos) = ($-[0], $+[0]);

- my @repl = $sub->($&, $1);
+ my @repl = $sub->($&, $1, $2);

my $pre = substr($fragment, $oldpos, $prepos - $oldpos);
push_or_append($new_fragments, $pre);
diff --git a/t/t9502-gitweb-committags.sh b/t/t9502-gitweb-committags.sh
index 0753630..cbe607b 100755
--- a/t/t9502-gitweb-committags.sh
+++ b/t/t9502-gitweb-committags.sh
@@ -226,5 +226,55 @@ test_expect_success 'msgid link: linked when enabled' '
test_debug 'cat gitweb.log'
test_debug 'grep -F "y.z" gitweb.output'

+# ----------------------------------------------------------------------
+# custom committags
+#
+echo custom_test > file.txt
+git add file.txt
+git commit -q -F - file.txt <<END
+Something for <foo&***@bar.com>
+END
+echo '$feature{"allowed_committag_subs"}{"default"} = [
+ "hyperlink_committag",
+ "markup_committag",
+ "transform_committag",
+ ];' >> gitweb_config.perl
+git config gitweb.committags 'sha1, obfuscate'
+git config gitweb.committag.obfuscate.pattern '([a-z&]+@)[a-z]+(.com)'
+git config gitweb.committag.obfuscate.sub 'transform_committag'
+git config gitweb.committag.obfuscate.replacement '%2$sXXX%3$s'
+test_expect_success 'custom committags: transform_committag' '
+ gitweb_run "p=.git;a=commit;h=HEAD" &&
+ grep -q -F \
+ "foo&amp;***@XXX.com" \
+ gitweb.output
+'
+test_debug 'cat gitweb.log'
+test_debug 'grep -F "foo" gitweb.output'
+
+git config gitweb.committags 'sha1, linkemail'
+git config gitweb.committag.linkemail.pattern '<([a-z&]+@[a-z]+.com)>'
+git config gitweb.committag.linkemail.sub 'markup_committag'
+git config gitweb.committag.linkemail.replacement '<a href="mailto:%2$s">%1$s</a>'
+test_expect_success 'custom committags: markup_committag' '
+ gitweb_run "p=.git;a=commit;h=HEAD" &&
+ grep -q -F \
+ "<a href=\"mailto:foo&amp;***@bar.com\">&lt;foo&amp;***@bar.com&gt;</a>" \
+ gitweb.output
+'
+test_debug 'cat gitweb.log'
+test_debug 'grep -F "foo" gitweb.output'
+
+echo '$feature{"allowed_committag_subs"}{"default"} = [
+ ];' >> gitweb_config.perl
+test_expect_success 'custom committags: ignored when disabled' '
+ gitweb_run "p=.git;a=commit;h=HEAD" &&
+ grep -q -F \
+ "Something&nbsp;for&nbsp;&lt;foo&amp;***@bar.com&gt;" \
+ gitweb.output
+'
+test_debug 'cat gitweb.log'
+test_debug 'grep -F "foo" gitweb.output'
+

test_done
--
1.6.4.4
Marcel M. Cary
2009-11-18 06:22:30 UTC
Permalink
If the site admin configures the list of committags, there's no way
for a project to get the defaults back short of enumerating them
explicitly. Worse yet, when the distribution upgrades the default
list, perhaps to push more pre-existing functionality into committags,
the project would have to discover this and upgrade its configuration
to match the new defaults.

Add a special _defaults_ list entry which, in the project config,
expands to the build-time default list configured for that variable.
A project may use this to append or prepend to the default
configuration, even as the default configuration changes with new
releases.
---
gitweb/INSTALL | 5 +++++
gitweb/gitweb.perl | 18 +++++++++++++-----
t/t9502-gitweb-committags.sh | 29 +++++++++++++++++++++++++++++
3 files changed, 47 insertions(+), 5 deletions(-)

diff --git a/gitweb/INSTALL b/gitweb/INSTALL
index 15c0128..83e6a5e 100644
--- a/gitweb/INSTALL
+++ b/gitweb/INSTALL
@@ -143,6 +143,11 @@ And then let each project configure its bug tracker URL:

git config gitweb.committag.bugzilla.url 'http://bts.example.com?bug='

+In a project config, the build-time list of committags can be accessed
+with the special _defaults_ entry.
+
+ git config gitweb.committags '_defaults_, bugzilla'
+

Gitweb repositories
-------------------
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index d413f22..707e76e 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -334,6 +334,13 @@ our %committags = (
},
);

+sub make_list_feature {
+ my ($name, $hash) = @_;
+ $hash->{'build_default'} = [@{$hash->{'default'}}];
+ $hash->{'sub'} = sub { feature_list($name, @_) };
+ return @_;
+}
+
# You define site-wide feature defaults here; override them with
# $GITWEB_CONFIG as necessary.
our %feature = (
@@ -378,8 +385,7 @@ our %feature = (
# $feature{'snapshot'}{'override'} = 1;
# and in project config, a comma-separated list of formats or "none"
# to disable. Example: gitweb.snapshot = tbz2,zip;
- 'snapshot' => {
- 'sub' => sub { feature_list('snapshot', @_) },
+ make_list_feature 'snapshot' => {
'override' => 0,
'default' => ['tgz']},

@@ -549,8 +555,7 @@ our %feature = (
# $feature{'committags'}{'override'} = 1;
# and in project config gitweb.committags = sha1, url, bugzilla
# to enable those three committags for that project
- 'committags' => {
- 'sub' => sub { feature_list('committags', @_) },
+ make_list_feature 'committags' => {
'override' => 0,
'default' => ['signoff', 'sha1']},

@@ -621,7 +626,10 @@ sub feature_list {
my ($cfg) = git_get_project_config($key);

if ($cfg) {
- return ($cfg eq 'none' ? () : split(/\s*[,\s]\s*/, $cfg));
+ return () if $cfg eq 'none';
+ return map {
+ $_ eq '_defaults_' ? @{$feature{$key}{'build_default'}} : $_
+ } split(/\s*[,\s]\s*/, $cfg);
}

return @defaults;
diff --git a/t/t9502-gitweb-committags.sh b/t/t9502-gitweb-committags.sh
index cbe607b..7d16329 100755
--- a/t/t9502-gitweb-committags.sh
+++ b/t/t9502-gitweb-committags.sh
@@ -276,5 +276,34 @@ test_expect_success 'custom committags: ignored when disabled' '
test_debug 'cat gitweb.log'
test_debug 'grep -F "foo" gitweb.output'

+# ----------------------------------------------------------------------
+# default keyword
+#
+echo default_test > file.txt
+git add file.txt
+git commit -q -F - file.txt <<END
+Lets see what's enabled...
+
+Bug 1234
+567890ab
+See msg-id <***@y.z>
+
+Signed-off-by: A U Thor <***@example.com>
+END
+echo '
+$feature{"committags"}{"default"} = ["sha1", "messageid"];
+$feature{"committags"}{"override"} = 1;
+' >> gitweb_config.perl
+git config gitweb.committags '_defaults_, bugzilla'
+# All these committags should be in effect except messageid
+test_expect_success '_defaults_ keyword: restores build-time default' '
+ gitweb_run "p=.git;a=commit;h=HEAD" &&
+ grep -q "Bug&nbsp;<a[^>]*>1234</a>" gitweb.output &&
+ grep -q "<a[^>]*>567890ab</a>" gitweb.output &&
+ grep -q "See&nbsp;msg-id&nbsp;&lt;***@y.z&gt;" gitweb.output &&
+ grep -q "<span[^>]*>Signed-off-by:" gitweb.output
+'
+test_debug 'cat gitweb.log'
+test_debug 'for i in Bug 5678 msg-id Signed-off; do grep $i gitweb.output; done'

test_done
--
1.6.4.4
Marcel M. Cary
2009-11-18 06:22:27 UTC
Permalink
Currently, a site administrator must choose between allowing all or
none of a committag's options to be overridden in the project config.
However, a site admin may wish to permit specifying a bugzilla URL
without risking a maliciously resource hungry regular expression.

Allow the site admin to specify which committag parameters may be
overridden. Preserve the behavior of the original 0 and 1 override
specifications.

Signed-off-by: Marcel M. Cary <***@oak.homeunix.org>
---
gitweb/INSTALL | 8 +++++++-
gitweb/gitweb.perl | 24 ++++++++++++++++++------
t/t9502-gitweb-committags.sh | 13 +++++++++++++
3 files changed, 38 insertions(+), 7 deletions(-)

diff --git a/gitweb/INSTALL b/gitweb/INSTALL
index 9081ed8..15c0128 100644
--- a/gitweb/INSTALL
+++ b/gitweb/INSTALL
@@ -133,9 +133,15 @@ adding the following lines to your $GITWEB_CONFIG:
$known_snapshot_formats{'tgz'}{'compressor'} = ['gzip','-6'];

To add a committag to the default list of commit tags, for example to
-enable hyperlinking of bug numbers to a bug tracker for all projects:
+enable hyperlinking of bug numbers to a bug tracker for all projects, while
+allowing each project to choose only the base URL for its bug tracker:

push @{$feature{'committags'}{'default'}}, 'bugzilla';
+ $committags{"bugzilla"}{"override"} = ["url"];
+
+And then let each project configure its bug tracker URL:
+
+ git config gitweb.committag.bugzilla.url 'http://bts.example.com?bug='


Gitweb repositories
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 032b1c5..8f4480e 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -225,11 +225,13 @@ our %avatar_size = (
# will not be processed further.
#
# For any committag, set the 'override' key to 1 to allow individual
-# projects to override entries in the 'options' hash for that tag.
-# For example, to match only commit hashes given in lowercase in one
-# project, add this to the $GITWEB_CONFIG:
+# projects to override any entry in the 'options' hash for that tag.
+# Leave 'override' as 0 to disallow all overriding of all entries.
+# Set 'override' to an array of 'option' key names to allow overriding
+# specific keys. For example, to match only commit hashes given in
+# lowercase in one project, add this to the $GITWEB_CONFIG:
#
-# $committags{'sha1'}{'override'} = 1;
+# $committags{'sha1'}{'override'} = 1; # or ["pattern"]
#
# And in the project's config:
#
@@ -237,7 +239,8 @@ our %avatar_size = (
#
# Some committags have additional options whose interpretation depends
# on the implementation of the 'sub' key. The hyperlink_committag
-# value appends the first captured group to the 'url' option.
+# value appends the first captured group to the 'url' option, for example.
+#
our %committags = (
# Link Git-style hashes to this gitweb
'sha1' => {
@@ -1029,8 +1032,17 @@ sub gitweb_load_project_committags {
$project_config{$ctname}{$option} = $raw_config{$key};
}
foreach my $ctname (keys(%committags)) {
- next if (!$committags{$ctname}{'override'});
+ my $override = $committags{$ctname}{'override'};
+ next if (!$override);
+ my $override_keys = undef;
+ if (ref($override) eq "ARRAY") {
+ $override_keys = {};
+ foreach my $optname (@$override) {
+ $override_keys->{$optname} = 1;
+ }
+ }
foreach my $optname (keys %{$project_config{$ctname}}) {
+ next if ($override_keys && !$override_keys->{$optname});
$committags{$ctname}{'options'}{$optname} =
$project_config{$ctname}{$optname};
}
diff --git a/t/t9502-gitweb-committags.sh b/t/t9502-gitweb-committags.sh
index 718e763..e13ac47 100755
--- a/t/t9502-gitweb-committags.sh
+++ b/t/t9502-gitweb-committags.sh
@@ -68,6 +68,19 @@ test_expect_success 'bugzilla: url overridden but not permitted' '
test_debug 'cat gitweb.log'
test_debug 'grep 1234 gitweb.output'

+echo '$committags{"bugzilla"}{"override"} = ["url"];' >> gitweb_config.perl
+git config gitweb.committag.bugzilla.url 'http://bts.example.com?bug='
+git config gitweb.committag.bugzilla.pattern 'slow DoS regex'
+test_expect_success 'bugzilla: url overridden but regex not permitted' '
+ gitweb_run "p=.git;a=commit;h=HEAD" &&
+ grep -F -q \
+ "Fixes&nbsp;bug&nbsp;<a class=\"text\" href=\"http://bts.example.com?bug=1234\">1234</a>&nbsp;involving" \
+ gitweb.output
+'
+test_debug 'cat gitweb.log'
+test_debug 'grep 1234 gitweb.output'
+git config --unset gitweb.committag.bugzilla.pattern
+
echo '$committags{"bugzilla"}{"override"} = 1;' >> gitweb_config.perl
test_expect_success 'bugzilla: url overridden' '
gitweb_run "p=.git;a=commit;h=HEAD" &&
--
1.6.4.4
Marcel M. Cary
2009-11-18 06:22:28 UTC
Permalink
Committags cannot currently span multiple lines. Since some committag
patterns match multiple words. If some of those words wrap to the
next line, the committag would miss an opportunity to match.

Eliminate the for-loop over @log and pull the signoff transformation
from that loop into a committag.

The message will still get cut into pieces as committags are applied,
but at least newlines no longer force a cut.

Signed-off-by: Marcel M. Cary <***@oak.homeunix.org>
---
gitweb/gitweb.perl | 67 +++++++++++++++++++-----------------------
t/t9502-gitweb-committags.sh | 8 +++++
2 files changed, 38 insertions(+), 37 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 8f4480e..7f7d3a3 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -255,6 +255,21 @@ our %committags = (
esc_html($match[0], -nbsp=>1));
},
},
+ # Facilitate styling of common header/footer lines, suppressing
+ # any trailing newlines
+ 'signoff' => {
+ 'options' => {
+ 'pattern' =>
+ qr/^( *(?:signed[ \-]off[ \-]by|acked[ \-]by|cc)[ :].*)\n*$/mi,
+ },
+ 'override' => 0,
+ 'sub' => sub {
+ my ($opts, @match) = @_;
+ return (\$cgi->span({'class' => 'signoff'},
+ esc_html($match[1], -nbsp=>1)),
+ "\n");
+ },
+ },
# Link bug/features to Mantis bug tracker using Mantis-style
# contextual cues
'mantis' => {
@@ -542,7 +557,7 @@ our %feature = (
'committags' => {
'sub' => sub { feature_list('committags', @_) },
'override' => 0,
- 'default' => ['sha1']},
+ 'default' => ['signoff', 'sha1']},
);

sub gitweb_get_feature {
@@ -1619,8 +1634,8 @@ sub file_type_long {
## which don't belong to other sections

# format line of commit message.
-sub format_log_line_html {
- my $line = shift;
+sub format_log_html {
+ my $text = shift;

# Merge project configs with site default committag definitions if
# it hasn't been done yet
@@ -1629,7 +1644,7 @@ sub format_log_line_html {
# In this list of log message fragments, a string ref indicates
# HTML, and a string indicates plain text. The string refs are
# also currently not processed by subsequent committags.
- my @message_fragments = ( $line );
+ my @message_fragments = ( $text );

COMMITTAG:
foreach my $ctname (@committags) {
@@ -1671,7 +1686,9 @@ COMMITTAG:
if (ref($fragment)) {
$html .= $$fragment;
} else {
- $html .= esc_html($fragment, -nbsp=>1);
+ # Don't let esc_html turn "\n" into "\\n"
+ $html .= join("<br/>\n", map { esc_html($_, -nbsp=>1) }
+ split("\n", $fragment, -1));
}
}

@@ -3776,40 +3793,16 @@ sub git_print_log {
shift @$log;
}

- # print log
- my $signoff = 0;
- my $empty = 0;
- foreach my $line (@$log) {
- if ($line =~ m/^ *(signed[ \-]off[ \-]by[ :]|acked[ \-]by[ :]|cc[ :])/i) {
- $signoff = 1;
- $empty = 0;
- if (! $opts{'-remove_signoff'}) {
- print "<span class=\"signoff\">" . esc_html($line) . "</span><br/>\n";
- next;
- } else {
- # remove signoff lines
- next;
- }
- } else {
- $signoff = 0;
- }
-
- # print only one empty line
- # do not print empty line after signoff
- if ($line eq "") {
- next if ($empty || $signoff);
- $empty = 1;
- } else {
- $empty = 0;
- }
-
- print format_log_line_html($line) . "<br/>\n";
- }
-
if ($opts{'-final_empty_line'}) {
- # end with single empty line
- print "<br/>\n" unless $empty;
+ # If we already have a trailing newline, this will be
+ # coalesced with it later.
+ push @$log, "";
}
+
+ # print log
+ my $text = join("\n", @$log) . "\n";
+ $text =~ s{\n\n+}{\n\n}g;
+ print format_log_html($text);
}

# return link target (what link points to)
diff --git a/t/t9502-gitweb-committags.sh b/t/t9502-gitweb-committags.sh
index e13ac47..0753630 100755
--- a/t/t9502-gitweb-committags.sh
+++ b/t/t9502-gitweb-committags.sh
@@ -138,6 +138,10 @@ Fix memory leak in confabulator from bug 123.
Based on history from bugs 223, 224, and 225,
fix bug 323 or 324.

+Bugs:
+1234,
+1235
+
Bug: 423,424,425,426,427,428,429,430,431,432,435
Resolves-bugs: #523 #524
END
@@ -167,6 +171,10 @@ test_expect_success 'bugzilla: fancy defaults: space-separated with hash' '
"-bugs:&nbsp;<a[^>]*>#523</a>&nbsp;<a[^>]*>#524</a>" \
gitweb.output
'
+test_expect_success 'bugzilla: fancy defaults: spanning newlines' '
+ grep -q -e "<a[^>]*>1234</a>,<br" gitweb.output &&
+ grep -q -e "<a[^>]*>1235</a><br" gitweb.output
+'
test_debug 'cat gitweb.log'
test_debug 'grep 23 gitweb.output'
--
1.6.4.4
Marcel M. Cary
2009-11-18 06:22:26 UTC
Permalink
Currently it's not easy to capture an unbounded number of items
in a committag phrase to hyperlink individually.

For example, I would like to match and hyperlinking each bug ID
individually in these situations:

[#1234, #1235]
Resolves-bug: 1234, 1235
bugs 1234, 1235, and 1236

Match Bugzilla bug IDs with two regexes instead of one. The first is
a pre-filter that allows easy matching of multiple bug IDs and
contextual queues like "bugs ___ and ___" in a phrase, and the second
easily picks out the individual big IDs for hyperlinking.

Signed-off-by: Marcel M. Cary <***@oak.homeunix.org>
---
gitweb/gitweb.perl | 68 +++++++++++++++++++++++++++++------------
t/t9502-gitweb-committags.sh | 69 ++++++++++++++++++++++++++++++++++++++---
2 files changed, 112 insertions(+), 25 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 2d72202..032b1c5 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -262,14 +262,31 @@ our %committags = (
'override' => 0,
'sub' => \&hyperlink_committag,
},
- # Link mentions of bug IDs to bugzilla
+ # Link mentions of bugs to bugzilla, allowing for separate outer
+ # and inner regexes (see unit test for example)
'bugzilla' => {
'options' => {
- 'pattern' => qr/bug\s+(\d+)/,
+ 'pattern' => qr/(?i:bugs?):?\s+
+ [#]?\d+(?:(?:,\s*|,?\sand\s|,?\sn?or\s|\s+)
+ [#]?\d+\b)*/x,
+ 'innerpattern' => qr/#?(\d+)/,
'url' => 'http://bugzilla.example.com/show_bug.cgi?id=',
},
'override' => 0,
- 'sub' => \&hyperlink_committag,
+ 'sub' => sub {
+ my ($opts, @match) = @_;
+ if ($opts->{'innerpattern'}) {
+ my @message_fragments = ();
+ push_or_append_replacements(\@message_fragments,
+ $opts->{'innerpattern'},
+ $match[0], sub {
+ return hyperlink_committag($opts, @_);
+ });
+ return @message_fragments;
+ } else {
+ return hyperlink_committag(@_);
+ }
+ },
},
# Link URLs
'url' => {
@@ -1626,23 +1643,10 @@ COMMITTAG:
next PART;
}

- my $oldpos = 0;
-
- MATCH:
- while ($fragment =~ m/$pattern/gc) {
- my ($prepos, $postpos) = ($-[0], $+[0]);
- my $repl = $sub->($opts, $&, $1);
- $repl = "" if (!defined $repl);
-
- my $pre = substr($fragment, $oldpos, $prepos - $oldpos);
- push_or_append(\@new_message_fragments, $pre);
- push_or_append(\@new_message_fragments, $repl);
-
- $oldpos = $postpos;
- } # end while [regexp matches]
-
- my $rest = substr($fragment, $oldpos);
- push_or_append(\@new_message_fragments, $rest);
+ push_or_append_replacements(\@new_message_fragments,
+ $pattern, $fragment, sub {
+ $sub->($opts, @_);
+ });

} # end foreach (@message_fragments)

@@ -1672,6 +1676,30 @@ sub hyperlink_committag {
esc_html($match[0], -nbsp=>1));
}

+# Find $pattern in string $fragment, and push_or_append the parts
+# between matches and the result of calling $sub with matched text to
+# $new_fragments.
+sub push_or_append_replacements {
+ my ($new_fragments, $pattern, $fragment, $sub) = @_;
+
+ my $oldpos = 0;
+
+MATCH:
+ while ($fragment =~ m/$pattern/gc) {
+ my ($prepos, $postpos) = ($-[0], $+[0]);
+
+ my @repl = $sub->($&, $1);
+
+ my $pre = substr($fragment, $oldpos, $prepos - $oldpos);
+ push_or_append($new_fragments, $pre);
+ push_or_append($new_fragments, @repl);
+
+ $oldpos = $postpos;
+ } # end while [regexp matches]
+
+ my $rest = substr($fragment, $oldpos);
+ push_or_append($new_fragments, $rest);
+}

sub push_or_append (\@@) {
my $fragments = shift;
diff --git a/t/t9502-gitweb-committags.sh b/t/t9502-gitweb-committags.sh
index f86cb3d..718e763 100755
--- a/t/t9502-gitweb-committags.sh
+++ b/t/t9502-gitweb-committags.sh
@@ -52,7 +52,7 @@ echo '$feature{"committags"}{"override"} = 1;' >> gitweb_config.perl
test_expect_success 'bugzilla: enabled' '
gitweb_run "p=.git;a=commit;h=HEAD" &&
grep -F -q \
- "Fixes&nbsp;<a class=\"text\" href=\"http://bugzilla.example.com/show_bug.cgi?id=1234\">bug&nbsp;1234</a>&nbsp;involving" \
+ "Fixes&nbsp;bug&nbsp;<a class=\"text\" href=\"http://bugzilla.example.com/show_bug.cgi?id=1234\">1234</a>&nbsp;involving" \
gitweb.output
'
test_debug 'cat gitweb.log'
@@ -62,7 +62,7 @@ git config gitweb.committag.bugzilla.url 'http://bts.example.com?bug='
test_expect_success 'bugzilla: url overridden but not permitted' '
gitweb_run "p=.git;a=commit;h=HEAD" &&
grep -F -q \
- "Fixes&nbsp;<a class=\"text\" href=\"http://bugzilla.example.com/show_bug.cgi?id=1234\">bug&nbsp;1234</a>&nbsp;involving" \
+ "Fixes&nbsp;bug&nbsp;<a class=\"text\" href=\"http://bugzilla.example.com/show_bug.cgi?id=1234\">1234</a>&nbsp;involving" \
gitweb.output
'
test_debug 'cat gitweb.log'
@@ -72,12 +72,13 @@ echo '$committags{"bugzilla"}{"override"} = 1;' >> gitweb_config.perl
test_expect_success 'bugzilla: url overridden' '
gitweb_run "p=.git;a=commit;h=HEAD" &&
grep -F -q \
- "Fixes&nbsp;<a class=\"text\" href=\"http://bts.example.com?bug=1234\">bug&nbsp;1234</a>&nbsp;involving" \
+ "Fixes&nbsp;bug&nbsp;<a class=\"text\" href=\"http://bts.example.com?bug=1234\">1234</a>&nbsp;involving" \
gitweb.output
'
test_debug 'cat gitweb.log'
test_debug 'grep 1234 gitweb.output'

+git config gitweb.committag.bugzilla.innerpattern ''
git config gitweb.committag.bugzilla.pattern 'Fixes bug (\d+)'
test_expect_success 'bugzilla: pattern overridden' '
gitweb_run "p=.git;a=commit;h=HEAD" &&
@@ -87,17 +88,75 @@ test_expect_success 'bugzilla: pattern overridden' '
'
test_debug 'cat gitweb.log'
test_debug 'grep 1234 gitweb.output'
-git config --unset gitweb.committag.bugzilla.pattern

+git config --unset gitweb.committag.bugzilla.innerpattern
+git config --unset gitweb.committag.bugzilla.pattern
test_expect_success 'bugzilla: affects log view too' '
gitweb_run "p=.git;a=log" &&
grep -F -q \
- "<a class=\"text\" href=\"http://bts.example.com?bug=1234\">bug&nbsp;1234</a>" \
+ "<a class=\"text\" href=\"http://bts.example.com?bug=1234\">1234</a>" \
gitweb.output
'
test_debug 'cat gitweb.log'
test_debug 'grep 1234 gitweb.output'

+echo more_bugzilla > file.txt
+git add file.txt
+git commit -q -F - file.txt <<END
+[#123,#45] This commit fixes two bugs involving bar and baz.
+END
+git config gitweb.committag.bugzilla.pattern '^\[#\d+(, ?#\d+)\]'
+git config gitweb.committag.bugzilla.innerpattern '#(\d+)'
+git config gitweb.committag.bugzilla.url 'http://bugs/'
+test_expect_success 'bugzilla: override everything, use fancier url format' '
+ gitweb_run "p=.git;a=commit;h=HEAD" &&
+ grep -F -q \
+ "[<a class=\"text\" href=\"http://bugs/123\">#123</a>,<a class=\"text\" href=\"http://bugs/45\">#45</a>]" \
+ gitweb.output
+'
+test_debug 'cat gitweb.log'
+test_debug 'grep 123 gitweb.output'
+
+echo even_more_bugzilla > file.txt
+git add file.txt
+git commit -q -F - file.txt <<END
+Fix memory leak in confabulator from bug 123.
+
+Based on history from bugs 223, 224, and 225,
+fix bug 323 or 324.
+
+Bug: 423,424,425,426,427,428,429,430,431,432,435
+Resolves-bugs: #523 #524
+END
+git config --unset gitweb.committag.bugzilla.pattern
+git config --unset gitweb.committag.bugzilla.innerpattern
+git config --unset gitweb.committag.bugzilla.url
+gitweb_run "p=.git;a=commit;h=HEAD"
+test_expect_success 'bugzilla: fancy defaults: match one bug' '
+ grep -q "from&nbsp;bug&nbsp;<a[^>]*>123</a>." gitweb.output
+'
+test_expect_success 'bugzilla: fancy defaults: comma-separated list' '
+ grep -q \
+ "bugs&nbsp;<a[^>]*>223</a>,&nbsp;<a[^>]*>224</a>,&nbsp;and&nbsp;<a[^>]*>225</a>," \
+ gitweb.output
+'
+test_expect_success 'bugzilla: fancy defaults: or-pair' '
+ grep -q "bug&nbsp;<a[^>]*>323</a>&nbsp;or&nbsp;<a[^>]*>324</a>." \
+ gitweb.output
+'
+test_expect_success 'bugzilla: fancy defaults: comma-separated, caps, >10' '
+ grep -q \
+ "Bug:&nbsp;<a[^>]*>423</a>,<a[^>]*>424</a>,.*,<a[^>]*>435</a>" \
+ gitweb.output
+'
+test_expect_success 'bugzilla: fancy defaults: space-separated with hash' '
+ grep -q -e \
+ "-bugs:&nbsp;<a[^>]*>#523</a>&nbsp;<a[^>]*>#524</a>" \
+ gitweb.output
+'
+test_debug 'cat gitweb.log'
+test_debug 'grep 23 gitweb.output'
+
# ----------------------------------------------------------------------
# url linking
#
--
1.6.4.4
Marcel M. Cary
2009-11-18 06:22:24 UTC
Permalink
Thanks for the feedback. I've added four more patches to the end of
the series and updated the first two. My replies are below.
Post by Jakub Narebski
Thanks for diligently working on this issue. Good work!
I see that it is an RFC, and not final submission, but just in case
I'd like to remind you that some of information below should go into
commit message, but some of it should I think go to comments (between
"---" and diffstat).
Post by Marcel M. Cary
I want gitweb to hyperlink commits to my bug tracking system so that
information regarding the current status of a commit can be easily
cross-referenced. For example, the QA and release status of a commit
cannot be inserted into the comment. Maybe someday a "git notes"
feature will help with this, but for now, my organization has a
separate bug tracking system. Other repository browsers such as
unfuddle and websvn support similar features.
The paragraph above should, I think, be made more clear. You don't
need to mention what you don't do; the comment about "git notes" should
be not in commit message but in comments section.
What you want to have is to have some markers in commit message
(committags) hyperlinked; namely you want notifications about bug/issue
numbers in the commit message hyperlinked to appropriate bugtracker/issue
tracker URL. Do I understand this correctly?
I'm not sure I would call the context of the bug numbers
"notifications". A good example of what I'm looking for is to
hyperlink "Resolves-bug: 1234" to
http://bugzilla.example.com/show_bug.cgi?bug_id=1234. I suppose this
could be seen as a notification of bug 1234's resolution, except that
I expect the more complex bugs (bugs, issues, and features) to require
several commits to resolve, each of which I'd like tagged with that
bug number, and so the bug would not really be resolved after the
first such commit.
Post by Jakub Narebski
I tried here to reword what you said, to come up with better commit
message for a future final submission.
I've rewritten the commit message paragraph as you suggested. Is it
now more inline with what you'd like to see?
Post by Jakub Narebski
Post by Marcel M. Cary
Since the bug hyperlinking feature was previously discussed as part of
"committags," a more general mechanism to embellish commit messages,
implement the more general mechanism instead, including the following
Well, I think that the fact that it would be not much harder to create
general mechanism for commit message transformation, than to add
suitably generic and well customizable support for bugtracker
'committags'.
So far, the biggest difficulty in generalizing the bugzilla hyperlink
feature has been the interaction between two or more commit tag
transformations -- how to structure them so they are composable but
decoupled.
Post by Jakub Narebski
Post by Marcel M. Cary
* Hyperlinking mentions of bug IDs to Bugzilla
* Hyperlinking URLs
* Hyperlinking Message-Ids to a mailing list archive
* Hyperlinking commit hashes as before by default, now with a
configurable regex
* Defining new committags per gitweb installation
Well, there is one _implicit_ (but important) commit message
transformation (filter) for display, which has to be always present[1],
namely HTML escaping. We make use of the fact that you can do
HTML escaping before doing the only currently supported committag,
namely hyperlinking (shortened) SHA-1 to 'object' gitweb URL, but for
other committags like mentioned "Message-Id to mail archive" committag
(filter) it would make them more difficult.
The original patch still did the HTML escaping. I don't see much
value in implementing this transformation as a committag since it
wouldn't be useful to configure it, and I don't really see it making
the code more clear or concise. Did you have any particular reasons?
Post by Jakub Narebski
Also one might consider vertical whitespace simplification (removing
leading empty lines, compacting empty lines to single empty line
between paragraphs), and syntax highlighting signoff lines to be
a kind of commit message filter like mentioned above committags
(see git_print_log() subroutine).
Although probably vertical whitespace simplification should be not
made into commit message filter, as it is used not for all views.
Post by Marcel M. Cary
Since different repositories may use different bug tracking systems or
mailing list archives, the URL parameter may be configured
per-repository without reiterating the regexes. To accomodate
different conventions, regexes may also be configured per-project.
Also list of supported committags is separated from the list of
committags used[1]; just like it is done for snapshot formats.
This could be mentioned in final commit message.
[1] well, sequence rather than list in this case, as here
ordering does matter a bit
Post by Marcel M. Cary
This patch is heavily based on discussions and code samples from the
[RFC/PATCH] gitweb: Add committags support, Sep 2006
http://thread.gmane.org/gmane.comp.version-control.git/27504
[RFC] gitweb: Add committags support (take 2), Dec 2006
http://thread.gmane.org/gmane.comp.version-control.git/33150
[RFC] Configuring (future) committags support in gitweb, Nov 2008
http://thread.gmane.org/gmane.comp.version-control.git/100415
Hmmm... should this be put in final commit message, or only in comment
to the patch (should this be in commit history of git repository)?
I'll exclude the mailing list references from the commit message since
they describe how I arrived at this set of changes rather than
describing the changes themselves.
Post by Jakub Narebski
Post by Marcel M. Cary
* Should this configuration try to follow the bugtraq spec?
As far as I know, only subversion implements it. Separation of
regexes by a newline would be a little awkward in the git config.
And it is broader than just hyperlinking bugs: it also encompasses
GUI bug ID form fields. So gitweb would only implement a subset.
I didn't even know that there is such spec. Were you talking about
http://tortoisesvn.net/issuetracker_integration or do you have different
URL in mind?
That is the Bugtraq spec, although I had been looking at a text-only
version of it at the time. I can't seem to find the text-only
document, but I think that one explains it just as well.
Post by Jakub Narebski
Post by Marcel M. Cary
The gitweb configuration mechanism currently only reads
keys starting with "gitweb.", but these parameters would be more
broadly applicable, potentially to git-gui, for example.
Actually the fact that gitweb reads only keys in the 'gitweb' section
from config is just a convention. There were (are) no config variables
in other places (other sections) which would be of interest to gitweb.
Post by Marcel M. Cary
However, it *would* be useful for Git tools to standardize on
config keys and interpretations of regexes and url formats. For
example, git-gui might be able to hyperlink the same text as gitweb,
and even show a separate bugID field when composing a commit
message.
This is I think a very good idea... but I think idea which
implementation can be left for later.
Very well then, I'll leave everything related to the other git tools
for later.
Post by Jakub Narebski
Post by Marcel M. Cary
* I would prefer the regex match against the whole commit message.
This would allow the regex to insist that a bug reference occur
on the first line or non-first line of the commit message. However,
even if we concatenated the log lines for the first committag,
subsequent committags would see the text broken up.
Also, it would allow the regex to match a phrase split across a
line boundary, as dicussed at some length in the first thread,
but again, only if no prior committags had interfered.
This could happen in a later patch.
Well, the change should be fairly easy: just concatenate lines before
passing them as single element list to commit message filters. OTOH
you would have to take care of end of line characters in committags
regexps.
Yes, it seems manageable to process the whole commit message at once
rather than line by line. I've made this change to support patch 4 of
this series.
Post by Jakub Narebski
Post by Marcel M. Cary
* I would prefer the site admin have a way to let a repository
owner define new committags, which means having a way to specify
the 'sub' key from the repo config or having a flexible default.
Perhaps, following your earlier suggestion to make committags supported
also by other tools, gitweb (and e.g. git-gui / gitk) use config
variable committag.<name>.<key> (where <key> can be 'pattern' or 'url';
This is a great example of the kind of decision I'd like to make
up-front to avoid breaking existing configs later:
gitweb.committag.<name>.<key> vs. committag.<name>.<key>. I'm not
very interested in adding the git-gui / gitk feature myself. Is
anyone actually interested in this feature? If not, perhaps it should
stay under gitweb.committag.<name>.<key>. And even if there is
interest in the git-gui / gitk features, perhaps it would make sense
to start with the gitweb-specific version of the config variable names
and once there is cross-tool support, keep those configs as overrides
for gitweb only?
Post by Jakub Narebski
although I wonder if we can allow 'pattern' as malicious user can do
a DoS attack against gitweb / server using badly behaved regexp).
Yes, I imagine there is a risk of abuse in allowing patterns to be
configurable. That's one reason why the site config can disallow
per-project configuration of regexes. Are you suggesting there be a
separate flag to allow override of regexes than to allow override of
other parameters? For example, with a separate flag for regexes,
repo.or.cz could allow you to configure the bug tracker URL but not
the bug regex. I've added very fine grain support for allowing only
some options to be overridden in patch 3 of this series.
Post by Jakub Narebski
Post by Marcel M. Cary
The bugtraq and some of the regex questions must be decided now to
avoid breaking gitweb configs later.
True.
Post by Marcel M. Cary
---
gitweb/INSTALL | 4 +
gitweb/gitweb.perl | 221 +++++++++++++++++++++++++++++++-
t/t9500-gitweb-standalone-no-errors.sh | 150 +++++++++++++++++++++-
3 files changed, 367 insertions(+), 8 deletions(-)
diff --git a/gitweb/INSTALL b/gitweb/INSTALL
index 18c9ce3..223e39e 100644
--- a/gitweb/INSTALL
+++ b/gitweb/INSTALL
$feature{'snapshot'}{'default'} = ['zip', 'tgz'];
$feature{'snapshot'}{'override'} = 1;
+ $feature{'committags'}{'default'} = ['sha1', 'url', 'bugzilla'];
+ $feature{'committags'}{'override'} = 1;
+
+
Gitweb repositories
-------------------
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 1e7e2d8..c66fdf3 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -195,6 +195,81 @@ our %known_snapshot_format_aliases = (
'x-zip' => undef, '' => undef,
);
I understand that comments such as one below would be not present in
a final submission, and they are here to provide running commentary for
code, isn't it?
Yes. I've removed the running commentary, since it seems to be in the
way.
Post by Jakub Narebski
Post by Marcel M. Cary
+# Could call these something else besides committags... embellishments,
+# patterns, rewrite rules, ?
They are "commit filters", or "commit message filters" (or 'formatters',
or 'processors'; they are not 'parsers').
The name 'committag' was first introduced as far as I remember in xmms2
fork of gitweb (in old times when gitweb was separate project, and not
part of git repository).
Sounds as though there is no interest in changing the name to
something that more clearly describes the feature (asside from my own,
but I'd like a little more validation...). Let's stick with
committag.
Post by Jakub Narebski
Post by Marcel M. Cary
+#
+# In general, the site admin can enable/disable per-project configuration
+# of each committag. Only the 'options' part of the committag is configurable
+# per-project.
See above caveat about allowing to customize 'regexp'/'pattern' part
in untrusted environment; you can construct regexp which has exponential
behavior.
Post by Marcel M. Cary
+#
+# The site admin can of course add new tags to this hash or override the
+# 'sub' key if necessary. But such changes may be fragile; this is not
+# designed as a full-blown plugin architecture.
+our %committags = (
You should put the comments here about supported keys, similar to the
one for %known_snapshot_formats and %feature hashes.
I believe I've addressed your request by adding a comment that
applies to all tags. Because of the similarity of the tags, I don't
think it would be very useful to do this for each tag. Does the 'url'
option used on two of the tags need further explanation?
Post by Jakub Narebski
Post by Marcel M. Cary
+ # Link Git-style hashes to this gitweb
+ 'sha1' => {
+ 'options' => {
+ 'pattern' => qr/\b([0-9a-fA-F]{8,40})\b/,
+ },
+ 'override' => 0,
Shouldn't 'override' key be better last?
I liked 'override' close to the 'options' hash because that is what it
controls. The 'sub' key was not overridable per-project no matter how
you configure gitweb. Now it is, so maybe now 'override' could go
last. Or first. But really, I don't think the order matters much.
Post by Jakub Narebski
Post by Marcel M. Cary
+ 'sub' => sub {
+ \$cgi->a({-href => href(action=>"object", hash=>$match[1]),
+ -class => "text"}, esc_html($match[0], -nbsp=>1));
+ },
Style: although there is commonly used idiom to use 'sub { <expr>; }'
for a wrapper subroutines (e.g. 'sub { [] }' in Moose examples), one
should use explicit "return" statement instead of relying on Perl
behavior of returning last statement in a block.
See Perl::Critic::Policy::Subroutines::RequireFinalReturn policy in
Subroutines without explicit 'return' statements at their ends can be
confusing. It can be challenging to deduce what the return value will be.
Post by Marcel M. Cary
+ },
+ # Link bug/features to Mantis bug tracker using Mantis-style contextual cues
+ 'mantis' => {
+ 'options' => {
+ 'pattern' => qr/(?:BUG|FEATURE)\((\d+)\)/,
+ 'url' => 'http://bugs.xmms2.xmms.se/view.php?id=',
I don't think we want to put such URL here. Please check if Mantis
documentation uses some specific links, or follow RFC conventions and
use 'example.com' as hostname (e.g. 'bugs.example.com').
Ok, I'll use a neutral URL for the mantis default.
Post by Jakub Narebski
By the way the bugtraq proposal you mentioned uses placeholder in URL
for putting issue number (%BUGID%). Perhaps gitweb should do the same
here.
I'd rather use a regex-style or sprintf-style syntax that could be
used by all committags. I started with just an opaque string to
prepend to the bug ID because it was the simplest way to fulfill my
requirements, but it's certainly plausible that a BTS would want urls
like "example.com/bug/1234/detail" in which case the prepending
strategy isn't sufficient. Again, ideally we'd decide now whether to
use '$1' or '%s' so we don't break configs later. If the bugtraq
feature is implemented at some point, those config parameters can
still use the %BUGID% placeholder.
Post by Jakub Narebski
Post by Marcel M. Cary
+ },
+ 'override' => 0,
+ 'sub' => \&hyperlink_committag,
+ },
+ # Link mentions of bug IDs to bugzilla
+ 'bugzilla' => {
+ 'options' => {
+ 'pattern' => qr/bug\s+(\d+)/,
+ 'url' => 'http://bugzilla.kernel.org/show_bug.cgi?id=',
The same comment as above.
Post by Marcel M. Cary
+ },
+ 'override' => 0,
+ 'sub' => \&hyperlink_committag,
+ },
+ # Link URLs
+ 'url' => {
+ 'options' => {
+ # Avoid matching punctuation that might immediately follow
+ # a url, is not part of the url, and is allowed in urls,
+ # like a full-stop ('.').
If you took this regexp from some place (like blog), it would be good
to mention URL here, to be able to check more detailed explanation of
construction of this URL-catching regexp.
Actually, I think I started with a URL regex mentioned on-list and
then tweaked it until I thought it worked well enough.

Most of the regexes I found on the web had issues. They tended to be
too long because they were trying to precisely validate the URL or
because they were trying to parse all the different parts of the url,
or they did not deal with the trailing punctuation problem: I don't
want the period (".") to be highlighted in this context: "See
http://example.com/foo." I think it's a valid URL even with the
period, but typically the period will not be part of the URL. On the
other hand, "http://example.com/foo.html" should be completely
highlighted in spite of the period.
Post by Jakub Narebski
Should we also support irc://, nntp:// (pseudo)protocols? What about
git:// ?
I'va added more schemes, but it's also possible to leave those as a
customization. In my organization, it would be very uncommon for
someone to follow a URL with any of those schemes. I'd be willing to
allow any scheme that matches "[a-z]+://". I don't care about stuff
like "data:literal+text", and "mailto:" is also less interesting to
me. (I'd rather add a committag to match the email address without a
"mailto:".) And then there's those pesky windows file sharing
thingies like '\\server\directory' that work like URLs in Internet
Explorer. I'd rather not include them in the URL committag... but a
site admin could certainly configure a committag for it. Here's a
list schemes for which KDE allegedly supports retreival of metadata:

bluetooth, fish, ftp, imap(s), invitation, iso, ldap(s), mac, mdns,
nfs, nntp(s), nxfish, obex, pop3(s), print, printdb, sdp, service,
sftp, slp, smb, smtp(s), webdav(s)

I wouldn't want to enumerate all of those, but, in addition to the
three you suggest, these look most useful to me: sftp, smb, webdav(s),
nfs. So really I think it comes down to whether we want to enumerate
them or just match any scheme, and risk matching something that was
not intended as a URL. And if we enumerate, how many do we want to
list.

What do you think of the new list of schemes in patch 1? I've
included several more than before.
Post by Jakub Narebski
Post by Marcel M. Cary
+ },
+ 'override' => 0,
+ 'sub' => sub {
+ return
+ \$cgi->a({-href => $match[0],
+ -class => "text"},
+ esc_html($match[0], -nbsp=>1));
+ },
Here you use explicit return.
Post by Marcel M. Cary
+ },
+ # Link Message-Id to mailing list archive
+ 'messageid' => {
+ 'options' => {
+ # The original pattern, which I don't really understand
+ #'pattern' => qr!(?:message|msg)-id:?\s+<([^>]+)>;!i,
+ 'pattern' => qr!(?:message|msg)-?id:?\s+(<[^>]+>)!i,
Errr... how original patter is different from the one used? Also above
comment should be removed in final submission.
The most important change is that the semicolon is removed. I
also made the dash optional. It wasn't clear to me whether this was
intended to match a header-style reference:

Message-Id: <***@example.com>

Or a casual mention:

In msgid <***@example.com>, John Doe says...

But I like the latter, and would like the former to still be
supported.
Post by Jakub Narebski
Post by Marcel M. Cary
+ 'url' => 'http://news.gmane.org/find-root.php?message_id=',
Same comment about generic URL... although on the other hand perhaps
having a few examples of mail archive sites which support finding
messages by Message-Id could be a good idea.
BTW. you can write 'http://mid.gmane.org/' instead...
Thanks for the tip. Is there another common mail archive site
that allows looking up emails by message-id? If so, I'd love to
mention the url in the comment for that committag. I think we need to
strike a balance between having things work with minimal configuration
and avoiding the promotion of specific web sites that won't make sense
for most sites using a particular committag. So, unless there is a
whole slew of web sites that provide this feature, I'd suggest leaving
the specific URL in.
Post by Jakub Narebski
Post by Marcel M. Cary
+ },
+ 'override' => 0,
+ # The original version didn't include the "msg-id" text in the
+ # link text, but this does. In general, I think a little more
+ # context makes for better link text.
I guess that is the result of using generic hyperlink_committag()
subroutine here. (This comment should be removed or reworded in final
submitted version, I think.)
Yes, although if we decided it was sub-optimal link text, I'd be
happy to use a less generic mechanism.
Post by Jakub Narebski
BTW. it would be much easier with Perl6-ish (or Perl 5.10.x) named
'pattern' => qr!(?:message|msg)-?id:?\s+(?P<query><[^>]+>)!i,
or something like that.
I like the notion of names rather than numberic indices, but I'm
hesitant to require site admins to use unfamiliar Perl regex syntax to
configure committags. Of two admins I asked, neither could make sense
of the sample regex. If wouldn't want a site admin to *have* to learn
new syntax to configure regexes.
Post by Jakub Narebski
Post by Marcel M. Cary
+ 'sub' => \&hyperlink_committag,
+ },
+);
+
# You define site-wide feature defaults here; override them with
# $GITWEB_CONFIG as necessary.
our %feature = (
@@ -365,6 +440,21 @@ our %feature = (
'sub' => \&feature_patches,
'override' => 0,
'default' => [16]},
+
+ # The selection and ordering of committags that are enabled.
+ # Committag transformations will be applied to commit log messages
+ # in this order if listed here.
/this/given/
You need to mention somewhere that committag subroutines return a list
of mixed scalar and reference to scalar elements, where using reference
to scalar removes value from the chain of filters (including implicit
final esc_html filter).
I chose to add that explanation in the section that describes the
configuration of committags rather than in this section that defines
the list of active committags. Sound ok?
Post by Jakub Narebski
Post by Marcel M. Cary
+
+ # To disable system wide have in $GITWEB_CONFIG
+ # $feature{'committags'}{'default'} = [];
+ # To have project specific config enable override in $GITWEB_CONFIG
+ # $feature{'committags'}{'override'} = 1;
+ # and in project config gitweb.committags = sha1, url, bugzilla
+ # to enable those three committags for that project
Just a thought: perhaps we should provide support for 'default' in
config (which would currently be "sha1" or "sha1, url").
I didn't understand how this would be very useful until I added
the signoff committag. The use case I see is that it allows the
gitweb distribution to adjust the base functionality, in this case by
pushing some functionality into a committag, without project owners
needing to reconfigure their repositories. Site admins don't need
this because they can use "push" or "unshift" to preserve the
distributed committags, right? The project owner should get the
distribution default, not the site default, when they write "default",
right? Are there other use cases you had in mind? A potential
implementation is in patch 6.
Post by Jakub Narebski
and in project config, a comma-separated list of [...] or
"none" to disable.
Post by Marcel M. Cary
+ 'committags' => {
+ 'sub' => \&feature_committags,
+ 'override' => 0,
+ 'default' => ['sha1']},
);
sub gitweb_get_feature {
@@ -433,6 +523,18 @@ sub feature_patches {
return ($_[0]);
}
+sub feature_committags {
+
+ my ($cfg) = git_get_project_config('committags');
+
+ if ($cfg) {
+ return ($cfg eq 'none' ? () : split(/\s*[,\s]\s*/, $cfg));
+ }
+
+}
As this would be second feature which uses comma-separated (or for
backward compatibility space separated) list of options, perhaps
we should factor out this part into common helper subroutine named
for example 'feature_list' or 'feature_multi' (like 'feature_bool').
Thanks for pointing out the opportunity to fold the feature list
functions together.
Post by Jakub Narebski
Post by Marcel M. Cary
+
# checking HEAD file with -e is fragile if the repository was
# initialized long time ago (i.e. symlink HEAD) and was pack-ref'ed
# and then pruned.
@@ -814,6 +916,34 @@ $git_dir = "$projectroot/$project" if $project;
@snapshot_fmts = filter_snapshot_fmts(@snapshot_fmts);
+# ordering of committags
+
+# Merge project configs with default committag definitions
+gitweb_load_project_committags();
Good idea... although gitweb first defines and then uses subroutine,
see evaluate_path_info().
Sounds like you're asking me to move the function call after the
function definition. I've made the change, but I'm also curious to
know why you have that preference. I sometimes find it less readable,
as in this case.
Post by Jakub Narebski
Post by Marcel M. Cary
+
+# Load committag configs from the repository config file and and
+# incorporate them into the gitweb defaults where permitted by the
+# site administrator.
+sub gitweb_load_project_committags {
+ return if (!$git_dir);
+ my %project_config = ();
+ my %raw_config = git_parse_project_config('gitweb\.committag');
Why not do lazy-loading of a whole config here? We use committag
info only for project-specific actions in gitweb.
I thought the check for $git_dir would prevent loading it for
non-project-specific pages. Even so, I suppose we might load the
config on a page like the shortlog page that doesn't need it.
Post by Jakub Narebski
Post by Marcel M. Cary
+ foreach my $key (keys(%raw_config)) {
+ next if ($key !~ /gitweb\.committag\.[^.]+\.[^.]/);
+ my ($gitweb_prefix, $committag_prefix, $ctname, $option) =
+ split(/\./, $key, 4);
+ $project_config{$ctname}{$option} = $raw_config{$key};
+ }
And use created subroutines to handle config?
Are you saying I should be using other subroutines that already
exist in gitweb rather than implementing my own? Are you thinking of
git_get_project_config? If I understand correctly, it requires me to
enumerate all the config keys I want. I'd rather not require the
committags to have a predetermined set of possible config keys. I see
now that I could still call git_get_project_config to load data into
%config and access that directly... I didn't do it before because it
seems to violate some kind of abstraction.
Post by Jakub Narebski
Post by Marcel M. Cary
+ foreach my $ctname (keys(%committags)) {
+ next if (!$committags{$ctname}{'override'});
+ foreach my $optname (keys %{$project_config{$ctname}}) {
+ $committags{$ctname}{'options'}{$optname} =
+ $project_config{$ctname}{$optname};
+ }
+ }
+}
+
# dispatch
if (!defined $action) {
if (defined $hash) {
@@ -1384,13 +1514,92 @@ sub file_type_long {
sub format_log_line_html {
my $line = shift;
- $line = esc_html($line, -nbsp=>1);
- $line =~ s{\b([0-9a-fA-F]{8,40})\b}{
- $cgi->a({-href => href(action=>"object", hash=>$1),
- -class => "text"}, $1);
- }eg;
+ # In this list of log message fragments, a string ref indicates HTML,
+ # and a string indicates plain text
Well, to be more exact string ref means that the string referenced is
not to be processed by later filters, including final implicit esc_html.
Well... it means both not processing and HTML escaping. The string
refs will also not be escaped in the final processing step.
Post by Jakub Narebski
Post by Marcel M. Cary
- return $line;
+ next COMMITTAG unless exists $committags{$ctname};
+ my $committag = $committags{$ctname};
+
+ next COMMITTAG unless exists $committag->{'options'};
+ my $opts = $committag->{'options'};
+
+ next COMMITTAG unless exists $opts->{'pattern'};
+ my $pattern = $opts->{'pattern'};
+
+
+ next PART if $part eq "";
+ if (ref($part)) {
+ next PART;
+ }
+
+ my $oldpos = 0;
+
+ while ($part =~ m/$pattern/gc) {
+ my ($prepos, $postpos) = ($-[0], $+[0]);
+ my $repl = $committag->{'sub'}->($opts, $&, $1);
+ $repl = "" if (!defined $repl);
+
+ my $pre = substr($part, $oldpos, $prepos - $oldpos);
+
+ $oldpos = $postpos;
+ } # end while [regexp matches]
+
+ my $rest = substr($part, $oldpos);
+
+
+
+ # Escape any remaining plain text and concatenate
+ my $html = '';
+ if (ref($part)) {
+ $html .= $$part;
+ } else {
+ $html .= esc_html($part, -nbsp=>1);
+ }
+ }
+
+ return $html;
+}
Nice.
Hehe, that bit of code is mostly yours, which you posted on this list
ages ago. (: I suppose we should try to get your signoff into the
first patch of the series?
Post by Jakub Narebski
Post by Marcel M. Cary
+
+# Returns a ref to an HTML snippet that links the second
+# parameter to a URL formed from the first and last parameters.
+# This is a helper function used in %committags.
+sub hyperlink_committag {
+ return
+ \$cgi->a({-href => $opts->{url} . CGI::escape($match[1]),
$opts->{'url'} not $opts->{url}
'$cgi->escapeHTML' I think, not 'CGI::escape' (but I am not sure here).
besides, we can always import 'escape'.
Post by Marcel M. Cary
+ -class => "text"},
+ esc_html($match[0], -nbsp=>1));
+}
+
+
Hmmm... this would be first use of Perl subroutine prototypes in gitweb.
But this is made to imitate 'push' built-in, so I think it is O.K.
Ok, I guess I'll leave the subroutine prototype in then. I don't
understand all the implications, so if you think it'll work well
without the prototype, I'm happy to remove it. This is code
that I think you posted to the list.
Post by Jakub Narebski
Post by Marcel M. Cary
+ my $list = shift;
+
+ } else {
+
+ }
+ # imitate push
}
# format marker of refs pointing to given object
diff --git a/t/t9500-gitweb-standalone-no-errors.sh b/t/t9500-gitweb-standalone-no-errors.sh
index d539619..37a127c 100755
--- a/t/t9500-gitweb-standalone-no-errors.sh
+++ b/t/t9500-gitweb-standalone-no-errors.sh
@@ -55,9 +55,9 @@ gitweb_run () {
# some of git commands write to STDERR on error, but this is not
# we are interested only in properly formatted errors/warnings
- rm -f gitweb.log &&
+ rm -f resp.http gitweb.log &&
perl -- "$SCRIPT_NAME" \
- >/dev/null 2>gitweb.log &&
+ > resp.http 2>gitweb.log &&
if grep "^[[]" gitweb.log >/dev/null 2>&1; then false; else true; fi
Well, if you begin to check _output_ of gitweb, then it should be put
in separate test, not t/t9500-gitweb-standalone-no-errors.sh which is
only about no-errors... or change name of gitweb test.
I see there is some precedent for adding lib-foo.sh files. Perhaps it
would be appropriate to move parts of the no-errors test into a
lib-gitweb.sh? Like gitweb_init, most of gitweb_run, and maybe the
perl checks (which are that way for lib-git-svn.sh)? I'm all for
separating the tests. I don't like having to keep telling the test to
skip the first 85 cases when all I want to run is the committag stuff.

Actually, looks like Mark Rada already made a gitweb-lib.sh which was
exactly the same as the one I'd made except the name of the lib (vs.
lib-gitweb.sh) and the name of the file holding gitweb output. I've
rebased onto those changes.
Post by Jakub Narebski
I think I'm gonna leave this and the other feedback
Post by Marcel M. Cary
# gitweb.log is left for debugging
@@ -702,4 +702,150 @@ test_expect_success \
gitweb_run "p=.git;a=summary"'
test_debug 'cat gitweb.log'
+# ----------------------------------------------------------------------
+# sha1 linking
+#
+echo hi > file.txt
+git add file.txt
+git commit -q -F - file.txt <<END
+Summary
+
+See also commit 567890ab
+ h=$(git rev-parse --verify HEAD) &&
+ gitweb_run "p=.git;a=commit;h=$h" &&
Actually you can just use "h=HEAD" or use query without 'h' parameter
(which defaults to "HEAD") here.
Thanks for the tip, I'll use h=HEAD.
Post by Jakub Narebski
Post by Marcel M. Cary
+ grep -q \
+ "commit&nbsp;<a class=\"text\" href=\".*\">567890ab</a>" \
+ resp.http
+'
+test_debug 'cat gitweb.log'
+test_debug 'grep 567890ab resp.http'
I'd rather use Test::* (e.g. Test::WWW::Mechanize::CGI) for that...
but having some output test for gitweb, even in such simple form would
certainly be nice.
Would this mean people need to install some Mechanize stuff before
they can run the tests? I think I'd rather avoid the dependency as
long as grep seems to do the trick.
Post by Jakub Narebski
Post by Marcel M. Cary
+
+# ----------------------------------------------------------------------
+# bugzilla commit tag
+#
+
+echo foo > file.txt
+git add file.txt
+git commit -q -F - file.txt <<END
+Fix foo
+
+Fixes bug 1234 involving foo.
+END
+git config gitweb.committags 'sha1, bugzilla'
+test_expect_success 'bugzilla: enabled but not permitted' '
+ h=$(git rev-parse --verify HEAD) &&
+ gitweb_run "p=.git;a=commit;h=$h" &&
+ grep -F -q \
+ "Fixes&nbsp;bug&nbsp;1234&nbsp;involving" \
+ resp.http
+'
+test_debug 'cat gitweb.log'
+test_debug 'grep 1234 resp.http'
+
+echo '$feature{"committags"}{"override"} = 1;' >> gitweb_config.perl
+test_expect_success 'bugzilla: enabled' '
+ h=$(git rev-parse --verify HEAD) &&
+ gitweb_run "p=.git;a=commit;h=$h" &&
+ grep -F -q \
+ "Fixes&nbsp;<a class=\"text\" href=\"http://bugzilla.kernel.org/show_bug.cgi?id=1234\">bug&nbsp;1234</a>&nbsp;involving" \
+ resp.http
+'
Hmmm...
?



Marcel M. Cary (6):
gitweb: Hyperlink various committags in commit message with regex
gitweb: Add second-stage matching of bug IDs in bugzilla committag
gitweb: Allow finer-grained override controls for committags
gitweb: Allow committag pattern matches to span multiple lines
gitweb: Allow per-repository definition of new committags
gitweb: Add _defaults_ keyword for feature lists in project config

gitweb/INSTALL | 16 ++
gitweb/gitweb.perl | 404 +++++++++++++++++++++++++++++++++++++-----
t/t9502-gitweb-committags.sh | 309 ++++++++++++++++++++++++++++++++
3 files changed, 681 insertions(+), 48 deletions(-)
create mode 100755 t/t9502-gitweb-committags.sh
Marcel M. Cary
2009-11-18 06:22:25 UTC
Permalink
I want gitweb to hyperlink commits to my bug tracking system so that
information regarding the current status of a commit can be easily
cross-referenced. The QA and release status of a commit cannot be
directly inserted into the commit message because they change over
time. But if the commit message mentions a bug number, gitweb could
detect the bug reference in the message and hyperlink it to the bug
tracking system. Other repository browsers such as unfuddle and
websvn support similar features.

Currently only commit hashes are hyperlinked in this manner.

Since the bug hyperlinking feature was previously discussed as part of
"committags," a more general mechanism to embellish commit messages,
implement the more general mechanism instead, including the following
capabilities:

* Hyperlinking mentions of bug IDs to Bugzilla
* Hyperlinking URLs
* Hyperlinking Message-Ids to a mailing list archive
* Hyperlinking commit hashes, as before by default, now with a
configurable regex
* Defining new committags per gitweb installation

Since different repositories may use different bug tracking systems or
mailing list archives, the URL parameter may be configured
per-repository without reiterating the regexes. To accomodate
different conventions, regexes may also be configured per-project.

The order in which gitweb applies committags may be configured
per-project as well, because one committag may affect subsequent ones.
Inclusion in this sequence determines whether a committag is enabled
or not.

Signed-off-by: Marcel M. Cary <***@oak.homeunix.org>
---

One additional thing that occured to me is that maybe the hyperlinks
added by committags should have 'rel="nofollow"' by default? And if
so, maybe that needs to be configurable? On the other hand, I'm not
sure how useful it is to hide real URLs in the commit messages from
search engines... ?


gitweb/INSTALL | 5 +
gitweb/gitweb.perl | 247 +++++++++++++++++++++++++++++++++++++++---
t/t9502-gitweb-committags.sh | 150 +++++++++++++++++++++++++
3 files changed, 389 insertions(+), 13 deletions(-)
create mode 100755 t/t9502-gitweb-committags.sh

diff --git a/gitweb/INSTALL b/gitweb/INSTALL
index b76a0cf..9081ed8 100644
--- a/gitweb/INSTALL
+++ b/gitweb/INSTALL
@@ -132,6 +132,11 @@ adding the following lines to your $GITWEB_CONFIG:
$known_snapshot_formats{'zip'}{'disabled'} = 1;
$known_snapshot_formats{'tgz'}{'compressor'} = ['gzip','-6'];

+To add a committag to the default list of commit tags, for example to
+enable hyperlinking of bug numbers to a bug tracker for all projects:
+
+ push @{$feature{'committags'}{'default'}}, 'bugzilla';
+

Gitweb repositories
-------------------
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index e4cbfc3..2d72202 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -213,6 +213,97 @@ our %avatar_size = (
'double' => 32
);

+# In general, the site admin can enable/disable per-project
+# configuration of each committag. Only the 'options' part of the
+# committag is configurable per-project.
+#
+# The site admin can of course add new tags to this hash or override
+# the 'sub' key if necessary. But such changes may be fragile; this
+# is not designed as a full-blown plugin architecture. The 'sub' must
+# return a list of strings or string refs. The strings must contain
+# plain text and the string refs must contain HTML. The string refs
+# will not be processed further.
+#
+# For any committag, set the 'override' key to 1 to allow individual
+# projects to override entries in the 'options' hash for that tag.
+# For example, to match only commit hashes given in lowercase in one
+# project, add this to the $GITWEB_CONFIG:
+#
+# $committags{'sha1'}{'override'} = 1;
+#
+# And in the project's config:
+#
+# gitweb.committags.sha1.pattern = \\b([0-9a-f]{8,40})\\b
+#
+# Some committags have additional options whose interpretation depends
+# on the implementation of the 'sub' key. The hyperlink_committag
+# value appends the first captured group to the 'url' option.
+our %committags = (
+ # Link Git-style hashes to this gitweb
+ 'sha1' => {
+ 'options' => {
+ 'pattern' => qr/\b([0-9a-fA-F]{8,40})\b/,
+ },
+ 'override' => 0,
+ 'sub' => sub {
+ my ($opts, @match) = @_;
+ return \$cgi->a({-href => href(action=>"object", hash=>$match[1]),
+ -class => "text"},
+ esc_html($match[0], -nbsp=>1));
+ },
+ },
+ # Link bug/features to Mantis bug tracker using Mantis-style
+ # contextual cues
+ 'mantis' => {
+ 'options' => {
+ 'pattern' => qr/(?:BUG|FEATURE)\((\d+)\)/,
+ 'url' => 'http://www.example.com/mantisbt/view.php?id=',
+ },
+ 'override' => 0,
+ 'sub' => \&hyperlink_committag,
+ },
+ # Link mentions of bug IDs to bugzilla
+ 'bugzilla' => {
+ 'options' => {
+ 'pattern' => qr/bug\s+(\d+)/,
+ 'url' => 'http://bugzilla.example.com/show_bug.cgi?id=',
+ },
+ 'override' => 0,
+ 'sub' => \&hyperlink_committag,
+ },
+ # Link URLs
+ 'url' => {
+ 'options' => {
+ # Avoid matching punctuation that might immediately follow
+ # a url, is not part of the url, and is allowed in urls,
+ # like a full-stop ('.').
+ 'pattern' => qr!(https?|ftps?|git|ssh|ssh+git|sftp|smb|webdavs?|
+ nfs|irc|nntp|rsync)
+ ://[-_a-zA-Z0-9\@/&=+~#<>;%:.?]+
+ [-_a-zA-Z0-9\@/&=+~#<>]!x,
+ },
+ 'override' => 0,
+ 'sub' => sub {
+ my ($opts, @match) = @_;
+ return \$cgi->a({-href => $match[0],
+ -class => "text"},
+ esc_html($match[0], -nbsp=>1));
+ },
+ },
+ # Link Message-Id to mailing list archive
+ 'messageid' => {
+ 'options' => {
+ 'pattern' => qr!(?:message|msg)-?id:?\s+(<[^>]+>)!i,
+ 'url' => 'http://mid.gmane.org/',
+ },
+ 'override' => 0,
+ # Includes the "msg-id" text in the link text.
+ # Since we don't support linking multiple msg-ids in one match, we
+ # can include the "msg-id" in the link text for better context.
+ 'sub' => \&hyperlink_committag,
+ },
+);
+
# You define site-wide feature defaults here; override them with
# $GITWEB_CONFIG as necessary.
our %feature = (
@@ -258,7 +349,7 @@ our %feature = (
# and in project config, a comma-separated list of formats or "none"
# to disable. Example: gitweb.snapshot = tbz2,zip;
'snapshot' => {
- 'sub' => \&feature_snapshot,
+ 'sub' => sub { feature_list('snapshot', @_) },
'override' => 0,
'default' => ['tgz']},

@@ -417,6 +508,21 @@ our %feature = (
'sub' => \&feature_avatar,
'override' => 0,
'default' => ['']},
+
+ # The selection and ordering of committags that are enabled.
+ # Committag transformations will be applied to commit log messages
+ # in the order listed here if listed here.
+
+ # To disable system wide have in $GITWEB_CONFIG
+ # $feature{'committags'}{'default'} = [];
+ # To have project specific config enable override in $GITWEB_CONFIG
+ # $feature{'committags'}{'override'} = 1;
+ # and in project config gitweb.committags = sha1, url, bugzilla
+ # to enable those three committags for that project
+ 'committags' => {
+ 'sub' => sub { feature_list('committags', @_) },
+ 'override' => 0,
+ 'default' => ['sha1']},
);

sub gitweb_get_feature {
@@ -463,16 +569,16 @@ sub feature_bool {
}
}

-sub feature_snapshot {
- my (@fmts) = @_;
+sub feature_list {
+ my ($key, @defaults) = @_;

- my ($val) = git_get_project_config('snapshot');
+ my ($cfg) = git_get_project_config($key);

- if ($val) {
- @fmts = ($val eq 'none' ? () : split /\s*[,\s]\s*/, $val);
+ if ($cfg) {
+ return ($cfg eq 'none' ? () : split(/\s*[,\s]\s*/, $cfg));
}

- return @fmts;
+ return @defaults;
}

sub feature_patches {
@@ -886,6 +992,35 @@ if ($git_avatar eq 'gravatar') {
$git_avatar = '';
}

+# ordering of committags
+our @committags = gitweb_get_feature('committags');
+
+# whether we've loaded committags for the project yet
+our $loaded_project_committags = 0;
+
+# Load committag configs from the repository config file and and
+# incorporate them into the gitweb defaults where permitted by the
+# site administrator.
+sub gitweb_load_project_committags {
+ return if (!$git_dir || $loaded_project_committags);
+ my %project_config = ();
+ my %raw_config = git_parse_project_config('gitweb\.committag');
+ foreach my $key (keys(%raw_config)) {
+ next if ($key !~ /gitweb\.committag\.[^.]+\.[^.]/);
+ my ($gitweb_prefix, $committag_prefix, $ctname, $option) =
+ split(/\./, $key, 4);
+ $project_config{$ctname}{$option} = $raw_config{$key};
+ }
+ foreach my $ctname (keys(%committags)) {
+ next if (!$committags{$ctname}{'override'});
+ foreach my $optname (keys %{$project_config{$ctname}}) {
+ $committags{$ctname}{'options'}{$optname} =
+ $project_config{$ctname}{$optname};
+ }
+ }
+ $loaded_project_committags = 1;
+}
+
# dispatch
if (!defined $action) {
if (defined $hash) {
@@ -1458,13 +1593,99 @@ sub file_type_long {
sub format_log_line_html {
my $line = shift;

- $line = esc_html($line, -nbsp=>1);
- $line =~ s{\b([0-9a-fA-F]{8,40})\b}{
- $cgi->a({-href => href(action=>"object", hash=>$1),
- -class => "text"}, $1);
- }eg;
+ # Merge project configs with site default committag definitions if
+ # it hasn't been done yet
+ gitweb_load_project_committags();

- return $line;
+ # In this list of log message fragments, a string ref indicates
+ # HTML, and a string indicates plain text. The string refs are
+ # also currently not processed by subsequent committags.
+ my @message_fragments = ( $line );
+
+COMMITTAG:
+ foreach my $ctname (@committags) {
+ next COMMITTAG unless exists $committags{$ctname};
+ my $committag = $committags{$ctname};
+
+ next COMMITTAG unless exists $committag->{'sub'};
+ my $sub = $committag->{'sub'};
+
+ next COMMITTAG unless exists $committag->{'options'};
+ my $opts = $committag->{'options'};
+
+ next COMMITTAG unless exists $opts->{'pattern'};
+ my $pattern = $opts->{'pattern'};
+
+ my @new_message_fragments = ();
+
+ PART:
+ foreach my $fragment (@message_fragments) {
+ next PART if $fragment eq "";
+ if (ref($fragment)) {
+ push @new_message_fragments, $fragment;
+ next PART;
+ }
+
+ my $oldpos = 0;
+
+ MATCH:
+ while ($fragment =~ m/$pattern/gc) {
+ my ($prepos, $postpos) = ($-[0], $+[0]);
+ my $repl = $sub->($opts, $&, $1);
+ $repl = "" if (!defined $repl);
+
+ my $pre = substr($fragment, $oldpos, $prepos - $oldpos);
+ push_or_append(\@new_message_fragments, $pre);
+ push_or_append(\@new_message_fragments, $repl);
+
+ $oldpos = $postpos;
+ } # end while [regexp matches]
+
+ my $rest = substr($fragment, $oldpos);
+ push_or_append(\@new_message_fragments, $rest);
+
+ } # end foreach (@message_fragments)
+
+ @message_fragments = @new_message_fragments;
+ } # end foreach (@committags)
+
+ # Escape any remaining plain text and concatenate
+ my $html = '';
+ for my $fragment (@message_fragments) {
+ if (ref($fragment)) {
+ $html .= $$fragment;
+ } else {
+ $html .= esc_html($fragment, -nbsp=>1);
+ }
+ }
+
+ return $html;
+}
+
+# Returns a ref to an HTML snippet that links the whole match to a URL
+# formed from the 'url' option and the first captured subgroup. This
+# is a helper function used in %committags.
+sub hyperlink_committag {
+ my ($opts, @match) = @_;
+ return \$cgi->a({-href => $opts->{'url'} . $cgi->escape($match[1]),
+ -class => "text"},
+ esc_html($match[0], -nbsp=>1));
+}
+
+
+sub push_or_append (\@@) {
+ my $fragments = shift;
+
+ if (ref $_[0] || ! @$fragments || ref $fragments->[-1]) {
+ push @$fragments, @_;
+ } else {
+ my $a = pop @$fragments;
+ my $b = shift @_;
+
+ push @$fragments, $a . $b, @_;
+ }
+ # imitate push
+ return scalar @$fragments;
}

# format marker of refs pointing to given object
diff --git a/t/t9502-gitweb-committags.sh b/t/t9502-gitweb-committags.sh
new file mode 100755
index 0000000..f86cb3d
--- /dev/null
+++ b/t/t9502-gitweb-committags.sh
@@ -0,0 +1,150 @@
+#!/bin/sh
+
+test_description='gitweb committag tests.
+
+This test runs gitweb (git web interface) as CGI script from
+commandline, and checks that committags perform the expected
+transformations on log messages.'
+
+. ./gitweb-lib.sh
+
+# ----------------------------------------------------------------------
+# sha1 linking
+#
+echo sha1_test > file.txt
+git add file.txt
+git commit -q -F - file.txt <<END
+Summary
+
+See also commit 567890ab
+END
+test_expect_success 'sha1 link: enabled by default' '
+ gitweb_run "p=.git;a=commit;h=HEAD" &&
+ grep -q \
+ "commit&nbsp;<a class=\"text\" href=\".*\">567890ab</a>" \
+ gitweb.output
+'
+test_debug 'cat gitweb.log'
+test_debug 'grep 567890ab gitweb.output'
+
+# ----------------------------------------------------------------------
+# bugzilla commit tag
+#
+
+echo bugzilla_test > file.txt
+git add file.txt
+git commit -q -F - file.txt <<END
+Fix foo
+
+Fixes bug 1234 involving foo.
+END
+git config gitweb.committags 'sha1, bugzilla'
+test_expect_success 'bugzilla: enabled but not permitted' '
+ gitweb_run "p=.git;a=commit;h=HEAD" &&
+ grep -F -q \
+ "Fixes&nbsp;bug&nbsp;1234&nbsp;involving" \
+ gitweb.output
+'
+test_debug 'cat gitweb.log'
+test_debug 'grep 1234 gitweb.output'
+
+echo '$feature{"committags"}{"override"} = 1;' >> gitweb_config.perl
+test_expect_success 'bugzilla: enabled' '
+ gitweb_run "p=.git;a=commit;h=HEAD" &&
+ grep -F -q \
+ "Fixes&nbsp;<a class=\"text\" href=\"http://bugzilla.example.com/show_bug.cgi?id=1234\">bug&nbsp;1234</a>&nbsp;involving" \
+ gitweb.output
+'
+test_debug 'cat gitweb.log'
+test_debug 'grep 1234 gitweb.output'
+
+git config gitweb.committag.bugzilla.url 'http://bts.example.com?bug='
+test_expect_success 'bugzilla: url overridden but not permitted' '
+ gitweb_run "p=.git;a=commit;h=HEAD" &&
+ grep -F -q \
+ "Fixes&nbsp;<a class=\"text\" href=\"http://bugzilla.example.com/show_bug.cgi?id=1234\">bug&nbsp;1234</a>&nbsp;involving" \
+ gitweb.output
+'
+test_debug 'cat gitweb.log'
+test_debug 'grep 1234 gitweb.output'
+
+echo '$committags{"bugzilla"}{"override"} = 1;' >> gitweb_config.perl
+test_expect_success 'bugzilla: url overridden' '
+ gitweb_run "p=.git;a=commit;h=HEAD" &&
+ grep -F -q \
+ "Fixes&nbsp;<a class=\"text\" href=\"http://bts.example.com?bug=1234\">bug&nbsp;1234</a>&nbsp;involving" \
+ gitweb.output
+'
+test_debug 'cat gitweb.log'
+test_debug 'grep 1234 gitweb.output'
+
+git config gitweb.committag.bugzilla.pattern 'Fixes bug (\d+)'
+test_expect_success 'bugzilla: pattern overridden' '
+ gitweb_run "p=.git;a=commit;h=HEAD" &&
+ grep -F -q \
+ "<a class=\"text\" href=\"http://bts.example.com?bug=1234\">Fixes&nbsp;bug&nbsp;1234</a>&nbsp;involving" \
+ gitweb.output
+'
+test_debug 'cat gitweb.log'
+test_debug 'grep 1234 gitweb.output'
+git config --unset gitweb.committag.bugzilla.pattern
+
+test_expect_success 'bugzilla: affects log view too' '
+ gitweb_run "p=.git;a=log" &&
+ grep -F -q \
+ "<a class=\"text\" href=\"http://bts.example.com?bug=1234\">bug&nbsp;1234</a>" \
+ gitweb.output
+'
+test_debug 'cat gitweb.log'
+test_debug 'grep 1234 gitweb.output'
+
+# ----------------------------------------------------------------------
+# url linking
+#
+echo url_test > file.txt
+git add file.txt
+url='http://***@pass:example.com/foo.html?u=v&x=y#z'
+url_esc="$(echo "$url" | sed 's/&/&amp;/g')"
+git commit -q -F - file.txt <<END
+Summary
+
+See also $url.
+END
+echo '$feature{"committags"}{"override"} = 1;' >> gitweb_config.perl
+git config gitweb.committags 'sha1, url'
+test_expect_success 'url link: links when enabled' '
+ gitweb_run "p=.git;a=commit;h=HEAD" &&
+ grep -q -F \
+ "See&nbsp;also&nbsp;<a class=\"text\" href=\"$url_esc\">$url_esc</a>." \
+ gitweb.output
+'
+test_debug 'cat gitweb.log'
+test_debug 'grep -F "$url" gitweb.output'
+
+# ----------------------------------------------------------------------
+# message id linking
+#
+echo msgid_test > file.txt
+git add file.txt
+url='http://mid.gmane.org/'
+msgid='<***@y.z>'
+msgid_esc="$(echo "$msgid" | sed 's/</\&lt;/g; s/>/\&gt;/g')"
+msgid_url="$url$(echo "$msgid" | sed 's/</%3C/g; s/@/%40/g; s/>/%3E/g')"
+git commit -q -F - file.txt <<END
+Summary
+
+See msg-id $msgid.
+END
+echo '$feature{"committags"}{"override"} = 1;' >> gitweb_config.perl
+git config gitweb.committags 'sha1, messageid'
+test_expect_success 'msgid link: linked when enabled' '
+ gitweb_run "p=.git;a=commit;h=HEAD" &&
+ grep -q -F \
+ "See&nbsp;<a class=\"text\" href=\"$msgid_url\">msg-id&nbsp;$msgid_esc</a>." \
+ gitweb.output
+'
+test_debug 'cat gitweb.log'
+test_debug 'grep -F "y.z" gitweb.output'
+
+
+test_done
--
1.6.4.4
Petr Baudis
2009-11-18 08:20:24 UTC
Permalink
Post by Marcel M. Cary
One additional thing that occured to me is that maybe the hyperlinks
added by committags should have 'rel="nofollow"' by default? And if
so, maybe that needs to be configurable? On the other hand, I'm not
sure how useful it is to hide real URLs in the commit messages from
search engines... ?
I don't think rel="nofollow" is necessary.

BTW, wouldn't it be useful to do the matching in blob bodies as well?
And is it sensible to call these "committags" at all then? I already
made the mistake of calling content tags "ctags" and I regret it; I
think calling yet another thing tags after git tags and ctags is almost
unbearable.
Post by Marcel M. Cary
diff --git a/gitweb/INSTALL b/gitweb/INSTALL
index b76a0cf..9081ed8 100644
--- a/gitweb/INSTALL
+++ b/gitweb/INSTALL
$known_snapshot_formats{'zip'}{'disabled'} = 1;
$known_snapshot_formats{'tgz'}{'compressor'} = ['gzip','-6'];
+To add a committag to the default list of commit tags, for example to
+
+
Gitweb repositories
-------------------
I think this is not useful at all, since:

(i) The user will also *always* need to override the URL.
(ii) More importantly, the user has no idea what on earth commit
tags are.

Could you please prepend this paragraph with a short committags
description, e.g.:

"Gitweb can rewrite certain snippets of text in commit messages
to hyperlinks, e.g. URL addresses or bug tracker references - we
call these snippets 'committags'."

And you should also add

$committags{'buzilla'}{'options'}{'url'} = ...

to the explanation, together with a reference to the appropriate part of
gitweb.perl for more details.
Post by Marcel M. Cary
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index e4cbfc3..2d72202 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -213,6 +213,97 @@ our %avatar_size = (
'double' => 32
);
+# In general, the site admin can enable/disable per-project
+# configuration of each committag. Only the 'options' part of the
+# committag is configurable per-project.
The exact same problem here - it is not explained what committag
actually is and where it applies.
Post by Marcel M. Cary
+# The site admin can of course add new tags to this hash or override
+# the 'sub' key if necessary. But such changes may be fragile; this
+# is not designed as a full-blown plugin architecture. The 'sub' must
+# return a list of strings or string refs. The strings must contain
+# plain text and the string refs must contain HTML. The string refs
+# will not be processed further.
+#
+# For any committag, set the 'override' key to 1 to allow individual
+# projects to override entries in the 'options' hash for that tag.
+# For example, to match only commit hashes given in lowercase in one
+#
+# $committags{'sha1'}{'override'} = 1;
+#
+#
+# gitweb.committags.sha1.pattern = \\b([0-9a-f]{8,40})\\b
+#
+# Some committags have additional options whose interpretation depends
+# on the implementation of the 'sub' key. The hyperlink_committag
+# value appends the first captured group to the 'url' option.
+our %committags = (
+ # Link Git-style hashes to this gitweb
+ 'sha1' => {
+ 'options' => {
+ 'pattern' => qr/\b([0-9a-fA-F]{8,40})\b/,
+ },
+ 'override' => 0,
+ 'sub' => sub {
+ return \$cgi->a({-href => href(action=>"object", hash=>$match[1]),
+ -class => "text"},
+ esc_html($match[0], -nbsp=>1));
+ },
+ },
Ideally, a link should be made only in case the object exists, but this
is not trivial to implement without overhead of 1 exec per object, so I
guess it's fine to leave this for later (after all this feature was
already present). In that case, I think it would be useful to start
matching ids from 5 characters up - I use these quite frequently ;) -
but until then it would probably make for too much false positives.
Post by Marcel M. Cary
@@ -417,6 +508,21 @@ our %feature = (
'sub' => \&feature_avatar,
'override' => 0,
'default' => ['']},
+
+ # The selection and ordering of committags that are enabled.
+ # Committag transformations will be applied to commit log messages
+ # in the order listed here if listed here.
You should add something like "See %committags definition above for
explanation of committags and pre-defined committag classes."
Post by Marcel M. Cary
+ # To disable system wide have in $GITWEB_CONFIG
+ # $feature{'committags'}{'default'} = [];
+ # To have project specific config enable override in $GITWEB_CONFIG
+ # $feature{'committags'}{'override'} = 1;
+ # and in project config gitweb.committags = sha1, url, bugzilla
+ # to enable those three committags for that project
+ 'committags' => {
+ 'override' => 0,
+ 'default' => ['sha1']},
);
Would people consider it harmful to add 'url' to the default set?
Post by Marcel M. Cary
@@ -463,16 +569,16 @@ sub feature_bool {
}
}
-sub feature_snapshot {
+sub feature_list {
- my ($val) = git_get_project_config('snapshot');
+ my ($cfg) = git_get_project_config($key);
- if ($val) {
+ if ($cfg) {
+ return ($cfg eq 'none' ? () : split(/\s*[,\s]\s*/, $cfg));
}
}
sub feature_patches {
@@ -886,6 +992,35 @@ if ($git_avatar eq 'gravatar') {
$git_avatar = '';
}
+# ordering of committags
+
+# whether we've loaded committags for the project yet
+our $loaded_project_committags = 0;
+
+# Load committag configs from the repository config file and and
+# incorporate them into the gitweb defaults where permitted by the
+# site administrator.
+sub gitweb_load_project_committags {
+ return if (!$git_dir || $loaded_project_committags);
When can it happen that this is called and !$git_dir? In case it could
ever happen, why not allow the configuration at least in global gitweb
file?
Post by Marcel M. Cary
+ my %project_config = ();
+ my %raw_config = git_parse_project_config('gitweb\.committag');
+ foreach my $key (keys(%raw_config)) {
+ next if ($key !~ /gitweb\.committag\.[^.]+\.[^.]/);
+ my ($gitweb_prefix, $committag_prefix, $ctname, $option) =
+ split(/\./, $key, 4);
+ $project_config{$ctname}{$option} = $raw_config{$key};
+ }
+ foreach my $ctname (keys(%committags)) {
+ next if (!$committags{$ctname}{'override'});
+ foreach my $optname (keys %{$project_config{$ctname}}) {
+ $committags{$ctname}{'options'}{$optname} =
+ $project_config{$ctname}{$optname};
+ }
+ }
+ $loaded_project_committags = 1;
+}
For the next-conditions, I'd prefer unless formulation, but I guess
that's purely a matter of taste.
Post by Marcel M. Cary
+ my $fragments = shift;
+
+ } else {
+
+ }
+ # imitate push
This looks *quite* cryptic, a comment would be rather helpful.
--
Petr "Pasky" Baudis
A lot of people have my books on their bookshelves.
That's the problem, they need to read them. -- Don Knuth
Petr Baudis
2009-11-18 08:26:36 UTC
Permalink
Post by Marcel M. Cary
+ # Avoid matching punctuation that might immediately follow
+ # a url, is not part of the url, and is allowed in urls,
+ # like a full-stop ('.').
+ 'pattern' => qr!(https?|ftps?|git|ssh|ssh+git|sftp|smb|webdavs?|
+ nfs|irc|nntp|rsync)
You meant ssh\+git here. ;-)

Petr "Pasky" Baudis
Jakub Narebski
2009-11-20 23:24:28 UTC
Permalink
Thanks for the feedback. =A0I've added four more patches to the end o=
f
the series and updated the first two. =A0My replies are below.
=20
Thanks for working on this. I'll try to reply soon.

--=20
Jakub Narebski
Poland

Loading...