Discussion:
[RFC/PATCHv2 2/4 (resent)] gitweb: Add option to force version match
(too old to reply)
Jakub Narebski
2010-01-03 16:07:28 UTC
Permalink
From: John 'Warthog9' Hawley <***@kernel.org>

This adds $git_versions_must_match variable, which is set to true
value checks that we are running on the same version of git that we
shipped with, and if not throw '500 Internal Server Error' error.
What is checked is the version of gitweb (embedded in building
gitweb.cgi), against version of runtime git binary used.

Gitweb can usually run with a mismatched git install. This is more
here to give an obvious warning as to whats going on vs. silently
failing.

By default this feature is turned off.

Signed-off-by: John 'Warthog9' Hawley <***@kernel.org>
Signed-off-by: Jakub Narebski <***@gmail.com>
---
I don't quite see the reason behind such option, and I think that
error (instead of for example warning) on version mismatch is too much.

In short: this change is meant to avoid situation where version
mismatch results in silent error, but it would work only if it would
be default on; on the other hand having it default on is I think too
inconvenient, so I changed it to default off, which perhaps defeats
the stated purpose of this patch.

This is an RFC because formatting of error page is a bit rough, and
(ab)uses exist CSS classes instead of creating new classnames for
semantic markup.

Differences from original version, by J.H.:
* Changed name and flipped meaning of config variable, from
$missmatch_git to $git_versions_must_match
* $git_versions_must_match is boolean flag - do not compare with an
empty string.
* Changed error message a bit, fixed style, added entry in README

gitweb/README | 3 +++
gitweb/gitweb.perl | 33 +++++++++++++++++++++++++++++++++
2 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/gitweb/README b/gitweb/README
index 6c2c8e1..608b0f8 100644
--- a/gitweb/README
+++ b/gitweb/README
@@ -233,6 +233,9 @@ not include variables usually directly set during build):
If server load exceed this value then return "503 Service Unavailable" error.
Server load is taken to be 0 if gitweb cannot determine its value. Set it to
undefined value to turn it off. The default is 300.
+ * $git_versions_must_match
+ If set, gitweb fails with 500 Internal Server Error if the version of gitweb
+ doesn't match version of git binary. The default is false.


Projects list file format
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 3222131..b9bd865 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -221,6 +221,9 @@ our %avatar_size = (
'double' => 32
);

+# If it is true, exit if gitweb version and git binary version don't match
+our $git_versions_must_match = 0;
+
# Used to set the maximum load that we will still respond to gitweb queries.
# If server load exceed this value then return "503 server busy" error.
# If gitweb cannot determined server load, it is taken to be 0.
@@ -581,6 +584,36 @@ if (defined $maxload && get_loadavg() > $maxload) {
our $git_version = qx("$GIT" --version) =~ m/git version (.*)$/ ? $1 : "unknown";
$number_of_git_cmds++;

+# Throw an error if git versions does not match, if $git_versions_must_match is true.
+if ($git_versions_must_match &&
+ $git_version ne $version) {
+ git_header_html('500 - Internal Server Error');
+ my $admin_contact =
+ defined $ENV{'SERVER_ADMIN'} ? ", $ENV{'SERVER_ADMIN'}," : '';
+ print <<"EOT";
+<div class="page_body">
+<br /><br />
+500 - Internal Server Error
+<br />
+</div>
+<hr />
+<div class="readme">
+<h1 align="center">*** Warning ***</h1>
+<p>
+This version of gitweb was compiled for <b>@{[esc_html($version)]}</b>,
+however git version <b>@{[esc_html($git_version)]}</b> was found on server,
+and administrator requested strict version checking.
+</p>
+<p>
+Please contact the server administrator${admin_contact} to either configure
+gitweb to allow mismatched versions, or update git or gitweb installation.
+</p>
+</div>
+EOT
+ git_footer_html();
+ exit;
+}
+
$projects_list ||= $projectroot;

# ======================================================================
--
1.6.5.3
Jakub Narebski
2010-01-03 16:07:30 UTC
Permalink
From: John 'Warthog9' Hawley <***@kernel.org>

This commit adjust the main Makefile so you can simply run

make gitweb

which in turn calls gitweb/Makefile. This means that in order to
generate gitweb, you can simply run 'make' from gitweb subdirectory:

cd gitweb
make

Targets gitweb/gitweb.cgi and (dependent on JSMIN being defined)
gitweb/gitweb.min.js in main Makefile are preserved for backward
compatibility.

Signed-off-by: John 'Warthog9' Hawley <***@kernel.org>
Signed-off-by: Jakub Narebski <***@gmail.com>
---
This implements separate Makefile for gitweb, with main Makefile calling
it, like for gitk-git/, git-gui/, Documentation/, t/, and templates/
directories. The original patch by J.H. did it the opposite way, with
gitweb/Makefile calling main Makefile.

It is marked as RFC because I don't feel that my make-fu is strong enough
to be sure that there are no errors / mistakes. Very slightly tested.

Makefile | 65 +++++----------------------
gitweb/Makefile | 129 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 141 insertions(+), 53 deletions(-)
create mode 100644 gitweb/Makefile

diff --git a/Makefile b/Makefile
index c11719c..9c6d2b2 100644
--- a/Makefile
+++ b/Makefile
@@ -282,29 +282,6 @@ pathsep = :
# JavaScript minifier invocation that can function as filter
JSMIN =

-# default configuration for gitweb
-GITWEB_CONFIG = gitweb_config.perl
-GITWEB_CONFIG_SYSTEM = /etc/gitweb.conf
-GITWEB_HOME_LINK_STR = projects
-GITWEB_SITENAME =
-GITWEB_PROJECTROOT = /pub/git
-GITWEB_PROJECT_MAXDEPTH = 2007
-GITWEB_EXPORT_OK =
-GITWEB_STRICT_EXPORT =
-GITWEB_BASE_URL =
-GITWEB_LIST =
-GITWEB_HOMETEXT = indextext.html
-GITWEB_CSS = gitweb.css
-GITWEB_LOGO = git-logo.png
-GITWEB_FAVICON = git-favicon.png
-ifdef JSMIN
-GITWEB_JS = gitweb.min.js
-else
-GITWEB_JS = gitweb.js
-endif
-GITWEB_SITE_HEADER =
-GITWEB_SITE_FOOTER =
-
export prefix bindir sharedir sysconfdir

CC = gcc
@@ -1526,6 +1503,11 @@ $(patsubst %.perl,%,$(SCRIPT_PERL)): % : %.perl
chmod +x $@+ && \
mv $@+ $@

+
+.PHONY: gitweb
+gitweb:
+ $(QUIET_SUBDIR0)gitweb $(QUIET_SUBDIR1) all
+
ifdef JSMIN
OTHER_PROGRAMS += gitweb/gitweb.cgi gitweb/gitweb.min.js
gitweb/gitweb.cgi: gitweb/gitweb.perl gitweb/gitweb.min.js
@@ -1533,30 +1515,13 @@ else
OTHER_PROGRAMS += gitweb/gitweb.cgi
gitweb/gitweb.cgi: gitweb/gitweb.perl
endif
- $(QUIET_GEN)$(RM) $@ $@+ && \
- sed -e '1s|#!.*perl|#!$(PERL_PATH_SQ)|' \
- -e 's|++GIT_VERSION++|$(GIT_VERSION)|g' \
- -e 's|++GIT_BINDIR++|$(bindir)|g' \
- -e 's|++GITWEB_CONFIG++|$(GITWEB_CONFIG)|g' \
- -e 's|++GITWEB_CONFIG_SYSTEM++|$(GITWEB_CONFIG_SYSTEM)|g' \
- -e 's|++GITWEB_HOME_LINK_STR++|$(GITWEB_HOME_LINK_STR)|g' \
- -e 's|++GITWEB_SITENAME++|$(GITWEB_SITENAME)|g' \
- -e 's|++GITWEB_PROJECTROOT++|$(GITWEB_PROJECTROOT)|g' \
- -e 's|"++GITWEB_PROJECT_MAXDEPTH++"|$(GITWEB_PROJECT_MAXDEPTH)|g' \
- -e 's|++GITWEB_EXPORT_OK++|$(GITWEB_EXPORT_OK)|g' \
- -e 's|++GITWEB_STRICT_EXPORT++|$(GITWEB_STRICT_EXPORT)|g' \
- -e 's|++GITWEB_BASE_URL++|$(GITWEB_BASE_URL)|g' \
- -e 's|++GITWEB_LIST++|$(GITWEB_LIST)|g' \
- -e 's|++GITWEB_HOMETEXT++|$(GITWEB_HOMETEXT)|g' \
- -e 's|++GITWEB_CSS++|$(GITWEB_CSS)|g' \
- -e 's|++GITWEB_LOGO++|$(GITWEB_LOGO)|g' \
- -e 's|++GITWEB_FAVICON++|$(GITWEB_FAVICON)|g' \
- -e 's|++GITWEB_JS++|$(GITWEB_JS)|g' \
- -e 's|++GITWEB_SITE_HEADER++|$(GITWEB_SITE_HEADER)|g' \
- -e 's|++GITWEB_SITE_FOOTER++|$(GITWEB_SITE_FOOTER)|g' \
- $< >$@+ && \
- chmod +x $@+ && \
- mv $@+ $@
+ $(QUIET_SUBDIR0)gitweb $(QUIET_SUBDIR1) $(patsubst gitweb/%,%,$@)
+
+ifdef JSMIN
+gitweb/gitweb.min.js: gitweb/gitweb.js
+ $(QUIET_SUBDIR0)gitweb $(QUIET_SUBDIR1) $(patsubst gitweb/%,%,$@)
+endif # JSMIN
+

git-instaweb: git-instaweb.sh gitweb/gitweb.cgi gitweb/gitweb.css gitweb/gitweb.js
$(QUIET_GEN)$(RM) $@ $@+ && \
@@ -1583,12 +1548,6 @@ $(patsubst %.perl,%,$(SCRIPT_PERL)) git-instaweb: % : unimplemented.sh
mv $@+ $@
endif # NO_PERL

-
-ifdef JSMIN
-gitweb/gitweb.min.js: gitweb/gitweb.js
- $(QUIET_GEN)$(JSMIN) <$< >$@
-endif # JSMIN
-
ifndef NO_PYTHON
$(patsubst %.py,%,$(SCRIPT_PYTHON)): GIT-CFLAGS
$(patsubst %.py,%,$(SCRIPT_PYTHON)): % : %.py
diff --git a/gitweb/Makefile b/gitweb/Makefile
new file mode 100644
index 0000000..c9eb1ee
--- /dev/null
+++ b/gitweb/Makefile
@@ -0,0 +1,129 @@
+# The default target of this Makefile is...
+all::
+
+# Define V=1 to have a more verbose compile.
+#
+# Define JSMIN to point to JavaScript minifier that functions as
+# a filter to have gitweb.js minified.
+#
+
+prefix ?= $(HOME)
+bindir ?= $(prefix)/bin
+RM ?= rm -f
+
+# JavaScript minifier invocation that can function as filter
+JSMIN ?=
+
+# default configuration for gitweb
+GITWEB_CONFIG = gitweb_config.perl
+GITWEB_CONFIG_SYSTEM = /etc/gitweb.conf
+GITWEB_HOME_LINK_STR = projects
+GITWEB_SITENAME =
+GITWEB_PROJECTROOT = /pub/git
+GITWEB_PROJECT_MAXDEPTH = 2007
+GITWEB_EXPORT_OK =
+GITWEB_STRICT_EXPORT =
+GITWEB_BASE_URL =
+GITWEB_LIST =
+GITWEB_HOMETEXT = indextext.html
+GITWEB_CSS = gitweb.css
+GITWEB_LOGO = git-logo.png
+GITWEB_FAVICON = git-favicon.png
+ifdef JSMIN
+GITWEB_JS = gitweb.min.js
+else
+GITWEB_JS = gitweb.js
+endif
+GITWEB_SITE_HEADER =
+GITWEB_SITE_FOOTER =
+
+# include user config
+-include ../config.mak.autogen
+-include ../config.mak
+
+# determine version
+../GIT-VERSION-FILE: .FORCE-GIT-VERSION-FILE
+ $(QUIET_SUBDIR0)../ $(QUIET_SUBDIR1) GIT-VERSION-FILE
+
+-include ../GIT-VERSION-FILE
+
+### Build rules
+
+SHELL_PATH ?= $(SHELL)
+PERL_PATH ?= /usr/bin/perl
+
+# Shell quote;
+bindir_SQ = $(subst ','\'',$(bindir)) #'
+SHELL_PATH_SQ = $(subst ','\'',$(SHELL_PATH)) #'
+PERL_PATH_SQ = $(subst ','\'',$(PERL_PATH)) #'
+
+# Quiet generation (unless V=1)
+QUIET_SUBDIR0 = +$(MAKE) -C # space to separate -C and subdir
+QUIET_SUBDIR1 =
+
+ifneq ($(findstring $(MAKEFLAGS),w),w)
+PRINT_DIR = --no-print-directory
+else # "make -w"
+NO_SUBDIR = :
+endif
+
+ifneq ($(findstring $(MAKEFLAGS),s),s)
+ifndef V
+ QUIET = @
+ QUIET_GEN = $(QUIET)echo ' ' GEN $@;
+ QUIET_SUBDIR0 = +@subdir=
+ QUIET_SUBDIR1 = ;$(NO_SUBDIR) echo ' ' SUBDIR $$subdir; \
+ $(MAKE) $(PRINT_DIR) -C $$subdir
+ export V
+ export QUIET
+ export QUIET_GEN
+ export QUIET_SUBDIR0
+ export QUIET_SUBDIR1
+endif
+endif
+
+all:: gitweb.cgi
+
+ifdef JSMIN
+FILES=gitweb.cgi gitweb.min.js
+gitweb.cgi: gitweb.perl gitweb.min.js
+else # !JSMIN
+FILES=gitweb.cgi
+gitweb.cgi: gitweb.perl
+endif # JSMIN
+
+gitweb.cgi:
+ $(QUIET_GEN)$(RM) $@ $@+ && \
+ sed -e '1s|#!.*perl|#!$(PERL_PATH_SQ)|' \
+ -e 's|++GIT_VERSION++|$(GIT_VERSION)|g' \
+ -e 's|++GIT_BINDIR++|$(bindir)|g' \
+ -e 's|++GITWEB_CONFIG++|$(GITWEB_CONFIG)|g' \
+ -e 's|++GITWEB_CONFIG_SYSTEM++|$(GITWEB_CONFIG_SYSTEM)|g' \
+ -e 's|++GITWEB_HOME_LINK_STR++|$(GITWEB_HOME_LINK_STR)|g' \
+ -e 's|++GITWEB_SITENAME++|$(GITWEB_SITENAME)|g' \
+ -e 's|++GITWEB_PROJECTROOT++|$(GITWEB_PROJECTROOT)|g' \
+ -e 's|"++GITWEB_PROJECT_MAXDEPTH++"|$(GITWEB_PROJECT_MAXDEPTH)|g' \
+ -e 's|++GITWEB_EXPORT_OK++|$(GITWEB_EXPORT_OK)|g' \
+ -e 's|++GITWEB_STRICT_EXPORT++|$(GITWEB_STRICT_EXPORT)|g' \
+ -e 's|++GITWEB_BASE_URL++|$(GITWEB_BASE_URL)|g' \
+ -e 's|++GITWEB_LIST++|$(GITWEB_LIST)|g' \
+ -e 's|++GITWEB_HOMETEXT++|$(GITWEB_HOMETEXT)|g' \
+ -e 's|++GITWEB_CSS++|$(GITWEB_CSS)|g' \
+ -e 's|++GITWEB_LOGO++|$(GITWEB_LOGO)|g' \
+ -e 's|++GITWEB_FAVICON++|$(GITWEB_FAVICON)|g' \
+ -e 's|++GITWEB_JS++|$(GITWEB_JS)|g' \
+ -e 's|++GITWEB_SITE_HEADER++|$(GITWEB_SITE_HEADER)|g' \
+ -e 's|++GITWEB_SITE_FOOTER++|$(GITWEB_SITE_FOOTER)|g' \
+ $< >$@+ && \
+ chmod +x $@+ && \
+ mv $@+ $@
+
+ifdef JSMIN
+gitweb.min.js: gitweb.js
+ $(QUIET_GEN)$(JSMIN) <$< >$@
+endif # JSMIN
+
+clean:
+ $(RM) $(FILES)
+
+.PHONY: all clean .FORCE-GIT-VERSION-FILE
--
1.6.5.3
Jakub Narebski
2010-01-03 16:07:29 UTC
Permalink
From: John 'Warthog9' Hawley <***@kernel.org>

This adds a "git" link for each project in the project list page,
should a common $gitlinkurl_base be defined and not empty. The full
URL of each link is composed of $gitlinkurl_base and project name.
It is intended for git:// links, and in fact GITWEB_BASE_URL build
variable is used as its default value only if it starts with git://

This does make the assumption that the git repositories share a common
path. Nothing to date is known to actually make use of introduced
link.

Created "git" link follows rel=vcs-* microformat specification:
http://kitenet.net/~joey/rfc/rel-vcs/

Signed-off-by: John 'Warthog9' Hawley <***@kernel.org>
Signed-off-by: Jakub Narebski <***@gmail.com>
---
I think it might be good idea... but for the fact "Nothing to date is
known to actually make use of introduced link". What's its intended
use?

Differences to original version by John 'Warthog9' Hawley (J.H.):
* It doesn't cause syntax error ;-)
* Escaping of attribute value is left to CGI.pm (avoid double escaping)
* $gitlinkurl got renamed to $gitlinkurl_base, now includes git://
prefix, and defaults to GITWEB_BASE_URL if it begins with git://
* Added description of $gitlinkurl_base to gitweb/README
* Uses rel=vcs-* microformat by Joey Hess

I assume that nobody sane would define $gitlinkurl_base to "0";
the code assumes that is enough to check that $gitlinkurl_base
is true-ish.

gitweb/README | 4 ++++
gitweb/gitweb.perl | 8 ++++++++
2 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/gitweb/README b/gitweb/README
index 608b0f8..36fb059 100644
--- a/gitweb/README
+++ b/gitweb/README
@@ -71,6 +71,7 @@ You can specify the following configuration variables when building GIT:
* GITWEB_BASE_URL
Git base URLs used for URL to where fetch project from, i.e. full
URL is "$git_base_url/$project". Shown on projects summary page.
+ If it begins with "git://" it is also used for $gitlinkurl_base, see below.
Repository URL for project can be also configured per repository; this
takes precedence over URLs composed from base URL and a project name.
Note that you can setup multiple base URLs (for example one for
@@ -204,6 +205,9 @@ not include variables usually directly set during build):
access, and one for http:// "dumb" protocol access). Note that per
repository configuration in 'cloneurl' file, or as values of gitweb.url
project config.
+ * $gitlinkurl_base
+ Git base URL used (if it is defined and not empty) for "git" link in
+ projects list, for each project. Full URL is "$gitlinkurl_base/$project".
* $default_blob_plain_mimetype
Default mimetype for blob_plain (raw) view, if mimetype checking
doesn't result in some other type; by default 'text/plain'.
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index b9bd865..efb6471 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -224,6 +224,10 @@ our %avatar_size = (
# If it is true, exit if gitweb version and git binary version don't match
our $git_versions_must_match = 0;

+# If this variable is set and not empty, add an extra link called "git"
+# for each project in project list. Full URL is "$gitlinkurl_base/$project".
+our $gitlinkurl_base = ("++GITWEB_BASE_URL++" =~ m!^(git://.*)$!) ? $1 : '';
+
# Used to set the maximum load that we will still respond to gitweb queries.
# If server load exceed this value then return "503 server busy" error.
# If gitweb cannot determined server load, it is taken to be 0.
@@ -4472,6 +4476,10 @@ sub git_project_list_body {
$cgi->a({-href => href(project=>$pr->{'path'}, action=>"log")}, "log") . " | " .
$cgi->a({-href => href(project=>$pr->{'path'}, action=>"tree")}, "tree") .
($pr->{'forks'} ? " | " . $cgi->a({-href => href(project=>$pr->{'path'}, action=>"forks")}, "forks") : '') .
+ ($gitlinkurl_base ?
+ " | " . $cgi->a({-href=>"$gitlinkurl_base/$pr->{'path'}",
+ -rel=>"vcs-git"}, "git")
+ : '') .
"</td>\n" .
"</tr>\n";
}
--
1.6.5.3
Jakub Narebski
2010-01-09 11:20:19 UTC
Permalink
Post by Jakub Narebski
This adds a "git" link for each project in the project list page,
should a common $gitlinkurl_base be defined and not empty. The full
URL of each link is composed of $gitlinkurl_base and project name.
It is intended for git:// links, and in fact GITWEB_BASE_URL build
variable is used as its default value only if it starts with git://
This does make the assumption that the git repositories share a common
path. Nothing to date is known to actually make use of introduced
link.
http://kitenet.net/~joey/rfc/rel-vcs/
---
I think it might be good idea... but for the fact "Nothing to date is
known to actually make use of introduced link". What's its intended
use?
* It doesn't cause syntax error ;-)
* Escaping of attribute value is left to CGI.pm (avoid double escaping)
* $gitlinkurl got renamed to $gitlinkurl_base, now includes git://
prefix, and defaults to GITWEB_BASE_URL if it begins with git://
* Added description of $gitlinkurl_base to gitweb/README
* Uses rel=vcs-* microformat by Joey Hess
gitweb/README | 4 ++++
gitweb/gitweb.perl | 8 ++++++++
2 files changed, 12 insertions(+), 0 deletions(-)
A reminder - this patch series consists of the following patches:
[PATCHv2 1/4 (resent)] gitweb: Load checking
[RFC/PATCHv2 2/4 (resent)] gitweb: Add option to force version match
[PATCHv3 3/4 (resent)] gitweb: Optionally add "git" links in project list page
[PATCHv2/RFC 4/4 (resent)] gitweb: Makefile improvements
Ok I've been debating this as I've been going through the patches, I've
got small modifications on top of your patches Jakub for 1 and 2,
haven't pushed them yet but they are relatively trivial. The changes to
the first patch sets things up for additional load checkers to be added
later on.
Good idea, although I think that such addition can be left for a separate
patch.

By the way, are you doing if-elsif fallback chain, trying different
mechanisms (like '/proc/loadavg', BSD::getloadavg, etc.), or did
you made get_loadavg() into code reference, i.e. run it with
$get_loadavg->(), which has the advantage that the gitweb admin can
override it in gitweb config file (including such thing like simply
using load average over last 5 minutes, and not over last minute)?
The second changes the error message to use/abuse die_error()
vs. doing it's own thing (though I still think this should be on by
default).
True, the error message could use improvement (and not only using
its own class instead of abusing 'readme' class, or renaming 'readme'
class to something more generic). The problem with error message for
this is who is the target of this message: is it gitweb administrator
(who can change gitweb configuration), or is it gitweb user (who need
to contact web admin).

The problem with this patch is that for it to be useful for protecting
against silent errors it should be on by default, but OTOH having it on
by default is quite inconvenient.

Best solution would be to treat core of this issue, namely eliminate
silent errors and always provide some message in case of error.
Patch 4 I don't have anything to add or change at this point.
This patch has me pondering and I'm unsure of what I'd suggest, mainly
because of the addition of the smart http support meaning that git://
and http:// are legitimate and useful links for supporting full git
transactions.
I may withdraw the patch entirely since the link on kernel.org has been
around for years, and I'm unsure if anything actually uses it (though I
can see it being useful still). If it stays I think there's got to be a
way to specifically mark a url as being the one to link to vs.
defaulting to git:// (or allow for a marking to override the git://) and
I need to ponder that.
Also, it has to be _fast_, I think, i.e. no reading cloneurl and repo
config (for gitweb.url) for each repository.

You can always remove the check for "git://" prefix, and/or take first
I have given some initial thought to converting the $output options I'm
currently using to a print <FH> that Jakub is suggesting & exploring.
It's 'print {$fh}', i.e. use indirect filehandle, not global filehandle.
I think all told it's going to be a similarly sized patch, since all
output still has to get adjusted (including the things that directly
output but don't print).
print -> print {$fh} can be separate patch, and it can be checked that
it produces the same results. Well print -> $output .= could also be
separate patch...
I'm unsure if there's a real advantage to
either way, other than design preference. My patch (forcing the output
to get passed around) moves towards more of a modal style design
separating data & layout vs. it's combined nature now, well it's a step
in that direction anyway.
Errr... what? Forcing buffering (you need to read whole output into
memory, including for snapshots (uncompressed in case of .tar.gz))
is IMVHO orthogonal to the issue of separating data & layout.
BTW. Modern web server interfaces like Rack, PSGI/Plack etc. explicitly
include streaming support.

The advantage of doing 'print {$fh}' is that $fh can be \*STDOUT, can
be \$buffer, but can be filehandle to (temporary) file on disk, and
you can even "tee" it, i.e. both write to memory/file, and to STDOUT.
The number of possible choices / possible improvements is much larger.

And what is also important it means that people who do not use caching
do not suffer latency penalty and memory pressure from caching
infrastructure they do not use.


P.S.
Subject: [Virus] Exploit.PDF-9669
X-Virus-Scanned: ClamAV 0.88.7/10275/Fri Jan 8 17:06:46 2010 on shards.monkeyblade.net
X-Virus-Status: Infected with Exploit.PDF-9669
X-Original-Subject: Re: [PATCHv3 3/4 (resent)] gitweb: Optionally add "git"
links in project list page

A message sent from <***@eaglescrag.net> to
<***@gmail.com>
<***@vger.kernel.org>
<***@kernel.org>
contained Exploit.PDF-9669 and has not been delivered.
--
Jakub Narebski
Poland
J.H.
2010-01-12 00:39:39 UTC
Permalink
Post by Jakub Narebski
Post by Jakub Narebski
This adds a "git" link for each project in the project list page,
should a common $gitlinkurl_base be defined and not empty. The full
URL of each link is composed of $gitlinkurl_base and project name.
It is intended for git:// links, and in fact GITWEB_BASE_URL build
variable is used as its default value only if it starts with git://
This does make the assumption that the git repositories share a common
path. Nothing to date is known to actually make use of introduced
link.
http://kitenet.net/~joey/rfc/rel-vcs/
---
I think it might be good idea... but for the fact "Nothing to date is
known to actually make use of introduced link". What's its intended
use?
* It doesn't cause syntax error ;-)
* Escaping of attribute value is left to CGI.pm (avoid double escaping)
* $gitlinkurl got renamed to $gitlinkurl_base, now includes git://
prefix, and defaults to GITWEB_BASE_URL if it begins with git://
* Added description of $gitlinkurl_base to gitweb/README
* Uses rel=vcs-* microformat by Joey Hess
gitweb/README | 4 ++++
gitweb/gitweb.perl | 8 ++++++++
2 files changed, 12 insertions(+), 0 deletions(-)
[PATCHv2 1/4 (resent)] gitweb: Load checking
[RFC/PATCHv2 2/4 (resent)] gitweb: Add option to force version match
[PATCHv3 3/4 (resent)] gitweb: Optionally add "git" links in project list page
[PATCHv2/RFC 4/4 (resent)] gitweb: Makefile improvements
Ok I've been debating this as I've been going through the patches, I've
got small modifications on top of your patches Jakub for 1 and 2,
haven't pushed them yet but they are relatively trivial. The changes to
the first patch sets things up for additional load checkers to be added
later on.
Good idea, although I think that such addition can be left for a separate
patch.
It only set things up for additional checkers, adding a framework for
them to be added. I would agree that other load checkers should be in
additional patches, just trying to make it easy for other later on.
Post by Jakub Narebski
By the way, are you doing if-elsif fallback chain, trying different
mechanisms (like '/proc/loadavg', BSD::getloadavg, etc.), or did
you made get_loadavg() into code reference, i.e. run it with
$get_loadavg->(), which has the advantage that the gitweb admin can
override it in gitweb config file (including such thing like simply
using load average over last 5 minutes, and not over last minute)?
I had set it up using the if-elsif fallback chain. Not sure that an
admin would be overriding it for the most part, but it probably wouldn't
be hard to do both. Decent defaults for the general case, and if anyone
wants they can override it afterwards for special cases.
Post by Jakub Narebski
The second changes the error message to use/abuse die_error()
vs. doing it's own thing (though I still think this should be on by
default).
True, the error message could use improvement (and not only using
its own class instead of abusing 'readme' class, or renaming 'readme'
class to something more generic). The problem with error message for
this is who is the target of this message: is it gitweb administrator
(who can change gitweb configuration), or is it gitweb user (who need
to contact web admin).
Admin for sure, this is intended more to be a speed bump that forces the
admin to stop and consider what's happening. I originally added it
because I ended up in a situation where gitweb didn't match the
installed git and gitweb was dying silently. I wanted to just give the
admin a better guess at what was going on mainly.
Post by Jakub Narebski
The problem with this patch is that for it to be useful for protecting
against silent errors it should be on by default, but OTOH having it on
by default is quite inconvenient.
Upside is once it's been disabled it should stay disabled. I'll admit
it's not ideal but it's at least a stop gap in the interim till the
other silent errors get filled in.
Post by Jakub Narebski
Best solution would be to treat core of this issue, namely eliminate
silent errors and always provide some message in case of error.
Agreed, this is really just a stop-gap measure. I would imagine *most*
instalations of gitweb are installed from the distro's package
management so that would likely be upgraded at the same time as git itself.
Post by Jakub Narebski
Patch 4 I don't have anything to add or change at this point.
This patch has me pondering and I'm unsure of what I'd suggest, mainly
because of the addition of the smart http support meaning that git://
and http:// are legitimate and useful links for supporting full git
transactions.
I may withdraw the patch entirely since the link on kernel.org has been
around for years, and I'm unsure if anything actually uses it (though I
can see it being useful still). If it stays I think there's got to be a
way to specifically mark a url as being the one to link to vs.
defaulting to git:// (or allow for a marking to override the git://) and
I need to ponder that.
Also, it has to be _fast_, I think, i.e. no reading cloneurl and repo
config (for gitweb.url) for each repository.
You can always remove the check for "git://" prefix, and/or take first
Taking the first one by default may make more sense - I'll try that.
Post by Jakub Narebski
I have given some initial thought to converting the $output options I'm
currently using to a print <FH> that Jakub is suggesting & exploring.
It's 'print {$fh}', i.e. use indirect filehandle, not global filehandle.
I think all told it's going to be a similarly sized patch, since all
output still has to get adjusted (including the things that directly
output but don't print).
print -> print {$fh} can be separate patch, and it can be checked that
it produces the same results. Well print -> $output .= could also be
separate patch...
I'm unsure if there's a real advantage to
either way, other than design preference. My patch (forcing the output
to get passed around) moves towards more of a modal style design
separating data & layout vs. it's combined nature now, well it's a step
in that direction anyway.
Errr... what? Forcing buffering (you need to read whole output into
memory, including for snapshots (uncompressed in case of .tar.gz))
is IMVHO orthogonal to the issue of separating data & layout.
BTW. Modern web server interfaces like Rack, PSGI/Plack etc. explicitly
include streaming support.
The inbuilt streaming support does change things, and I don't think it
ultimately changes my caching engine really anyway - I should have that
change done shortly.
Post by Jakub Narebski
The advantage of doing 'print {$fh}' is that $fh can be \*STDOUT, can
be \$buffer, but can be filehandle to (temporary) file on disk, and
you can even "tee" it, i.e. both write to memory/file, and to STDOUT.
The number of possible choices / possible improvements is much larger.
And what is also important it means that people who do not use caching
do not suffer latency penalty and memory pressure from caching
infrastructure they do not use.
P.S.
Subject: [Virus] Exploit.PDF-9669
X-Virus-Scanned: ClamAV 0.88.7/10275/Fri Jan 8 17:06:46 2010 on shards.monkeyblade.net
X-Virus-Status: Infected with Exploit.PDF-9669
X-Original-Subject: Re: [PATCHv3 3/4 (resent)] gitweb: Optionally add "git"
links in project list page
contained Exploit.PDF-9669 and has not been delivered.
Clamav on my personal server wigging out and deciding that all kinds of
stuff was PDF viruses, *NO* idea why. I've had a different problem but
still involving Clamav on kernel.org. Both had been working fine for
quite some time and the only thing I can think of is that either an
update from clamav went awry, or a date dependent thing triggered and is
causing havoc.

Should have a patch series out in an hour or so.

- John 'Warthog9' Hawley
Jakub Narebski
2010-01-12 13:05:22 UTC
Permalink
Post by J.H.
Post by Jakub Narebski
I have given some initial thought to converting the $output options I'm
currently using to a print <FH> that Jakub is suggesting & exploring.
It's 'print {$fh}', i.e. use indirect filehandle, not global filehandle.
I think all told it's going to be a similarly sized patch, since all
output still has to get adjusted (including the things that directly
output but don't print).
print -> print {$fh} can be separate patch, and it can be checked that
it produces the same results. Well print -> $output .= could also be
separate patch...
I'm unsure if there's a real advantage to
either way, other than design preference. My patch (forcing the output
to get passed around) moves towards more of a modal style design
separating data & layout vs. it's combined nature now, well it's a step
in that direction anyway.
Errr... what? Forcing buffering (you need to read whole output into
memory, including for snapshots (uncompressed in case of .tar.gz))
is IMVHO orthogonal to the issue of separating data & layout.
BTW. Modern web server interfaces like Rack, PSGI/Plack etc. explicitly
include streaming support.
The inbuilt streaming support does change things, and I don't think it
ultimately changes my caching engine really anyway - I should have that
change done shortly.
It doesn't change caching engine much, especially if you encapsulate this
detail in the caching engine. With 'print {$fh}' (and in a few places
'printf {$fh}' (!)) you can just do something like:

$fh = $cache_fh;
$actions{$action}->();
show_cached($fh);


About 'should have that change done shortly': I am working now, time
permitting, on splitting your caching patch in smaller parts, wrapping
it a bit differently (and hopefully more clear).
Post by J.H.
Post by Jakub Narebski
The advantage of doing 'print {$fh}' is that $fh can be \*STDOUT, can
be \$buffer, but can be filehandle to (temporary) file on disk, and
you can even "tee" it, i.e. both write to memory/file, and to STDOUT.
The number of possible choices / possible improvements is much larger.
And what is also important it means that people who do not use caching
do not suffer latency penalty and memory pressure from caching
infrastructure they do not use.
--
Jakub Narebski
Poland
Jakub Narebski
2010-01-03 16:07:27 UTC
Permalink
From: John 'Warthog9' Hawley <***@kernel.org>

This changes slightly the behavior of gitweb, so that it verifies
that the box isn't inundated with before attempting to serve gitweb.
If the box is overloaded, it basically returns a 503 Server Unavailable
until the load falls below the defined threshold. This helps dramatically
if you have a box that's I/O bound, reaches a certain load and you
don't want gitweb, the I/O hog that it is, increasing the pain the
server is already undergoing.

This behavior is controlled by $maxload configuration variable.
Default is a load of 300, which for most cases should never be hit.
Unset it (set it to undefined value, i.e. undef) to turn off checking.

Currently it requires that '/proc/loadavg' file exists, otherwise the
load check is bypassed (load is taken to be 0). So platforms that do
not implement '/proc/loadavg' currently cannot use this feature.

Signed-off-by: John 'Warthog9' Hawley <***@kernel.org>
Signed-off-by: Jakub Narebski <***@gmail.com>
---
Originally appeared in
http://permalink.gmane.org/gmane.comp.version-control.git/135418

Differences to original version by John 'Warthog9' Hawley (J.H.):
* Slightly improved wording in commit message and in comments
* $maxload is described in "Gitweb config file variables section
in gitweb/README
* You can use '$maxload = undef;' to turn off load checking
* Error page for too high load is generated using die_error, which had
to be extended to handle 503 Service Unavailable HTTP error code

One complaints that didn't get addressed was describing alternate
approaches to just reading '/proc/loadavg', like using BSD::getloadavg
module from CPAN, either in comments in gitweb.perl or in commit
message. But I have changed my mind since then about this issue,
and now I think it is not strictly necessary. I think that
limitations of current approach are described in sufficient detail,
and we have fall back for when used method cannot work.

gitweb/README | 7 ++++++-
gitweb/gitweb.perl | 39 +++++++++++++++++++++++++++++++++++----
2 files changed, 41 insertions(+), 5 deletions(-)

diff --git a/gitweb/README b/gitweb/README
index e34ee79..6c2c8e1 100644
--- a/gitweb/README
+++ b/gitweb/README
@@ -174,7 +174,7 @@ not include variables usually directly set during build):
Base URL for relative URLs in pages generated by gitweb,
(e.g. $logo, $favicon, @stylesheets if they are relative URLs),
needed and used only for URLs with nonempty PATH_INFO via
- <base href="$base_url>. Usually gitweb sets its value correctly,
+ <base href="$base_url">. Usually gitweb sets its value correctly,
and there is no need to set this variable, e.g. to $my_uri or "/".
* $home_link
Target of the home link on top of all pages (the first part of view
@@ -228,6 +228,11 @@ not include variables usually directly set during build):
repositories from launching cross-site scripting (XSS) attacks. Set this
to true if you don't trust the content of your repositories. The default
is false.
+ * $maxload
+ Used to set the maximum load that we will still respond to gitweb queries.
+ If server load exceed this value then return "503 Service Unavailable" error.
+ Server load is taken to be 0 if gitweb cannot determine its value. Set it to
+ undefined value to turn it off. The default is 300.


Projects list file format
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 7e477af..3222131 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -221,6 +221,12 @@ our %avatar_size = (
'double' => 32
);

+# Used to set the maximum load that we will still respond to gitweb queries.
+# If server load exceed this value then return "503 server busy" error.
+# If gitweb cannot determined server load, it is taken to be 0.
+# Leave it undefined (or set to 'undef') to turn off load checking.
+our $maxload = 300;
+
# You define site-wide feature defaults here; override them with
# $GITWEB_CONFIG as necessary.
our %feature = (
@@ -551,6 +557,26 @@ if (-e $GITWEB_CONFIG) {
do $GITWEB_CONFIG_SYSTEM if -e $GITWEB_CONFIG_SYSTEM;
}

+# Get loadavg of system, to compare against $maxload.
+# Currently it requires '/proc/loadavg' present to get loadavg;
+# if it is not present it returns 0, which means no load checking.
+sub get_loadavg {
+ open my $fd, '<', '/proc/loadavg'
+ or return 0;
+ my @load = split(/\s+/, scalar <$fd>);
+ close $fd;
+
+ # The first three columns measure CPU and IO utilization of the last one,
+ # five, and 10 minute periods. The fourth column shows the number of
+ # currently running processes and the total number of processes in the m/n
+ # format. The last column displays the last process ID used.
+ return $load[0] || 0;
+}
+
+if (defined $maxload && get_loadavg() > $maxload) {
+ die_error(503, "The load average on the server is too high");
+}
+
# version of the core git binary
our $git_version = qx("$GIT" --version) =~ m/git version (.*)$/ ? $1 : "unknown";
$number_of_git_cmds++;
@@ -3354,14 +3380,19 @@ sub git_footer_html {
# 500: The server isn't configured properly, or
# an internal error occurred (e.g. failed assertions caused by bugs), or
# an unknown error occurred (e.g. the git binary died unexpectedly).
+# 503: The server is currently unavailable (because it is overloaded,
+# or down for maintenance). Generally, this is a temporary state.
sub die_error {
my $status = shift || 500;
my $error = shift || "Internal server error";

- my %http_responses = (400 => '400 Bad Request',
- 403 => '403 Forbidden',
- 404 => '404 Not Found',
- 500 => '500 Internal Server Error');
+ my %http_responses = (
+ 400 => '400 Bad Request',
+ 403 => '403 Forbidden',
+ 404 => '404 Not Found',
+ 500 => '500 Internal Server Error',
+ 503 => '503 Service Unavailable',
+ );
git_header_html($http_responses{$status});
print <<EOF;
<div class="page_body">
--
1.6.5.3
J.H.
2010-01-06 22:28:50 UTC
Permalink
Just a heads up I've been on vacation for two weeks and hadn't gotten
around to dealing with these e-mails. I'm going to get them imported
into my tree here and take a look over them.

- John 'Warthog9' Hawley
This is resend of early part of "[PATCH 0/6] Gitweb caching changes v2"
thread by John 'Warthog9' Hawley (J.H.),
http://thread.gmane.org/gmane.comp.version-control.git/135052
or alternatively
git://git.kernel.org/pub/scm/git/warthog9/gitweb.git gitweb-ml-v2
with a few modifications of my own. Those patches were send originally as
responses in the mentioned thread, for further comments from original
author. As the discussion didn't pick up (because of kernel.org upgrade, and
perhaps due to end-of-year stuff), I am resending those patches in a
separate thread for a better visibility; only comments are changed.
This series of patches is rebased on top of commit 37bae10
(Merge branch 'maint', 2009-12-31) in 'master' branch.
* moving from "GITWEB - " to "gitweb: " as subsystem prefix
and of course add my own signoff.
I have included reply to neither "GITWEB - File based caching layer"
nor "GITWEB - Separate defaults from main file" in this thread/series.
I haven't included the main point of the whole series, namely adding
response caching layer in the form that is used in git.kernel.org, because
I think this patch should be split into smaller parts, and unit-tested.
As it is now it is a bit of mess. I have done patch which makes gitweb
always use explicit filehandle when printing (simplifying a bit it
replaces 'print <something>' by 'print {$out} <something>', with $out set
to \*STDOUT), as a patch that prepares for (optional) gitweb caching, while
not affecting throughput, latency and memory consumption when caching is
disabled, as opposed to original solution by J.H. of always storing whole
response in scalar and writing it at the end.
I haven't included splitting of gitweb_defaults.perl off gitweb.perl, as it
was after large and invasive gitweb caching patch, it would require
substantial changes to gitweb tests upfront (by testing built gitweb.cgi and
not source gitweb.perl), and needs fixing of Makefile to actually work
reliably (we could have to process both gitweb.perl and
gitweb_defaults.perl, while provided Makefile process only the file which
triggered the rule... I think).
I am not sure if 'gitweb: Add option to force version match' is a good
solution to the problem it tires to address, i.e. if it is worth having,
and I am not sure if I did 'gitweb: Makefile improvements' correctly.
gitweb: Load checking
gitweb: Add option to force version match
gitweb: Optionally add "git" links in project list page
gitweb: Makefile improvements
Makefile | 65 +++++---------------------
gitweb/Makefile | 129 ++++++++++++++++++++++++++++++++++++++++++++++++++++
gitweb/README | 14 +++++-
gitweb/gitweb.perl | 80 ++++++++++++++++++++++++++++++--
4 files changed, 230 insertions(+), 58 deletions(-)
create mode 100644 gitweb/Makefile
John, any comments?
Jakub Narebski
2010-01-06 23:22:29 UTC
Permalink
Post by J.H.
Just a heads up I've been on vacation for two weeks and hadn't gotten
around to dealing with these e-mails. =C2=A0I'm going to get them imp=
orted
Post by J.H.
into my tree here and take a look over them.
They are also available from 'gitweb/cache-kernel' branch of my=20
git/jnareb-git.git repository at repo.or.cz... and are available from
'pu' branch in main git repository, as fbd43a8^2 (Merge branch=20
'jh/gitweb-cached' into pu, 2010-01-04).

It might be easier to import (fetch) them from that repository, and jus=
t=20
drop top two commits / merge jnareb-git/gitweb/cache-kernel~2


git://repo.or.cz/git/jnareb-git.git gitweb/cache-kernel

John 'Warthog9' Hawley (4):
gitweb: Load checking
gitweb: Add option to force version match
gitweb: Optionally add "git" links in project list page
gitweb: Makefile improvements

Jakub Narebski (2):
gitweb: Print to explicit filehandle (preparing for caching)
gitweb: href(..., -path_info =3D> 0|1)

The two top commits on that branch are work in progress in preparation
for response caching for gitweb from J.H. (the one used on=20
git.kernel.org), and are not ready yet, I think; at least there would=20
be one more commit in "preparing for gitweb caching" (sub)series.

--=20
Jakub Narebski
Poland
Loading...