Discussion:
[PATCH] gitweb: Use config file or file for repository owner's name.
(too old to reply)
Bruno Ribas
2008-01-30 05:28:17 UTC
Permalink
Allow to use configuration variable gitweb.owner or $GIT_DIR/owner file to
set the repository owner, it checks $GIT_DIR/owner first, then falls back to
the gitweb.owner, if none exist uses filesystem directory's owner.

Useful when we don't want to maintain project list file, and all
repository directories have to have the same owner (for example when the
same SSH account is shared for all projects, using ssh_acl to control
access instead).

Signed-off-by: Bruno Ribas <***@c3sl.ufpr.br>
---
gitweb/gitweb.perl | 18 ++++++++++++++++++
1 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 6256641..e29ad0a 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -1754,6 +1754,20 @@ sub git_get_project_list_from_file {
}
}

+sub gitweb_get_project_owner {
+ my $path = shift;
+
+ $git_dir = "$projectroot/$path";
+ open my $fd, "$projectroot/$path/owner"
+ or return git_get_project_config('owner');
+ my $owner = <$fd>;
+ close $fd;
+ if (defined $owner) {
+ chomp $owner;
+ }
+ return $owner;
+}
+
sub git_get_project_owner {
my $project = shift;
my $owner;
@@ -1767,6 +1781,10 @@ sub git_get_project_owner {
if (exists $gitweb_project_owner->{$project}) {
$owner = $gitweb_project_owner->{$project};
}
+
+ if (!defined $owner) {
+ $owner = gitweb_get_project_owner($project);
+ }
if (!defined $owner) {
$owner = get_file_owner("$projectroot/$project");
}
--
1.5.3.8
Bruno Ribas
2008-01-30 05:28:18 UTC
Permalink
Signed-off-by: Bruno Ribas <***@c3sl.ufpr.br>
---
gitweb/README | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/gitweb/README b/gitweb/README
index b28f59f..d90969f 100644
--- a/gitweb/README
+++ b/gitweb/README
@@ -220,6 +220,10 @@ You can use the following files in repository:
Displayed in the project summary page. You can use multiple-valued
gitweb.url repository configuration variable for that, but the file
takes precendence.
+ * owner (or gitweb.owner)
+ File with the owner of the repository, on a single line. By default set
+ to the filesystem directory's owner. You can use the gitweb.owner repo
+ configuration variable, but the file takes precedence.
* various gitweb.* config variables (in config)
Read description of %feature hash for detailed list, and some
descriptions.
--
1.5.3.8
Junio C Hamano
2008-01-30 06:16:16 UTC
Permalink
Post by Bruno Ribas
Allow to use configuration variable gitweb.owner or $GIT_DIR/owner file to
set the repository owner, it checks $GIT_DIR/owner first, then falls back to
the gitweb.owner, if none exist uses filesystem directory's owner.
Useful when we don't want to maintain project list file, and all
repository directories have to have the same owner (for example when the
same SSH account is shared for all projects, using ssh_acl to control
access instead).
+sub gitweb_get_project_owner {
+ my $path = shift;
+
+ $git_dir = "$projectroot/$path";
+ open my $fd, "$projectroot/$path/owner"
+ or return git_get_project_config('owner');
+ my $owner = <$fd>;
+ close $fd;
+ if (defined $owner) {
+ chomp $owner;
+ }
+ return $owner;
+}
+
sub git_get_project_owner {
my $project = shift;
my $owner;
@@ -1767,6 +1781,10 @@ sub git_get_project_owner {
if (exists $gitweb_project_owner->{$project}) {
$owner = $gitweb_project_owner->{$project};
}
+
+ if (!defined $owner) {
+ $owner = gitweb_get_project_owner($project);
+ }
if (!defined $owner) {
$owner = get_file_owner("$projectroot/$project");
}
I am not sure about the effect of this change on a large scale
site. If you do not have the project list file, originally we
just needed a stat per project, but now you open an extra file
(either "owner" or "config") and read it, once per every
project.

The project list page does that for every project, and it
actually is worse because it also needs to open yet another file
"description" from the directory. It almost makes me wonder if
are much better of to have a single file per project to read all
the necessary information off of, instead of having to open many
little files (currently it is only two---owner and description.
But who knows what other little pieces of information you would
want to add next week).
Bruno Cesar Ribas
2008-01-31 02:36:29 UTC
Permalink
<snip>
I am not sure about the effect of this change on a large scale
site. If you do not have the project list file, originally we
just needed a stat per project, but now you open an extra file
(either "owner" or "config") and read it, once per every
project.
Opening the extra file has same problem as the description file. And, as
gitweb allow us to create "description" and "cloneurl" file there is no
problem having another file to open instead finding out who is the owner of
the directory asking to the filesystem.
The project list page does that for every project, and it
actually is worse because it also needs to open yet another file
"description" from the directory. It almost makes me wonder if
are much better of to have a single file per project to read all
the necessary information off of, instead of having to open many
little files (currently it is only two---owner and description.
But who knows what other little pieces of information you would
want to add next week).
wow, just me adds more little information?! 8-P (I have my own git.acl which
is the "simple" way to my SSH scripts control acl via public_key, but that's
another story).

Having a single file is already done, having the config file but parser may
be bad too (or not for common case), we can have a "gitweb.config" which
contains all those necessary information like owner, description, cloneurl
...

Another "option" is to make cache of then project list page =) From time to
time (each push ?!) gitweb creates project list page again and generates a
new project_list file to have easy access when entering summary page, or
make summary page cached too
--
Bruno Ribas - ***@c3sl.ufpr.br
http://web.inf.ufpr.br/ribas
C3SL: http://www.c3sl.ufpr.br
Junio C Hamano
2008-01-31 02:48:43 UTC
Permalink
Post by Bruno Cesar Ribas
<snip>
I am not sure about the effect of this change on a large scale
site. If you do not have the project list file, originally we
just needed a stat per project, but now you open an extra file
(either "owner" or "config") and read it, once per every
project.
Opening the extra file has same problem as the description file. And, as
gitweb allow us to create "description" and "cloneurl" file there is no
problem having another file to open instead finding out who is the owner of
the directory asking to the filesystem.
We heard the same argument when cloneurl was added, and a
newcomer who does not know that may rightly use the same
argument. But I think we should work towards _reducing_ the
number of such ad-hoc one-line-per-information files, not using
existing ones as an excuse to add _more_ of them.
Bruno Cesar Ribas
2008-01-31 03:02:50 UTC
Permalink
Post by Junio C Hamano
Post by Bruno Cesar Ribas
<snip>
I am not sure about the effect of this change on a large scale
site. If you do not have the project list file, originally we
just needed a stat per project, but now you open an extra file
(either "owner" or "config") and read it, once per every
project.
Opening the extra file has same problem as the description file. And, as
gitweb allow us to create "description" and "cloneurl" file there is no
problem having another file to open instead finding out who is the owner of
the directory asking to the filesystem.
We heard the same argument when cloneurl was added, and a
newcomer who does not know that may rightly use the same
Well, it worked for cloneurl.
Post by Junio C Hamano
argument. But I think we should work towards _reducing_ the
number of such ad-hoc one-line-per-information files, not using
existing ones as an excuse to add _more_ of them.
Okay, I agree with you. We need to centralize information.
One idea use $GIT_DIR/config only, sharing any other information. But I
don't like the idea do parse $GIT_DIR/config everytime. Unless it caches
gitweb only information in one line.

Another Idea is to use $GIT_DIR/gitweb.conf with information [like the
cached above], but generated by the gitweb admin.

Other to continue with all those files, but creating a gitweb.d/* .

That's what I can think 1:02am =(

Good night
--
Bruno Ribas - ***@c3sl.ufpr.br
http://web.inf.ufpr.br/ribas
C3SL: http://www.c3sl.ufpr.br
Junio C Hamano
2008-01-31 03:06:44 UTC
Permalink
Post by Junio C Hamano
...
Post by Bruno Cesar Ribas
Opening the extra file has same problem as the description file. And, as
gitweb allow us to create "description" and "cloneurl" file there is no
problem having another file to open instead finding out who is the owner of
the directory asking to the filesystem.
We heard the same argument when cloneurl was added, and a
newcomer who does not know that may rightly use the same
argument. But I think we should work towards _reducing_ the
number of such ad-hoc one-line-per-information files, not using
existing ones as an excuse to add _more_ of them.
Rephrasing to be constructive (but remember, this is all post
1.5.4).

* we would need for historical reasons to keep supporting
description and cloneurl for some time. There may be some
others, but the goal should be to deprecate and remove these
ad-hoc one-file-per-piece-of-information files.

* we also need for historical reasons to keep supporting some
other stuff found in $git_dir/config of the project.

If the config reading interface in gitweb is reasonably fast and
cheap, we can move the existing description/cloneurl to gitweb
config when deprecating them. New ones such as "owner" would
naturally fit there.

If the config reading interface is too slow (somebody has to
bench it on a large set of repositories), maybe we would need to
optimize _THAT_. If it turns out to be unreasonable (e.g. we
may really want to keep the implementation that spawns "git
config" to do the work, rather than writing and having to
maintain a pure Perl version of config parser inside gitweb,
which is a reasonable position to take in the longer run, but
spawning a process per repository may be too expensive). An
alternative could be to separate out the pieces of information
that are needed even when drawing the top-level project-list
page, and come up with a _new_ single file that is easily
parsable without spawning "git config" for gitweb to read them
(e.g. "description", "owner", perhaps the toplevel project-list
might want to list "cloneurl" as well in the future).
Jakub Narebski
2008-01-31 03:36:31 UTC
Permalink
Post by Junio C Hamano
Post by Junio C Hamano
...
Post by Bruno Cesar Ribas
Opening the extra file has same problem as the description file. And, as
gitweb allow us to create "description" and "cloneurl" file there is no
problem having another file to open instead finding out who is the owner of
the directory asking to the filesystem.
We heard the same argument when cloneurl was added, and a
newcomer who does not know that may rightly use the same
argument. But I think we should work towards _reducing_ the
number of such ad-hoc one-line-per-information files, not using
existing ones as an excuse to add _more_ of them.
Rephrasing to be constructive (but remember, this is all post
1.5.4).
* we would need for historical reasons to keep supporting
description and cloneurl for some time. There may be some
others, but the goal should be to deprecate and remove these
ad-hoc one-file-per-piece-of-information files.
* we also need for historical reasons to keep supporting some
other stuff found in $git_dir/config of the project.
If the config reading interface in gitweb is reasonably fast and
cheap, we can move the existing description/cloneurl to gitweb
config when deprecating them. New ones such as "owner" would
naturally fit there.
Currently gitweb parses repo config file _once_, using one call to
git-config -z -l.

We could simply add description to the projects_list file, but it will
be a bit backwards incompatibile change.

We have to call at least one git-for-each-ref per repo to get last
update date, by the way.
Post by Junio C Hamano
If the config reading interface is too slow (somebody has to
bench it on a large set of repositories), maybe we would need to
optimize _THAT_. If it turns out to be unreasonable (e.g. we
may really want to keep the implementation that spawns "git
config" to do the work, rather than writing and having to
maintain a pure Perl version of config parser inside gitweb,
which is a reasonable position to take in the longer run, but
spawning a process per repository may be too expensive).
While IIRC cvsimport or cvsserver has its own config parser in Perl,
but which accepts only limited sensible subset of configuration syntax
(and IIRC uses separate config file).
--
Jakub Narebski
Poland
ShadeHawk on #git
Johannes Schindelin
2008-01-31 11:12:31 UTC
Permalink
Hi,
Post by Jakub Narebski
...
Post by Bruno Cesar Ribas
Opening the extra file has same problem as the description file.
And, as gitweb allow us to create "description" and "cloneurl" file
there is no problem having another file to open instead finding out
who is the owner of the directory asking to the filesystem.
We heard the same argument when cloneurl was added, and a newcomer
who does not know that may rightly use the same argument. But I
think we should work towards _reducing_ the number of such ad-hoc
one-line-per-information files, not using existing ones as an excuse
to add _more_ of them.
Rephrasing to be constructive (but remember, this is all post 1.5.4).
* we would need for historical reasons to keep supporting
description and cloneurl for some time. There may be some
others, but the goal should be to deprecate and remove these
ad-hoc one-file-per-piece-of-information files.
* we also need for historical reasons to keep supporting some
other stuff found in $git_dir/config of the project.
If the config reading interface in gitweb is reasonably fast and
cheap, we can move the existing description/cloneurl to gitweb config
when deprecating them. New ones such as "owner" would naturally fit
there.
Currently gitweb parses repo config file _once_, using one call to
git-config -z -l.
We could simply add description to the projects_list file, but it will
be a bit backwards incompatibile change.
Not if you say "the config overrides the description/cloneurl file", i.e.
when there is a description or a cloneurl from the config, don't even
bother to stat the single-line files.

That would help transition, and still be backwards compatible. (BTW this
resembles what we did for the .git/remotes/* -> .git/config transition.)

Ciao,
Dscho
Jakub Narebski
2008-02-01 00:17:07 UTC
Permalink
Post by Johannes Schindelin
Post by Jakub Narebski
Rephrasing to be constructive (but remember, this is all post 1.5.4).
* we would need for historical reasons to keep supporting
description and cloneurl for some time. There may be some
others, but the goal should be to deprecate and remove these
ad-hoc one-file-per-piece-of-information files.
* we also need for historical reasons to keep supporting some
other stuff found in $git_dir/config of the project.
If the config reading interface in gitweb is reasonably fast and
cheap, we can move the existing description/cloneurl to gitweb config
when deprecating them. New ones such as "owner" would naturally fit
there.
Currently gitweb parses repo config file _once_, using one call to
git-config -z -l.
We could simply add description to the projects_list file, but it will
be a bit backwards incompatibile change.
Not if you say "the config overrides the description/cloneurl file", i.e.
when there is a description or a cloneurl from the config, don't even
bother to stat the single-line files.
Errr... what I wanted to say there is instead of current format of
'projects_list' file which is:
<URI-encoded project path> SPC <URI-encoded owner> LF
add also project description to that file, so the format would be
<URI-encoded project path> SPC <URI-encoded owner> SPC
<one-line project description> LF
(project description doesn't need to be URI encoded). This means
avoiding reading $git_dir/description (and in rare cases also avoiding
gitweb.description in $git_dir/config).

This is of course a bit backwards incompatibile.
Post by Johannes Schindelin
That would help transition, and still be backwards compatible. (BTW this
resembles what we did for the .git/remotes/* -> .git/config transition.)
Note that some of info is needed for 'projects_list' view, and some only
for the 'summary' view. For the 'projects_view' page we would want to
avoid, I think, calling "git config -z -l" per repository (or opening
$git_dir/config file and [limited] parsing it inside gitweb in Perl,
like git-cvsserver does). For 'summary' view we want usually to read
repo config file for features nevertheless, and is only once per
web-page, so we don't avoid it then.

Currently for 'projects_list' view we have, when $projects_list is
a directory (this includes situation when it is undef, and fallbacks
to $projectroot):
1. Call git-for-each-ref to get last modification time
2. Read $git_dir/description file for description (which is generated
by default template, so is usualy present, if in useless form),
fallback to git-config / reading $git_dir/config, gitweb.description
3. Check owner of $git_dir (stat + getpwuid)

With the addition of $git_dir/owner and gitweb.owner we would have
3'. Read $git_dir/owner file, usually not present,
fallback to gitweb.owner (which means reading and parsing
repo config!),
fallback to $git_dir owner (stat + getpwuid)
so after consideration I think that adding gitweb.owner is a bit of
a stupid idea from performance point of view, at least till we have
'projects_list' caching. Only $git_dir/owner would be better.

BTW. what about filesystems where file / directory does not have
an owner?


Another solution would be using $projectroot/.gitconfig, with simplified
syntax easy parseable by Perl, with gitweb.<repo path>.<config>, where
<config> is limited to 'description', 'owner' and 'url', and
gitweb.description for fallback description, gitweb.owner for fallback
owner and owner for set of repositories, gitweb.baseurl for base URLs
(gitweb.<repo>.url = gitweb.baseurl/<repo>).

This would limit repo paths to not have embedded newlines in them, but
this is not I think serious limitation :-)
--
Jakub Narebski
Poland
Bruno Cesar Ribas
2008-02-04 13:35:42 UTC
Permalink
<snip>
Note that some of info is needed for 'projects_list' view, and some only
for the 'summary' view. For the 'projects_view' page we would want to
avoid, I think, calling "git config -z -l" per repository (or opening
$git_dir/config file and [limited] parsing it inside gitweb in Perl,
like git-cvsserver does). For 'summary' view we want usually to read
repo config file for features nevertheless, and is only once per
web-page, so we don't avoid it then.
Currently for 'projects_list' view we have, when $projects_list is
a directory (this includes situation when it is undef, and fallbacks
1. Call git-for-each-ref to get last modification time
2. Read $git_dir/description file for description (which is generated
by default template, so is usualy present, if in useless form),
fallback to git-config / reading $git_dir/config, gitweb.description
3. Check owner of $git_dir (stat + getpwuid)
With the addition of $git_dir/owner and gitweb.owner we would have
3'. Read $git_dir/owner file, usually not present,
fallback to gitweb.owner (which means reading and parsing
repo config!),
fallback to $git_dir owner (stat + getpwuid)
so after consideration I think that adding gitweb.owner is a bit of
a stupid idea from performance point of view, at least till we have
'projects_list' caching. Only $git_dir/owner would be better.
Unless we parse config only once for each project. We can create a small
cache with all gitweb conf. Then each time we ask
git_get_project_config('bla') we check if we alread had parsed it, if parsed
get in a small hash table.
We could even check if we are generating project list then we can store only
description and owner. (this sounds ugly)
BTW. what about filesystems where file / directory does not have
an owner?
Does GIT runs on a system like this?! I only remembers FAT having such
"problem". 8^)
Another solution would be using $projectroot/.gitconfig, with simplified
syntax easy parseable by Perl, with gitweb.<repo path>.<config>, where
<config> is limited to 'description', 'owner' and 'url', and
gitweb.description for fallback description, gitweb.owner for fallback
owner and owner for set of repositories, gitweb.baseurl for base URLs
(gitweb.<repo>.url = gitweb.baseurl/<repo>).
This sounds good. Having this small, simple file would make things better.
But we will have another file inside repository, having all in config would
be cleaner [i guess]. If we parse config file only once per project might be
good.
This would limit repo paths to not have embedded newlines in them, but
this is not I think serious limitation :-)
--
Jakub Narebski
Poland
--
Bruno Ribas - ***@c3sl.ufpr.br
http://web.inf.ufpr.br/ribas
C3SL: http://www.c3sl.ufpr.br
Jakub Narebski
2008-02-04 14:00:48 UTC
Permalink
Post by Bruno Cesar Ribas
<snip>
Note that some of info is needed for 'projects_list' view, and some only
for the 'summary' view.
Currently for 'projects_list' view we have, when $projects_list is
a directory (this includes situation when it is undef, and fallbacks
1. Call git-for-each-ref to get last modification time
2. Read $git_dir/description file for description (which is generated
by default template, so is usualy present, if in useless form),
fallback to git-config / reading $git_dir/config, gitweb.description
3. Check owner of $git_dir (stat + getpwuid)
With the addition of $git_dir/owner and gitweb.owner we would have
3'. Read $git_dir/owner file, usually not present,
fallback to gitweb.owner (which means reading and parsing
repo config!),
fallback to $git_dir owner (stat + getpwuid)
so after consideration I think that adding gitweb.owner is a bit of
a stupid idea from performance point of view, at least till we have
'projects_list' caching. Only $git_dir/owner would be better.
Unless we parse config only once for each project. We can create a small
cache with all gitweb conf. Then each time we ask
git_get_project_config('bla') we check if we alread had parsed it, if parsed
get in a small hash table.
If you have read gitweb.perl current code more carefully, or browsed
"git log -- gitweb" output, you would notice that since b201927a

gitweb: Read repo config using 'git config -z -l'

gitweb does just that. Reads whole repo config using "git config -z -l"
(this additionally includes support for user and system git
configuration files; also git config file format has some hairy corners)
into hash lazily, then uses this hash.

But this is once per repo, which for 'projects_list' might be too much,
especialy for the operating systems where fork is slow. I'd rather go
in opposite direction and add support for providing description in
'projects_list' file.
Post by Bruno Cesar Ribas
We could even check if we are generating project list then we can store only
description and owner. (this sounds ugly)
This doesn't buy us much, because description and owner migh be in the
last lines of repo config, so we have to read whole config file in full.

Note that if I understand this correctly variable initialization,
including %config and $config_file are generated anew for each request.
Besides, as it is now, viewing some project view would flush out
projects_list %config, etc.
Post by Bruno Cesar Ribas
Another solution would be using $projectroot/.gitconfig, with simplified
syntax easy parseable by Perl, with gitweb.<repo path>.<config>, where
<config> is limited to 'description', 'owner' and 'url', and
gitweb.description for fallback description, gitweb.owner for fallback
owner and owner for set of repositories, gitweb.baseurl for base URLs
(gitweb.<repo>.url = gitweb.baseurl/<repo>).
This sounds good. Having this small, simple file would make things better.
But we will have another file inside repository, having all in config would
be cleaner [i guess]. If we parse config file only once per project might be
good.
I think that it would be better to separate gitweb configuration
(in gitweb_config.perl) fro [cached] repositories configuration
(in gitconfig or .gitconfig).
--
Jakub Narebski
Poland
Bruno Cesar Ribas
2008-02-05 04:41:21 UTC
Permalink
Post by Junio C Hamano
If the config reading interface is too slow (somebody has to
bench it on a large set of repositories), maybe we would need to
I made a simple bench as follows.

dd= 'dd if=/dev/zero of=/home/bruno/dds/$i bs=1M count=400000'
Machine: 4*opteron 2.8GHz
32GB ram
14*750GB RAID0 sata2 /home

Generated a 1000 projects [ too much?! git.debian.org has 668]
For each test a 'echo 2 > /proc/sys/vm/drop_caches' was done before running
it.

command: time gitweb.cgi > /dev/null

----------------- ----------------- ------------ -----------------
With Project List NO Project List LoadAvg description way
----------------- ----------------- ------------ -----------------
->0*dd ->0*dd
1m0.851s 1m18.651s 0.78 0.70 description file
1m1.511s 0m55.051s 0.83 0.35 gitweb.description

->2*dd ->2*dd
21m0.899s 17m19.706s 8.21 6.48 description file
16m29.455s 13m36.602s 5.90 5.95 gitweb.description

->4*dd ->4*dd
23m6.781s 26m51.544s 10.81 12 description file
20m57.249s 26m32.704s 11.50 12.55 gitweb.description


My test was simple =) But we can get some conclusions on it.
Running a git-config -z -l for each git repository is not a problem, as we
obtain same speed on tests (only loosing on very high IO's) [maybe i should
run with 8*dd or 80...].

After that, having a gitweb.owner might not be a problem as we parse all
config file once.

Running the "git-for-each-ref" is the big killer [waaaw =P]. We could store
timestamp at gitweb.lastchange, which is updated by a hook?! Or store it in
some way[ i will bench with if stored].
--
Bruno Ribas - ***@c3sl.ufpr.br
http://web.inf.ufpr.br/ribas
C3SL: http://www.c3sl.ufpr.br
Jakub Narebski
2008-02-05 10:04:51 UTC
Permalink
Post by Bruno Cesar Ribas
Post by Junio C Hamano
If the config reading interface is too slow (somebody has to
bench it on a large set of repositories), maybe we would need to
I made a simple bench as follows.
dd= 'dd if=/dev/zero of=/home/bruno/dds/$i bs=1M count=400000'
This was to provide load, and check how it works under load, isn't it?
Post by Bruno Cesar Ribas
Machine: 4*opteron 2.8GHz
32GB ram
14*750GB RAID0 sata2 /home
Generated a 1000 projects [ too much?! git.debian.org has 668]
For each test a 'echo 2 > /proc/sys/vm/drop_caches' was done before running
it.
command: time gitweb.cgi > /dev/null
Didn't you mean

time GATEWAY_INTERFACE="CGI/1.1" HTTP_ACCEPT="*/*" \
REQUEST_METHOD="GET" QUERY_STRING="" gitweb.cgi

here (or some wrapper thereof)?

I wonder what would ApacheBench show...

Note also that there are operating systems (MS Windows, MacOS X) where
fork is much slower than on Linux.
--
Jakub Narebski
Poland
ShadeHawk on #git
Bruno Cesar Ribas
2008-02-05 14:28:10 UTC
Permalink
Post by Jakub Narebski
Post by Bruno Cesar Ribas
Post by Junio C Hamano
If the config reading interface is too slow (somebody has to
bench it on a large set of repositories), maybe we would need to
I made a simple bench as follows.
dd= 'dd if=/dev/zero of=/home/bruno/dds/$i bs=1M count=400000'
This was to provide load, and check how it works under load, isn't it?
Yes, it was to provide high disk IO. Running a test with system 100%idle is
not satisfatory because time is almost satisfatory.
Post by Jakub Narebski
Post by Bruno Cesar Ribas
Machine: 4*opteron 2.8GHz
32GB ram
14*750GB RAID0 sata2 /home
Generated a 1000 projects [ too much?! git.debian.org has 668]
For each test a 'echo 2 > /proc/sys/vm/drop_caches' was done before running
it.
command: time gitweb.cgi > /dev/null
Didn't you mean
time GATEWAY_INTERFACE="CGI/1.1" HTTP_ACCEPT="*/*" \
REQUEST_METHOD="GET" QUERY_STRING="" gitweb.cgi
here (or some wrapper thereof)?
I thought so, but running it without any arguments generated a clean project
list page.
Post by Jakub Narebski
I wonder what would ApacheBench show...
Note also that there are operating systems (MS Windows, MacOS X) where
fork is much slower than on Linux.
Problem is wunning it with MsWindows an MacOS (happily I removed it from my
macbook)

Running an ApacheBench should be unde high disk IO AND multiple requests at
once.

It could be done under linux, but what concerns me more is the generation of
ONE page without apache. And that using gitweb.descripton (and probably
gitweb.owner) is better than using single-lined-isolated-files.
Post by Jakub Narebski
--
Jakub Narebski
Poland
ShadeHawk on #git
-
To unsubscribe from this list: send the line "unsubscribe git" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Bruno Ribas - ***@c3sl.ufpr.br
http://web.inf.ufpr.br/ribas
C3SL: http://www.c3sl.ufpr.br
Bruno Cesar Ribas
2008-02-07 04:12:46 UTC
Permalink
<alll snip>
Hello again,

I made another bench, but testing time difference between a gitweb.cgi
without reading gitweb.owner AND reading it.

These times i got with a 1000projects running 2 dd to generate disk IO.
Here comes the resultm
NO projects_list projects_list
16m30s69 15m10s74 default gitweb, using FS's owner
16m07s40 15m24s34 patched to get gitweb.owner
16m37s76 15m59s32 same above, but without gitweb.owner

Now results for a 1000projects on an idle machine.
NO projects_list projects_list
1m19s08 1m09s55 default gitweb, using FS's owner
1m17s58 1m09s55 patched to get gitweb.owner
1m18s49 1m08s96 same above, but without gitweb.owner

*For "projects_list" column, gitweb got owner via project_list file.

Small fluctuation occurs. But speed is essencially the same.

My guess is that adding a gitweb.owner there's no problem and helps people
to get owner when there's no need to maintain a project_list or maintain
this file is not wanted.

Bruno
--
Bruno Ribas - ***@c3sl.ufpr.br
http://web.inf.ufpr.br/ribas
C3SL: http://www.c3sl.ufpr.br
Continue reading on narkive:
Loading...