Discussion:
What's cooking in git.git (Jan 2009, #04; Mon, 19)
(too old to reply)
Junio C Hamano
2009-01-19 09:13:30 UTC
Permalink
Here are the topics that have been cooking. Commits prefixed with '-' are
only in 'pu' while commits prefixed with '+' are in 'next'. The ones
marked with '.' do not appear in any of the branches, but I am still
holding onto them.

The topics list the commits in reverse chronological order. The topics
meant to be merged to the maintenance series have "maint-" in their names.

----------------------------------------------------------------
[New Topics]

* jk/color-parse (Sat Jan 17 10:38:46 2009 -0500) 2 commits
+ expand --pretty=format color options
+ color: make it easier for non-config to parse color specs

* sb/hook-cleanup (Sat Jan 17 04:02:55 2009 +0100) 5 commits
+ run_hook(): allow more than 9 hook arguments
+ run_hook(): check the executability of the hook before filling
argv
+ api-run-command.txt: talk about run_hook()
+ Move run_hook() from builtin-commit.c into run-command.c (libgit)
+ checkout: don't crash on file checkout before running post-
checkout hook

* js/maint-all-implies-HEAD (Sat Jan 17 22:27:08 2009 -0800) 2 commits
- bundle: allow the same ref to be given more than once
- revision walker: include a detached HEAD in --all

* tr/previous-branch (Sat Jan 17 19:08:12 2009 +0100) 6 commits
- Fix parsing of @{-1}@{1}
- interpret_nth_last_branch(): avoid traversing the reflog twice
- checkout: implement "-" abbreviation, add docs and tests
- sha1_name: support @{-N} syntax in get_sha1()
- sha1_name: tweak @{-N} lookup
- checkout: implement "@{-N}" shortcut name for N-th last branch

* rs/ctype (Sat Jan 17 16:50:37 2009 +0100) 4 commits
+ Add is_regex_special()
+ Change NUL char handling of isspecial()
+ Reformat ctype.c
+ Add ctype test

* mh/unify-color (Sun Jan 18 21:39:12 2009 +0100) 2 commits
- move the color variables to color.c
- handle color.ui at a central place

* jf/am-failure-report (Sun Jan 18 19:34:31 2009 -0800) 2 commits
+ git-am: re-fix the diag message printing
+ git-am: Make it easier to see which patch failed

* cb/add-pathspec (Wed Jan 14 15:54:35 2009 +0100) 2 commits
- remove pathspec_match, use match_pathspec instead
- clean up pathspec matching

* sg/maint-gitdir-in-subdir (Fri Jan 16 16:37:33 2009 +0100) 1 commit
+ Fix gitdir detection when in subdir of gitdir

This has my "don't do the fullpath if you are directly inside .git"
squashed in, so it should be much safer.

* am/maint-push-doc (Sun Jan 18 15:36:58 2009 +0100) 4 commits
+ Documentation: avoid using undefined parameters
+ Documentation: mention branches rather than heads
+ Documentation: remove a redundant elaboration
+ Documentation: git push repository can also be a remote

* sp/runtime-prefix (Sun Jan 18 13:00:15 2009 +0100) 5 commits
- Windows: Revert to default paths and convert them by
RUNTIME_PREFIX
- Modify setup_path() to only add git_exec_path() to PATH
- Add calls to git_extract_argv0_path() in programs that call
git_config_*
- git_extract_argv0_path(): Move check for valid argv0 from caller
to callee
- Move computation of absolute paths from Makefile to runtime (in
preparation for RUNTIME_PREFIX)

----------------------------------------------------------------
[Stalled and may need help and prodding to go forward]

* jc/blame (Wed Jun 4 22:58:40 2008 -0700) 2 commits
+ blame: show "previous" information in --porcelain/--incremental
format
+ git-blame: refactor code to emit "porcelain format" output

This gives Porcelains (like gitweb) the information on the commit _before_
the one that the final blame is laid on, which should save them one
rev-parse to dig further. The line number in the "previous" information
may need refining, and sanity checking code for reference counting may
need to be resurrected before this can move forward.

* db/foreign-scm (Sun Jan 11 15:12:10 2009 -0500) 3 commits
- Support fetching from foreign VCSes
- Add specification of git-vcs helpers
- Add "vcs" config option in remotes

The "spec" did not seem quite well cooked yet, but in the longer term I
think something like this to allow interoperating with other SCMs as if
the other end is a native git repository is a very worthy goal.

----------------------------------------------------------------
[Actively cooking]

* kb/lstat-cache (Sun Jan 18 16:14:54 2009 +0100) 5 commits
+ lstat_cache(): introduce clear_lstat_cache() function
+ lstat_cache(): introduce invalidate_lstat_cache() function
+ lstat_cache(): introduce has_dirs_only_path() function
+ lstat_cache(): introduce has_symlink_or_noent_leading_path()
function
+ lstat_cache(): more cache effective symlink/directory detection

This is the tenth round, now in 'next'.

* lh/submodule-tree-traversal (Mon Jan 12 00:45:55 2009 +0100) 3 commits
- builtin-ls-tree: enable traversal of submodules
- archive.c: enable traversal of submodules
- tree.c: add support for traversal of submodules

Still getting active reviews.

* lt/maint-wrap-zlib (Wed Jan 7 19:54:47 2009 -0800) 1 commit
+ Wrap inflate and other zlib routines for better error reporting

Needs the "free our memory upon seeing Z_MEM_ERROR and try again" bits
extracted from Shawn's patch on top of this one.

* jk/signal-cleanup (Sun Jan 11 06:36:49 2009 -0500) 3 commits
- pager: do wait_for_pager on signal death
- refactor signal handling for cleanup functions
- chain kill signals for cleanup functions

Sorry, I lost track. What is the status of this one?

* js/diff-color-words (Sat Jan 17 17:29:48 2009 +0100) 7 commits
- color-words: make regex configurable via attributes
- color-words: expand docs with precise semantics
- color-words: enable REG_NEWLINE to help user
- color-words: take an optional regular expression describing words
- color-words: change algorithm to allow for 0-character word
boundaries
- color-words: refactor word splitting and use ALLOC_GROW()
- Add color_fwrite_lines(), a function coloring each line
individually

Dscho's series that was done in response to Thomas's original; two agreed
to work together on this codebase.

* ks/maint-mailinfo-folded (Tue Jan 13 01:21:04 2009 +0300) 5 commits
- mailinfo: tests for RFC2047 examples
- mailinfo: add explicit test for mails like '<***@example.com>
(A U Thor)'
- mailinfo: more smarter removal of rfc822 comments from 'From'
+ mailinfo: 'From:' header should be unfold as well
+ mailinfo: correctly handle multiline 'Subject:' header

I think "more smarter" one is too aggressive for our purpose. Perhaps not
removing comments at all would be what we want.

* js/patience-diff (Thu Jan 1 17:39:37 2009 +0100) 3 commits
+ bash completions: Add the --patience option
+ Introduce the diff option '--patience'
+ Implement the patience diff algorithm

* js/notes (Tue Jan 13 20:57:16 2009 +0100) 6 commits
+ git-notes: fix printing of multi-line notes
+ notes: fix core.notesRef documentation
+ Add an expensive test for git-notes
+ Speed up git notes lookup
+ Add a script to edit/inspect notes
+ Introduce commit notes

* sc/gitweb-category (Fri Dec 12 00:45:12 2008 +0100) 3 commits
- gitweb: Optional grouping of projects by category
- gitweb: Split git_project_list_body in two functions
- gitweb: Modularized git_get_project_description to be more generic

----------------------------------------------------------------
[Graduated to "master"]

* ds/uintmax-config (Mon Nov 3 09:14:28 2008 -0900) 1 commit
+ autoconf: Enable threaded delta search when pthreads are supported

See if anybody screams.

* gb/gitweb-opml (Fri Jan 2 13:49:30 2009 +0100) 2 commits
+ gitweb: suggest name for OPML view
+ gitweb: don't use pathinfo for global actions

* mv/apply-parse-opt (Fri Jan 9 22:21:36 2009 -0800) 2 commits
+ Resurrect "git apply --flags -" to read from the standard input
+ parse-opt: migrate builtin-apply.

* tr/rebase-root (Fri Jan 2 23:28:29 2009 +0100) 4 commits
+ rebase: update documentation for --root
+ rebase -i: learn to rebase root commit
+ rebase: learn to rebase root commit
+ rebase -i: execute hook only after argument checking

Looked reasonable.

* mh/maint-commit-color-status (Thu Jan 8 19:53:05 2009 +0100) 2 commits
+ git-status -v: color diff output when color.ui is set
+ git-commit: color status output when color.ui is set

* rs/maint-shortlog-foldline (Tue Jan 6 21:41:06 2009 +0100) 1 commit
+ shortlog: handle multi-line subjects like log --pretty=oneline et.
al. do

* rs/fgrep (Sat Jan 10 00:18:34 2009 +0100) 2 commits
+ grep: don't call regexec() for fixed strings
+ grep -w: forward to next possible position after rejected match

* as/autocorrect-alias (Sun Jan 4 18:16:01 2009 +0100) 1 commit
+ git.c: make autocorrected aliases work

* tr/maint-no-index-fixes (Wed Jan 7 12:15:30 2009 +0100) 3 commits
+ diff --no-index -q: fix endless loop
+ diff --no-index: test for pager after option parsing
+ diff: accept -- when using --no-index

* jc/maint-format-patch (Sat Jan 10 12:41:33 2009 -0800) 1 commit
+ format-patch: show patch text for the root commit

* ap/clone-into-empty (Sun Jan 11 15:19:12 2009 +0300) 2 commits
+ Allow cloning to an existing empty directory
+ add is_dot_or_dotdot inline function

* gb/gitweb-patch (Thu Dec 18 08:13:19 2008 +0100) 4 commits
+ gitweb: link to patch(es) view in commit(diff) and (short)log view
+ gitweb: add patches view
+ gitweb: change call pattern for git_commitdiff
+ gitweb: add patch view

----------------------------------------------------------------
[Will merge to "master" soon]

* kb/am-directory (Wed Jan 14 16:29:59 2009 -0800) 2 commits
+ git-am: fix shell quoting
+ git-am: add --directory=<dir> option

This is "third-time-lucky, perhaps?" resurrection. I do not think I'd be
using this very often, but it originated from a real user request.

* jc/maint-format-patch-o-relative (Mon Jan 12 15:18:02 2009 -0800) 1 commit
+ Teach format-patch to handle output directory relative to cwd

----------------------------------------------------------------
[On Hold]

* jk/renamelimit (Sat May 3 13:58:42 2008 -0700) 1 commit
. diff: enable "too large a rename" warning when -M/-C is explicitly
asked for

* jc/stripspace (Sun Mar 9 00:30:35 2008 -0800) 6 commits
. git-am --forge: add Signed-off-by: line for the author
. git-am: clean-up Signed-off-by: lines
. stripspace: add --log-clean option to clean up signed-off-by:
lines
. stripspace: use parse_options()
. Add "git am -s" test
. git-am: refactor code to add signed-off-by line for the committer

* jc/post-simplify (Fri Aug 15 01:34:51 2008 -0700) 2 commits
. revision --simplify-merges: incremental simplification
. revision --simplify-merges: prepare for incremental simplification

* jk/valgrind (Thu Oct 23 04:30:45 2008 +0000) 2 commits
. valgrind: ignore ldso errors
. add valgrind support in test scripts

* wp/add-patch-find (Thu Nov 27 04:08:03 2008 +0000) 3 commits
. In add --patch, Handle K,k,J,j slightly more gracefully.
. Add / command in add --patch
. git-add -i/-p: Change prompt separater from slash to comma

* jc/grafts (Wed Jul 2 17:14:12 2008 -0700) 1 commit
. [BROKEN wrt shallow clones] Ignore graft during object transfer

* jc/replace (Fri Oct 31 09:21:39 2008 -0700) 1 commit
. WIP
Kjetil Barvik
2009-01-19 11:54:32 UTC
Permalink
Junio C Hamano <***@pobox.com> writes:
<snipp>
Post by Junio C Hamano
----------------------------------------------------------------
[Actively cooking]
* kb/lstat-cache (Sun Jan 18 16:14:54 2009 +0100) 5 commits
+ lstat_cache(): introduce clear_lstat_cache() function
+ lstat_cache(): introduce invalidate_lstat_cache() function
+ lstat_cache(): introduce has_dirs_only_path() function
+ lstat_cache(): introduce has_symlink_or_noent_leading_path()
function
+ lstat_cache(): more cache effective symlink/directory detection
This is the tenth round, now in 'next'.
Thanks!! Nice to see that the patch is going forward. And I have to
admit that it was very fun to make that patch.

How long is the 'merge window' in Linux Kernel terms for this round
(to the next release of GIT)?

I have a second idea to an improvement, which also looks quite good
for the moment, and I am sort of wondering how fast I must work. :-)

-- kjetil
Johannes Schindelin
2009-01-19 13:09:48 UTC
Permalink
Hi,
Post by Kjetil Barvik
How long is the 'merge window' in Linux Kernel terms for this round
(to the next release of GIT)?
There is no "merge window", but rather an "-rc" period for 1.x versions.
Which has been largely ignored :-)

Ciao,
Dscho
Johannes Schindelin
2009-01-19 13:08:48 UTC
Permalink
Hi,
Post by Junio C Hamano
* js/diff-color-words (Sat Jan 17 17:29:48 2009 +0100) 7 commits
- color-words: make regex configurable via attributes
- color-words: expand docs with precise semantics
- color-words: enable REG_NEWLINE to help user
- color-words: take an optional regular expression describing words
- color-words: change algorithm to allow for 0-character word
boundaries
- color-words: refactor word splitting and use ALLOC_GROW()
- Add color_fwrite_lines(), a function coloring each line
individually
Dscho's series that was done in response to Thomas's original; two agreed
to work together on this codebase.
I am actually pretty comfortable with this series now.
Post by Junio C Hamano
* jk/valgrind (Thu Oct 23 04:30:45 2008 +0000) 2 commits
. valgrind: ignore ldso errors
. add valgrind support in test scripts
Could you put this in pu, at least, please?

Thanks,
Dscho
Jeff King
2009-01-20 04:44:47 UTC
Permalink
Post by Johannes Schindelin
Post by Junio C Hamano
* jk/valgrind (Thu Oct 23 04:30:45 2008 +0000) 2 commits
. valgrind: ignore ldso errors
. add valgrind support in test scripts
Could you put this in pu, at least, please?
I don't think I've really touched this since it was posted. One of the
things I didn't like about it was that the valgrind wrapper directory
was created in the Makefile. I think creating it inside the trash
directory for each test run that wants to use valgrind makes more sense
(probably as .git/valgrind, which is unlikely to hurt anything but will
stay out of the way of most of the tests).

I doubt I will have the chance to look at it anytime soon, so please
feel free to pick up the topic if you are interested.

-Peff
Johannes Schindelin
2009-01-20 13:51:49 UTC
Permalink
Hi,
One of the things I didn't like about it was that the valgrind wrapper
directory was created in the Makefile.
I agree.
I think creating it inside the trash directory for each test run that
wants to use valgrind makes more sense (probably as .git/valgrind, which
is unlikely to hurt anything but will stay out of the way of most of the
tests).
Here I disagree. But I think that test-lib.sh should create it on-demand,
and it should traverse all executables in all paths listed in $PATH,
replacing the ones that start with "git-" ("git" itself should be the
first one) that are no scripts by symlinks to the valgrind script (which
should therefore live in t/), and those that _are_ scripts by symlinks to
$GIT_ROOT/$NAME.

I'll work on it.

Ciao,
Dscho
Jeff King
2009-01-20 14:19:32 UTC
Permalink
Post by Johannes Schindelin
I think creating it inside the trash directory for each test run that
wants to use valgrind makes more sense (probably as .git/valgrind, which
is unlikely to hurt anything but will stay out of the way of most of the
tests).
Here I disagree. But I think that test-lib.sh should create it on-demand,
and it should traverse all executables in all paths listed in $PATH,
replacing the ones that start with "git-" ("git" itself should be the
first one) that are no scripts by symlinks to the valgrind script (which
should therefore live in t/), and those that _are_ scripts by symlinks to
$GIT_ROOT/$NAME.
How will you deal with race conditions between two simultaneously
running scripts? I.e., where are you going to put it?
Post by Johannes Schindelin
I'll work on it.
Great.

-Peff
Johannes Schindelin
2009-01-20 14:50:28 UTC
Permalink
Hi,
Post by Jeff King
Post by Johannes Schindelin
I think creating it inside the trash directory for each test run
that wants to use valgrind makes more sense (probably as
.git/valgrind, which is unlikely to hurt anything but will stay out
of the way of most of the tests).
Here I disagree. But I think that test-lib.sh should create it
on-demand, and it should traverse all executables in all paths listed
in $PATH, replacing the ones that start with "git-" ("git" itself
should be the first one) that are no scripts by symlinks to the
valgrind script (which should therefore live in t/), and those that
_are_ scripts by symlinks to $GIT_ROOT/$NAME.
How will you deal with race conditions between two simultaneously
running scripts? I.e., where are you going to put it?
There are no race conditions, as for every git executable, a symbolic link
is created, pointing to the valgrind.sh script [*1*].

Besides, what with valgrind being a memory hog, you'd be nuts to call
valgrinded scripts simultaneously.

Ciao,
Dscho

[*1*] Before anybody complains about symbolic links not being available on
Windows, or $GIT_SHELL not being heeded by the valgrind.sh script: get
valgrind to compile on those platforms, and _then_ we'll talk again.
Johannes Schindelin
2009-01-20 15:04:28 UTC
Permalink
This patch adds the ability to use valgrind's memcheck tool to
diagnose memory problems in git while running the test scripts. It
works by placing a "fake" git in the front of the test script's PATH;
this fake git runs the real git under valgrind. It also points the
exec-path such that any stand-alone dashed git programs are run using
the same script. In this way we avoid having to modify the actual git
code in any way.

To be certain that every call to any git executable is intercepted,
the PATH is searched for executables beginning with "git-"; Scripts
are excluded however.

Valgrind can be used by specifying "GIT_TEST_OPTS=--valgrind" in the
make invocation. Any invocation of git that finds any errors under
valgrind will exit with failure code 126. Any valgrind output will go
to the usual stderr channel for tests (i.e., /dev/null, unless -v has
been specified).

If you need to pass options to valgrind -- you might want to run
another tool than memcheck, for example -- you can set the environment
variable GIT_VALGRIND_OPTIONS.

A few default suppressions are included, since libz seems to
trigger quite a few false positives. We'll assume that libz
works and that we can ignore any errors which are reported
there.

Initial patch and all the hard work by Jeff King.

Signed-off-by: Johannes Schindelin <***@gmx.de>
---

AFAIR libz reserves only aligned memory, and does all operations
in an aligned manner, but it is safe (even if it accesses
uninitialized memory, it does not use the results anyway).

t/test-lib.sh | 39 +++++++++++++++++++++++++++++++++++++--
t/valgrind/.gitignore | 2 ++
t/valgrind/default.supp | 21 +++++++++++++++++++++
t/valgrind/valgrind.sh | 12 ++++++++++++
4 files changed, 72 insertions(+), 2 deletions(-)
create mode 100644 t/valgrind/.gitignore
create mode 100644 t/valgrind/default.supp
create mode 100755 t/valgrind/valgrind.sh

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 41d5a59..1daae9b 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -94,6 +94,8 @@ do
--no-python)
# noop now...
shift ;;
+ --va|--val|--valg|--valgr|--valgri|--valgrin|--valgrind)
+ valgrind=t; shift ;;
*)
break ;;
esac
@@ -467,8 +469,41 @@ test_done () {
# Test the binaries we have just built. The tests are kept in
# t/ subdirectory and are run in 'trash directory' subdirectory.
TEST_DIRECTORY=$(pwd)
-PATH=$TEST_DIRECTORY/..:$PATH
-GIT_EXEC_PATH=$(pwd)/..
+if test -z "$valgrind"
+then
+ PATH=$TEST_DIRECTORY/..:$PATH
+ GIT_EXEC_PATH=$TEST_DIRECTORY/..
+else
+ # override all git executables in PATH and TEST_DIRECTORY/..
+ GIT_VALGRIND=$TEST_DIRECTORY/valgrind
+ mkdir -p "$GIT_VALGRIND"
+ OLDIFS=$IFS
+ IFS=:
+ for path in $PATH:$TEST_DIRECTORY/..
+ do
+ ls "$TEST_DIRECTORY"/../git "$path"/git-* 2> /dev/null |
+ while read file
+ do
+ # handle only executables
+ test -x "$file" || continue
+
+ base=$(basename "$file")
+ test ! -h "$GIT_VALGRIND"/"$base" || continue
+
+ if test "#!" = "$(head -c 2 < "$file")"
+ then
+ # do not override scripts
+ ln -s ../../"$base" "$GIT_VALGRIND"/"$base"
+ else
+ ln -s valgrind.sh "$GIT_VALGRIND"/"$base"
+ fi
+ done
+ done
+ IFS=$OLDIFS
+ PATH=$GIT_VALGRIND:$PATH
+ GIT_EXEC_PATH=$GIT_VALGRIND
+ export GIT_VALGRIND
+fi
GIT_TEMPLATE_DIR=$(pwd)/../templates/blt
unset GIT_CONFIG
GIT_CONFIG_NOSYSTEM=1
diff --git a/t/valgrind/.gitignore b/t/valgrind/.gitignore
new file mode 100644
index 0000000..d781a63
--- /dev/null
+++ b/t/valgrind/.gitignore
@@ -0,0 +1,2 @@
+/git
+/git-*
diff --git a/t/valgrind/default.supp b/t/valgrind/default.supp
new file mode 100644
index 0000000..2482b3b
--- /dev/null
+++ b/t/valgrind/default.supp
@@ -0,0 +1,21 @@
+{
+ ignore-zlib-errors-cond
+ Memcheck:Cond
+ obj:*libz.so*
+}
+
+{
+ ignore-zlib-errors-value4
+ Memcheck:Value4
+ obj:*libz.so*
+}
+
+{
+ writing-data-from-zlib-triggers-errors
+ Memcheck:Param
+ write(buf)
+ obj:/lib/ld-*.so
+ fun:write_in_full
+ fun:write_buffer
+ fun:write_loose_object
+}
diff --git a/t/valgrind/valgrind.sh b/t/valgrind/valgrind.sh
new file mode 100755
index 0000000..24f3a4e
--- /dev/null
+++ b/t/valgrind/valgrind.sh
@@ -0,0 +1,12 @@
+#!/bin/sh
+
+base=$(basename "$0")
+
+exec valgrind -q --error-exitcode=126 \
+ --leak-check=no \
+ --suppressions="$GIT_VALGRIND/default.supp" \
+ --gen-suppressions=all \
+ --log-fd=4 \
+ --input-fd=4 \
+ $GIT_VALGRIND_OPTIONS \
+ "$GIT_VALGRIND"/../../"$base" "$@"
--
1.6.1.243.g6c8bb35
Johannes Schindelin
2009-01-20 15:05:03 UTC
Permalink
From: Jeff King <***@peff.net>

On some Linux systems, we get a host of Cond and Addr errors
from calls to dlopen that are caused by nss modules. We
should be able to safely ignore anything happening in
ld-*.so as "not our problem."

Signed-off-by: Jeff King <***@peff.net>
Signed-off-by: Junio C Hamano <***@pobox.com>
---

This is Peff's patch, unchanged.

t/valgrind/default.supp | 12 ++++++++++++
1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/t/valgrind/default.supp b/t/valgrind/default.supp
index 2482b3b..1013847 100644
--- a/t/valgrind/default.supp
+++ b/t/valgrind/default.supp
@@ -11,6 +11,18 @@
}

{
+ ignore-ldso-cond
+ Memcheck:Cond
+ obj:*ld-*.so
+}
+
+{
+ ignore-ldso-addr8
+ Memcheck:Addr8
+ obj:*ld-*.so
+}
+
+{
writing-data-from-zlib-triggers-errors
Memcheck:Param
write(buf)
--
1.6.1.243.g6c8bb35
Jeff King
2009-01-21 00:12:19 UTC
Permalink
Post by Johannes Schindelin
+else
+ # override all git executables in PATH and TEST_DIRECTORY/..
+ GIT_VALGRIND=$TEST_DIRECTORY/valgrind
+ mkdir -p "$GIT_VALGRIND"
Isn't this mkdir unnecessary, since it is actually part of the
repository (i.e., there is a gitignore there already).

However, I think it makes more sense to put the symlink cruft into
"$GIT_VALGRIND/bin". That way you can clean up the cruft very easily. In
which case you do need to "mkdir" that directory.
Post by Johannes Schindelin
+ OLDIFS=$IFS
+ for path in $PATH:$TEST_DIRECTORY/..
+ do
+ ls "$TEST_DIRECTORY"/../git "$path"/git-* 2> /dev/null |
Why aren't these both "$path"/ ?

But more importantly, do we really need to bother overriding the whole
$PATH? In theory, we aren't calling anything git-* that isn't in
"$TEST_DIRECTORY/..". And while it might be nice to catch it if we do,
it seems like detecting that is totally orthogonal to running valgrind,
and we get different behavior from valgrind versus not. And I think the
two should be as similar as possible (with the obvious except of
actually, you know, running valgrind).
Post by Johannes Schindelin
+ base=$(basename "$file")
+ test ! -h "$GIT_VALGRIND"/"$base" || continue
+
+ if test "#!" = "$(head -c 2 < "$file")"
+ then
+ # do not override scripts
+ ln -s ../../"$base" "$GIT_VALGRIND"/"$base"
+ else
+ ln -s valgrind.sh "$GIT_VALGRIND"/"$base"
+ fi
It would be nice to actually detect errors. But you have to
differentiate between EEXIST and other errors, which is a pain. And you
can't use "ln -sf" because it isn't atomic.

Copying would solve that (provided you copied to a tempfile and did
an atomic rename). Or writing this snippet as a C helper.
Post by Johannes Schindelin
--- /dev/null
+++ b/t/valgrind/valgrind.sh
@@ -0,0 +1,12 @@
+#!/bin/sh
+
+base=$(basename "$0")
+
+exec valgrind -q --error-exitcode=126 \
+ --leak-check=no \
+ --suppressions="$GIT_VALGRIND/default.supp" \
+ --gen-suppressions=all \
+ --log-fd=4 \
+ --input-fd=4 \
+ $GIT_VALGRIND_OPTIONS \
Hm. My version had to do some magic with the GIT_EXEC_PATH, but I think
that is because I didn't set GIT_EXEC_PATH in the first place. If yours
works (and I haven't really tested it -- I remember it being a real pain
in the butt to make sure valgrind was getting called from every code
path), then I like your approach much better.

-Peff
Johannes Schindelin
2009-01-21 00:41:13 UTC
Permalink
Hi,
Post by Jeff King
Post by Johannes Schindelin
+else
+ # override all git executables in PATH and TEST_DIRECTORY/..
+ GIT_VALGRIND=$TEST_DIRECTORY/valgrind
+ mkdir -p "$GIT_VALGRIND"
Isn't this mkdir unnecessary, since it is actually part of the
repository (i.e., there is a gitignore there already).
However, I think it makes more sense to put the symlink cruft into
"$GIT_VALGRIND/bin". That way you can clean up the cruft very easily. In
which case you do need to "mkdir" that directory.
Hmm. I actually liked the hierarchy to be shallow, but I could be
convinced...
Post by Jeff King
Post by Johannes Schindelin
+ OLDIFS=$IFS
+ for path in $PATH:$TEST_DIRECTORY/..
+ do
+ ls "$TEST_DIRECTORY"/../git "$path"/git-* 2> /dev/null |
Why aren't these both "$path"/ ?
Yeah. Makes it more readable, doesn't it?
Post by Jeff King
But more importantly, do we really need to bother overriding the whole
$PATH? In theory, we aren't calling anything git-* that isn't in
"$TEST_DIRECTORY/..". And while it might be nice to catch it if we do,
it seems like detecting that is totally orthogonal to running valgrind,
and we get different behavior from valgrind versus not. And I think the
two should be as similar as possible (with the obvious except of
actually, you know, running valgrind).
Actually, the two _are_ orthogonal from the technical viewpoint.

But with the infrastructure we have in place, it was already very easy to
make sure that calls to a Git program we no longer ship are caught.

I vividly remember such a bug costing me 3 hours of my life, and a few
hairs.

So I think "as it's already _that_ easy, we should catch them bugs, too".

Needs some documentation though, I agree.
Post by Jeff King
Post by Johannes Schindelin
+ base=$(basename "$file")
+ test ! -h "$GIT_VALGRIND"/"$base" || continue
+
+ if test "#!" = "$(head -c 2 < "$file")"
+ then
+ # do not override scripts
+ ln -s ../../"$base" "$GIT_VALGRIND"/"$base"
+ else
+ ln -s valgrind.sh "$GIT_VALGRIND"/"$base"
+ fi
It would be nice to actually detect errors. But you have to
differentiate between EEXIST and other errors, which is a pain. And you
can't use "ln -sf" because it isn't atomic.
I really would not care all that much about that.
'GIT_TEST_OPTS==--valgrind make test' should be run by experts. And even
if it is a dummy driving the test, the next "make" call should take care
of that.
Post by Jeff King
Copying would solve that (provided you copied to a tempfile and did
an atomic rename). Or writing this snippet as a C helper.
Nah, that is really too much work for such a rare thing. Think about it.
The symlinks are set up once. And even if you do that with -j50, there is
hardly a chance that two processes conflict with each other, and even if
they do, they do the same thing.

No, what I really want to fix is a script being replaced by a binary.
Post by Jeff King
Post by Johannes Schindelin
--- /dev/null
+++ b/t/valgrind/valgrind.sh
@@ -0,0 +1,12 @@
+#!/bin/sh
+
+base=$(basename "$0")
+
+exec valgrind -q --error-exitcode=126 \
+ --leak-check=no \
+ --suppressions="$GIT_VALGRIND/default.supp" \
+ --gen-suppressions=all \
+ --log-fd=4 \
+ --input-fd=4 \
+ $GIT_VALGRIND_OPTIONS \
Hm. My version had to do some magic with the GIT_EXEC_PATH, but I think
that is because I didn't set GIT_EXEC_PATH in the first place. If yours
works (and I haven't really tested it -- I remember it being a real pain
in the butt to make sure valgrind was getting called from every code
path), then I like your approach much better.
I set GIT_EXEC_PATH... to $GIT_VALGRIND.

Ciao,
Dscho
Johannes Schindelin
2009-01-21 01:10:17 UTC
Permalink
This patch adds the ability to use valgrind's memcheck tool to
diagnose memory problems in git while running the test scripts. It
works by placing a "fake" git in the front of the test script's PATH;
this fake git runs the real git under valgrind. It also points the
exec-path such that any stand-alone dashed git programs are run using
the same script. In this way we avoid having to modify the actual git
code in any way.

To be certain that every call to any git executable is intercepted,
the PATH is searched for executables beginning with "git-"; Scripts
are excluded however.

Valgrind can be used by specifying "GIT_TEST_OPTS=--valgrind" in the
make invocation. Any invocation of git that finds any errors under
valgrind will exit with failure code 126. Any valgrind output will go
to the usual stderr channel for tests (i.e., /dev/null, unless -v has
been specified).

If you need to pass options to valgrind -- you might want to run
another tool than memcheck, for example -- you can set the environment
variable GIT_VALGRIND_OPTIONS.

A few default suppressions are included, since libz seems to
trigger quite a few false positives. We'll assume that libz
works and that we can ignore any errors which are reported
there.

Initial patch and all the hard work by Jeff King.

Signed-off-by: Johannes Schindelin <***@gmx.de>
---

Changes vs v1:

- the symlinks will be created in t/valgrind/bin/ again, as it is
easier to remove the whole directory than weed out the unwanted
files from t/valgrind/.

- symbolic links are inspected for correct targets now, and if they
point somewhere else than expected, they are removed (this can
error out if the file could not be removed) and recreated.

- if the executable ends in ".sh" or ".perl", the target will be
set to a non-existing file (to catch invocations, erroring out).

- the Git binaries from the root are actually found now (IFS is
only interpreted after the file is parsed, it seems).

- added rudimentary documentation to t/README.

Interdiff to follow.

t/README | 6 ++++-
t/test-lib.sh | 49 +++++++++++++++++++++++++++++++++++++++++++++-
t/valgrind/.gitignore | 1 +
t/valgrind/default.supp | 21 ++++++++++++++++++++
t/valgrind/valgrind.sh | 12 +++++++++++
5 files changed, 86 insertions(+), 3 deletions(-)
create mode 100644 t/valgrind/.gitignore
create mode 100644 t/valgrind/default.supp
create mode 100755 t/valgrind/valgrind.sh

diff --git a/t/README b/t/README
index 8f12d48..0cee429 100644
--- a/t/README
+++ b/t/README
@@ -39,7 +39,8 @@ this:
* passed all 3 test(s)

You can pass --verbose (or -v), --debug (or -d), and --immediate
-(or -i) command line argument to the test.
+(or -i) command line argument to the test, or by setting GIT_TEST_OPTS
+appropriately before running "make".

--verbose::
This makes the test more verbose. Specifically, the
@@ -58,6 +59,9 @@ You can pass --verbose (or -v), --debug (or -d), and --immediate
This causes additional long-running tests to be run (where
available), for more exhaustive testing.

+--valgrind::
+ Execute all Git binaries with valgrind and stop on errors (the
+ exit code will be 126).

Skipping Tests
--------------
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 79f69de..f031905 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -94,6 +94,8 @@ do
--no-python)
# noop now...
shift ;;
+ --va|--val|--valg|--valgr|--valgri|--valgrin|--valgrind)
+ valgrind=t; shift ;;
*)
break ;;
esac
@@ -480,8 +482,51 @@ test_done () {
# Test the binaries we have just built. The tests are kept in
# t/ subdirectory and are run in 'trash directory' subdirectory.
TEST_DIRECTORY=$(pwd)
-PATH=$TEST_DIRECTORY/..:$PATH
-GIT_EXEC_PATH=$(pwd)/..
+if test -z "$valgrind"
+then
+ PATH=$TEST_DIRECTORY/..:$PATH
+ GIT_EXEC_PATH=$TEST_DIRECTORY/..
+else
+ # override all git executables in PATH and TEST_DIRECTORY/..
+ GIT_VALGRIND=$TEST_DIRECTORY/valgrind/bin
+ mkdir -p "$GIT_VALGRIND"
+ OLDIFS=$IFS
+ IFS=:
+ for path in $PATH $TEST_DIRECTORY/..
+ do
+ ls "$path"/git "$path"/git-* 2> /dev/null |
+ while read file
+ do
+ # handle only executables
+ test -x "$file" && test ! -d "$file" || continue
+
+ base=$(basename "$file")
+ symlink_target=$TEST_DIRECTORY/../$base
+ # do not override scripts
+ if test -x "$symlink_target" &&
+ test "#!" != "$(head -c 2 < "$symlink_target")"
+ then
+ symlink_target=../valgrind.sh
+ fi
+ case "$base" in
+ *.sh|*.perl)
+ symlink_target=../unprocessed-script
+ esac
+ # create the link, or replace it if it is out of date
+ if test ! -h "$GIT_VALGRIND"/"$base" ||
+ test "$symlink_target" != \
+ "$(readlink "$GIT_VALGRIND"/"$base")"
+ then
+ rm -f "$GIT_VALGRIND"/"$base" || exit
+ ln -s "$symlink_target" "$GIT_VALGRIND"/"$base"
+ fi
+ done
+ done
+ IFS=$OLDIFS
+ PATH=$GIT_VALGRIND:$PATH
+ GIT_EXEC_PATH=$GIT_VALGRIND
+ export GIT_VALGRIND
+fi
GIT_TEMPLATE_DIR=$(pwd)/../templates/blt
unset GIT_CONFIG
GIT_CONFIG_NOSYSTEM=1
diff --git a/t/valgrind/.gitignore b/t/valgrind/.gitignore
new file mode 100644
index 0000000..ae3c172
--- /dev/null
+++ b/t/valgrind/.gitignore
@@ -0,0 +1 @@
+/bin/
diff --git a/t/valgrind/default.supp b/t/valgrind/default.supp
new file mode 100644
index 0000000..2482b3b
--- /dev/null
+++ b/t/valgrind/default.supp
@@ -0,0 +1,21 @@
+{
+ ignore-zlib-errors-cond
+ Memcheck:Cond
+ obj:*libz.so*
+}
+
+{
+ ignore-zlib-errors-value4
+ Memcheck:Value4
+ obj:*libz.so*
+}
+
+{
+ writing-data-from-zlib-triggers-errors
+ Memcheck:Param
+ write(buf)
+ obj:/lib/ld-*.so
+ fun:write_in_full
+ fun:write_buffer
+ fun:write_loose_object
+}
diff --git a/t/valgrind/valgrind.sh b/t/valgrind/valgrind.sh
new file mode 100755
index 0000000..2c4b54b
--- /dev/null
+++ b/t/valgrind/valgrind.sh
@@ -0,0 +1,12 @@
+#!/bin/sh
+
+base=$(basename "$0")
+
+exec valgrind -q --error-exitcode=126 \
+ --leak-check=no \
+ --suppressions="$GIT_VALGRIND/../default.supp" \
+ --gen-suppressions=all \
+ --log-fd=4 \
+ --input-fd=4 \
+ $GIT_VALGRIND_OPTIONS \
+ "$GIT_VALGRIND"/../../../"$base" "$@"
--
1.6.1.243.g6c8bb35
Johannes Schindelin
2009-01-21 01:11:37 UTC
Permalink
t/README | 6 +++++-
t/test-lib.sh | 32 +++++++++++++++++++++-----------
t/valgrind/.gitignore | 3 +--
t/valgrind/valgrind.sh | 4 ++--
4 files changed, 29 insertions(+), 16 deletions(-)

diff --git a/t/README b/t/README
index 8f12d48..0cee429 100644
--- a/t/README
+++ b/t/README
@@ -39,7 +39,8 @@ this:
* passed all 3 test(s)

You can pass --verbose (or -v), --debug (or -d), and --immediate
-(or -i) command line argument to the test.
+(or -i) command line argument to the test, or by setting GIT_TEST_OPTS
+appropriately before running "make".

--verbose::
This makes the test more verbose. Specifically, the
@@ -58,6 +59,9 @@ You can pass --verbose (or -v), --debug (or -d), and --immediate
This causes additional long-running tests to be run (where
available), for more exhaustive testing.

+--valgrind::
+ Execute all Git binaries with valgrind and stop on errors (the
+ exit code will be 126).

Skipping Tests
--------------
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 6bd893d..f031905 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -488,27 +488,37 @@ then
GIT_EXEC_PATH=$TEST_DIRECTORY/..
else
# override all git executables in PATH and TEST_DIRECTORY/..
- GIT_VALGRIND=$TEST_DIRECTORY/valgrind
+ GIT_VALGRIND=$TEST_DIRECTORY/valgrind/bin
mkdir -p "$GIT_VALGRIND"
OLDIFS=$IFS
IFS=:
- for path in $PATH:$TEST_DIRECTORY/..
+ for path in $PATH $TEST_DIRECTORY/..
do
- ls "$TEST_DIRECTORY"/../git "$path"/git-* 2> /dev/null |
+ ls "$path"/git "$path"/git-* 2> /dev/null |
while read file
do
# handle only executables
- test -x "$file" || continue
+ test -x "$file" && test ! -d "$file" || continue

base=$(basename "$file")
- test ! -h "$GIT_VALGRIND"/"$base" || continue
-
- if test "#!" = "$(head -c 2 < "$file")"
+ symlink_target=$TEST_DIRECTORY/../$base
+ # do not override scripts
+ if test -x "$symlink_target" &&
+ test "#!" != "$(head -c 2 < "$symlink_target")"
+ then
+ symlink_target=../valgrind.sh
+ fi
+ case "$base" in
+ *.sh|*.perl)
+ symlink_target=../unprocessed-script
+ esac
+ # create the link, or replace it if it is out of date
+ if test ! -h "$GIT_VALGRIND"/"$base" ||
+ test "$symlink_target" != \
+ "$(readlink "$GIT_VALGRIND"/"$base")"
then
- # do not override scripts
- ln -s ../../"$base" "$GIT_VALGRIND"/"$base"
- else
- ln -s valgrind.sh "$GIT_VALGRIND"/"$base"
+ rm -f "$GIT_VALGRIND"/"$base" || exit
+ ln -s "$symlink_target" "$GIT_VALGRIND"/"$base"
fi
done
done
diff --git a/t/valgrind/.gitignore b/t/valgrind/.gitignore
index d781a63..ae3c172 100644
--- a/t/valgrind/.gitignore
+++ b/t/valgrind/.gitignore
@@ -1,2 +1 @@
-/git
-/git-*
+/bin/
diff --git a/t/valgrind/valgrind.sh b/t/valgrind/valgrind.sh
index 24f3a4e..2c4b54b 100755
--- a/t/valgrind/valgrind.sh
+++ b/t/valgrind/valgrind.sh
@@ -4,9 +4,9 @@ base=$(basename "$0")

exec valgrind -q --error-exitcode=126 \
--leak-check=no \
- --suppressions="$GIT_VALGRIND/default.supp" \
+ --suppressions="$GIT_VALGRIND/../default.supp" \
--gen-suppressions=all \
--log-fd=4 \
--input-fd=4 \
$GIT_VALGRIND_OPTIONS \
- "$GIT_VALGRIND"/../../"$base" "$@"
+ "$GIT_VALGRIND"/../../../"$base" "$@"
Junio C Hamano
2009-01-21 08:48:39 UTC
Permalink
Post by Johannes Schindelin
This patch adds the ability to use valgrind's memcheck tool to
diagnose memory problems in git while running the test scripts....
Hmmm, why do I haf to suffer with these new warnings from the tests?

$ sh t2012-checkout-last.sh --valgrind -v -i
warning: templates not found /git/git.git/t/valgrind/bin/templates/blt/
Initialized empty Git repository in /git/git.git/t/trash directory.t2012-checkout-last/.git/
mv: cannot stat `.git/hooks': No such file or directory
* expecting success:
echo hello >world &&

Am I using the patch incorrectly somehow?
Johannes Schindelin
2009-01-21 12:21:07 UTC
Permalink
Hi,
Post by Junio C Hamano
Post by Johannes Schindelin
This patch adds the ability to use valgrind's memcheck tool to
diagnose memory problems in git while running the test scripts....
Hmmm, why do I haf to suffer with these new warnings from the tests?
$ sh t2012-checkout-last.sh --valgrind -v -i
warning: templates not found /git/git.git/t/valgrind/bin/templates/blt/
Initialized empty Git repository in /git/git.git/t/trash directory.t2012-checkout-last/.git/
mv: cannot stat `.git/hooks': No such file or directory
echo hello >world &&
Am I using the patch incorrectly somehow?
Nope, I overlooked that GIT_EXEC_PATH was used by test-lib also to
determine the location of the templates. Will squash this in (which
makes a function out of the symlink business, and also fixes the error
that git-gui/ was tested if it is a script; "head" complained that it is
not a file):

-- snipsnap --
t/test-lib.sh | 22 ++++++++++++++--------
1 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index f031905..6acc6e0 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -487,6 +487,14 @@ then
PATH=$TEST_DIRECTORY/..:$PATH
GIT_EXEC_PATH=$TEST_DIRECTORY/..
else
+ make_symlink () {
+ test -h "$2" &&
+ test "$1" = "$(readlink "$2")" || {
+ rm -f "$2" &&
+ ln -s "$1" "$2"
+ }
+ }
+
# override all git executables in PATH and TEST_DIRECTORY/..
GIT_VALGRIND=$TEST_DIRECTORY/valgrind/bin
mkdir -p "$GIT_VALGRIND"
@@ -498,12 +506,13 @@ else
while read file
do
# handle only executables
- test -x "$file" && test ! -d "$file" || continue
+ test -x "$file" || continue

base=$(basename "$file")
symlink_target=$TEST_DIRECTORY/../$base
# do not override scripts
if test -x "$symlink_target" &&
+ test ! -d "$symlink_target" &&
test "#!" != "$(head -c 2 < "$symlink_target")"
then
symlink_target=../valgrind.sh
@@ -513,19 +522,16 @@ else
symlink_target=../unprocessed-script
esac
# create the link, or replace it if it is out of date
- if test ! -h "$GIT_VALGRIND"/"$base" ||
- test "$symlink_target" != \
- "$(readlink "$GIT_VALGRIND"/"$base")"
- then
- rm -f "$GIT_VALGRIND"/"$base" || exit
- ln -s "$symlink_target" "$GIT_VALGRIND"/"$base"
- fi
+ make_symlink "$symlink_target" "$GIT_VALGRIND/$base" ||
+ exit
done
done
IFS=$OLDIFS
PATH=$GIT_VALGRIND:$PATH
GIT_EXEC_PATH=$GIT_VALGRIND
export GIT_VALGRIND
+
+ make_symlink ../../../templates "$GIT_VALGRIND"/templates || exit
fi
GIT_TEMPLATE_DIR=$(pwd)/../templates/blt
unset GIT_CONFIG
--
1.6.1.442.g38a50
Jeff King
2009-01-21 19:02:01 UTC
Permalink
Post by Johannes Schindelin
- symbolic links are inspected for correct targets now, and if they
point somewhere else than expected, they are removed (this can
error out if the file could not be removed) and recreated.
Now you _do_ have a race on this, and triggering it will cause you to
run a random version of git from your PATH, not using valgrind (instead
of running the version from the repo using valgrind). Something like:

A: execvp("git-foo")
B: oops, "git-foo" is out of date
B: rm $GIT_VALGRIND/git-foo
A: look for $GIT_VALGRIND/git-foo; not there
A: look for $PATH[1]/git-foo; ok, there it is
B: ln -s ../../git-valgrind $GIT_VALGRIND/git-foo
Post by Johannes Schindelin
+ Execute all Git binaries with valgrind and stop on errors (the
+ exit code will be 126).
It doesn't necessarily stop: it just causes the command to fail, which
causes the test to fail. Which _will_ stop if you have "-i".

Also, you might want to mention that valgrind errors go to stderr, so
using "-v" is helpful.
Post by Johannes Schindelin
+ # override all git executables in PATH and TEST_DIRECTORY/..
+ GIT_VALGRIND=$TEST_DIRECTORY/valgrind/bin
I think you should leave GIT_VALGRIND pointing to the main valgrind
directory. That way it is more convenient for people using
GIT_VALGRIND_OPTIONS to make use of GIT_VALGRIND without having to ".."
everything (for example, they may want to pick and choose suppressions
to load for their platform).
Post by Johannes Schindelin
+ case "$base" in
+ *.sh|*.perl)
+ symlink_target=../unprocessed-script
+ esac
AFAIK, this triggers an error if I try to call "git-foo.perl" directly.
What does this have to do with valgrind? Why does this error checking
happen when I run --valgrind, but _not_ otherwise?

And yes, I know the answer is "because it's easy to do here, since
--valgrind is munging the PATH anyway". But my point is that that is an
_implementation_ detail, and the external behavior to a user is
nonsensical.

The fact that there are other uses for munging the PATH than valgrind
implies to me that we should _always_ be munging the PATH like this to
catch these sorts of errors. And then "--valgrind" can just change the
way we munge.
Post by Johannes Schindelin
+ # create the link, or replace it if it is out of date
+ if test ! -h "$GIT_VALGRIND"/"$base" ||
+ test "$symlink_target" != \
+ "$(readlink "$GIT_VALGRIND"/"$base")"
+ then
readlink is not portable; it's part of GNU coreutils. Right now valgrind
basically only runs on Linux, which I think generally means that
readlink will be available (though I have no idea if there are
distributions that vary in this). However, there is an experimental
valgrind port to FreeBSD and NetBSD, which are unlikely to have
readlink.

-Peff
Johannes Schindelin
2009-01-21 20:49:14 UTC
Permalink
Hi,
Post by Jeff King
Post by Johannes Schindelin
- symbolic links are inspected for correct targets now, and if they
point somewhere else than expected, they are removed (this can
error out if the file could not be removed) and recreated.
Now you _do_ have a race on this, and triggering it will cause you to
run a random version of git from your PATH, not using valgrind (instead
A: execvp("git-foo")
B: oops, "git-foo" is out of date
B: rm $GIT_VALGRIND/git-foo
A: look for $GIT_VALGRIND/git-foo; not there
A: look for $PATH[1]/git-foo; ok, there it is
B: ln -s ../../git-valgrind $GIT_VALGRIND/git-foo
Except that A had to check the link first, and it was out-of-date already
-- except if you changed a script into a builtin _and_ run make while a
valgrinded test is called _and_ you're unlucky.
Post by Jeff King
Post by Johannes Schindelin
+ Execute all Git binaries with valgrind and stop on errors (the
+ exit code will be 126).
It doesn't necessarily stop: it just causes the command to fail, which
causes the test to fail. Which _will_ stop if you have "-i".
Also, you might want to mention that valgrind errors go to stderr, so
using "-v" is helpful.
Okay.
Post by Jeff King
Post by Johannes Schindelin
+ # override all git executables in PATH and TEST_DIRECTORY/..
+ GIT_VALGRIND=$TEST_DIRECTORY/valgrind/bin
I think you should leave GIT_VALGRIND pointing to the main valgrind
directory. That way it is more convenient for people using
GIT_VALGRIND_OPTIONS to make use of GIT_VALGRIND without having to ".."
everything (for example, they may want to pick and choose suppressions
to load for their platform).
Okay.
Post by Jeff King
Post by Johannes Schindelin
+ case "$base" in
+ *.sh|*.perl)
+ symlink_target=../unprocessed-script
+ esac
AFAIK, this triggers an error if I try to call "git-foo.perl" directly.
Yep.
Post by Jeff King
What does this have to do with valgrind?
Nothing, except that the infrastructure is there now.
Post by Jeff King
Why does this error checking happen when I run --valgrind, but _not_
otherwise?
Because we can only check for that kind of mistake in our scripts (which
the author would not realize is a mistake when running on a system where
GIT_SHELL=/bin/sh) when we redirect GIT_EXEC_PATH.

So basically, it would take a tremendous effort otherwise, but here, it is
just easy.
Post by Jeff King
And yes, I know the answer is "because it's easy to do here, since
--valgrind is munging the PATH anyway". But my point is that that is an
_implementation_ detail, and the external behavior to a user is
nonsensical.
The fact that there are other uses for munging the PATH than valgrind
implies to me that we should _always_ be munging the PATH like this to
catch these sorts of errors. And then "--valgrind" can just change the
way we munge.
Hmm. Maybe.
Post by Jeff King
Post by Johannes Schindelin
+ # create the link, or replace it if it is out of date
+ if test ! -h "$GIT_VALGRIND"/"$base" ||
+ test "$symlink_target" != \
+ "$(readlink "$GIT_VALGRIND"/"$base")"
+ then
readlink is not portable; it's part of GNU coreutils. Right now valgrind
basically only runs on Linux, which I think generally means that
readlink will be available (though I have no idea if there are
distributions that vary in this). However, there is an experimental
valgrind port to FreeBSD and NetBSD, which are unlikely to have
readlink.
As I mentioned earlier: let's bridge this bridge when we face it
(probably it involves making a test-readlink).

Or are you insisting that the patch should be reworked _now_ so that
GIT_EXEC_PATH _always_ points somewhere else?

I hope not, because then you break Windows.

Ciao,
Dscho
Jeff King
2009-01-21 21:53:18 UTC
Permalink
Post by Johannes Schindelin
Post by Jeff King
A: execvp("git-foo")
B: oops, "git-foo" is out of date
B: rm $GIT_VALGRIND/git-foo
A: look for $GIT_VALGRIND/git-foo; not there
A: look for $PATH[1]/git-foo; ok, there it is
B: ln -s ../../git-valgrind $GIT_VALGRIND/git-foo
Except that A had to check the link first, and it was out-of-date already
-- except if you changed a script into a builtin _and_ run make while a
valgrinded test is called _and_ you're unlucky.
Hrm, true. I consider running "make" in the middle of tests and
expecting them to work properly to be a bit crazy, so I guess this is
not a problem in practice.

I'll stop bugging you about race conditions for now, then. :)
Post by Johannes Schindelin
Post by Jeff King
readlink is not portable; it's part of GNU coreutils. Right now valgrind
basically only runs on Linux, which I think generally means that
readlink will be available (though I have no idea if there are
distributions that vary in this). However, there is an experimental
valgrind port to FreeBSD and NetBSD, which are unlikely to have
readlink.
As I mentioned earlier: let's bridge this bridge when we face it
(probably it involves making a test-readlink).
Actually, I am wrong. There is a stripped-down readlink that has
shipped with FreeBSD (since 4.10) and NetBSD (since 1.6). So while
readlink isn't portable, I think it should generally work on platforms
supported by valgrind.
Post by Johannes Schindelin
Or are you insisting that the patch should be reworked _now_ so that
GIT_EXEC_PATH _always_ points somewhere else?
No, I'm not insisting. It was merely a suggestion that the patch be
split into two parts so non-valgrind invocations can benefit from this
type of bug checking (and by this type I mean general PATH issues -- I
think we had some problems in the past with invoking dashed forms of
commands which were supposed to be available only via exec-path).
Post by Johannes Schindelin
I hope not, because then you break Windows.
Only if you use the same symlink technique.

-Peff
Johannes Schindelin
2009-01-21 22:38:13 UTC
Permalink
Hi,
Actually, I am wrong. There is a stripped-down readlink that has shipped
with FreeBSD (since 4.10) and NetBSD (since 1.6). So while readlink
isn't portable, I think it should generally work on platforms supported
by valgrind.
A pity. I was already working on this patch:

-- snipsnap --
[PATCH] valgrind tests: provide a "readlink" function for systems which lack it

Signed-off-by: Johannes Schindelin <***@gmx.de>
---
t/test-lib.sh | 17 +++++++++++++++++
1 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 07e657e..c2199e7 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -487,7 +487,24 @@ then
PATH=$TEST_DIRECTORY/..:$PATH
GIT_EXEC_PATH=$TEST_DIRECTORY/..
else
+ readlink -h 2> /dev/null
+ if test $? = 127
+ then
+ readlink () {
+ ls -l "$1" |
+ sed -e "s/-> \(.*\)$/\1/g"
+ # cannot use s/.* -> //, because of
+ # ln -s "a -> b" "c -> d"
+ }
+ fi
+
make_symlink () {
+ case "$1" in
+ *" -> "*)
+ die "You must be kidding me ($1)."
+ ;;
+ esac
+
test -h "$2" &&
test "$1" = "$(readlink "$2")" || {
# be super paranoid
--
1.6.1.442.g112f5
Johannes Schindelin
2009-01-25 23:18:10 UTC
Permalink
I finally decided to give in on both the lock (let's see how many races
we encounter in reality...) and the searching the PATH and handling .sh
and .perl scripts, too. The latter issue is handled by 3/3, which is up
for discussion.

Oh, and BTW, this is vs 'next', and according to my tests, valgrind finds
at least one issue.

Jeff King (1):
valgrind: ignore ldso and more libz errors

Johannes Schindelin (2):
Add valgrind support in test scripts
Valgrind support: check for more than just programming errors

t/README | 8 +++++-
t/test-lib.sh | 66 +++++++++++++++++++++++++++++++++++++++++++++-
t/valgrind/.gitignore | 1 +
t/valgrind/default.supp | 45 ++++++++++++++++++++++++++++++++
t/valgrind/valgrind.sh | 12 ++++++++
5 files changed, 129 insertions(+), 3 deletions(-)
create mode 100644 t/valgrind/.gitignore
create mode 100644 t/valgrind/default.supp
create mode 100755 t/valgrind/valgrind.sh
Johannes Schindelin
2009-01-25 23:18:50 UTC
Permalink
This patch adds the ability to use valgrind's memcheck tool to
diagnose memory problems in Git while running the test scripts.

It works by creating symlinks to a valgrind script, which have the same
name as our Git binaries, and then putting that directory in front of
the test script's PATH as well as set GIT_EXEC_PATH to that directory.

Git scripts are symlinked from that directory directly. That way, Git
binaries called by Git scripts are valgrinded, too.

Valgrind can be used by specifying "GIT_TEST_OPTS=--valgrind" in the
make invocation. Any invocation of git that finds any errors under
valgrind will exit with failure code 126. Any valgrind output will go
to the usual stderr channel for tests (i.e., /dev/null, unless -v has
been specified).

If you need to pass options to valgrind -- you might want to run
another tool than memcheck, for example -- you can set the environment
variable GIT_VALGRIND_OPTIONS.

A few default suppressions are included, since libz seems to trigger
quite a few false positives. We'll assume that libz works and that we
can ignore any errors which are reported there.

Note: it is safe to run the valgrind tests in parallel, as the links in
t/valgrind/bin/ are created using proper locking.

Initial patch and all the hard work by Jeff King.

Signed-off-by: Johannes Schindelin <***@gmx.de>
---
t/README | 8 ++++++-
t/test-lib.sh | 54 +++++++++++++++++++++++++++++++++++++++++++++-
t/valgrind/.gitignore | 1 +
t/valgrind/default.supp | 21 ++++++++++++++++++
t/valgrind/valgrind.sh | 12 ++++++++++
5 files changed, 93 insertions(+), 3 deletions(-)
create mode 100644 t/valgrind/.gitignore
create mode 100644 t/valgrind/default.supp
create mode 100755 t/valgrind/valgrind.sh

diff --git a/t/README b/t/README
index 8f12d48..811bc0d 100644
--- a/t/README
+++ b/t/README
@@ -39,7 +39,8 @@ this:
* passed all 3 test(s)

You can pass --verbose (or -v), --debug (or -d), and --immediate
-(or -i) command line argument to the test.
+(or -i) command line argument to the test, or by setting GIT_TEST_OPTS
+appropriately before running "make".

--verbose::
This makes the test more verbose. Specifically, the
@@ -58,6 +59,11 @@ You can pass --verbose (or -v), --debug (or -d), and --immediate
This causes additional long-running tests to be run (where
available), for more exhaustive testing.

+--valgrind::
+ Execute all Git binaries with valgrind and exit with status
+ 126 on errors (just like regular tests, this will only stop
+ the test script when running under -i). Valgrind errors
+ go to stderr, so you might want to pass the -v option, too.

Skipping Tests
--------------
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 41d5a59..67d7883 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -94,6 +94,8 @@ do
--no-python)
# noop now...
shift ;;
+ --va|--val|--valg|--valgr|--valgri|--valgrin|--valgrind)
+ valgrind=t; shift ;;
*)
break ;;
esac
@@ -467,8 +469,56 @@ test_done () {
# Test the binaries we have just built. The tests are kept in
# t/ subdirectory and are run in 'trash directory' subdirectory.
TEST_DIRECTORY=$(pwd)
-PATH=$TEST_DIRECTORY/..:$PATH
-GIT_EXEC_PATH=$(pwd)/..
+if test -z "$valgrind"
+then
+ PATH=$TEST_DIRECTORY/..:$PATH
+ GIT_EXEC_PATH=$TEST_DIRECTORY/..
+else
+ make_symlink () {
+ test -h "$2" &&
+ test "$1" = "$(readlink "$2")" || {
+ # be super paranoid
+ if mkdir "$2".lock
+ then
+ rm -f "$2" &&
+ ln -s "$1" "$2" &&
+ rm -r "$2".lock
+ else
+ while test -d "$2".lock
+ do
+ say "Waiting for lock on $2."
+ sleep 1
+ done
+ fi
+ }
+ }
+
+ # override all git executables in TEST_DIRECTORY/..
+ GIT_VALGRIND=$TEST_DIRECTORY/valgrind
+ mkdir -p "$GIT_VALGRIND"/bin
+ ls $TEST_DIRECTORY/../git* 2> /dev/null |
+ while read symlink_target
+ do
+ # handle only executables
+ test -x "$symlink_target" || continue
+
+ base=$(basename "$symlink_target")
+ # do not override scripts
+ if test ! -d "$symlink_target" &&
+ test "#!" != "$(head -c 2 < "$symlink_target")"
+ then
+ symlink_target=../valgrind.sh
+ fi
+ # create the link, or replace it if it is out of date
+ make_symlink "$symlink_target" \
+ "$GIT_VALGRIND/bin/$base" || exit
+ done
+ PATH=$GIT_VALGRIND/bin:$PATH
+ GIT_EXEC_PATH=$GIT_VALGRIND/bin
+ export GIT_VALGRIND
+
+ make_symlink ../../templates "$GIT_VALGRIND"/templates || exit
+fi
GIT_TEMPLATE_DIR=$(pwd)/../templates/blt
unset GIT_CONFIG
GIT_CONFIG_NOSYSTEM=1
diff --git a/t/valgrind/.gitignore b/t/valgrind/.gitignore
new file mode 100644
index 0000000..ae3c172
--- /dev/null
+++ b/t/valgrind/.gitignore
@@ -0,0 +1 @@
+/bin/
diff --git a/t/valgrind/default.supp b/t/valgrind/default.supp
new file mode 100644
index 0000000..2482b3b
--- /dev/null
+++ b/t/valgrind/default.supp
@@ -0,0 +1,21 @@
+{
+ ignore-zlib-errors-cond
+ Memcheck:Cond
+ obj:*libz.so*
+}
+
+{
+ ignore-zlib-errors-value4
+ Memcheck:Value4
+ obj:*libz.so*
+}
+
+{
+ writing-data-from-zlib-triggers-errors
+ Memcheck:Param
+ write(buf)
+ obj:/lib/ld-*.so
+ fun:write_in_full
+ fun:write_buffer
+ fun:write_loose_object
+}
diff --git a/t/valgrind/valgrind.sh b/t/valgrind/valgrind.sh
new file mode 100755
index 0000000..24f3a4e
--- /dev/null
+++ b/t/valgrind/valgrind.sh
@@ -0,0 +1,12 @@
+#!/bin/sh
+
+base=$(basename "$0")
+
+exec valgrind -q --error-exitcode=126 \
+ --leak-check=no \
+ --suppressions="$GIT_VALGRIND/default.supp" \
+ --gen-suppressions=all \
+ --log-fd=4 \
+ --input-fd=4 \
+ $GIT_VALGRIND_OPTIONS \
+ "$GIT_VALGRIND"/../../"$base" "$@"
--
1.6.1.482.g7d54be
Jeff King
2009-01-25 23:29:55 UTC
Permalink
Post by Johannes Schindelin
Note: it is safe to run the valgrind tests in parallel, as the links in
t/valgrind/bin/ are created using proper locking.
I actually kind of liked the original atomic version over the one with
locking. But I find this one acceptable.
Post by Johannes Schindelin
Initial patch and all the hard work by Jeff King.
I don't know that there is much of my work left in here, but feel free
to add:

Signed-off-by: Jeff King <***@peff.net>

-Peff
Johannes Schindelin
2009-01-25 23:35:31 UTC
Permalink
Hi,
Post by Jeff King
Post by Johannes Schindelin
Note: it is safe to run the valgrind tests in parallel, as the links
in t/valgrind/bin/ are created using proper locking.
I actually kind of liked the original atomic version over the one with
locking. But I find this one acceptable.
The locking is only in there because of you...
Post by Jeff King
Post by Johannes Schindelin
Initial patch and all the hard work by Jeff King.
I don't know that there is much of my work left in here, but feel free
Will do!

Ciao,
Dscho
Jeff King
2009-01-25 23:42:04 UTC
Permalink
Post by Johannes Schindelin
Post by Jeff King
I actually kind of liked the original atomic version over the one with
locking. But I find this one acceptable.
The locking is only in there because of you...
I know it came out of our discussion, but I thought it was going a bit
far. That is, what should ideally be a little chunk of code to make some
links keeps getting more and more complex. And as your locking patch
came after my "OK, I guess this is fine" comments, I thought you
realized I was accepting it as-is.

So sorry to make you to go to extra work (and please don't go to extra
work ripping it out on my account -- I just wanted to make clear that I
decided your analysis was sane, and that I am OK with any of the
iterations you posted).

-Peff
Johannes Schindelin
2009-01-25 23:19:12 UTC
Permalink
On some Linux systems, we get a host of Cond and Addr errors
from calls to dlopen that are caused by nss modules. We
should be able to safely ignore anything happening in
ld-*.so as "not our problem."

[Johannes: I added some more...]

Signed-off-by: Jeff King <***@peff.net>
Signed-off-by: Johannes Schindelin <***@gmx.de>
---
t/valgrind/default.supp | 24 ++++++++++++++++++++++++
1 files changed, 24 insertions(+), 0 deletions(-)

diff --git a/t/valgrind/default.supp b/t/valgrind/default.supp
index 2482b3b..b2da4fd 100644
--- a/t/valgrind/default.supp
+++ b/t/valgrind/default.supp
@@ -5,12 +5,36 @@
}

{
+ ignore-zlib-errors-value8
+ Memcheck:Value8
+ obj:*libz.so*
+}
+
+{
ignore-zlib-errors-value4
Memcheck:Value4
obj:*libz.so*
}

{
+ ignore-ldso-cond
+ Memcheck:Cond
+ obj:*ld-*.so
+}
+
+{
+ ignore-ldso-addr8
+ Memcheck:Addr8
+ obj:*ld-*.so
+}
+
+{
+ ignore-ldso-addr4
+ Memcheck:Addr4
+ obj:*ld-*.so
+}
+
+{
writing-data-from-zlib-triggers-errors
Memcheck:Param
write(buf)
--
1.6.1.482.g7d54be
Jeff King
2009-01-25 23:32:43 UTC
Permalink
Post by Johannes Schindelin
On some Linux systems, we get a host of Cond and Addr errors
from calls to dlopen that are caused by nss modules. We
should be able to safely ignore anything happening in
ld-*.so as "not our problem."
[Johannes: I added some more...]
Your 0/3 cover letter lists this me as the author of this patch, but
there is no "From:" line at the top of this email. I don't particularly
care one way or the other for this patch, but I wanted to point it out
as a potential issue with your patch-sending workflow.

-Peff
Johannes Schindelin
2009-01-26 00:02:24 UTC
Permalink
Hi,
Post by Jeff King
On some Linux systems, we get a host of Cond and Addr errors from
calls to dlopen that are caused by nss modules. We should be able to
safely ignore anything happening in ld-*.so as "not our problem."
[Johannes: I added some more...]
Your 0/3 cover letter lists this me as the author of this patch, but
there is no "From:" line at the top of this email. I don't particularly
care one way or the other for this patch, but I wanted to point it out
as a potential issue with your patch-sending workflow.
Yep, sorry. I would not touch send-email with lead-protected gloves, so
what I do is to edit all patches I send. And in this case, I missed the
fact that there was another "From:". I am sorry.

Ciao,
Dscho "who is burning midnight oil again"
Jeff King
2009-01-26 00:14:52 UTC
Permalink
Post by Johannes Schindelin
Post by Jeff King
Your 0/3 cover letter lists this me as the author of this patch, but
there is no "From:" line at the top of this email. I don't particularly
care one way or the other for this patch, but I wanted to point it out
as a potential issue with your patch-sending workflow.
Yep, sorry. I would not touch send-email with lead-protected gloves, so
what I do is to edit all patches I send. And in this case, I missed the
fact that there was another "From:". I am sorry.
Heh. I certainly can't blame you for that; I don't use send-email
myself.

It might be convenient for format-patch to have a mode where it uses the
committer as the rfc822 "From:" and then adds a "From:" for the author
in the body if it is not the same as the committer.

It certainly shouldn't be the default, since that would confuse things
like rebase. But it makes sense if you are just going to throw away the
From header anyway when you import into your MUA.

-Peff
Johannes Schindelin
2009-01-25 23:20:21 UTC
Permalink
This patch makes --valgrind try to override _all_ Git binaries in the
PATH, and it will make calling *.sh and *.perl scripts directly an
error.

While it is not strictly necessary to look through the whole PATH to
find git binaries to override, it is in line with running an expensive
test (which valgrind is) to make extra sure that no binary is tested
that actually comes from the git.git checkout.

In the same spirit, we can test that neither our test suite nor our
scripts try to run the *.sh or *.perl scripts directly.

It's more like a "because we can" than a "this is tightly connected
to valgrind", but in the author's opinion "because we can" is "so we
should" in this case.

Signed-off-by: Johannes Schindelin <***@gmx.de>
---

As I said, I vividly remember chasing a bug which turned out to be
a Git program that was installed, but no longer in git.git, yet
the test suite used it.

This would catch it.

t/test-lib.sh | 42 +++++++++++++++++++++++++++---------------
1 files changed, 27 insertions(+), 15 deletions(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 67d7883..bdfb30f 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -496,23 +496,35 @@ else
# override all git executables in TEST_DIRECTORY/..
GIT_VALGRIND=$TEST_DIRECTORY/valgrind
mkdir -p "$GIT_VALGRIND"/bin
- ls $TEST_DIRECTORY/../git* 2> /dev/null |
- while read symlink_target
+ OLDIFS=$IFS
+ IFS=:
+ for path in $PATH $TEST_DIRECTORY/..
do
- # handle only executables
- test -x "$symlink_target" || continue
-
- base=$(basename "$symlink_target")
- # do not override scripts
- if test ! -d "$symlink_target" &&
- test "#!" != "$(head -c 2 < "$symlink_target")"
- then
- symlink_target=../valgrind.sh
- fi
- # create the link, or replace it if it is out of date
- make_symlink "$symlink_target" \
- "$GIT_VALGRIND/bin/$base" || exit
+ ls "$path"/git "$path"/git-* 2> /dev/null |
+ while read file
+ do
+ # handle only executables
+ test -x "$file" || continue
+
+ base=$(basename "$file")
+ symlink_target=$TEST_DIRECTORY/../$base
+ # do not override scripts
+ if test -x "$symlink_target" &&
+ test ! -d "$symlink_target" &&
+ test "#!" != "$(head -c 2 < "$symlink_target")"
+ then
+ symlink_target=../valgrind.sh
+ fi
+ case "$base" in
+ *.sh|*.perl)
+ symlink_target=../unprocessed-script
+ esac
+ # create the link, or replace it if it is out of date
+ make_symlink "$symlink_target" \
+ "$GIT_VALGRIND/bin/$base" || exit
+ done
done
+ IFS=$OLDIFS
PATH=$GIT_VALGRIND/bin:$PATH
GIT_EXEC_PATH=$GIT_VALGRIND/bin
export GIT_VALGRIND
--
1.6.1.482.g7d54be
Jeff King
2009-01-25 23:42:49 UTC
Permalink
Post by Johannes Schindelin
While it is not strictly necessary to look through the whole PATH to
find git binaries to override, it is in line with running an expensive
test (which valgrind is) to make extra sure that no binary is tested
that actually comes from the git.git checkout.
Should this be "...no binary is tested that _doesn't_ actually come from
the git.git checkout"?

-Peff
Johannes Schindelin
2009-01-26 00:43:13 UTC
Permalink
Hi,
Post by Jeff King
Post by Johannes Schindelin
While it is not strictly necessary to look through the whole PATH to
find git binaries to override, it is in line with running an expensive
test (which valgrind is) to make extra sure that no binary is tested
that actually comes from the git.git checkout.
Should this be "...no binary is tested that _doesn't_ actually come from
the git.git checkout"?
Yep, that was half the change to "that only binaries are tested...".

Thanks,
Dscho
Johannes Schindelin
2009-01-21 22:31:58 UTC
Permalink
Even if there is only a faint, almost neglible chance that two parallel
tests create the symlinks needed for the valgrind test at the same time,
Peff wrote more than just a couple mails about the issue.

To get rid of that threat^Wthread, use a locking mechanism to make
sure a symlink is only created by one test invocation, and the other
has to wait.

Peff, do you see how much I like you?

Signed-off-by: Johannes Schindelin <***@gmx.de>
---
t/test-lib.sh | 15 +++++++++++++--
1 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 6acc6e0..07e657e 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -490,8 +490,19 @@ else
make_symlink () {
test -h "$2" &&
test "$1" = "$(readlink "$2")" || {
- rm -f "$2" &&
- ln -s "$1" "$2"
+ # be super paranoid
+ if mkdir "$2".lock
+ then
+ rm -f "$2" &&
+ ln -s "$1" "$2" &&
+ rm -r "$2".lock
+ else
+ while test -d "$2".lock
+ do
+ say "Waiting for lock on $2."
+ sleep 1
+ done
+ fi
}
}
--
1.6.1.442.g112f5
Jeff King
2009-01-20 23:24:39 UTC
Permalink
Post by Johannes Schindelin
Post by Jeff King
How will you deal with race conditions between two simultaneously
running scripts? I.e., where are you going to put it?
There are no race conditions, as for every git executable, a symbolic link
is created, pointing to the valgrind.sh script [*1*].
Hmm. I suppose that would work, since every test run is trying to create
the same state.
Post by Johannes Schindelin
Besides, what with valgrind being a memory hog, you'd be nuts to call
valgrinded scripts simultaneously.
I have to disagree there. I think there are two obvious usage patterns:

- run script $X specifically under valgrind to track down a bug

- run the whole test suite under valgrind occasionally to find
latent bugs that wouldn't otherwise show up

In the latter, you want a pretty beefy box. When I did the original
patches, I ran through the whole test suite under valgrind. It took
several hours on a 6GB quad-core box, using "-j4". I would hate for it
to have taken an entire day. :)

-Peff
Johannes Schindelin
2009-01-21 00:10:22 UTC
Permalink
Hi,
Post by Jeff King
Post by Johannes Schindelin
Post by Jeff King
How will you deal with race conditions between two simultaneously
running scripts? I.e., where are you going to put it?
There are no race conditions, as for every git executable, a symbolic
link is created, pointing to the valgrind.sh script [*1*].
Hmm. I suppose that would work, since every test run is trying to create
the same state.
Yep, that's what I meant with "no race".
Post by Jeff King
Post by Johannes Schindelin
Besides, what with valgrind being a memory hog, you'd be nuts to call
valgrinded scripts simultaneously.
- run script $X specifically under valgrind to track down a bug
- run the whole test suite under valgrind occasionally to find
latent bugs that wouldn't otherwise show up
In the latter, you want a pretty beefy box. When I did the original
patches, I ran through the whole test suite under valgrind. It took
several hours on a 6GB quad-core box, using "-j4". I would hate for it
to have taken an entire day. :)
Heh. Okay. I was convinced that your valgrind patch predated my -j<n>
patch...

In any case, I already found a bug in the nth_last series, thanks to your
work, which I'll send in a minute.

Ciao,
Dscho
Jeff King
2009-01-21 00:15:51 UTC
Permalink
Post by Johannes Schindelin
Post by Jeff King
Hmm. I suppose that would work, since every test run is trying to create
the same state.
Yep, that's what I meant with "no race".
Right, but it is still possible to screw it up, if your creation process
does a delete-create. But it looks like you did it correctly in your
patch (try to create, and if you fail because it's there, assume it's
right).
Post by Johannes Schindelin
Heh. Okay. I was convinced that your valgrind patch predated my -j<n>
patch...
I think I did an early version that did predate it, but then another
round afterwards.
Post by Johannes Schindelin
In any case, I already found a bug in the nth_last series, thanks to your
work, which I'll send in a minute.
Yay! It's nice when infrastructure work like this actually pays off.

Thanks for picking up this topic...I can drop the size of my
ever-growing git todo list by one. :)

-Peff
Johannes Schindelin
2009-01-21 00:28:01 UTC
Permalink
Hi,
Post by Jeff King
Post by Johannes Schindelin
Post by Jeff King
Hmm. I suppose that would work, since every test run is trying to create
the same state.
Yep, that's what I meant with "no race".
Right, but it is still possible to screw it up, if your creation process
does a delete-create. But it looks like you did it correctly in your
patch (try to create, and if you fail because it's there, assume it's
right).
Actually, I test first if it is there, and only if it is not, try to
create the symlink.

Now, there is still a very minor chance for a race, namely if two
processes happen to test the existence of the missing symlink at exactly
the same time, and both do not find it, so both processes will try to
create it.

However, the symlink creation is not checked for success, so the processes
will still both run just fine.

There is a very subtle problem, though. If you screw with your
configuration, replacing a link in t/valgrind/ by a script, my code will
not try to undo it. However, I think that's really asking for trouble,
and you can get out of the mess by "rm -r t/valgrind/git*".

Another problem which is potentially much more troublesome is this:
when there was a script by a certain name, my code would symlink it
to $GIT_DIR/$BASENAME (actually a relative path, but you get the
idea). If that script is turned into a builtin -- this list has certainly
known a certain person to push for that kind of conversion :-) -- that
fact is not picked up.

But I think I have an easy solution for that.
Post by Jeff King
Post by Johannes Schindelin
In any case, I already found a bug in the nth_last series, thanks to
your work, which I'll send in a minute.
Yay! It's nice when infrastructure work like this actually pays off.
Yep! Thanks!
Post by Jeff King
Thanks for picking up this topic...I can drop the size of my
ever-growing git todo list by one. :)
Actually, don't remind me... of my TODO list.

Ciao,
Dscho
Jeff King
2009-01-21 00:37:39 UTC
Permalink
Post by Johannes Schindelin
Actually, I test first if it is there, and only if it is not, try to
create the symlink.
Now, there is still a very minor chance for a race, namely if two
processes happen to test the existence of the missing symlink at exactly
the same time, and both do not find it, so both processes will try to
create it.
Yep. Though I find "minor chance" when it comes to races to really mean
"annoying to debug". But...
Post by Johannes Schindelin
However, the symlink creation is not checked for success, so the processes
will still both run just fine.
Yes, so there is no race in what is there currently. It's just sad that
we can't detect any actual errors.
Post by Johannes Schindelin
There is a very subtle problem, though. If you screw with your
configuration, replacing a link in t/valgrind/ by a script, my code will
not try to undo it. However, I think that's really asking for trouble,
and you can get out of the mess by "rm -r t/valgrind/git*".
I think we can safely ignore such mucking about in the valgrind
directory as craziness.
Post by Johannes Schindelin
when there was a script by a certain name, my code would symlink it
to $GIT_DIR/$BASENAME (actually a relative path, but you get the
idea). If that script is turned into a builtin -- this list has certainly
known a certain person to push for that kind of conversion :-) -- that
fact is not picked up.
Yes. One way around this is to generate a "want" and a "have" list, and
then just operate on the differences. Something like (totally untested):

(cd $GIT_VALGRIND && ls) | sort >have
(cd $TEST_DIRECTORY/.. && ls git git-*) | sort >want
comm -23 have want | xargs -r rm -v
comm -13 have want | while read f; do ln -s ../../$f $GIT_VALGRIND/$f; done

and then you are also cleaning every time you create.

-Peff
Johannes Schindelin
2009-01-21 01:26:56 UTC
Permalink
Hi,
Post by Jeff King
Post by Johannes Schindelin
Actually, I test first if it is there, and only if it is not, try to
create the symlink.
Now, there is still a very minor chance for a race, namely if two
processes happen to test the existence of the missing symlink at exactly
the same time, and both do not find it, so both processes will try to
create it.
Yep. Though I find "minor chance" when it comes to races to really mean
"annoying to debug". But...
Well, in this case, you will find that the "bug" is _at most_ some
binaries not being found.

And really, the chance is so small as to be forgotten in the clutter:
after the valgrind setup, there are so many other things which are done
that by the time we actually use the Git binaries, everything should be
okay.

And keep in mind, this _only_ matters if you do make -j _and_ you haven't
run --valgrind _ever_. Once the symlinks are there, they are there.

(Actually, with my new patch, the may be replaced, but _only_ if
necessary, and the same thing would apply as I said earlier: the binary
would not be found, or a binary from the PATH would be run without
valgrind; but the next runs will not have the problem.)
Post by Jeff King
Post by Johannes Schindelin
However, the symlink creation is not checked for success, so the
processes will still both run just fine.
Yes, so there is no race in what is there currently. It's just sad that
we can't detect any actual errors.
Now we can. I actually check for the correct link target now (which means
I also check for a link), and if it is incorrect, the link is recreated
(and the deletion is checked for errors).
Post by Jeff King
Post by Johannes Schindelin
There is a very subtle problem, though. If you screw with your
configuration, replacing a link in t/valgrind/ by a script, my code
will not try to undo it. However, I think that's really asking for
trouble, and you can get out of the mess by "rm -r t/valgrind/git*".
I think we can safely ignore such mucking about in the valgrind
directory as craziness.
You'll find that v2 copes with that, too.
Post by Jeff King
Post by Johannes Schindelin
when there was a script by a certain name, my code would symlink it to
$GIT_DIR/$BASENAME (actually a relative path, but you get the idea).
If that script is turned into a builtin -- this list has certainly
known a certain person to push for that kind of conversion :-) -- that
fact is not picked up.
Yes. One way around this is to generate a "want" and a "have" list, and
(cd $GIT_VALGRIND && ls) | sort >have
(cd $TEST_DIRECTORY/.. && ls git git-*) | sort >want
comm -23 have want | xargs -r rm -v
comm -13 have want | while read f; do ln -s ../../$f $GIT_VALGRIND/$f; done
and then you are also cleaning every time you create.
The script will now pick up on those changes, and recreate the symlink
correctly.

We don't need cleaning, as we only link to $TEST_DIRECTORY/.. (at least
via valgrind.sh), and if the binary does not exist there, well, it does
not exist there, and the script will error out, saying so.

Ciao,
Dscho
Johannes Schindelin
2009-01-21 01:36:40 UTC
Permalink
On some Linux systems, we get a host of Cond and Addr errors
from calls to dlopen that are caused by nss modules. We
should be able to safely ignore anything happening in
ld-*.so as "not our problem."

Signed-off-by: Jeff King <***@peff.net>
Signed-off-by: Junio C Hamano <***@pobox.com>
Signed-off-by: Johannes Schindelin <***@gmx.de>
---

Only change vs v1: adds Addr4 suppression, so that ld.so "errors"
are ignored on 32-bit, too.

t/valgrind/default.supp | 18 ++++++++++++++++++
1 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/t/valgrind/default.supp b/t/valgrind/default.supp
index 2482b3b..6061283 100644
--- a/t/valgrind/default.supp
+++ b/t/valgrind/default.supp
@@ -11,6 +11,24 @@
}

{
+ ignore-ldso-cond
+ Memcheck:Cond
+ obj:*ld-*.so
+}
+
+{
+ ignore-ldso-addr8
+ Memcheck:Addr8
+ obj:*ld-*.so
+}
+
+{
+ ignore-ldso-addr4
+ Memcheck:Addr4
+ obj:*ld-*.so
+}
+
+{
writing-data-from-zlib-triggers-errors
Memcheck:Param
write(buf)
--
1.6.1.243.g6c8bb35
Jeff King
2009-01-21 19:09:21 UTC
Permalink
Post by Johannes Schindelin
Only change vs v1: adds Addr4 suppression, so that ld.so "errors"
are ignored on 32-bit, too.
I don't think it is wrong to add the extra suppression, but out of
curiosity, did you actually trigger it? I tested the original on both
32- and 64-bit platforms, and that was what made me create the original
(i.e., for some reason my 32-bit platform did not need the same ld.so
suppression).

-Peff
Johannes Schindelin
2009-01-21 20:51:05 UTC
Permalink
Hi,
Post by Jeff King
Post by Johannes Schindelin
Only change vs v1: adds Addr4 suppression, so that ld.so "errors"
are ignored on 32-bit, too.
I don't think it is wrong to add the extra suppression, but out of
curiosity, did you actually trigger it?
Yes. I wouldn't have touched the file if I hadn't triggered it.

Ciao,
Dscho
Jeff King
2009-01-21 19:07:57 UTC
Permalink
Post by Johannes Schindelin
Well, in this case, you will find that the "bug" is _at most_ some
binaries not being found.
[...]
(Actually, with my new patch, the may be replaced, but _only_ if
necessary, and the same thing would apply as I said earlier: the binary
would not be found, or a binary from the PATH would be run without
valgrind; but the next runs will not have the problem.)
You can run a random binary from the PATH. So I have asked git to test
the version in the repository using valgrind, and to report success only
if both the git command itself succeeds and valgrind reports zero
errors. But it might run some other random version of git, not using
valgrind, and if _that_ succeeds, report success. And you don't think
that is a bug?

I'll grant it is an unlikely race to lose. I guess I just prefer my
races to be non-existent.

-Peff
Johannes Schindelin
2009-01-21 22:17:35 UTC
Permalink
Hi,
Post by Jeff King
Post by Johannes Schindelin
Well, in this case, you will find that the "bug" is _at most_ some
binaries not being found.
[...]
(Actually, with my new patch, the may be replaced, but _only_ if
necessary, and the same thing would apply as I said earlier: the binary
would not be found, or a binary from the PATH would be run without
valgrind; but the next runs will not have the problem.)
You can run a random binary from the PATH.
No. You seem to assume that a test script can run all kinds of Git
commands while another, is replacing the symlinks in $GIT_VALGRIND/bin/ at
the same time.

Fact is: every test script will check $GIT_VALGRIND/bin/ for
up-to-dateness first. Before running any Git command.

During that time, races are possible, but non-fatal, because they all try
to do the same thing.

Except, of course, if you replace a script by a builtin _while your test
is running the up-to-date check of $GIT_VALGRIND/bin/_! But I would have
no word of consolation for you in that case.

So, can we agree that every test script tries to keep $GIT_VALGRIND/bin/
up-to-date before the first Git command is called?

Now, you might assume that it is possible that one test-script symlinked
the Git command while another removed it.

But the script that removed the symlink will recreate it right away.

Granted, during that time, the other script could have gone off to call a
Git command in that very brief time span, but keep in mind: it does not
take a long time from rm to ln -s, _and_ the other script would have to go
on to call a Git command _right through that time_.

And you know which command that might be?

Exactly. git init. Which takes a long, long, long time, and where I
really could not care less if it is called from the PATH or not.

Note: this would be only possible if both scripts checked the same name at
the very same time, coming to the very same result that the name needs
symlinking. Unlikely.

Note, too: such a replacing/creating could only take place the very first
time you run valgrind, or when a script was replaced by a builtin. IOW
very, very rarely to begin with.

Now the big question: is this highly, highly unlikely issue relevant?

And I say: no. Because even in that highly, highly, highly unlikely
event, all that will happen is that a git init (which is tested later,
anyway) is not valgrinded.

Besides, if that race would happen _and_ you would see any issues, you'd
run the test again, without parallelization, because you would not be able
to discern what messages belong together from the output of "make -j50
test" anyway.

And the whole issue goes away, because that call will again try to
make GIT_VALGRIND/bin up-to-date, and there will be no chance for a race
this time.

Phew. A lot of time, a lot of braincycles, and a lot of keystrokes wasted
on that subject, don't you think?

Ciao,
Dscho
Jeff King
2009-01-21 23:57:57 UTC
Permalink
Post by Johannes Schindelin
Phew. A lot of time, a lot of braincycles, and a lot of keystrokes wasted
on that subject, don't you think?
Yes, especially considering my other email that said I had dropped the
subject. ;P

But thank you for discussing it. There is still some part of me that
says "if you have no races, you don't have to worry about analyzing
them." But I think your analysis is correct, and I am willing to let it
go in the name of practicality.

As for braincycles, I don't think they were necessarily wasted. The
point of review is to double-check, and the discussion is how we resolve
(even if we resolve that it is OK as-is). Of course there is such a
thing as useless, annoying pedantry, but I hope this didn't count... :)

-Peff
Junio C Hamano
2009-01-22 00:42:22 UTC
Permalink
Post by Johannes Schindelin
Fact is: every test script will check $GIT_VALGRIND/bin/ for
up-to-dateness first. Before running any Git command.
Hmm, is that a good thing in general? Can't makefile rules be arranged in
such a way that one "valgrind-prep" target runs before all the potentially
parallel executions of actual tests begin?

Independent from the above, I suspect that some of the existing tests
cannot run in parallel; I haven't really looked at any of them, but a
server-ish tests to open a local port and test interaction with client
obviously need to either use different ports or serialize. Perhaps we
need a way to mark some tests that cannot be run in parallel even under
"make -j"?
Jeff King
2009-01-22 00:59:01 UTC
Permalink
Post by Junio C Hamano
Post by Johannes Schindelin
Fact is: every test script will check $GIT_VALGRIND/bin/ for
up-to-dateness first. Before running any Git command.
Hmm, is that a good thing in general? Can't makefile rules be arranged in
such a way that one "valgrind-prep" target runs before all the potentially
parallel executions of actual tests begin?
You have to choose either "everybody does this setup, whether they want
--valgrind or not" which is what my original patch did, or doing it
inside test-lib.sh. Because we don't know we want --valgrind until we
get into the individual scripts.

I suppose one could try parsing GIT_TEST_OPTS in the Makefile, but that
seems a bit hack-ish.

But I like putting it into test-lib.sh; yes, it is a little more CPU
time for each script, but it is negligible compared to running the
actual tests (especially since you only pay when running with
--valgrind, which makes the actual tests very expensive). But it is much
easier to be sure it is _correct_ when you run the test, especially if
you tend to run the test script directly.
Post by Junio C Hamano
Independent from the above, I suspect that some of the existing tests
cannot run in parallel; I haven't really looked at any of them, but a
server-ish tests to open a local port and test interaction with client
obviously need to either use different ports or serialize. Perhaps we
need a way to mark some tests that cannot be run in parallel even under
"make -j"?
I think the only culprits are http-push and a few SVN tests. The
http-push test starts a server on a specific port, but because it is the
only script which uses that port, it is fine. It looks like a few
different SVN tests start an httpd server (9115, 9118, and 9120), which
could potentially interact badly. I've never had a problem running with
"-j4", but I don't have svn installed, so I always end up skipping those
tests.

It looks like both the http-push and svn tests are set up to take an
arbitrary port as input. Perhaps the simplest thing would be for each of
the svn tests to pick a different port so that they can be run
simultaneously.

-Peff
Johannes Schindelin
2009-01-22 05:02:51 UTC
Permalink
Hi,
Post by Jeff King
Post by Junio C Hamano
Independent from the above, I suspect that some of the existing tests
cannot run in parallel; I haven't really looked at any of them, but a
server-ish tests to open a local port and test interaction with client
obviously need to either use different ports or serialize. Perhaps we
need a way to mark some tests that cannot be run in parallel even
under "make -j"?
I think the only culprits are http-push and a few SVN tests. The
http-push test starts a server on a specific port, but because it is the
only script which uses that port, it is fine. It looks like a few
different SVN tests start an httpd server (9115, 9118, and 9120), which
could potentially interact badly. I've never had a problem running with
"-j4", but I don't have svn installed, so I always end up skipping those
tests.
It looks like both the http-push and svn tests are set up to take an
arbitrary port as input. Perhaps the simplest thing would be for each of
the svn tests to pick a different port so that they can be run
simultaneously.
I _suspect_ that the svn tests already use different ports (or can work
with the same httpd), as I have subversion installed and run with -j50
regularly.

Ciao,
Dscho
Jeff King
2009-01-22 05:39:35 UTC
Permalink
Post by Johannes Schindelin
I _suspect_ that the svn tests already use different ports (or can work
with the same httpd), as I have subversion installed and run with -j50
regularly.
I think you're just not running them; it looks like they bail if
SVN_HTTPD_PORT isn't set by the user.

-Peff
Johannes Schindelin
2009-01-27 02:50:48 UTC
Permalink
Hi,

it is real late now, so I am uncomfortable sending off a new patch series
(I _know_ that I'll just introduce a stupid bug or forget to write a
commit message or whatever). In case you are interested in the current
progress, you know where my branches are.

The changes I made:

- added t/valgrind/templates to t/.gitignore, too,

- split out the valgrind-unrelated parts that Peff complained about,

- added some more suppressions I needed,

- added a mode whereby the tests' results are written to test-results/,

- provided a Makefile target for further convenience,

- added a script to coalesce the valgrind results by backtrace,

- split out a patch that lets --valgrind imply --verbose, and

- ran the scripts several times, which is a PITA because one run takes 5.5
hours (and the first time I forgot to redirect stderr, ouch, thus the
test-results/ patch).

I have an output from a previous full run, albeit it was done with an
earlier version of the valgrind patch series I was not comfortable with,
so I will not send it here. Besides, it is 300K (bzip2 -9 reduces that to
20K), and I am sure you don't want to have it.

Just that much, most of the backtraces are pretty repetitive. In fact, I
think most if not all of them touch xwrite.c (I got other errors from my
patches, as I expected).

==valgrind== Syscall param write(buf) points to uninitialised byte(s)
==valgrind== at 0x5609E40: __write_nocancel (in /lib/libpthread-2.6.1.so)
==valgrind== by 0x4D0380: xwrite (wrapper.c:129)
==valgrind== by 0x4D046E: write_in_full (wrapper.c:159)
==valgrind== by 0x4C0697: write_buffer (sha1_file.c:2275)
==valgrind== by 0x4C0B1C: write_loose_object (sha1_file.c:2387)
==valgrind== by 0x4C0C4F: write_sha1_file (sha1_file.c:2418)
==valgrind== by 0x46DBB8: update_one (cache-tree.c:348)
==valgrind== by 0x46D8CF: update_one (cache-tree.c:282)
==valgrind== by 0x46DCCA: cache_tree_update (cache-tree.c:373)
==valgrind== by 0x46E2B5: write_cache_as_tree (cache-tree.c:562)
==valgrind== by 0x4662D4: cmd_write_tree (builtin-write-tree.c:36)
==valgrind== by 0x404F37: run_command (git.c:243)
==valgrind== Address 0x713dc23 is 51 bytes inside a block of size 195 alloc'd
==valgrind== at 0x4C2273B: malloc (in /usr/local/lib/valgrind/amd64-linux/vgpreload_memcheck.so)
==valgrind== by 0x4CFFCC: xmalloc (wrapper.c:20)
==valgrind== by 0x4C0A33: write_loose_object (sha1_file.c:2362)
==valgrind== by 0x4C0C4F: write_sha1_file (sha1_file.c:2418)
==valgrind== by 0x46DBB8: update_one (cache-tree.c:348)
==valgrind== by 0x46D8CF: update_one (cache-tree.c:282)
==valgrind== by 0x46DCCA: cache_tree_update (cache-tree.c:373)
==valgrind== by 0x46E2B5: write_cache_as_tree (cache-tree.c:562)
==valgrind== by 0x4662D4: cmd_write_tree (builtin-write-tree.c:36)
==valgrind== by 0x404F37: run_command (git.c:243)
==valgrind== by 0x4050E4: handle_internal_command (git.c:387)
==valgrind== by 0x4051CA: run_argv (git.c:425)

which can be reproduced by running t0000-basic.out in valgrind mode.

Good night, Vietnam,
Dscho
Linus Torvalds
2009-01-27 03:38:56 UTC
Permalink
Post by Johannes Schindelin
Just that much, most of the backtraces are pretty repetitive. In fact, I
think most if not all of them touch xwrite.c (I got other errors from my
patches, as I expected).
==valgrind== Syscall param write(buf) points to uninitialised byte(s)
==valgrind== at 0x5609E40: __write_nocancel (in /lib/libpthread-2.6.1.so)
==valgrind== by 0x4D0380: xwrite (wrapper.c:129)
==valgrind== by 0x4D046E: write_in_full (wrapper.c:159)
==valgrind== by 0x4C0697: write_buffer (sha1_file.c:2275)
==valgrind== by 0x4C0B1C: write_loose_object (sha1_file.c:2387)
Looks entirely bogus.

I suspect that valgrind for some reason doesn't see the writes made by
zlib as being initialization, possibly due to some incorrect valgrind
annotations on deflate(). We've just totally initialized that whole
buffer with deflate().

It definitely does not look like a git bug, but a valgrind run issue.

Linus
Johannes Schindelin
2009-01-27 04:26:34 UTC
Permalink
Hi,
Post by Linus Torvalds
Post by Johannes Schindelin
Just that much, most of the backtraces are pretty repetitive. In
fact, I think most if not all of them touch xwrite.c (I got other
errors from my patches, as I expected).
==valgrind== Syscall param write(buf) points to uninitialised byte(s)
==valgrind== at 0x5609E40: __write_nocancel (in /lib/libpthread-2.6.1.so)
==valgrind== by 0x4D0380: xwrite (wrapper.c:129)
==valgrind== by 0x4D046E: write_in_full (wrapper.c:159)
==valgrind== by 0x4C0697: write_buffer (sha1_file.c:2275)
==valgrind== by 0x4C0B1C: write_loose_object (sha1_file.c:2387)
Looks entirely bogus.
And it gets worse.

I suspected that zlib does something "cute" with alignments, i.e. that it
writes a possibly odd number of bytes, but then rounds up the buffer to
the next multiple of two of four bytes.

Yet, the buffer in question is 195 bytes, stream.total_count (which
totally agrees with size - stream.avail_out) says it is 58 bytes, and
valgrind says that the byte with offset 51 is uninitialized.

So it is definitely a zlib error. And a strange one at that. Even
allowing for a header, if we have 51 valid bytes in the buffer (remember:
the 52nd byte is reported uninitialized by valgrind), even on a 64-bit
machine, it should not be rounded up to 58 bytes reported by zlib. And
the address of the buffer seems to be even 16-byte aligned (that's
probably valgrind's doing).

Just for bullocks, I let valgrind check if offset 51 is the only
uninitialized byte (who knows what zlib is thinking that it's doing?), and
here's the rub: offset 51 is indeed the _only_ one which valgrind thinks
is uninitialized!

Wasn't there some zlib wizard in the kernel community? We could throw
that thing at him, to see why it behaves so strangely...

Of course, it could also be a valgrind issue, as you suggested. Hmpf.

Ciao,
Dscho
Johannes Schindelin
2009-01-27 04:46:01 UTC
Permalink
Hi,
Post by Johannes Schindelin
Post by Linus Torvalds
Post by Johannes Schindelin
Just that much, most of the backtraces are pretty repetitive. In
fact, I think most if not all of them touch xwrite.c (I got other
errors from my patches, as I expected).
==valgrind== Syscall param write(buf) points to uninitialised byte(s)
==valgrind== at 0x5609E40: __write_nocancel (in /lib/libpthread-2.6.1.so)
==valgrind== by 0x4D0380: xwrite (wrapper.c:129)
==valgrind== by 0x4D046E: write_in_full (wrapper.c:159)
==valgrind== by 0x4C0697: write_buffer (sha1_file.c:2275)
==valgrind== by 0x4C0B1C: write_loose_object (sha1_file.c:2387)
Looks entirely bogus.
And it gets worse.
I suspected that zlib does something "cute" with alignments, i.e. that
it writes a possibly odd number of bytes, but then rounds up the buffer
to the next multiple of two of four bytes.
Yet, the buffer in question is 195 bytes, stream.total_count (which
totally agrees with size - stream.avail_out) says it is 58 bytes, and
valgrind says that the byte with offset 51 is uninitialized.
So it is definitely a zlib error. And a strange one at that. Even
allowing for a header, if we have 51 valid bytes in the buffer
(remember: the 52nd byte is reported uninitialized by valgrind), even on
a 64-bit machine, it should not be rounded up to 58 bytes reported by
zlib. And the address of the buffer seems to be even 16-byte aligned
(that's probably valgrind's doing).
Just for bullocks, I let valgrind check if offset 51 is the only
uninitialized byte (who knows what zlib is thinking that it's doing?),
and here's the rub: offset 51 is indeed the _only_ one which valgrind
thinks is uninitialized!
Wasn't there some zlib wizard in the kernel community? We could throw
that thing at him, to see why it behaves so strangely...
Of course, it could also be a valgrind issue, as you suggested. Hmpf.
FWIW this test was done with 3.4.0.SVN.

Just to be sure, I upgraded to 3.5.0.SVN, the very newest update (well, as
new as I could make my git svn mirror of valgrind and VEX deliver). Still
there.

Off to bed,
Dscho
Mark Brown
2009-01-27 13:14:04 UTC
Permalink
Post by Johannes Schindelin
I suspected that zlib does something "cute" with alignments, i.e. that it
writes a possibly odd number of bytes, but then rounds up the buffer to
the next multiple of two of four bytes.
I don't recall anything along those lines in zlib but it does generate
warnings with valgrind which require overrides - it has at least one
unrolled loop which roll on beyond initialised memory (but keep within
memory that zlib knows it has allocated). It rolls back the results of
the loop before producing output, but it's possible that some unused
bits in the stream may be derived from the results.
Johannes Schindelin
2009-01-27 16:54:56 UTC
Permalink
Hi,
Post by Mark Brown
Post by Johannes Schindelin
I suspected that zlib does something "cute" with alignments, i.e. that
it writes a possibly odd number of bytes, but then rounds up the
buffer to the next multiple of two of four bytes.
I don't recall anything along those lines in zlib but it does generate
warnings with valgrind which require overrides - it has at least one
unrolled loop which roll on beyond initialised memory (but keep within
memory that zlib knows it has allocated). It rolls back the results of
the loop before producing output, but it's possible that some unused
bits in the stream may be derived from the results.
That is what I suspected, but the data contradict this:

- accesses to all offsets between 0 and 50 and 52 and 58 (one _more_ than
indicated as valid by stream.total_count!) do not trigger any message in
valgrind.

- access to offset 51, which is well _within_ the boundaries, and even
well outside the range of a stray alignment issue, _does_ trigger a
valgrind message.

So either valgrind gets it wrong (which I find rather unlikely), or zlib
really does not write to that offset.

Or, and I think that makes most sense so far, valgrind has not really
ignored the initialization of byte number 52 in that buffer which partly
depended on an uninitialized value (but does not matter, maybe due to
Huffman cutoff or something similar).

Come to think of it, the word "suppression" is probably a good indicator
that valgrind never claimed it would mark the zlib buffer as properly
initialized.

Sorry for the noise, then,
Dscho
Linus Torvalds
2009-01-27 18:55:40 UTC
Permalink
Post by Johannes Schindelin
Come to think of it, the word "suppression" is probably a good indicator
that valgrind never claimed it would mark the zlib buffer as properly
initialized.
Hmm. The zlib faq has a note about zlib doing a conditional on
uninitialized memory that doesn't matter, and that is what the suppression
should be about (to avoid a warning about "Conditional jump or move
depends on uninitialised value").

But that one is documented to not matter for the actual output (zlib
FAQ#36).

It's possible that zlib really does leave padding bytes around that
literally don't matter, and that don't get initialized. That really would
be bad, because it means that the output of git wouldn't be repeatable.
But I doubt this is the case - original git used to actually do the SHA1
over the _compressed_ data, which was admittedly a totally and utterly
broken design (and we fixed it), but it did work. Maybe it worked by luck,
but I somehow doubt it.

Some googling did find this:

http://mailman.few.vu.nl/pipermail/sysprog/2008-October/000298.html

which looks very similar: an uninitialized byte in the middle of a
deflate() packet.

Anyway, I'm just going to Cc '***@gzip.org', since this definitely is
_not_ the same issue as in the FAQ, and we're not the only ones seeing it.
For the zlib people: the code is literally this:

/* Set it up */
memset(&stream, 0, sizeof(stream));
deflateInit(&stream, zlib_compression_level);
size = 8 + deflateBound(&stream, len+hdrlen);
compressed = xmalloc(size);

/* Compress it */
stream.next_out = compressed;
stream.avail_out = size;

/* First header.. */
stream.next_in = (unsigned char *)hdr;
stream.avail_in = hdrlen;
while (deflate(&stream, 0) == Z_OK)
/* nothing */;

/* Then the data itself.. */
stream.next_in = buf;
stream.avail_in = len;
ret = deflate(&stream, Z_FINISH);
if (ret != Z_STREAM_END)
die("unable to deflate new object %s (%d)", sha1_to_hex(sha1), ret);

ret = deflateEnd(&stream);
if (ret != Z_OK)
die("deflateEnd on object %s failed (%d)", sha1_to_hex(sha1), ret);

size = stream.total_out;

if (write_buffer(fd, compressed, size) < 0)
die("unable to write sha1 file");

and valgrind complains that the "write_buffer()" call will touch an
uninitialized byte (just one byte, and in the _middle_ of the buffer, no
Post by Johannes Schindelin
Yet, the buffer in question is 195 bytes, stream.total_count (which
totally agrees with size - stream.avail_out) says it is 58 bytes, and
valgrind says that the byte with offset 51 is uninitialized.
The thing to note here is that what we are passing in to "write_buffer()"
is _exactly_ what zlib deflated for us:

- 'compressed' is the allocation, and is what we used to initialize
'stream.next_out' with (at the top of the code sequence above)

- 'size' is gotten from 'stream.total_out' at the end of the compression.

Maybe the zlib people can tell us that we're idiots and the above is
buggy, but maybe there is a real bug in zlib. Maybe it's triggered by our
use of using two different input buffers to deflate() (ie we compress the
header first, and then the body of the actual data, and put it all in one
single output buffer), which may be unusual usage of zlib routines and may
be why there aren't tons of reports of this.

(Our use of just depending on deflate() returning Z_BUF_ERROR after
consuming all of the header data is probably also "unusual", but the
manual explicitly says that it's not fatal and that deflate can be called
again with more buffers).

Oh Gods of zlib, please hear our plea for clarification..

Linus
Johannes Schindelin
2009-01-27 21:52:39 UTC
Permalink
Hi,

[Cc'ed the valgrind-users list, maybe the valgrind Gods can see that our
case is pretty strange, and tell us what we do wrong.]

Note to valgrind experts: this is _not_ about the Conditional thing in
zlib, but about an uninitialized byte _in the middle_ of the zlib output
buffer.
Post by Linus Torvalds
Hmm. The zlib faq has a note about zlib doing a conditional on
uninitialized memory that doesn't matter, and that is what the
suppression should be about (to avoid a warning about "Conditional jump
or move depends on uninitialised value").
But that one is documented to not matter for the actual output (zlib
FAQ#36).
It's possible that zlib really does leave padding bytes around that
literally don't matter, and that don't get initialized. That really
would be bad, because it means that the output of git wouldn't be
repeatable. But I doubt this is the case - original git used to actually
do the SHA1 over the _compressed_ data, which was admittedly a totally
and utterly broken design (and we fixed it), but it did work. Maybe it
worked by luck, but I somehow doubt it.
http://mailman.few.vu.nl/pipermail/sysprog/2008-October/000298.html
which looks very similar: an uninitialized byte in the middle of a
deflate() packet.
_not_ the same issue as in the FAQ, and we're not the only ones seeing it.
[...]
Post by Johannes Schindelin
Yet, the buffer in question is 195 bytes, stream.total_count (which
totally agrees with size - stream.avail_out) says it is 58 bytes, and
valgrind says that the byte with offset 51 is uninitialized.
The thing to note here is that what we are passing in to "write_buffer()"
- 'compressed' is the allocation, and is what we used to initialize
'stream.next_out' with (at the top of the code sequence above)
- 'size' is gotten from 'stream.total_out' at the end of the compression.
Oh Gods of zlib, please hear our plea for clarification..
To help ye Gods, I put together this almost minimal C program:

-- snip --
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <zlib.h>

int main(int argc, char **argv)
{
const char hdr[] = {
0x74, 0x72, 0x65, 0x65, 0x20, 0x31, 0x36, 0x35,
0x00,
};
int hdrlen = sizeof(hdr);
const char buf[] = {
0x31, 0x30, 0x30, 0x36, 0x34, 0x34, 0x20, 0x66,
0x69, 0x6c, 0x65, 0x31, 0x00, 0x10, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x31, 0x30, 0x30, 0x36, 0x34, 0x34, 0x20,
0x66, 0x69, 0x6c, 0x65, 0x32, 0x00, 0x20, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x31, 0x30, 0x30, 0x36, 0x34, 0x34,
0x20, 0x66, 0x69, 0x6c, 0x65, 0x33, 0x00, 0x30,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x31, 0x30, 0x30, 0x36, 0x34,
0x34, 0x20, 0x66, 0x69, 0x6c, 0x65, 0x34, 0x00,
0x40, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x31, 0x30, 0x30, 0x36,
0x34, 0x34, 0x20, 0x66, 0x69, 0x6c, 0x65, 0x35,
0x00, 0x50, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00,
};
int len = sizeof(buf);
z_stream stream;
unsigned char *compressed;
int size, ret, i;
FILE *out;

memset(&stream, 0, sizeof(stream));
deflateInit(&stream, Z_BEST_SPEED);
size = 8 + deflateBound(&stream, len+hdrlen);
compressed = malloc(size);
if (!compressed)
return 1;

stream.next_out = compressed;
stream.avail_out = size;

stream.next_in = (unsigned char *)hdr;
stream.avail_in = hdrlen;
while ((ret = deflate(&stream, 0)) == Z_OK)
/* nothing */;
/* deflate() returns Z_BUF_ERROR at this point */

stream.next_in = (unsigned char *)buf;
stream.avail_in = len;
ret = deflate(&stream, Z_FINISH);
if (ret != Z_STREAM_END)
return 1;

if (deflateEnd(&stream) != Z_OK)
return 1;

out = fopen("/dev/null", "w");
fwrite(compressed + 51, 51, 1, out);
fwrite(compressed + 51, 1, 1, stderr);
fflush(out);
fclose(out);

free(compressed);
return 0;
}
-- snap --

... which produces this output...

-- snip --
==6348== Memcheck, a memory error detector.
==6348== Copyright (C) 2002-2008, and GNU GPL'd, by Julian Seward et al.
==6348== Using LibVEX rev exported, a library for dynamic binary translation.
==6348== Copyright (C) 2004-2008, and GNU GPL'd, by OpenWorks LLP.
==6348== Using valgrind-3.5.0.SVN, a dynamic binary instrumentation framework.
==6348== Copyright (C) 2000-2008, and GNU GPL'd, by Julian Seward et al.
==6348== For more details, rerun with: -v
==6348==
==6348== Use of uninitialised value of size 8
==6348== at 0x4E2FC5B: (within /usr/lib/libz.so.1.2.3.3)
==6348== by 0x4E317B6: (within /usr/lib/libz.so.1.2.3.3)
==6348== by 0x4E2DF9C: (within /usr/lib/libz.so.1.2.3.3)
==6348== by 0x4E2E654: deflate (in /usr/lib/libz.so.1.2.3.3)
==6348== by 0x400957: main (valgrind-testcase.c:60)
==6348==
==6348== Syscall param write(buf) points to uninitialised byte(s)
==6348== at 0x5103D50: write (in /lib/libc-2.6.1.so)
==6348== by 0x50A9AE2: _IO_file_write (in /lib/libc-2.6.1.so)
==6348== by 0x50A9748: (within /lib/libc-2.6.1.so)
==6348== by 0x50A9A4B: _IO_file_xsputn (in /lib/libc-2.6.1.so)
==6348== by 0x509FDBA: fwrite (in /lib/libc-2.6.1.so)
==6348== by 0x4009D7: main (valgrind-testcase.c:69)
==6348== Address 0x53da87b is 51 bytes inside a block of size 195 alloc'd
==6348== at 0x4C222CB: malloc (in /usr/local/lib/valgrind/amd64-linux/vgpreload_memcheck.so)
==6348== by 0x4008D7: main (valgrind-testcase.c:45)
,==6348==
==6348== Syscall param write(buf) points to uninitialised byte(s)
==6348== at 0x5103D50: write (in /lib/libc-2.6.1.so)
==6348== by 0x50A9AE2: _IO_file_write (in /lib/libc-2.6.1.so)
==6348== by 0x50A9748: (within /lib/libc-2.6.1.so)
==6348== by 0x50A9A83: _IO_do_write (in /lib/libc-2.6.1.so)
==6348== by 0x50AA048: _IO_file_sync (in /lib/libc-2.6.1.so)
==6348== by 0x509EDB9: fflush (in /lib/libc-2.6.1.so)
==6348== by 0x4009E0: main (valgrind-testcase.c:70)
==6348== Address 0x4020000 is not stack'd, malloc'd or (recently) free'd
==6348==
==6348== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 15 from 4)
==6348== malloc/free: in use at exit: 0 bytes in 0 blocks.
==6348== malloc/free: 7 allocs, 7 frees, 268,835 bytes allocated.
==6348== For counts of detected errors, rerun with: -v
==6348== Use --track-origins=yes to see where uninitialised values come from
==6348== All heap blocks were freed -- no leaks are possible.
-- snap --

Note that the error only occurs when fwrite()ing to stderr, not
any other file.

This is with valgrind compiled from a git-svn mirror updated today, i.e.
valgrind-3.5.0.SVN.


Ciao,
Dscho
Linus Torvalds
2009-01-29 01:56:05 UTC
Permalink
This one is buggy.
Post by Johannes Schindelin
out = fopen("/dev/null", "w");
fwrite(compressed + 51, 51, 1, out);
fwrite(compressed + 51, 1, 1, stderr);
fflush(out);
fclose(out);
The problem is that the first argument to that first "fwrite()" is simply
wrong. It shouldn't be "compressed + 51", it should be just "compressed".
As it is, you're writing 51 bytes, starting at 51 bytes in, and that's
obviously not correct (you only got 58 bytes from deflate()).

So valgrind does complain about it, but for a perfectly valid reason.

So I think your minimal C program isn't actually showing what you wanted
to show, and isn't showing the behaviour you see in git.

Linus
Johannes Schindelin
2009-01-29 14:22:59 UTC
Permalink
Hi,
Post by Linus Torvalds
This one is buggy.
Not exactly buggy. Underexplained.
Post by Linus Torvalds
Post by Johannes Schindelin
out = fopen("/dev/null", "w");
fwrite(compressed + 51, 51, 1, out);
fwrite(compressed + 51, 1, 1, stderr);
fflush(out);
fclose(out);
The problem is that the first argument to that first "fwrite()" is simply
wrong. It shouldn't be "compressed + 51", it should be just "compressed".
Nope. It should be "compressed + 51" to narrow down the issue, as
valgrind does not complain about _any other_ offset.

Not even when that is _well_ after the 58 bytes deflate() says are
available.
Post by Linus Torvalds
As it is, you're writing 51 bytes, starting at 51 bytes in, and that's
obviously not correct (you only got 58 bytes from deflate()).
It is not, granted. But I left it in for a purpose: to show that valgrind
does not even bother to mention bytes we think should be invalid.

I thought that there might be a shortcut for /dev/null, so I changed the
outfile to a real file, and it _still_ does not complain.
Post by Linus Torvalds
So valgrind does complain about it, but for a perfectly valid reason.
Only it does not. It complains about the write of 1 byte, not the write
of 51.

But I know why: "out" is opened buffered, so it shows the error (well
delayed, I might add, and not in a helpful manner) when fflush() is
called.

The real issue, namely that an access of offset 51 triggers a valgrind
error, is demonstrated by my small test case.

Ciao,
Dscho
Mark Adler
2009-01-28 23:06:44 UTC
Permalink
Post by Linus Torvalds
and valgrind complains that the "write_buffer()" call will touch an
uninitialized byte (just one byte, and in the _middle_ of the
buffer, no
Linus,

That is definitely not deflate's intentional use of uninitialized
bytes that is noted in the zlib FAQ. This is something else.
Post by Linus Torvalds
Maybe the zlib people can tell us that we're idiots and the above is
buggy, but maybe there is a real bug in zlib.
I can't speak to the idiot part, but your usage of deflate is not
buggy. (At least assuming that NULL is all zeros for the compiler in
use.)

If this is all correct, it sounds like a serious bug in deflate. If
so, it would have to be a very sneaky bug to not have been discovered
over the last decade or so of deflate usage on who knows how many
zettabytes of data. The deflate code has remained largely unchanged
in that time, and there really isn't anything unusual about your usage.

I have some questions:

1. Is this problem reproducible on more than one machine?

2. Can someone send me the input and the 58 bytes of output from this
case?

3. Did you try decompressing the 58 bytes?

4. For the detection of an "uninitialized byte", if for example an
uninitialized byte is copied to another location, is that location
then also considered uninitialized? Or does uninitialized mean that
that location has really never been written to?

5. Would the access of uninitialized bytes by deflate have been
detected? Since I don't see a mention of uninitialized access before
the write_buffer(), does that mean that deflate never did such a thing
itself?

Mark
Johannes Schindelin
2009-01-28 23:27:31 UTC
Permalink
Hi,
Post by Mark Adler
2. Can someone send me the input and the 58 bytes of output from this
case?
I did better than that already...
http://article.gmane.org/gmane.comp.version-control.git/107391

Maybe it did not go through correctly.

Unfortunately, I was sick today and could not do any proper work, so I
could not even test the suggestions Julian gave me.

The easiest test, though, should be to set the byte at offset 51 to
something bogus and see if inflate() still groks it.

Ciao,
Dscho
Mark Adler
2009-01-29 00:15:44 UTC
Permalink
Post by Johannes Schindelin
Post by Mark Adler
2. Can someone send me the input and the 58 bytes of output from this
case?
I did better than that already...
http://article.gmane.org/gmane.comp.version-control.git/107391
Johannes,

Thanks for the input and code. When I run it, the byte in question at
offset 51 is 0x2c. The output decompresses fine and the result
matches the input. If I change the 0x2c to anything else,
decompression fails. The 58 bytes are below.

Can you also send me the 58 bytes of output that you get when you run
it? Thanks.

Mark



78 01 2b 29 4a 4d 55 30 34 33 65 30 34 30 30 33
31 51 48 cb cc 49 35 64 10 60 c0 04 48 0a 8c 18
14 30 e5 91 4d 30 66 30 c0 af c0 84 c1 01 bf 02
53 86 00 2c 0a 00 86 79 13 07
Johannes Schindelin
2009-01-29 14:14:17 UTC
Permalink
Hi,
Post by Mark Adler
Post by Johannes Schindelin
Post by Mark Adler
2. Can someone send me the input and the 58 bytes of output from this
case?
I did better than that already...
http://article.gmane.org/gmane.comp.version-control.git/107391
Johannes,
Thanks for the input and code. When I run it, the byte in question at
offset 51 is 0x2c. The output decompresses fine and the result matches
the input. If I change the 0x2c to anything else, decompression fails.
The 58 bytes are below.
Can you also send me the 58 bytes of output that you get when you run it?
I get exactly the same 58 bytes. Together with the fact that the 52nd
byte is actually required to be 0x2c, I think that maybe valgrind is
having problems to track that this byte was correctly initialized.

BTW did you have any chance to test the code with valgrind on your
machine? It might be related to this here platform (x86_64).

Ciao,
Dscho
Johannes Schindelin
2009-01-29 14:54:11 UTC
Permalink
Hi,
Post by Johannes Schindelin
Post by Mark Adler
Post by Johannes Schindelin
Post by Mark Adler
2. Can someone send me the input and the 58 bytes of output from this
case?
I did better than that already...
http://article.gmane.org/gmane.comp.version-control.git/107391
Johannes,
Thanks for the input and code. When I run it, the byte in question at
offset 51 is 0x2c. The output decompresses fine and the result matches
the input. If I change the 0x2c to anything else, decompression fails.
The 58 bytes are below.
Can you also send me the 58 bytes of output that you get when you run it?
I get exactly the same 58 bytes. Together with the fact that the 52nd
byte is actually required to be 0x2c, I think that maybe valgrind is
having problems to track that this byte was correctly initialized.
BTW did you have any chance to test the code with valgrind on your
machine? It might be related to this here platform (x86_64).
Now, things get interesting.

Of course, I made sure that I had the newest zlib installed before
mentioning publically that I found a strange valgrind issue.

But I did not build it from source myself; I installed what Ubuntu Gutsy
Gibbon had to offer me.

Now that I tried to investigate further by compiling zlib from source,
instrumenting it with various valgrind-specific code to find out what is
actually happening, I cannot reproduce anymore!

So I searched for the sources that Ubuntu provides, and I _still_ cannot
reproduce.

So I'll just go for the easy solution, install plain straightforward
zlib-1.2.3 (as opposed to zlib_1.2.3.3.dfsg-12ubuntu1), and apologise to
y'all for all the bruhaha.

Ciao,
Dscho

P.S.: Note that there is still something fishy going on, as Ubuntu's zlib
generates the deflated stream correctly. But that will have to be
investigated by someone with substantially more time on her hands than me.
Jeff King
2009-01-27 04:48:38 UTC
Permalink
Post by Linus Torvalds
Post by Johannes Schindelin
==valgrind== Syscall param write(buf) points to uninitialised byte(s)
==valgrind== at 0x5609E40: __write_nocancel (in /lib/libpthread-2.6.1.so)
==valgrind== by 0x4D0380: xwrite (wrapper.c:129)
==valgrind== by 0x4D046E: write_in_full (wrapper.c:159)
==valgrind== by 0x4C0697: write_buffer (sha1_file.c:2275)
==valgrind== by 0x4C0B1C: write_loose_object (sha1_file.c:2387)
Looks entirely bogus.
I suspect that valgrind for some reason doesn't see the writes made by
zlib as being initialization, possibly due to some incorrect valgrind
annotations on deflate(). We've just totally initialized that whole
buffer with deflate().
It definitely does not look like a git bug, but a valgrind run issue.
Yes, this is exactly the issue I ran into when doing the valgrind stuff
a few months ago. I spent several hours looking carefully at the code
and came to the same conclusion. Anything zlib touches needs to be
manually suppressed for uninitialized writes (which I _thought_ was
covered in the suppressions I sent out originally, but maybe they need
to be tweaked for Dscho's system).

-Peff
Johannes Schindelin
2009-01-27 09:31:05 UTC
Permalink
Hi,
Post by Jeff King
Post by Linus Torvalds
Post by Johannes Schindelin
==valgrind== Syscall param write(buf) points to uninitialised byte(s)
==valgrind== at 0x5609E40: __write_nocancel (in /lib/libpthread-2.6.1.so)
==valgrind== by 0x4D0380: xwrite (wrapper.c:129)
==valgrind== by 0x4D046E: write_in_full (wrapper.c:159)
==valgrind== by 0x4C0697: write_buffer (sha1_file.c:2275)
==valgrind== by 0x4C0B1C: write_loose_object (sha1_file.c:2387)
Looks entirely bogus.
I suspect that valgrind for some reason doesn't see the writes made by
zlib as being initialization, possibly due to some incorrect valgrind
annotations on deflate(). We've just totally initialized that whole
buffer with deflate().
It definitely does not look like a git bug, but a valgrind run issue.
Yes, this is exactly the issue I ran into when doing the valgrind stuff
a few months ago. I spent several hours looking carefully at the code
and came to the same conclusion. Anything zlib touches needs to be
manually suppressed for uninitialized writes (which I _thought_ was
covered in the suppressions I sent out originally, but maybe they need
to be tweaked for Dscho's system).
Indeed. I used the "..." wildcard to account for slight differences in
Git's code calling path.

Sorry for the noise,
Dscho
Jeff King
2009-01-20 04:30:30 UTC
Permalink
Post by Junio C Hamano
* jk/color-parse (Sat Jan 17 10:38:46 2009 -0500) 2 commits
+ expand --pretty=3Dformat color options
+ color: make it easier for non-config to parse color specs
I posted a revised version of 1/2 based on Ren=C3=A9's work, but it loo=
ks
like you have the original. So here it is on top of what's in next.

-- >8 --
=46rom: Ren=C3=A9 Scharfe <***@lsrfire.ath.cx>

optimize color_parse_mem

Commit 5ef8d77a implemented color_parse_mem, a function for
parsing colors from a non-NUL-terminated string, by simply
allocating a new NUL-terminated string and calling
color_parse. This had a small but measurable speed impact on
a user format that used the advanced color parsing. E.g.,

# uses quick parsing
$ time ./git log --pretty=3Dtformat:'%Credfoo%Creset' >/dev/null
real 0m0.673s
user 0m0.652s
sys 0m0.016s

# uses color_parse_mem
$ time ./git log --pretty=3Dtformat:'%C(red)foo%C(reset)' >/dev/null
real 0m0.692s
user 0m0.660s
sys 0m0.032s

This patch implements color_parse_mem as the primary
function, with color_parse as a wrapper for strings. This
gives comparable timings to the first case above.

Original patch by Ren=C3=A9. Commit message and debugging by Jeff
King.

Signed-off-by: Jeff King <***@peff.net>
---
color.c | 38 +++++++++++++++++++++-----------------
1 files changed, 21 insertions(+), 17 deletions(-)

diff --git a/color.c b/color.c
index 54a3da1..915d7a9 100644
--- a/color.c
+++ b/color.c
@@ -41,29 +41,40 @@ static int parse_attr(const char *name, int len)
=20
void color_parse(const char *value, const char *var, char *dst)
{
+ color_parse_mem(value, strlen(value), var, dst);
+}
+
+void color_parse_mem(const char *value, int value_len, const char *var=
,
+ char *dst)
+{
const char *ptr =3D value;
+ int len =3D value_len;
int attr =3D -1;
int fg =3D -2;
int bg =3D -2;
=20
- if (!strcasecmp(value, "reset")) {
+ if (!strncasecmp(value, "reset", len)) {
strcpy(dst, "\033[m");
return;
}
=20
/* [fg [bg]] [attr] */
- while (*ptr) {
+ while (len > 0) {
const char *word =3D ptr;
- int val, len =3D 0;
+ int val, wordlen =3D 0;
=20
- while (word[len] && !isspace(word[len]))
- len++;
+ while (len > 0 && !isspace(word[wordlen])) {
+ wordlen++;
+ len--;
+ }
=20
- ptr =3D word + len;
- while (*ptr && isspace(*ptr))
+ ptr =3D word + wordlen;
+ while (len > 0 && isspace(*ptr)) {
ptr++;
+ len--;
+ }
=20
- val =3D parse_color(word, len);
+ val =3D parse_color(word, wordlen);
if (val >=3D -1) {
if (fg =3D=3D -2) {
fg =3D val;
@@ -75,7 +86,7 @@ void color_parse(const char *value, const char *var, =
char *dst)
}
goto bad;
}
- val =3D parse_attr(word, len);
+ val =3D parse_attr(word, wordlen);
if (val < 0 || attr !=3D -1)
goto bad;
attr =3D val;
@@ -115,7 +126,7 @@ void color_parse(const char *value, const char *var=
, char *dst)
*dst =3D 0;
return;
bad:
- die("bad color value '%s' for variable '%s'", value, var);
+ die("bad color value '%.*s' for variable '%s'", value_len, value, var=
);
}
=20
int git_config_colorbool(const char *var, const char *value, int stdou=
t_is_tty)
@@ -191,10 +202,3 @@ int color_fprintf_ln(FILE *fp, const char *color, =
const char *fmt, ...)
va_end(args);
return r;
}
-
-void color_parse_mem(const char *value, int len, const char *var, char=
*dst)
-{
- char *tmp =3D xmemdupz(value, len);
- color_parse(tmp, var, dst);
- free(tmp);
-}
--=20
1.6.1.335.g0366b.dirty
Jeff King
2009-01-20 04:40:21 UTC
Permalink
Post by Junio C Hamano
* jk/signal-cleanup (Sun Jan 11 06:36:49 2009 -0500) 3 commits
- pager: do wait_for_pager on signal death
- refactor signal handling for cleanup functions
- chain kill signals for cleanup functions
Sorry, I lost track. What is the status of this one?
I need to clean up and re-send. The three improvements needed are:

- there is a related Windows cleanup from JSixt, which I will send
when I re-post

- the test needs a few tweaks to be portable to Windows

- Some of the signal handlers should be guarded from inserting
themselves multiple times. I don't think any are dangerous to run
twice (they generally traverse a list, cleaning up files, and then
remove the list elements), but I'm not sure that you can't get some
stupid behavior, like inserting one handler per diff'd file, which
will unnecessarily allocate memory.

This series fixes pager handling for interrupted git programs. There is
also a related fix that needs to be done for forked git programs. I
posted a "how about this" patch to use run_command for external git
programs, but it has some serious problems ("git bogus" no longer
reports an error!).

I have unfortunately not had very much git time lately, but I'll try to
come up with something for both cases this week.

-Peff
Junio C Hamano
2009-01-20 07:04:55 UTC
Permalink
Post by Junio C Hamano
* jk/signal-cleanup (Sun Jan 11 06:36:49 2009 -0500) 3 commits
- pager: do wait_for_pager on signal death
- refactor signal handling for cleanup functions
- chain kill signals for cleanup functions
Sorry, I lost track. What is the status of this one?
...
Thanks.
Johannes Sixt
2009-01-20 07:55:19 UTC
Permalink
Post by Jeff King
- the test needs a few tweaks to be portable to Windows
While this is true, the workaround I have in my tree is so ugly that its
discussion would hold back this series unnecessarily. So, please don't
wait for the fixup of the test.

[My intention is to send test suite fixups for Windows as a separate
series, which would include the fixup for this case, too.]

-- Hannes
Jeff King
2009-01-20 14:18:10 UTC
Permalink
Post by Johannes Sixt
Post by Jeff King
- the test needs a few tweaks to be portable to Windows
While this is true, the workaround I have in my tree is so ugly that its
discussion would hold back this series unnecessarily. So, please don't
wait for the fixup of the test.
My goal was to just accept multiple exit codes in the test. I'll cc you
when I send out the new one, and you can comment.

-Peff
Boyd Stephen Smith Jr.
2009-01-20 05:17:13 UTC
Permalink
Post by Junio C Hamano
Here are the topics that have been cooking. Commits prefixed with '-'
are only in 'pu' while commits prefixed with '+' are in 'next'. The
ones marked with '.' do not appear in any of the branches, but I am
still holding onto them.
Is there anywhere you are publishing these refs? Of course, I see the
commits in 'pu', but sometimes I would like to merge something you have
in 'next'/'pu' into a branch based on 'master' or one of my local
branches, and I have to go hunting for the commit SHA.

It's not a big deal: qgit, gitk, and 'git log'+grep all solve the issue
quickly enough, and I don't want to add to your workload. I was just
hoping they were already published and I could simply add a remote to my
config to get them.

Currently, I'm just using:
* remote origin
URL: git://git.kernel.org/pub/scm/git/git.git
Remote branch merged with 'git pull' while on branch master
master
Tracked remote branches
html maint man master next pu todo
and I get this:
$ git pull origin jk/color-parse
fatal: Couldn't find remote ref jk/color-parse
--
Boyd Stephen Smith Jr. ,= ,-_-. =.
***@iguanasuicide.net ((_/)o o(\_))
ICQ: 514984 YM/AIM: DaTwinkDaddy `-'(. .)`-'
http://iguanasuicide.net/ \_/
Thomas Rast
2009-01-20 08:57:50 UTC
Permalink
Post by Boyd Stephen Smith Jr.
Is there anywhere you are publishing these refs? Of course, I see the
commits in 'pu', but sometimes I would like to merge something you have
in 'next'/'pu' into a branch based on 'master' or one of my local
branches, and I have to go hunting for the commit SHA.
[...]
Post by Boyd Stephen Smith Jr.
$ git pull origin jk/color-parse
fatal: Couldn't find remote ref jk/color-parse
You could try the script I posted here:

http://article.gmane.org/gmane.comp.version-control.git/106129

Just 'git resurrect -m jk/color-parse' and you should be good to go.
--
Thomas Rast
trast@{inf,student}.ethz.ch
Continue reading on narkive:
Loading...