Discussion:
[PATCH 0/3] Support stdio access to lockfiles
Michael Haggerty
2014-10-01 11:14:46 UTC
Permalink
This series applies on top of the series "Lockfile correctness and
refactoring" (Junio's branch mh/lockfile).

There are already two callers that write to lockfiles using stdio. But
they currently need intimate knowledge of the lockfile implementation
to work correctly; for example, they have to call fclose() themselves
and set lk->fd to -1 to prevent the file from being closed again. This
is awkward and error-prone.

So provide official API support for stdio-based access to lockfiles.
Add a new function fdopen_lock_file(), which returns a (FILE *)
associated with an open lockfile, and teach close_lock_file() (and
therefore also commit_lock_file(), rollback_lock_file(), etc.) to use
fclose() instead of close() on lockfiles for which fdopen_lock_file()
has been called.

...except in the signal handler, where calling fclose() is not
permitted. In the signal handler call close() on any still-open
lockfiles regardless of whether they have been fdopen()ed. Since the
very next step is to delete the file, this should be OK.

The second and third patches rewrite the two callers who currently
fdopen() lockfiles to use the new function. I didn't look around for
other lockfile users that might be simplified and/or sped up by
converting them to use stdio; probably there are some.

This improvement was initially discussed when the second fdopen()
callsite was added [1] and also when discussing inconsistencies
between the documentation and real life in the context of the
mh/lockfile patch series [2].

Michael

[1] http://thread.gmane.org/gmane.comp.version-control.git/256729/focus=256734
[2] http://thread.gmane.org/gmane.comp.version-control.git/257504/focus=257553

Michael Haggerty (3):
fdopen_lock_file(): access a lockfile using stdio
dump_marks(): reimplement using fdopen_lock_file()
commit_packed_refs(): reimplement using fdopen_lock_file()

Documentation/technical/api-lockfile.txt | 34 +++++++++++++++--------
fast-import.c | 21 ++-------------
lockfile.c | 46 ++++++++++++++++++++++++++++----
lockfile.h | 4 +++
refs.c | 5 +---
5 files changed, 71 insertions(+), 39 deletions(-)
--
2.1.0
Michael Haggerty
2014-10-01 11:14:48 UTC
Permalink
Signed-off-by: Michael Haggerty <***@alum.mit.edu>
---
fast-import.c | 21 ++-------------------
1 file changed, 2 insertions(+), 19 deletions(-)

diff --git a/fast-import.c b/fast-import.c
index deadc33..fee7906 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -1794,20 +1794,18 @@ static void dump_marks_helper(FILE *f,
static void dump_marks(void)
{
static struct lock_file mark_lock;
- int mark_fd;
FILE *f;

if (!export_marks_file)
return;

- mark_fd = hold_lock_file_for_update(&mark_lock, export_marks_file, 0);
- if (mark_fd < 0) {
+ if (hold_lock_file_for_update(&mark_lock, export_marks_file, 0) < 0) {
failure |= error("Unable to write marks file %s: %s",
export_marks_file, strerror(errno));
return;
}

- f = fdopen(mark_fd, "w");
+ f = fdopen_lock_file(&mark_lock, "w");
if (!f) {
int saved_errno = errno;
rollback_lock_file(&mark_lock);
@@ -1816,22 +1814,7 @@ static void dump_marks(void)
return;
}

- /*
- * Since the lock file was fdopen()'ed, it should not be close()'ed.
- * Assign -1 to the lock file descriptor so that commit_lock_file()
- * won't try to close() it.
- */
- mark_lock.fd = -1;
-
dump_marks_helper(f, 0, marks);
- if (ferror(f) || fclose(f)) {
- int saved_errno = errno;
- rollback_lock_file(&mark_lock);
- failure |= error("Unable to write marks file %s: %s",
- export_marks_file, strerror(saved_errno));
- return;
- }
-
if (commit_lock_file(&mark_lock)) {
failure |= error("Unable to commit marks file %s: %s",
export_marks_file, strerror(errno));
--
2.1.0
Michael Haggerty
2014-10-01 11:14:49 UTC
Permalink
Signed-off-by: Michael Haggerty <***@alum.mit.edu>
---
refs.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/refs.c b/refs.c
index 1d73f1d..a77458f 100644
--- a/refs.c
+++ b/refs.c
@@ -2309,16 +2309,13 @@ int commit_packed_refs(void)
if (!packed_ref_cache->lock)
die("internal error: packed-refs not locked");

- out = fdopen(packed_ref_cache->lock->fd, "w");
+ out = fdopen_lock_file(packed_ref_cache->lock, "w");
if (!out)
die_errno("unable to fdopen packed-refs descriptor");

fprintf_or_die(out, "%s", PACKED_REFS_HEADER);
do_for_each_entry_in_dir(get_packed_ref_dir(packed_ref_cache),
0, write_packed_entry_fn, out);
- if (fclose(out))
- die_errno("write error");
- packed_ref_cache->lock->fd = -1;

if (commit_lock_file(packed_ref_cache->lock)) {
save_errno = errno;
--
2.1.0
Michael Haggerty
2014-10-01 11:14:47 UTC
Permalink
Add a new function, fdopen_lock_file(), which returns a FILE pointer
open to the lockfile. If a stream is open on a lock_file object, it is
closed using fclose() on commit, rollback, or close_lock_file().

This change will allow callers to use stdio to write to a lockfile
without having to muck around in the internal representation of the
lock_file object (callers will be rewritten in upcoming commits).

Signed-off-by: Michael Haggerty <***@alum.mit.edu>
---

I thought about adding second stdio-oriented entrance points analogous
to hold_lock_file_for_update(), hold_lock_file_for_append(), and
reopen_lock_file(), but it seemed simpler to add just the one new
function instead of three or four. If using stdio on lockfiles becomes
more popular, we might want to add some helper functions to make it a
bit more convenient.

In close_lock_file(), if ferror() returns an error, then errno is not
necessarily still set in a way that reflects the original error. I
don't see a way to ensure that errno is set correctly in this
situation. But hopefully, callers are monitoring their calls to
fwrite()/fprintf() etc and will have noticed write errors on their own
already. If anybody can suggest an improvement here, please let me
know.

Documentation/technical/api-lockfile.txt | 34 +++++++++++++++--------
lockfile.c | 46 ++++++++++++++++++++++++++++----
lockfile.h | 4 +++
3 files changed, 68 insertions(+), 16 deletions(-)

diff --git a/Documentation/technical/api-lockfile.txt b/Documentation/technical/api-lockfile.txt
index d4484d1..93b5f23 100644
--- a/Documentation/technical/api-lockfile.txt
+++ b/Documentation/technical/api-lockfile.txt
@@ -42,9 +42,13 @@ The caller:
of the final destination (e.g. `$GIT_DIR/index`) to
`hold_lock_file_for_update` or `hold_lock_file_for_append`.

-* Writes new content for the destination file by writing to the file
- descriptor returned by those functions (also available via
- `lock->fd`).
+* Writes new content for the destination file by either:
+
+ * writing to the file descriptor returned by the `hold_lock_file_*`
+ functions (also available via `lock->fd`).
+
+ * calling `fdopen_lock_file` to get a `FILE` pointer for the open
+ file and writing to the file using stdio.

When finished writing, the caller can:

@@ -70,10 +74,10 @@ any uncommitted changes.

If you need to close the file descriptor you obtained from a
`hold_lock_file_*` function yourself, do so by calling
-`close_lock_file`. You should never call `close(2)` yourself!
-Otherwise the `struct lock_file` structure would still think that the
-file descriptor needs to be closed, and a commit or rollback would
-result in duplicate calls to `close(2)`. Worse yet, if you `close(2)`
+`close_lock_file`. You should never call `close(2)` or `fclose(3)`
+yourself! Otherwise the `struct lock_file` structure would still think
+that the file descriptor needs to be closed, and a commit or rollback
+would result in duplicate calls to `close(2)`. Worse yet, if you close
and then later open another file descriptor for a completely different
purpose, then a commit or rollback might close that unrelated file
descriptor.
@@ -143,6 +147,13 @@ hold_lock_file_for_append::
the existing contents of the file (if any) to the lockfile and
position its write pointer at the end of the file.

+fdopen_lock_file::
+
+ Associate a stdio stream with the lockfile. Return NULL
+ (*without* rolling back the lockfile) on error. The stream is
+ closed automatically when `close_lock_file` is called or when
+ the file is committed or rolled back.
+
get_locked_file_path::

Return the path of the file that is locked by the specified
@@ -179,10 +190,11 @@ close_lock_file::

Take a pointer to the `struct lock_file` initialized with an
earlier call to `hold_lock_file_for_update` or
- `hold_lock_file_for_append`, and close the file descriptor.
- Return 0 upon success. On failure to `close(2)`, return a
- negative value and roll back the lock file. Usually
- `commit_lock_file`, `commit_lock_file_to`, or
+ `hold_lock_file_for_append`. Close the file descriptor (and
+ the file pointer if it has been opened using
+ `fdopen_lock_file`). Return 0 upon success. On failure to
+ `close(2)`, return a negative value and roll back the lock
+ file. Usually `commit_lock_file`, `commit_lock_file_to`, or
`rollback_lock_file` should eventually be called if
`close_lock_file` succeeds.

diff --git a/lockfile.c b/lockfile.c
index d27e61c..e046027 100644
--- a/lockfile.c
+++ b/lockfile.c
@@ -7,20 +7,29 @@

static struct lock_file *volatile lock_file_list;

-static void remove_lock_files(void)
+static void remove_lock_files(int skip_fclose)
{
pid_t me = getpid();

while (lock_file_list) {
- if (lock_file_list->owner == me)
+ if (lock_file_list->owner == me) {
+ /* fclose() is not safe to call in a signal handler */
+ if (skip_fclose)
+ lock_file_list->fp = NULL;
rollback_lock_file(lock_file_list);
+ }
lock_file_list = lock_file_list->next;
}
}

+static void remove_lock_files_on_exit(void)
+{
+ remove_lock_files(0);
+}
+
static void remove_lock_files_on_signal(int signo)
{
- remove_lock_files();
+ remove_lock_files(1);
sigchain_pop(signo);
raise(signo);
}
@@ -97,7 +106,7 @@ static int lock_file(struct lock_file *lk, const char *path, int flags)
if (!lock_file_list) {
/* One-time initialization */
sigchain_push_common(remove_lock_files_on_signal);
- atexit(remove_lock_files);
+ atexit(remove_lock_files_on_exit);
}

if (lk->active)
@@ -106,6 +115,7 @@ static int lock_file(struct lock_file *lk, const char *path, int flags)
if (!lk->on_list) {
/* Initialize *lk and add it to lock_file_list: */
lk->fd = -1;
+ lk->fp = NULL;
lk->active = 0;
lk->owner = 0;
strbuf_init(&lk->filename, pathlen + LOCK_SUFFIX_LEN);
@@ -214,6 +224,17 @@ int hold_lock_file_for_append(struct lock_file *lk, const char *path, int flags)
return fd;
}

+FILE *fdopen_lock_file(struct lock_file *lk, const char *mode)
+{
+ if (!lk->active)
+ die("BUG: fdopen_lock_file() called for unlocked object");
+ if (lk->fp)
+ die("BUG: fdopen_lock_file() called twice for file '%s'", lk->filename.buf);
+
+ lk->fp = fdopen(lk->fd, mode);
+ return lk->fp;
+}
+
char *get_locked_file_path(struct lock_file *lk)
{
if (!lk->active)
@@ -226,17 +247,32 @@ char *get_locked_file_path(struct lock_file *lk)
int close_lock_file(struct lock_file *lk)
{
int fd = lk->fd;
+ FILE *fp = lk->fp;
+ int err;

if (fd < 0)
return 0;

lk->fd = -1;
- if (close(fd)) {
+ if (fp) {
+ lk->fp = NULL;
+
+ /*
+ * Note: no short-circuiting here; we want to fclose()
+ * in any case!
+ */
+ err = ferror(fp) | fclose(fp);
+ } else {
+ err = close(fd);
+ }
+
+ if (err) {
int save_errno = errno;
rollback_lock_file(lk);
errno = save_errno;
return -1;
}
+
return 0;
}

diff --git a/lockfile.h b/lockfile.h
index 9059e89..dc066d1 100644
--- a/lockfile.h
+++ b/lockfile.h
@@ -34,6 +34,8 @@
* - active is set
* - filename holds the filename of the lockfile
* - fd holds a file descriptor open for writing to the lockfile
+ * - fp holds a pointer to an open FILE object if and only if
+ * fdopen_lock_file() has been called on the object
* - owner holds the PID of the process that locked the file
*
* - Locked, lockfile closed (after successful close_lock_file()).
@@ -56,6 +58,7 @@ struct lock_file {
struct lock_file *volatile next;
volatile sig_atomic_t active;
volatile int fd;
+ FILE *volatile fp;
volatile pid_t owner;
char on_list;
struct strbuf filename;
@@ -74,6 +77,7 @@ extern void unable_to_lock_message(const char *path, int err,
extern NORETURN void unable_to_lock_die(const char *path, int err);
extern int hold_lock_file_for_update(struct lock_file *, const char *path, int);
extern int hold_lock_file_for_append(struct lock_file *, const char *path, int);
+extern FILE *fdopen_lock_file(struct lock_file *, const char *mode);
extern char *get_locked_file_path(struct lock_file *);
extern int commit_lock_file_to(struct lock_file *, const char *path);
extern int commit_lock_file(struct lock_file *);
--
2.1.0
Jeff King
2014-10-01 12:48:47 UTC
Permalink
Post by Michael Haggerty
I thought about adding second stdio-oriented entrance points analogous
to hold_lock_file_for_update(), hold_lock_file_for_append(), and
reopen_lock_file(), but it seemed simpler to add just the one new
function instead of three or four. If using stdio on lockfiles becomes
more popular, we might want to add some helper functions to make it a
bit more convenient.
I think it makes sense to wait until we see more potential callers crop
up.
Post by Michael Haggerty
In close_lock_file(), if ferror() returns an error, then errno is not
necessarily still set in a way that reflects the original error. I
don't see a way to ensure that errno is set correctly in this
situation. But hopefully, callers are monitoring their calls to
fwrite()/fprintf() etc and will have noticed write errors on their own
already. If anybody can suggest an improvement here, please let me
know.
I was careful in the packed-refs stdio caller to check all of my fprintf
calls (because I was using fclose myself). I wonder if it would be nicer
to back off from that and just depend on the ferror() call at
commit-time. The exact value of errno is not usually that important
after the open() has succeeded.
Post by Michael Haggerty
-static void remove_lock_files(void)
+static void remove_lock_files(int skip_fclose)
{
pid_t me = getpid();
while (lock_file_list) {
- if (lock_file_list->owner == me)
+ if (lock_file_list->owner == me) {
+ /* fclose() is not safe to call in a signal handler */
+ if (skip_fclose)
+ lock_file_list->fp = NULL;
I wondered when reading the commit message if it should mention this
signal-handling case (which you brought up in the cover letter). This
comment is probably enough, though.
Post by Michael Haggerty
+FILE *fdopen_lock_file(struct lock_file *lk, const char *mode)
+{
+ if (!lk->active)
+ die("BUG: fdopen_lock_file() called for unlocked object");
+ if (lk->fp)
+ die("BUG: fdopen_lock_file() called twice for file '%s'", lk->filename.buf);
I would have expected calling this twice to be a noop (i.e., make the
function more "give me the associated filehandle, and create one if
necessary"). But I don't think any current callers should need that, so
it probably makes sense to play it safe and die("BUG"), at least for
now.
Post by Michael Haggerty
+ if (fp) {
+ lk->fp = NULL;
+
+ /*
+ * Note: no short-circuiting here; we want to fclose()
+ * in any case!
+ */
+ err = ferror(fp) | fclose(fp);
Would this be more clear as:

err = ferror(fp);
err |= fclose(fp);

-Peff
Junio C Hamano
2014-10-01 21:20:14 UTC
Permalink
Post by Jeff King
... If using stdio on lockfiles becomes
more popular, we might want to add some helper functions to make it a
bit more convenient.
I think it makes sense to wait until we see more potential callers crop
up.
Yup.
Post by Jeff King
In close_lock_file(), if ferror() returns an error, then errno is not
necessarily still set in a way that reflects the original error. I
don't see a way to ensure that errno is set correctly in this
situation. But hopefully, callers are monitoring their calls to
fwrite()/fprintf() etc and will have noticed write errors on their own
already. If anybody can suggest an improvement here, please let me
know.
I was careful in the packed-refs stdio caller to check all of my fprintf
calls (because I was using fclose myself). I wonder if it would be nicer
to back off from that and just depend on the ferror() call at
commit-time.
That's a thought (I think the same can be said for "close-time").
Post by Jeff King
-static void remove_lock_files(void)
+static void remove_lock_files(int skip_fclose)
{
pid_t me = getpid();
while (lock_file_list) {
- if (lock_file_list->owner == me)
+ if (lock_file_list->owner == me) {
+ /* fclose() is not safe to call in a signal handler */
+ if (skip_fclose)
+ lock_file_list->fp = NULL;
I wondered when reading the commit message if it should mention this
signal-handling case (which you brought up in the cover letter). This
comment is probably enough, though.
No strong opinion either way.
Post by Jeff King
+FILE *fdopen_lock_file(struct lock_file *lk, const char *mode)
+{
+ if (!lk->active)
+ die("BUG: fdopen_lock_file() called for unlocked object");
+ if (lk->fp)
+ die("BUG: fdopen_lock_file() called twice for file '%s'", lk->filename.buf);
I would have expected calling this twice to be a noop (i.e., make the
function more "give me the associated filehandle, and create one if
necessary"). But I don't think any current callers should need that, so
it probably makes sense to play it safe and die("BUG"), at least for
now.
Interesting. One could imagine a sane call-chain whose top-level
creates a lockfile, associates a FILE * with it to write into it
itself, then calls set of helpers. You could pass only FILE * to
such helpers that does nothing other than writing to lk->fp to the
lockfile, but it would be cumbersome if a helper wants to have
access to the lockfile itself and FILE * (i.e. it writes and then
either commits or rolls back; name it foo_finish() or something).

Such a call-chain certainly would want a way to ask "I know this lk
is already associated with a FILE *; give me that". But that still
does not require "I do not know if this lk already has FILE * or
not, but I want a FILE * associated with it now. Peek or create."

So I think this is OK.
Post by Jeff King
+ if (fp) {
+ lk->fp = NULL;
+
+ /*
+ * Note: no short-circuiting here; we want to fclose()
+ * in any case!
+ */
+ err = ferror(fp) | fclose(fp);
err = ferror(fp);
err |= fclose(fp);
No strong opinion either way.

Thanks, both.
Torsten Bögershausen
2014-10-02 09:29:22 UTC
Permalink
On 01.10.14 13:14, Michael Haggerty wrote:
[]
Nice done, small comments inline
Post by Michael Haggerty
diff --git a/lockfile.c b/lockfile.c
index d27e61c..e046027 100644
--- a/lockfile.c
+++ b/lockfile.c
@@ -7,20 +7,29 @@
static struct lock_file *volatile lock_file_list;
-static void remove_lock_files(void)
+static void remove_lock_files(int skip_fclose)
Even if the motivation to skip is clear now and here,
I would consider to do it the other way around,
and actively order the fclose():

static void remove_lock_files(int call_fclose)
Post by Michael Haggerty
{
pid_t me = getpid();
while (lock_file_list) {
- if (lock_file_list->owner == me)
+ if (lock_file_list->owner == me) {
+ /* fclose() is not safe to call in a signal handler */
+ if (skip_fclose)
+ lock_file_list->fp = NULL;
rollback_lock_file(lock_file_list);
+ }
lock_file_list = lock_file_list->next;
}
}
+static void remove_lock_files_on_exit(void)
+{
+ remove_lock_files(0);
What does "0" mean ?

remove_lock_files(LK_DO_FCLOSE) ?
Post by Michael Haggerty
+}
+
static void remove_lock_files_on_signal(int signo)
{
- remove_lock_files();
+ remove_lock_files(1);
And what does this "1" mean ?

remove_lock_files(LK_SKIP_FCLOSE) ?

We can even have an emum, or use #define
Michael Haggerty
2014-10-12 06:17:19 UTC
Permalink
Sorry, I see I never responded to this email.
Post by Torsten Bögershausen
[]
Nice done, small comments inline
Post by Michael Haggerty
diff --git a/lockfile.c b/lockfile.c
index d27e61c..e046027 100644
--- a/lockfile.c
+++ b/lockfile.c
@@ -7,20 +7,29 @@
=20
static struct lock_file *volatile lock_file_list;
=20
-static void remove_lock_files(void)
+static void remove_lock_files(int skip_fclose)
Even if the motivation to skip is clear now and here,
I would consider to do it the other way around,
=20
static void remove_lock_files(int call_fclose)
I don't think inverting the logic will help the reader remember the
motivation for skipping the call to fclose(). I think this way was
clearer because skipping the call to fclose() is the "unusual" case; it
has to actively sabotage the fclose() that would otherwise take place i=
n
rollback_lock_file(). Also, "call_fclose" slightly implies that fclose(=
)
will be called for the lockfiles, whereas in fact it will only be calle=
d
for the lockfiles for which fdopen_lock_file() has been called.
Post by Torsten Bögershausen
Post by Michael Haggerty
{
pid_t me =3D getpid();
=20
while (lock_file_list) {
- if (lock_file_list->owner =3D=3D me)
+ if (lock_file_list->owner =3D=3D me) {
+ /* fclose() is not safe to call in a signal handler */
+ if (skip_fclose)
+ lock_file_list->fp =3D NULL;
rollback_lock_file(lock_file_list);
+ }
lock_file_list =3D lock_file_list->next;
}
}
=20
+static void remove_lock_files_on_exit(void)
+{
+ remove_lock_files(0);
What does "0" mean ?
=20
remove_lock_files(LK_DO_FCLOSE) ?
=20
Post by Michael Haggerty
+}
+
static void remove_lock_files_on_signal(int signo)
{
- remove_lock_files();
+ remove_lock_files(1);
And what does this "1" mean ?
=20
remove_lock_files(LK_SKIP_FCLOSE) ?
=20
We can even have an emum, or use #define
Meh. These are private functions, all defined within a few lines of eac=
h
other. I think that an enum would be overkill here when a "boolean"
suffices.

Michael

--=20
Michael Haggerty
***@alum.mit.edu

Jeff King
2014-10-01 12:52:15 UTC
Permalink
Post by Michael Haggerty
This series applies on top of the series "Lockfile correctness and
refactoring" (Junio's branch mh/lockfile).
There are already two callers that write to lockfiles using stdio. But
they currently need intimate knowledge of the lockfile implementation
to work correctly; for example, they have to call fclose() themselves
and set lk->fd to -1 to prevent the file from being closed again. This
is awkward and error-prone.
I think it's also wrong on systems where you cannot delete an open file.
A signal or atexit handler will not be able to close the file (since it
does not know the fd), and the unlink() will fail. IIRC, either cygwin
or msysgit (or both?) is such a platform.
Post by Michael Haggerty
fdopen_lock_file(): access a lockfile using stdio
dump_marks(): reimplement using fdopen_lock_file()
commit_packed_refs(): reimplement using fdopen_lock_file()
I had a few minor comments on the first patch, but I would also be OK
with it going in as-is. The other two looked fine to me. Thanks for
working on this.

-Peff
Loading...