There are two types of string_lists: those that own the
string memory, and those that don't. You can tell the
difference by the strdup_strings flag, and one should use
either STRING_LIST_INIT_DUP, or STRING_LIST_INIT_NODUP as an
initializer.
Historically, the normal all-zeros initialization has
corresponded to the NODUP case. Many sites use no
initializer at all, and that works as a shorthand for that
case. But for a reader of the code, it can be hard to
remember which is which. Let's be more explicit and actually
have each site declare which type it means to use.
This is a fairly mechanical conversion; I assumed each site
was correct as-is, and just switched them all to NODUP.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
blame,shortlog: don't make local option variables static
There's no need for these option variables to be static,
except that they are referenced by the options array itself,
which is static. But having all of this static is simply
unnecessary and confusing (and inconsistent with most other
commands, which either use a static global option list or a
true function-local one).
Note that in some cases we may need to actually initialize
the variables (since we cannot rely on BSS to do so). This
is a net improvement to readability, though, as we can use
the more verbose initializers for our string_lists.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
parse_opt_string_list: stop allocating new strings
The parse_opt_string_list callback is basically a thin
wrapper to string_list_append() any string options we get.
However, it calls:
string_list_append(v, xstrdup(arg));
which duplicates the option value. This is wrong for two
reasons:
1. If the string list has strdup_strings set, then we are
making an extra copy, which is simply leaked.
2. If the string list does not have strdup_strings set,
then we pass memory ownership to the string list, but
it does not realize this. If we later call
string_list_clear(), which can happen if "--no-foo" is
passed, then we will leak all of the existing entries.
Instead, we should just pass the argument straight to
string_list_append, and it can decide whether to copy or not
based on its strdup_strings flag.
It's possible that some (buggy) caller could be relying on
this extra copy (e.g., because it parses some options from
an allocated argv array and then frees the array), but it's
not likely. For one, we generally only use parse_options on
the argv given to us in main(). And two, such a caller is
broken anyway, because other option types like OPT_STRING()
do not make such a copy. This patch brings us in line with
them.
Noticed-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 15ffb7cde48 (2011-06-13, submodule update: continue when a checkout
fails), we reasoned it is ok to continue, when there is not much of
a mental burden by the failure. If a recursive submodule fails to clone
because a .gitmodules file is broken (e.g. :
fatal: No url found for submodule path 'foo/bar' in .gitmodules
Failed to recurse into submodule path 'foo', signaled by exit code 128),
this is one of the cases where the user is not expected to have much of
a burden afterwards, so we can also continue in that case.
This means we only want to stop for updating submodules in case of rebase,
merge or custom update command failures, which are all signaled with
exit code 2.
Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Each submodule that is attempted to be cloned, will be retried once in
case of failure after all other submodules were cloned. This helps to
mitigate ephemeral server failures and increases chances of a reliable
clone of a repo with hundreds of submodules immensely.
Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's continuing work in this area, so clean up unneeded "extern"
keywords rather than schlepping them around. Also split up some overlong
lines and add parameter names in a couple of places.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
lock_ref_sha1_basic(): only handle REF_NODEREF mode
Now lock_ref_sha1_basic() is only called with flags==REF_NODEREF. So we
don't have to handle other cases anymore.
This enables several simplifications, the most interesting of which come
from the fact that ref_lock::orig_ref_name is now always the same as
ref_lock::ref_name:
* Remove ref_lock::orig_ref_name
* Remove local variable orig_refname from lock_ref_sha1_basic()
* ref_name can be initialize once and its value reused
* commit_ref_update() never has to write to the reflog for
lock->orig_ref_name
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
If a transaction includes a non-NODEREF update to a symbolic reference,
we don't have to look it up in lock_ref_for_update(). The reference will
be dereferenced anyway when the split-off update is processed.
This change requires that we store a backpointer from the split-off
update to its parent update, for two reasons:
* We still want to report the original reference name in error messages.
So if an error occurs when checking the split-off update's old_sha1,
walk the parent_update pointers back to find the original reference
name, and report that one.
* We still need to write the old_sha1 of the symref to its reflog. So
after we read the split-off update's reference value, walk the
parent_update pointers back and fill in their old_sha1 fields.
Aside from eliminating unnecessary reads, this change fixes a
subtle (though not very serious) race condition: in the old code, the
old_sha1 of the symref was resolved before the reference that it pointed
at was locked. So it was possible that the old_sha1 value logged to the
symref's reflog could be wrong if another process changed the downstream
reference before it was locked.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Before the previous patch, our first read of the reference happened
before the reference was locked, so we couldn't trust its value and had
to read it again. But now that our first read of the reference happens
after acquiring the lock, there is no need to read it a second time. So
move the read_ref_full() call into the (update->type & REF_ISSYMREF)
block.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Before committing ref updates, split symbolic ref updates into two
parts: an update to the underlying ref, and a log-only update to the
symbolic ref. This ensures that both references are locked correctly
during the transaction, including while their reflogs are updated.
Similarly, if the reference pointed to by HEAD is modified directly, add
a separate log-only update to HEAD, rather than leaving the job of
updating HEAD's reflog to commit_ref_update(). This change ensures that
HEAD is locked correctly while its reflog is being modified, as well as
being cheaper (HEAD only needs to be resolved once).
This makes use of a new function, lock_raw_ref(), which is analogous to
read_raw_ref(), but acquires a lock on the reference before reading it.
This change still has two problems:
* There are redundant read_ref_full() reference lookups.
* It is still possible to get incorrect reflogs for symbolic references
if there is a concurrent update by another process, since the old_oid
of a symref is determined before the lock on the pointed-to ref is
held.
Both problems will soon be fixed.
Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
WIP
ref_transaction_update(): check refname_is_safe() at a minimum
If the user has asked that a new value be set for a reference, we use
check_refname_format() to verify that the reference name satisfies all
of the rules. But in other cases, at least check that refname_is_safe().
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Change add_update() to initialize all of the fields in the new
ref_update object. Rename the function to ref_transaction_add_update(),
and increase its visibility to all of the refs-related code.
All of this makes the function more useful for other future callers.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
When renaming refs, don't dereference either the origin or the destination
before renaming.
The origin does not need to be dereferenced because it is presently
forbidden to rename symbolic refs.
Not dereferencing the destination fixes a bug where renaming on top of
a broken symref would use the pointed-to ref name for the moved
reflog.
Add a test for the reflog bug.
Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
The refs infrastructure learns about log-only ref updates, which only
update the reflog. Later, we will use this to separate symbolic
reference resolution from ref updating.
Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
It is nonsensical (and a little bit dangerous) to use REF_ISPRUNING
without REF_NODEREF. Forbid it explicitly. Change the one REF_ISPRUNING
caller to pass REF_NODEREF too.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
lock_ref_sha1_basic(): remove unneeded local variable
resolve_ref_unsafe() can cope with being called with NULL passed to its
flags argument. So lock_ref_sha1_basic() can just hand its own type
parameter through.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
This will hopefully reduce confusion with the "flags" arguments that are
used in many functions in this module as an input parameter to choose
how the function should operate.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
ref_transaction_commit(): remove local variables n and updates
These microoptimizations don't make a significant difference in speed.
And they cause problems if somebody ever wants to modify the function to
add updates to a transaction as part of processing it, as will happen
shortly.
Make the same changes in initial_ref_transaction_commit().
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
http://lkml.kernel.org/g/20160610075043.GA13411@sigill.intra.peff.net
reports that a change to add a new "function" with common ending
with the existing one at the end of the file is shown like this:
def foo
do_foo_stuff()
+ common_ending()
+end
+
+def bar
+ do_bar_stuff()
+
common_ending()
end
when the new heuristic is in use. In reality, the change is to add
the blank line before "def bar" and everything below, which is what
the code without the new heuristic shows.
Disable the heuristics by default, and resurrect the documentation
for the option and the configuration variables, while clearly
marking the feature as still experimental.
write_or_die: remove the unused write_or_whine() function
Now the last caller of this function is gone, and new ones are
unlikely to appear, because this function is doing very little that
a regular if() does not besides obfuscating the error message (and
if we ever did want something like it, we would probably prefer the
function to come back with more "normal" return value semantics).
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When -W is given we search the lines between the end of the current
context and the next change for a function line. If there is none then
we merge those two hunks as they must be part of the same function.
If the next change is an appended chunk we abort the search early in
get_func_line(), however, because its line number is out of range. Fix
that by searching from the end of the pre-image in that case instead.
Reported-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use "working tree" instead of "working directory" for git status
Working directory can be easily confused with the current directory.
In one of my patches I already updated the usage of working directory
with working tree for the man page but I noticed that git status also
uses this incorrect term.
Signed-off-by: Lars Vogel <Lars.Vogel@vogella.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/commit.c: memoize git-path for COMMIT_EDITMSG
This is a follow up commit for f932729c (memoize common git-path
"constant" files, 10-Aug-2015).
The many function calls to git_path() are replaced by
git_path_commit_editmsg() and which thus eliminates the need to repeatedly
compute the location of "COMMIT_EDITMSG".
Mentored-by: Lars Schneider <larsxschneider@gmail.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
send-pack: use buffered I/O to talk to pack-objects
We start a pack-objects process and then write all of the
positive and negative sha1s to it over a pipe. We do so by
formatting each item into a fixed-size buffer and then
writing each individually. This has two drawbacks:
1. There's some manual computation of the buffer size,
which is not immediately obvious is correct (though it
is).
2. We write() once per sha1, which means a lot more system
calls than are necessary.
We can solve both by wrapping the pipe descriptor in a stdio
handle; this is the same technique used by upload-pack when
serving fetches.
Note that we can also simplify and improve the error
handling here. The original detected a single write error
and broke out of the loop (presumably to avoid writing the
error message over and over), but never actually acted on
seeing an error; we just fed truncated input and took
whatever pack-objects returned.
In practice, this probably didn't matter, as the likely
errors would be caused by pack-objects dying (and we'd
probably just die with SIGPIPE anyway). But we can easily
make this simpler and more robust; the stdio handle keeps an
error flag, which we can check at the end.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Tom Russello <tom.russello@grenoble-inp.org> Signed-off-by: Erwan Mathoniere <erwan.mathoniere@grenoble-inp.org> Signed-off-by: Samuel Groot <samuel.groot@grenoble-inp.org> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
doc: more consistency in environment variables format
Wrap with backticks (monospaced font) unwrapped or single-quotes wrapped
(italic type) environment variables which are followed by the word
"environment". It was obtained with:
One of the main purposes is to stick to the CodingGuidelines as possible so
that people writting new documentation by mimicking the existing are more likely
to have it right (even if they didn't read the CodingGuidelines).
Signed-off-by: Tom Russello <tom.russello@grenoble-inp.org> Signed-off-by: Erwan Mathoniere <erwan.mathoniere@grenoble-inp.org> Signed-off-by: Samuel Groot <samuel.groot@grenoble-inp.org> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This change GIT_* variables that where in italic style to monospaced font
according to the guideline. It was obtained with
perl -pi -e "s/\'(GIT_.*?)\'/\`\1\`/g" *.txt
One of the main purposes is to stick to the CodingGuidelines as possible so
that people writting new documentation by mimicking the existing are more likely
to have it right (even if they didn't read the CodingGuidelines).
Signed-off-by: Tom Russello <tom.russello@grenoble-inp.org> Signed-off-by: Erwan Mathoniere <erwan.mathoniere@grenoble-inp.org> Signed-off-by: Samuel Groot <samuel.groot@grenoble-inp.org> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the guideline text that we want for our documentation clearer.
Signed-off-by: Tom Russello <tom.russello@grenoble-inp.org> Signed-off-by: Erwan Mathoniere <erwan.mathoniere@grenoble-inp.org> Signed-off-by: Samuel Groot <samuel.groot@grenoble-inp.org> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 72441af (tree-diff: rework diff_tree() to generate
diffs for multiparent cases as well, 2014-04-07) introduced
the use of alloca so that the common cases of commits with 1
or 2 parents would not be adversely affected by going
through the multi-parent code.
However, our xalloca is not ideal when the number of parents
grows very large:
1. If the requested size is too large for our stack,
alloca() has no way to tell us, and we simply segfault
while trying to access the memory.
2. It does not use our usual memory_limit_check() logic.
I measured, and alloca is indeed buying us a very small
speedup over xmalloc()/free(). So we'd want to keep
something like it.
This patch simply puts a conditional in place at each
callsite: we use alloca for common known-small numbers of
parents, and otherwise use the heap. We are technically
still vulnerable to (1), but no more so than if we simply
put a few dozen bytes on the stack, which we must do all the
time anyway. And likewise, we technically miss a memory
limit check if it is tiny, but such a limit is pointless.
An alternative to this would be implement something like:
struct tree *tp, tp_fallback[2];
if (nparent <= ARRAY_SIZE(tp_fallback))
tp = tp_fallback;
else
ALLOC_ARRAY(tp, nparent);
...
if (tp != tp_fallback)
free(tp);
That would let us drop our xalloca() portability code
entirely. But in my measurements, this seemed to perform
slightly worse than the xalloca solution.
Note in the example above, and in the patch below, I've used
ALLOC_ARRAY() to replace the manual xmalloc(nr * sizeof(*x)).
Besides being shorter, this has the bonus that one cannot
accidentally overflow a size_t during that computation.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The executable bit will not be detected (and therefore will not be
set) for paths in a repository with `core.filemode` set to false,
though the users may still wish to add files as executable for
compatibility with other users who _do_ have `core.filemode`
functionality. For example, Windows users adding shell scripts may
wish to add them as executable for compatibility with users on
non-Windows.
Although this can be done with a plumbing command
(`git update-index --add --chmod=+x foo`), teaching the `git-add`
command allows users to set a file executable with a command that
they're already familiar with.
Signed-off-by: Edward Thomson <ethomson@edwardthomson.com> Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since commit 56a1a3ab ("Silence GCC's \"cast of pointer to integer of a
different size\" warning", 26-10-2015), sparse has been issuing a macro
redefinition warning for the SIZE_MAX macro. However, gcc did not issue
any such warning.
After commit 56a1a3ab, in terms of the order of #includes and #defines,
the code looked something like:
However, if you compile that file with -Wsystem-headers, then it will
also issue a warning. Having set -Wsystem-headers in CFLAGS, using the
config.mak file, then (on cygwin):
$ make compat/regex/regex.o
CC compat/regex/regex.o
In file included from /usr/lib/gcc/x86_64-pc-cygwin/4.9.3/include/stdint.h:9:0,
from compat/regex/regcomp.c:21,
from compat/regex/regex.c:77:
/usr/include/stdint.h:362:0: warning: "SIZE_MAX" redefined
#define SIZE_MAX (__SIZE_MAX__)
^
In file included from compat/regex/regex.c:69:0:
compat/regex/regex_internal.h:108:0: note: this is the location of the previous definition
# define SIZE_MAX ((size_t) -1)
^
$
The compilation of the compat/regex code is somewhat unusual in that the
regex.c file directly #includes the other c files (regcomp.c, regexec.c
and regex_internal.c). Commit 56a1a3ab added an #include of <stdint.h>
to the regcomp.c file, which results in the redefinition, since this is
included after the regex_internal.h header. This header file contains a
'fallback' definition for SIZE_MAX, in order to support systems which do
not have the <stdint.h> header (the HAVE_STDINT_H macro is not defined).
In order to suppress the warning, we move the #include of <stdint.h>
from regcomp.c to the start of the compilation unit, close to the top
of regex.c, prior to the #include of the regex_internal.h header.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reflog: continue walking the reflog past root commits
If a repository contains more than one root commit, then its HEAD
reflog may contain multiple "creation events", i.e. entries whose
"from" value is the null sha1. Listing such a reflog currently stops
prematurely at the first such entry, even when the reflog still
contains older entries. This can scare users into thinking that their
reflog got truncated after 'git checkout --orphan'.
Continue walking the reflog past such creation events based on the
preceeding reflog entry's "new" value.
The test 'symbolic-ref writes reflog entry' in t1401-symbolic-ref
implicitly relies on the current behavior of the reflog walker to stop
at a root commit and thus to list only the reflog entries that are
relevant for that test. Adjust the test to explicitly specify the
number of relevant reflog entries to be listed.
Reported-by: Patrik Gustafsson <pvn@textalk.se> Signed-off-by: SZEDER Gábor <szeder@ira.uka.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
A couple of bugs around core.autocrlf have been fixed.
* tb/core-eol-fix:
convert.c: ident + core.autocrlf didn't work
t0027: test cases for combined attributes
convert: allow core.autocrlf=input and core.eol=crlf
t0027: make commit_chk_wrnNNO() reliable
Merge branch 'ar/diff-args-osx-precompose' into maint
Many commands normalize command line arguments from NFD to NFC
variant of UTF-8 on OSX, but commands in the "diff" family did
not, causing "git diff $path" to complain that no such path is
known to Git. They have been taught to do the normalization.
* ar/diff-args-osx-precompose:
diff: run arguments through precompose_argv
builtin/apply: remove misleading comment on lock_file field
Just like pointer field like prefix, the piece of memory pointed at
by lock_file field is not owned by the apply_state structure. It is
true that the caller needs to be careful about the lifetime rule for
lockfile instances, but that is none of this API's business.
git-prompt.sh: Don't error on null ${ZSH,BASH}_VERSION, $short_sha
When the shell is in "nounset" or "set -u" mode, referencing unset or
null variables results in an error. Protect $ZSH_VERSION and
$BASH_VERSION against that, and initialize $short_sha before use.
Signed-off-by: Ville Skyttä <ville.skytta@iki.fi> Signed-off-by: Junio C Hamano <gitster@pobox.com>
cherry-pick allows to pick single commits to an empty HEAD, but not
multiple commits.
Allow the multiple commit case, too.
Reported-by: Fabrizio Cucci <fabrizio.cucci@gmail.com> Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Combined with "git format-patch --pretty=mboxrd", this should
allow us to round-trip commit messages with embedded mbox
"From " lines without corruption.
Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This output format prevents format-patch output from breaking
readers if somebody copy+pasted an mbox into a commit message.
Unlike the traditional "mboxo" format, "mboxrd" is designed to
be fully-reversible. "mboxrd" also gracefully degrades to
showing extra ">" in existing "mboxo" readers.
This degradation is preferable to breaking message splitting
completely, a problem I've seen in "mboxcl" due to having
multiple, non-existent, or inaccurate Content-Length headers.
"mboxcl2" is a non-starter since it's inherits the problems
of "mboxcl" while being completely incompatible with existing
tooling based around mailsplit.
Redirect auto-gc output to the sideband such that it is visible to all
clients. As a side effect, all auto-gc error messages are now prefixed
with "remote: " before being printed to stderr on the client-side which
makes it easier to understand that those error messages originate from
the server.
Signed-off-by: Lukas Fleischer <lfleischer@lfos.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
CSS is widely used, motivating it being included as a built-in pattern.
It must be noted that the word_regex for CSS (i.e. the regex defining
what is a word in the language) does not consider '.' and '#' characters
(in CSS selectors) to be part of the word. This behavior is documented
by the test t/t4018/css-rule.
The logic behind this behavior is the following: identifiers in CSS
selectors are identifiers in a HTML/XML document. Therefore, the '.'/'#'
character are not part of the identifier, but an indicator of the nature
of the identifier in HTML/XML (class or id). Diffing ".class1" and
".class2" must show that the class name is changed, but we still are
selecting a class.
Logic behind the "pattern" regex is:
1. reject lines ending with a colon/semicolon (properties)
2. if a line begins with a name in column 1, pick the whole line
Credits to Johannes Sixt (j6t@kdbg.org) for the pattern regex and most
of the tests.
Signed-off-by: William Duclot <william.duclot@ensimag.grenoble-inp.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Reviewed-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git rebase -i", after it fails to auto-resolve the conflict, had
an unnecessary call to "git rerere" from its very early days, which
was spotted recently; the call has been removed.
* js/rebase-i-dedup-call-to-rerere:
rebase -i: remove an unnecessary 'rerere' invocation
builtin/apply: add 'lock_file' pointer into 'struct apply_state'
We cannot have a 'struct lock_file' allocated on the stack, as lockfile.c
keeps a linked list of all created lock_file structures.
Also 'struct apply_state' users might later want the same 'struct lock_file'
instance to be reused by different series of calls to the apply api.
So let's add a 'struct lock_file *lock_file' pointer into 'struct apply_state'
and have the user of 'struct apply_state' allocate memory for the actual
'struct lock_file' instance.
Let's also add an argument to init_apply_state(), so that the caller can
easily supply a pointer to the allocated instance.
Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
rev-list: disable bitmaps when "-n" is used with listing objects
You can ask rev-list to use bitmaps to speed up an --objects
traversal, which should generally give you your answers much
faster.
Likewise, you can ask rev-list to limit such a traversal
with `-n`, in which case we'll show only a limited set of
commits (and only the tree and commit objects directly
reachable from those commits).
But if you do both together, the results are nonsensical. We
end up limiting any fallback traversal we do to _find_ the
bitmaps, but the actual set of objects we output will be
picked arbitrarily from the union of any bitmaps we do find,
and will involve the objects of many more commits.
It's possible that somebody might want this as a "show me
what you can, but limit the amount of work you do" flag.
But as with the prior commit clamping "--count", the results
are basically non-deterministic; you'll get the values from
some commits between `n` and the total number, and you can't
tell which.
And unlike the `--count` case, we can't easily generate the
"real" value from the bitmap values (you can't just walk
back `-n` commits and subtract out the reachable objects
from the boundary commits; the bitmaps for `X` record its
total reachability, so you don't know which objects are
directly from `X` itself, which from `X^`, and so on).
So let's just fallback to the non-bitmap code path in this
case, so we always give a sane answer.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
rev-list: "adjust" results of "--count --use-bitmap-index -n"
If you ask rev-list for:
git rev-list --count --use-bitmap-index HEAD
we optimize out the actual traversal and just give you the
number of bits set in the commit bitmap. This is faster,
which is good.
But if you ask to limit the size of the traversal, like:
git rev-list --count --use-bitmap-index -n 100 HEAD
we'll still output the full bitmapped number we found. On
the surface, that might even seem OK. You explicitly asked
to use the bitmap index, and it was cheap to compute the
real answer, so we gave it to you.
But there's something much more complicated going on under
the hood. If we don't have a bitmap directly for HEAD, then
we have to actually traverse backwards, looking for a
bitmapped commit. And _that_ traversal is bounded by our
`-n` count.
This is a good thing, because it bounds the work we have to
do, which is probably what the user wanted by asking for
`-n`. But now it makes the output quite confusing. You might
get many values:
- your `-n` value, if we walked back and never found a
bitmap (or fewer if there weren't that many commits)
- the actual full count, if we found a bitmap root for
every path of our traversal with in the `-n` limit
- any number in between! We might have walked back and
found _some_ bitmaps, but then cut off the traversal
early with some commits not accounted for in the result.
So you cannot even see a value higher than your `-n` and say
"OK, bitmaps kicked in, this must be the real full count".
The only sane thing is for git to just clamp the value to a
maximum of the `-n` value, which means we should output the
exact same results whether bitmaps are in use or not.
The test in t5310 demonstrates this by using `-n 1`.
Without this patch we fail in the full-bitmap case (where we
do not have to traverse at all) but _not_ in the
partial-bitmap case (where we have to walk down to find an
actual bitmap). With this patch, both cases just work.
I didn't implement the crazy in-between case, just because
it's complicated to set up, and is really a subset of the
full-count case, which we do cover.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
upload-pack: provide a hook for running pack-objects
When upload-pack serves a client request, it turns to
pack-objects to do the heavy lifting of creating a
packfile. There's no easy way to intercept the call to
pack-objects, but there are a few good reasons to want to do
so:
1. If you're debugging a client or server issue with
fetching, you may want to store a copy of the generated
packfile.
2. If you're gathering data from real-world fetches for
performance analysis or debugging, storing a copy of
the arguments and stdin lets you replay the pack
generation at your leisure.
3. You may want to insert a caching layer around
pack-objects; it is the most CPU- and memory-intensive
part of serving a fetch, and its output is a pure
function[1] of its input, making it an ideal place to
consolidate identical requests.
This patch adds a simple "hook" interface to intercept calls
to pack-objects. The new test demonstrates how it can be
used for debugging (using it for caching is a
straightforward extension; the tricky part is writing the
actual caching layer).
This hook is unlike the normal hook scripts found in the
"hooks/" directory of a repository. Because we promise that
upload-pack is safe to run in an untrusted repository, we
cannot execute arbitrary code or commands found in the
repository (neither in hooks/, nor in the config). So
instead, this hook is triggered from a config variable that
is explicitly ignored in the per-repo config.
The config variable holds the actual shell command to run as
the hook. Another approach would be to simply treat it as a
boolean: "should I respect the upload-pack hooks in this
repo?", and then run the script from "hooks/" as we usually
do. However, that isn't as flexible; there's no way to run a
hook approved by the site administrator (e.g., in
"/etc/gitconfig") on a repository whose contents are not
trusted. The approach taken by this patch is more
fine-grained, if a little less conventional for git hooks
(it does behave similar to other configured commands like
diff.external, etc).
[1] Pack-objects isn't _actually_ a pure function. Its
output depends on the exact packing of the object
database, and if multi-threading is used for delta
compression, can even differ racily. But for the
purposes of caching, that's OK; of the many possible
outputs for a given input, it is sufficient only that we
output one of them.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t1308: do not get fooled by symbolic links to the source tree
When your $PWD does not match $(/bin/pwd), e.g. you have your copy
of the git source tree in one place, point it with a symbolic link,
and then "cd" to that symbolic link before running 'make test', one
of the tests in t1308 expects that the per-user configuration was
reported to have been read from the true path (i.e. relative to the
target of such a symbolic link), but the test-config program reports
a path relative to $PWD (i.e. the symbolic link).
Instead, expect a path relative to $HOME (aka $TRASH_DIRECTORY), as
per-user configuration is read from $HOME/.gitconfig and the test
framework sets these shell variables up in such a way to avoid this
problem.
pathspec: rename free_pathspec() to clear_pathspec()
The function takes a pointer to a pathspec structure, and releases
the resources held by it, but does not free() the structure itself.
Such a function should be called "clear", not "free".
t2300: run git-sh-setup in an environment that better mimics the real life
When we run scripted Porcelains, "git" potty has set up the $PATH by
prepending $GIT_EXEC_PATH, the path given by "git --exec-path=$there
$cmd", etc. already. Because of this, scripted Porcelains can
dot-source shell script library like git-sh-setup with simple dot
without specifying any path.
t2300 however dot-sources git-sh-setup without adjusting $PATH like
the real "git" potty does. This has not been a problem so far, but
once git-sh-setup wants to rely on the $PATH adjustment, just like
any scripted Porcelains already do, it would become one. It cannot
for example dot-source another shell library without specifying the
full path to it by prefixing $(git --exec-path).
In t5500::check_prot_host_port_path(), diagport is not a variable
used elsewhere and the function is not recursively called so this
can simply lose the "local", which may not be supported by shell
(besides, the function liberally clobbers other variables without
making them "local").
t7403::reset_submodule_urls() overrides the "root" variable used
in the test framework for no good reason; its use is not about
temporarily relocating where the test repositories are created.
This assignment can be made not to clobber the variable by moving
them into the subshells it already uses. Its value is always
$TRASH_DIRECTORY, so we could use it instead there, and this
function that is called only once and its two subshells may not be
necessary (instead, the caller can use "git -C $there config" and
set a value that is derived from $TRASH_DIRECTORY), but this is a
minimum fix that is needed to lose "local".
Helped-by: John Keeping <john@keeping.me.uk> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>