- Add Godeps/LICENSES.md
- Add verify-godep-licenses to verify that Godeps/LICENSES.md is up to date
- Trigger verify-godep-licenses in the pre-commit hook only if the Godeps dir has changed
- Exclude verify-godep-licenses in verify-all
- Add verify-godep-licenses to make verify (used by travis)
- Add verify-godep-licenses to shippable
- Update dev docs to mention update-godep-licenses
Some programs like the boilerplate or the flag checker will check the
whole repo if they aren't given a specific set of files to test. If you
use `git commit --amend` to change commit messages you will be calling
these functions with no args, and thus it take a lot longer to commit no
changes than it does to actually commit changes!
It is a pretty slow test (it downloads fresh) all of kube's Godeps, so only
run it when needed in pre-commit hook.
This also means that random changes to other non-kube repositories could
cause travis/shippable to just randomly stop working for all PRs which touch
Godeps after that moment (even though no changes have been made to Godeps by
us). Examples would be things like other repos completely disappearing. Or
even the directory we include disappearing in master in the remote
project (even though the directory may exist at the commit we care
about) This is a bugwin godep, but it is a problem we have seen happen
with kube Godeps.
Although the boilerplate checker was very fast it can be faster. With
this change we can hand the boilerplate a list of files which need to be
checked or give it no files. If given no files it will run all files in
the repo. Before you had to explicitly tell the boiler checker the
'extention' of the the files. In this case we let the checker figure it
out and load the headers as needed.
Doing the whole repo takes about 0.4 seconds. Doing a single go file
takes < .04 seconds.
This works by defining two 'static' lists in hack. The first is the list
of all flags in the project which use a `-` or an `_` in their name. All
files being processed by verify-flags-underscore.py (or all files in the
repo if no filename arguments are given) will be searched for flag
declaration using a simple regex. Its not super smart. If a flag is
found which is not in the static list it will complain/reject the commit
until a human adds it to the list. If we do not keep a static list of
flags it takes >.2 seconds to find them 'all' at runtime. Since this is
run in pre-commit saving every part of a second helps.
After it finds all of the flags it runs all of the arguments (or all
files in repo if no arguments) looking for usage of those flags which
includes an `_`. There are lots of places where these are false
positives. For example we have a flag named oom-adj-score but the kernel
calls it oom_adj_score. To handle this we keep a second 'whitelist' of
lines which are allowed to use these flag names with an `_`.
Running the entire git repo looking for flags in every golang file and
looking in every single file for bad usage takes about 8.75 seconds.
Running it in the precommit hook where we only check things that changed
takes about .06 seconds.
Right now some of the hack/* tools use `go run` and build almost every
time. There are some which expect you to have already run `go install`.
And in all cases the pre-commit hook, which runs a full build wouldn't
want to do either, since it just built!
This creates a new hack/after-build/ directory and has the scripts which
REQUIRE that the binary already be built. It doesn't test and complain.
It just fails miserably. Users should not be in this directory. Users
should just use hack/verify-* which will just do the build and then call
the "after-build" version. The pre-commit hook or anything which KNOWS
the binaries have been built can use the fast version.
We found in that someone just copied/pasted the boilerplate language into
their code. But the boilerplate contains 2014, not 2015. We have 2 ways
to fix this.
1) Update the boilerplate to 2015 so people would get the right one.
2) Update the boilerplate so it doesn't make sense and then warn when
people use it.
This PR takes the second option. While options #1 seems easier, it will
get wrong in 2016, 17, 18 and it's unlikely anyone remember why they
need to update the boilerplate text and the regex rewrite. So just
make the humans do a tiny bit more work now.
Clayton pointed out that if he created a file with no /* in it anywhere
the boilerplate logic would crash like:
$ hack/verify-boilerplate.sh
Traceback (most recent call last):
File "hack/../hooks/boilerplate.py", line 87, in <module>
sys.exit(main())
File "hack/../hooks/boilerplate.py", line 83, in main
if not file_passes(filename, extention, ref, p):
File "hack/../hooks/boilerplate.py", line 38, in file_passes
while data[0] != "/*\n":
IndexError: list index out of range
That is because we were just stripping everything before the first line
that contained exacly "/*". If no such line existed it got to the end
and just kept going.
This does something smarter. We use a regex to look for one or more
lines which start // +build followed by a single newline and remove only
those. This obviously found one place where the package name was above
the license and was being missed by both the old and the new checker.
It also fixed the python spew and just tells you your file fails.
It's just a little bit faster.....
BEFORE:
$ time hack/verify-boilerplate.sh
real 0m9.378s
user 0m3.405s
sys 0m13.906s
AFTER:
$ time hack/verify-boilerplate.sh
real 0m0.181s
user 0m0.114s
sys 0m0.068s