The extra "showlog" command in our OpenRC service script was more
trouble than it was worth: the only thing it did was call "less" on a
log file, and the service script is only guessing at the location of
the log file (only the fail2ban server knows its true location).
It's not like "/etc/init.d/fail2ban showlog" is that much easier to type
than "less /var/log/fail2ban.log" in the first place, so I think the
extra complexity (5 more lines in the service script) is not worth it.
If the "retry" variable is set in the service script, we don't have to
pass it to start-stop-daemon explicitly. While we can't immediately
eliminate any code with this change, it will be necessary later to
adopt the default OpenRC stop() function.
If our service is installed under some other name, then we don't want
the service script to say things like "Starting fail2ban..." because
the name "fail2ban" won't make any sense at that point. Instead, we
use the $RC_SVCNAME variable to ensure that the service name matches
what we tell the user. Typically, however, $RC_SVCNAME will still be
"fail2ban".
Our OpenRC service script performs two tasks before starting the service:
1. It removes any stake sockets (from e.g. a system crash).
2. It ensures that the PID file directory exists.
These have both been moved into the "start_pre" phase, which is
designed to do such things (and will allow us to simplify the "start"
phase in the future). The existing "mkdir -p" has also been converted
into a "checkpath -d" command which is built-in to OpenRC.
OpenRC has a special variable "pidfile" that should be used to store
the location of the daemon's PID file. This commit replaces two
instances of said location with one variable.
The FAIL2BAN variable in our OpenRC service script was a combination
of two standard OpenRC variables, "command" and "command_args". This
commit simply replaces the custom variable with the two standard
ones. This will aid future simplifications of the service script.
Our OpenRC service script contained a "need logger" dependency, which
meant that the life cycle of the fail2ban service was tied to that of
the system logger service. That isn't quite correct: fail2ban
functions fine even if the system logger is stopped:
1. fail2ban is capable of analyzing non-syslog log files.
2. Even if fail2ban is solely analyzing syslog files, we don't
want to stop the fail2ban service simply because syslog was
stopped -- fail2ban just won't see any new log lines until
syslog is started again.
This commit changes the "need net" dependency to "use net", which will
still attempt to start the system logger service, but which won't kill
fail2ban if the system logger is ever stopped.
The "need net" dependency in our OpenRC service script was incorrect:
the fail2ban service does not need a working WAN to function. This
issue is well-documented and is covered in the OpenRC Service Script
Guide, currently located at
https://github.com/OpenRC/openrc/blob/master/service-script-guide.md
Our OpenRC conf file already tells users how to find the available
options that can be placed in the FAIL2BAN_OPTIONS variable, so having
a specific example of,
FAIL2BAN_OPTIONS="-x"
doesn't provide much more information. In fact, it makes you wonder
why it's there in the first place: does the init script have some kind
of problem with stale sockets? It used to, but that problem has been
fixed. This commit removes the redundant example.
There were two paths mentioned in comments in the fail2ban OpenRC conf
file, but those paths aren't guaranteed to be correct (until/unless we
integrate the conf file with the build system).
The first comment referenced the physical location of the associated
init script, and in my opinion is not useful to an end user in the
first place. It has been removed: OpenRC users know what this file
is for, there's no reason to repeat it in a comment.
The second comment contained an absolute path to fail2ban-client, and
I've removed the leading path components because "fail2ban-client" is
generally run from your $PATH.
We ship a service script and configuration file for "gentoo" that are
actually more generally applicable: they work on any system where
OpenRC is used. This commit simply renames the files from "gentoo" to
"openrc" to reflect the fact that they are in no way Gentoo-specific.
add descriptions to stop syslog errors for extra_started_commands when running:
rc-service ipset describe
Oct 28 15:13:30 xxxx daemon.warn /etc/init.d/fail2ban[26446]: ^[[1m^[[36mreload^[[m: no description
Oct 28 15:13:30 xxxx daemon.warn /etc/init.d/fail2ban[26447]: ^[[1m^[[36mshowlog^[[m: no description
- setup process generates `build/fail2ban.service` from `files/fail2ban.service.in` using distribution related bin-path;
- bug-fixing by running setup with option `--dry-run` (note: specify option `--dry-run` before `install`, like `python setup.py --dry-run install`);
- test cases extended to cover dry-run.
The fail2ban server can take several seconds to shut down. This can
make Gentoo's start-stop-service time out and decide that stopping has
failed, even if it actually succeeds a few seconds later.
The default timeout for start-stop-service if --retry is not specified
appears to be 5 seconds. Increase that to 30 seconds to be sure that if
fail2ban-server is going to be able to stop, it has time to do so.
- starting service in normal mode (without forking)
- does not restart if service exited normally (exit-code 0, e.g. stopped via fail2ban-client)
- does not restart if service can not start (exit-code 255, e.g. wrong configuration, etc.)
- service can be additionally started/stopped with commands (fail2ban-client, fail2ban-server)
Currently, if fail2ban is killed (or crashes), its status will be
reported by '/etc/init.d/fail2ban status' as 'running' even though it
is not. Attempting to restart the service also fails, because Gentoo
unsuccessfully tries to stop the service.
By using start-stop-daemon and providing a pidfile, Gentoo will
instead report the status as 'crashed' and allow the service to be
restarted as normal.
These fixes are pretty pedantic, but they do simplify the script a
little.
* Checking the existence of a file/directory before creating/deleting
it adds complexity and raciness. There are better options.
* mkdir -p does the job of making sure a directory exists. (It only
fails if there's a filesystem error or something.)
* Likewise, rm -f doesn't fail if the file doesn't exist.
* rm -r isn't neccessary because the socket shouldn't be a directory.
(If it is for some reason, that should be an error.)
As a useful side effect, prevents "Unable to contact server. Is it
running?" mails from cron when fail2ban hasn't been (intentionally)
running nor thus logging anything either.
(a) use static-network-up, since it is more generic than the started networking event
(b) do not hook into network deconfiguration to speed up shutdown
(c) expect fork, per the use of the "-f" option
(d) use a variable for the run directory to make changing it simpler
(e) handle the situation of a left over socket file
(f) use the -f option to be able to track the PID
No longer directly exec the server, do not remove the PID file because it is unnecessary to do so. No longer respawns because Upstart can not track the process with the starter command.