DOC: additional pass over DEVELOP (just rephrasings, spaces, formatting)

pull/367/head
Yaroslav Halchenko 2013-09-25 22:12:36 -04:00
parent 3d6fa59b53
commit e9504122b8
1 changed files with 148 additions and 134 deletions

282
DEVELOP
View File

@ -1,6 +1,6 @@
__ _ _ ___ _ __ _ _ ___ _
/ _|__ _(_) |_ ) |__ __ _ _ _ / _|__ _(_) |_ ) |__ __ _ _ _
| _/ _` | | |/ /| '_ \/ _` | ' \ | _/ _` | | |/ /| '_ \/ _` | ' \
|_| \__,_|_|_/___|_.__/\__,_|_||_| |_| \__,_|_|_/___|_.__/\__,_|_||_|
================================================================================ ================================================================================
@ -26,7 +26,7 @@ Pull Requests
When submitting pull requests on GitHub we ask you to: When submitting pull requests on GitHub we ask you to:
* Clearly describe the problem you're solving; * Clearly describe the problem you're solving;
* Don't introduce regressions that will make it hard for systems administrators * Don't introduce regressions that will make it hard for systems administrators
to update; to update;
* If adding a major feature rebase your changes on master and get to a single commit; * If adding a major feature rebase your changes on master and get to a single commit;
* Include test cases (see below); * Include test cases (see below);
@ -42,74 +42,79 @@ Filters are tricky. They need to:
* work with the range of logging configuration options available in the * work with the range of logging configuration options available in the
software; software;
* work with multiple operating systems; * work with multiple operating systems;
* not make assumptions about the log format in excess of the software (don't * not make assumptions about the log format in excess of the software
assume a username doesn't contain spaces and use \S+ unless you've checked (e.g. do not assume a username doesn't contain spaces and use \S+ unless
the source code); you've checked the source code);
* make assumptions as to how future versions of the software will log messages * account for how future versions of the software will log messages
(guess what would happen to the log message if different authentication (e.g. guess what would happen to the log message if different authentication
types are added); types are added);
* not be susceptible to DoS vulnerabilities (see Filter Security below); and * not be susceptible to DoS vulnerabilities (see Filter Security below); and
* match intended log lines only. * match intended log lines only.
Please follow the steps from Filter Test Cases to Developing Filter Regular Please follow the steps from Filter Test Cases to Developing Filter Regular
Expressions and submit a GitHub pull request afterwards. If you get stuck, Expressions and submit a GitHub pull request (PR) afterwards. If you get stuck,
create a GitHub issue with what you have done and we'll attempt to help. you can push your unfinished changes and still submit a PR -- describe
what you have done, what is the hurdle, and we'll attempt to help (PR
will be automagically updated with future commits you would push to
complete it).
Filter test cases Filter test cases
----------------- -----------------
Purpose: Purpose:
Start by finding the log messages that the application generates related to Start by finding the log messages that the application generates related to
some form of authentication failure. If you are adding to an existing filter some form of authentication failure. If you are adding to an existing filter
think about whether the log messages are of a similar importance and purpose think about whether the log messages are of a similar importance and purpose
to the existing filter. If you are a user of fail2ban, and did a package to the existing filter. If you were a user of Fail2Ban, and did a package
update of fail2ban that started matching the new log messages, would anything update of Fail2Ban that started matching new log messages, would anything
unexpected happen? Would the bantime/findtime for the jail be appropriate for unexpected happen? Would the bantime/findtime for the jail be appropriate for
the new log messages. If it doesn't perhaps it needs to be in a separate the new log messages? If it doesn't, perhaps it needs to be in a separate
filter definition, for example like exim is authentication failures and filter definition, for example like exim filter aims at authentication failures
exim-spam contains log messages related to spam. and exim-spam at log messages related to spam.
Even if it is a new filter you may consider separating the log messages into Even if it is a new filter you may consider separating the log messages into
different filters based on purpose. different filters based on purpose.
Cause: Cause:
Are some of the log lines a result of the same action? For example is a PAM Are some of the log lines a result of the same action? For example, is a PAM
failure log message, followed by an application specific failure message the failure log message, followed by an application specific failure message the
result of the same user/script action. The result is if you add regular result of the same user/script action? If you add regular expressions for
expressions for both you'll end up with two failures for a single action. both you would end up with two failures for a single action.
Select the most appropriate log message and document the other log message with Therefore, select the most appropriate log message and document the other log
a test case not to match it and a description as to why you chose one over message) with a test case not to match it and a description as to why you chose
another. one over another.
With the log lines selected consider what occurred to generate those log With the selected log lines consider what action has caused those log
messages and whether they could of been generated by accidental means. Could messages and whether they could have been generated by accident? Could
the log message occur always as this is the first step towards the application the log message be occurring due to the first step towards the application
asking for authentication? Could the log messages occur often? If some of asking for authentication? Could the log messages occur often? If some of
these are true make a note of this in the jail.conf example that you provide. these are true make a note of this in the jail.conf example that you provide.
Samples: Samples:
Its important to include log file samples so any future change in the regular It is important to include log file samples so any future change in the regular
expression will still work with the log lines you have identified. expression will still work with the log lines you have identified.
The sample log messages are provided in testcases/files/logs/ with same name The sample log messages are provided in a file under testcases/files/logs/
as the filter. Each log line should include a failJSON metadata (so the logs named identically as the corresponding filter (but without .conf extension).
Each log line should be preceded by a line with failJSON metadata (so the logs
lines are tested in the test suite) directly above the log line. If there is lines are tested in the test suite) directly above the log line. If there is
any specific information about the log message, such as version or an any specific information about the log message, such as version or an
application configuration option that is needed for the message to occur, application configuration option that is needed for the message to occur,
include this in a comment (line beginning with #) above the failJSON metadata. include this in a comment (line beginning with #) above the failJSON metadata.
Log samples should include only one, definitely not more than 3, examples of Log samples should include only one, definitely not more than 3, examples of
log messages of the same form. If log messages are different in different log messages of the same form. If log messages are different in different
versions of the application log messages that show this is encouraged. versions of the application log messages that show this are encouraged.
Also attempt inject an IP into the application so that fail2ban detects the IP Also attempt to inject an IP into the application (e.g. by specifying
it as a username) so that Fail2Ban possibly detects the IP
from user input rather than the true origin. See the Filter Security section from user input rather than the true origin. See the Filter Security section
and the top example in testcases/files/logs/apache-auth as to how to do this. and the top example in testcases/files/logs/apache-auth as to how to do this.
One you have discovered this correct the regex so it doesn't match and provide One you have discovered that this is possible, correct the regex so it doesn't
this as a test case with match: false (see failJSON below). match and provide this as a test case with "match": false (see failJSON below).
If the mechanism to create the log message isn't obvious provide a If the mechanism to create the log message isn't obvious provide a
configuration and/or sample scripts testcases/files/config/{filtername} and configuration and/or sample scripts testcases/files/config/{filtername} and
@ -120,24 +125,25 @@ FailJSON metadata:
A failJSON metadata is a comment immediately above the log message. It will A failJSON metadata is a comment immediately above the log message. It will
look like: look like:
# failJSON: { "time": "2013-06-10T10:10:59", "match": true , "host": "193.169.56.211" } # failJSON: { "time": "2013-06-10T10:10:59", "match": true , "host": "93.184.216.119" }
Time should match the time of the log message. It is in a specific format of Time should match the time of the log message. It is in a specific format of
Year-Month-Day'T'Hour:minute:Second. If your log message does not include a Year-Month-Day'T'Hour:minute:Second. If your log message does not include a
year, like the example below, the year will be 2005, if before Sun Aug 14 10am year, like the example below, the year should be listed as 2005, if before Sun
UTC, and 2004 if afterwards. Aug 14 10am UTC, and 2004 if afterwards. Here is an example failJSON
line preceding a sample log line:
# failJSON: { "time": "2005-03-24T15:25:51", "match": true , "host": "198.51.100.87" } # failJSON: { "time": "2005-03-24T15:25:51", "match": true , "host": "198.51.100.87" }
Mar 24 15:25:51 buffalo1 dropbear[4092]: bad password attempt for 'root' from 198.51.100.87:5543 Mar 24 15:25:51 buffalo1 dropbear[4092]: bad password attempt for 'root' from 198.51.100.87:5543
The host will contain the IP or domain that should be blocked. The "host" in failJSON should contain the IP or domain that should be blocked.
For long lines that you don't want matched, like log injection vulnerabilities For long lines that you do not want to be matched (e.g. from log injection
and log lines excluded (see "Cause" section above), a "match": false in the attacks) and any log lines to be excluded (see "Cause" section above), set
failJSON and the reason why in the comment above. "match": false in the failJSON and describe the reason in the comment above.
After developing the regexs, the following command will test all the failJSON After developing regexes, the following command will test all failJSON metadata
metadata against the log lines: against the log lines in all sample log files
./fail2ban-testcases testSampleRegex ./fail2ban-testcases testSampleRegex
@ -146,28 +152,29 @@ Developing Filter Regular Expressions
Date/Time: Date/Time:
The first step in checking your log line can have a filter is to check that the At the moment, Fail2Ban depends on log lines to have time stamps. That is why
time format matches an existing regex. To test this copy the time component before starting to develop failregex, check if your log line format known to
from the log line and append an IP address. Then test it with: Fail2Ban. Copy the time component from the log line and append an IP address to
test with following command:
./fail2ban-regex "2013-09-19 02:46:12 1.2.3.4" "<HOST>" ./fail2ban-regex "2013-09-19 02:46:12 1.2.3.4" "<HOST>"
In the output from this should be something like: Output of such command should contain something like:
Date template hits: Date template hits:
|- [# of hits] date format |- [# of hits] date format
| [1] Year-Month-Day Hour:Minute:Second | [1] Year-Month-Day Hour:Minute:Second
Ensure that the template description matches of bits in the time format. If Ensure that the template description matches time/date elements in your log line
there isn't a matched a format and date regex can be added to time stamp. If there is no matched format then date template needs to be added
server/datedetector.py. Ensure this is added in an order that will match make to server/datedetector.py. Ensure that a new template is added in the order
more specific matches occur first and that their is no confusion as to which that more specific matches occur first and that there is no confusion between a
is the date or month. Day and a Month.
Filter file: Filter file:
The filter file is in config/filter.d/{filtername}.conf. The format of the The filter is specified in a config/filter.d/{filtername}.conf file. Filter file
filter file has two sections INCLUDES and Definition as follows: can have sections INCLUDES (optional) and Definition as follows:
[INCLUDES] [INCLUDES]
@ -181,30 +188,31 @@ failregex = ....
ignoreregex = .... ignoreregex = ....
This is also documented in the man pages as jail.conf (section 5). Other This is also documented in the man page jail.conf (section 5). Other definitions
definitions can be added to make failregex's more readable and maintainable. can be added to make failregex's more readable and maintainable to be used
through string Interpolations (see http://docs.python.org/2.7/library/configparser.html)
General rules: General rules:
Use "before" if you need to include a common set of rules, like syslog or if Use "before" if you need to include a common set of rules, like syslog or if
there's a common set of regexs for multiple filters. there is a common set of regexes for multiple filters.
Use "after" if you wish to allow the user to overwrite a set of customisation's Use "after" if you wish to allow the user to overwrite a set of customisations
of the current filter. This file doesn't need to exist. of the current filter. This file doesn't need to exist.
Try to avoid using ignoreregex mainly for performance reasons. The case when Try to avoid using ignoreregex mainly for performance reasons. The case when you
you would use it is if in trying to avoid using ignoreregex, you end up with would use it is if in trying to avoid using it, you end up with an unreadable
an unreadable failregex. failregex.
Syslog: Syslog:
If your application logs to syslog you can use the following to capture that If your application logs to syslog you can take advantage of log line prefix
part. So as a base use: definitions present in common.conf. So as a base use:
[INCLUDES] [INCLUDES]
before = commmon.conf before = common.conf
[Definition] [Definition]
@ -213,113 +221,119 @@ _daemon = app
failregex = ^%(__prefix_line)s failregex = ^%(__prefix_line)s
In this example common.conf defines __prefix_line which also contains the In this example common.conf defines __prefix_line which also contains the
_daemon name, (in syslog terms the service) you specified. _daemon can also be _daemon name (in syslog terms the service) you have just specified. _daemon
a regex. can also be a regex.
So the following uses a _daemon set to "dovecot" For example, to capture following line _daemon should be set to "dovecot"
Dec 12 11:19:11 dunnart dovecot: pop3-login: Aborted login (tried to use disabled plaintext auth): rip=190.210.136.21, lip=113.212.99.193 Dec 12 11:19:11 dunnart dovecot: pop3-login: Aborted login (tried to use disabled plaintext auth): rip=190.210.136.21, lip=113.212.99.193
So now ^%(__prefix_line)s matches "Dec 12 11:19:11 dunnart dovecot: ". Note it and then ^%(__prefix_line)s would match "Dec 12 11:19:11 dunnart dovecot:
matches the trailing space. Putting a space after ^%(__prefix_line)s in the ". Note it matches the trailing space(s) as well.
regex will probably not match.
Substitutions: Substitutions (AKA string interpolations):
Substation's are what the syslog uses. The regex bits of %(_name)s substitute We have used string interpolations in above examples. They are useful for
the _name definition into the regex. They are useful for making the regexes making the regexes more readable, reuse generic patterns in multiple failregex
more readable and also defining regex parts that occur in multiple log lines. lines, and also to refer definition of regex parts to specific filters or even
to the user. General principle is that value of a _name variable replaces
occurrences of %(_name)s within the same section or anywhere in the config file
if defined in [DEFAULT] section.
Regular Expressions: Regular Expressions:
The regular expression you will be writing will assume that the date/time has Regular expressions (failregex, ignoreregex) assume that the date/time has been
been removed from the log line because this is how fail2ban works internally. removed from the log line (this is just how fail2ban works internally ATM).
If the format is like '<date...> error 1.2.3.4 is evil' then you will need to If the format is like '<date...> error 1.2.3.4 is evil' then you need to match
match the < at the start so regex should be similar to '^<> <HOST> is evil$'. the < at the start so regex should be similar to '^<> <HOST> is evil$' using
<HOST> where the IP/domain name appears in the log line.
Use <HOST> where the IP/domain name appears in the log line.
The following general rules apply to regular expressions: The following general rules apply to regular expressions:
* Ensure regexs start with a ^ and are restrictive as possible. E.g. not .* if * ensure regexes start with a ^ and are as restrictive as possible. E.g. do not
\d+ is sufficient use .* if \d+ is sufficient;
* Use the functionality of regexs http://docs.python.org/2/library/re.html * use functionality of Python regexes defined in the standard Python re library
* Try to make the regular expression readable (as much as possible). E.g. http://docs.python.org/2/library/re.html;
(?:...) represents a non-capturing regex but (...) is more readable. * make regular expressions readable (as much as possible). E.g.
(?:...) represents a non-capturing regex but (...) is more readable, thus
preferred.
If you only have a basic knowledge of regular repressions read If you have only a basic knowledge of regular repressions we advise to read
http://docs.python.org/2/library/re.html first. Really. It doesn't take long http://docs.python.org/2/library/re.html first. It doesn't take long and would
and will remind you which bits you need to escape and which bits you don't. remind you e.g. which characters you need to escape and which you don't.
Developing/testing the regex: Developing/testing a regex:
You can develop the regex in the file or on the command line depending on your You can develop a regex in a file or using command line depending on your
preference. You can also use the samples you've created in the test cases or preference. You can also use samples you have already created in the test cases
test them one at a time. or test them one at a time.
The general tool is fail2ban-regex. To see how to use it run: The general tool for testing Fail2Ban regexes is fail2ban-regex. To see how to
use it run:
./fail2ban-regex --help ./fail2ban-regex --help
Take note of -l heavydebug / -l debug and -v as they will be most useful. Take note of -l heavydebug / -l debug and -v as they might be very useful.
TIP: Take a look at the source code of the application. You may see optional or TIP: Take a look at the source code of the application you are developing
extra log messages, or parts there of, that need to form part of your regex. failregex for. You may see optional or extra log messages, or parts there
It may also show how some parts are con trained and different formats of, that need to form part of your regex. It may also reveal how some
depending on configuration or less common usages. parts are constrained and different formats depending on configuration or
less common usages.
TIP: Some applications log spaces at the end. If you're not sure add \s*$ as the TIP: Some applications log spaces at the end. If you are not sure add \s*$ as
end part of the regex. the end part of the regex.
If your regex isn't matching take a look at http://www.debuggex.com/?flavor=python If your regex is not matching, http://www.debuggex.com/?flavor=python can help
to tune it:
Using the regex from the ./fail2ban-regex output (to ensure all substitutions * use regex from the ./fail2ban-regex output (to ensure all substitutions are
are done) and with <HOST> replaced with (?&.ipv4). Set the regex type to done) and replace <HOST> with (?&.ipv4). Make sure that regex type set to
Python. Python;
* for the test data put your log output with the time removed;
For the test data put your log output with the time removed. - when you have fixed the regex put it back into your filter file.
When you've fixed the regex put it back into your filter file.
Please spread the good word about debuggex - Serge Toarca is kindly continuing Please spread the good word about debuggex - Serge Toarca is kindly continuing
its free availability to Open Source developers. its free availability to Open Source developers.
Finishing up: Finishing up:
If you've created a new filter, add an entry in config/jail.conf. The theory If you've added a new filter, add a new entry in config/jail.conf. The theory
here is that a user will create a jail.conf with [filtername]\nenable=true. here is that a user will create a jail.local with [filtername]\nenable=true to
enable your jail.
So more specifically in the [filter] section in jail.conf: So more specifically in the [filter] section in jail.conf:
* Ensure that you have "enabled = false", we want people to enable as needed * ensure that you have "enabled = false" (users will enable as needed);
* use "filter =" set to your filter name. * use "filter =" set to your filter name;
* use a action to disable ports associated with the application * use a typical action to disable ports associated with the application;
* set "logpath" to a usual location for the log file for the application. * set "logpath" to the usual location of application log file;
* If the default findtime or bantime isn't appropriate to the filter set a value * if the default findtime or bantime isn't appropriate to the filter, specify
that is more appropriate. more appropriate choices (possibly with a brief comment line).
Send the fail2ban a git pull request (See "Pull Requests" above) containing Submit github pull request (See "Pull Requests" above) for
your great work. github.com/fail2ban/fail2ban containing your great work.
Filter Security Filter Security
--------------- ---------------
Poor filter regular expressions are susceptible to DoS attacks. Poor filter regular expressions are susceptible to DoS attacks.
When a remote user has the ability to introduce text that will match the When a remote user has the ability to introduce text that would match filter's
filter regex, such that the inserted text matches the <HOST> part, they have the failregex, while matching inserted text to the <HOST> part, they have the
ability to deny any host they choose. ability to deny any host they choose.
So the <HOST> part must be anchored on text generated by the application, and not So the <HOST> part must be anchored on text generated by the application, and
the user, to a sufficient extent that the user cannot insert the entire text. not the user, to a extent sufficient to prevent user inserting the entire text
matching this or any other failregex.
Ideally filter regex should anchor to the beginning and end of the log line Ideally filter regex should anchor at the beginning and at the end of log line.
however as more applications log at the beginning than the end, anchoring the However as more applications log at the beginning than the end, anchoring the
beginning is more important. If the log file used by the application is shared beginning is more important. If the log file used by the application is shared
with other applications, like system logs, ensure the other application that with other applications, like system logs, ensure the other application that use
use that log file do not log user generated text at the beginning of the line, that log file do not log user generated text at the beginning of the line, or,
or, if they do, ensure the regexs of the filter are sufficient to mitigate the if they do, ensure the regexes of the filter are sufficient to mitigate the risk
risk of insertion. of insertion.
Examples of poor filters Examples of poor filters
@ -714,11 +728,11 @@ ver. 0.8.12 (2013/XX/XXX) - wanna-be-released
----------- -----------
- Fixes: - Fixes:
- New Features: - New Features:
- Enhancements: - Enhancements:
and adjust common/version.py to carry .dev suffix to signal and adjust common/version.py to carry .dev suffix to signal
a version under development. a version under development.