From 2b9d4f86cd1155b548a92b4429f05ed36a7a7a79 Mon Sep 17 00:00:00 2001 From: Daniel Black Date: Sun, 29 Dec 2013 07:26:41 +0000 Subject: [PATCH] DOC: filter documentation was bigger than DEVELOP so separated it out. Hopefully it may get read more --- DEVELOP | 460 +---------------------------------------------------- FILTERS | 469 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ MANIFEST | 1 + 3 files changed, 471 insertions(+), 459 deletions(-) create mode 100644 FILTERS diff --git a/DEVELOP b/DEVELOP index 3939fe47..18d29ad4 100644 --- a/DEVELOP +++ b/DEVELOP @@ -34,465 +34,7 @@ When submitting pull requests on GitHub we ask you to: * Include a change to the relevant section of the ChangeLog; and * Include yourself in THANKS if not already there. -Filters -======= - -Filters are tricky. They need to: -* work with a variety of the versions of the software that generates the logs; -* work with the range of logging configuration options available in the - software; -* work with multiple operating systems; -* not make assumptions about the log format in excess of the software - (e.g. do not assume a username doesn't contain spaces and use \S+ unless - you've checked the source code); -* account for how future versions of the software will log messages - (e.g. guess what would happen to the log message if different authentication - types are added); -* not be susceptible to DoS vulnerabilities (see Filter Security below); and -* match intended log lines only. - -Please follow the steps from Filter Test Cases to Developing Filter Regular -Expressions and submit a GitHub pull request (PR) afterwards. If you get stuck, -you can push your unfinished changes and still submit a PR -- describe -what you have done, what is the hurdle, and we'll attempt to help (PR -will be automagically updated with future commits you would push to -complete it). - -Filter test cases ------------------ - -Purpose: - -Start by finding the log messages that the application generates related to -some form of authentication failure. If you are adding to an existing filter -think about whether the log messages are of a similar importance and purpose -to the existing filter. If you were a user of Fail2Ban, and did a package -update of Fail2Ban that started matching new log messages, would anything -unexpected happen? Would the bantime/findtime for the jail be appropriate for -the new log messages? If it doesn't, perhaps it needs to be in a separate -filter definition, for example like exim filter aims at authentication failures -and exim-spam at log messages related to spam. - -Even if it is a new filter you may consider separating the log messages into -different filters based on purpose. - -Cause: - -Are some of the log lines a result of the same action? For example, is a PAM -failure log message, followed by an application specific failure message the -result of the same user/script action? If you add regular expressions for -both you would end up with two failures for a single action. -Therefore, select the most appropriate log message and document the other log -message) with a test case not to match it and a description as to why you chose -one over another. - -With the selected log lines consider what action has caused those log -messages and whether they could have been generated by accident? Could -the log message be occurring due to the first step towards the application -asking for authentication? Could the log messages occur often? If some of -these are true make a note of this in the jail.conf example that you provide. - -Samples: - -It is important to include log file samples so any future change in the regular -expression will still work with the log lines you have identified. - -The sample log messages are provided in a file under testcases/files/logs/ -named identically as the corresponding filter (but without .conf extension). -Each log line should be preceded by a line with failJSON metadata (so the logs -lines are tested in the test suite) directly above the log line. If there is -any specific information about the log message, such as version or an -application configuration option that is needed for the message to occur, -include this in a comment (line beginning with #) above the failJSON metadata. - -Log samples should include only one, definitely not more than 3, examples of -log messages of the same form. If log messages are different in different -versions of the application log messages that show this are encouraged. - -Also attempt to inject an IP into the application (e.g. by specifying -it as a username) so that Fail2Ban possibly detects the IP -from user input rather than the true origin. See the Filter Security section -and the top example in testcases/files/logs/apache-auth as to how to do this. -One you have discovered that this is possible, correct the regex so it doesn't -match and provide this as a test case with "match": false (see failJSON below). - -If the mechanism to create the log message isn't obvious provide a -configuration and/or sample scripts testcases/files/config/{filtername} and -reference these in the comments above the log line. - -FailJSON metadata: - -A failJSON metadata is a comment immediately above the log message. It will -look like: - -# failJSON: { "time": "2013-06-10T10:10:59", "match": true , "host": "93.184.216.119" } - -Time should match the time of the log message. It is in a specific format of -Year-Month-Day'T'Hour:minute:Second. If your log message does not include a -year, like the example below, the year should be listed as 2005, if before Sun -Aug 14 10am UTC, and 2004 if afterwards. Here is an example failJSON -line preceding a sample log line: - -# failJSON: { "time": "2005-03-24T15:25:51", "match": true , "host": "198.51.100.87" } -Mar 24 15:25:51 buffalo1 dropbear[4092]: bad password attempt for 'root' from 198.51.100.87:5543 - -The "host" in failJSON should contain the IP or domain that should be blocked. - -For long lines that you do not want to be matched (e.g. from log injection -attacks) and any log lines to be excluded (see "Cause" section above), set -"match": false in the failJSON and describe the reason in the comment above. - -After developing regexes, the following command will test all failJSON metadata -against the log lines in all sample log files - -./fail2ban-testcases testSampleRegex - -Developing Filter Regular Expressions -------------------------------------- - -Date/Time: - -At the moment, Fail2Ban depends on log lines to have time stamps. That is why -before starting to develop failregex, check if your log line format known to -Fail2Ban. Copy the time component from the log line and append an IP address to -test with following command: - -./fail2ban-regex "2013-09-19 02:46:12 1.2.3.4" "" - -Output of such command should contain something like: - -Date template hits: -|- [# of hits] date format -| [1] Year-Month-Day Hour:Minute:Second - -Ensure that the template description matches time/date elements in your log line -time stamp. If there is no matched format then date template needs to be added -to server/datedetector.py. Ensure that a new template is added in the order -that more specific matches occur first and that there is no confusion between a -Day and a Month. - -Filter file: - -The filter is specified in a config/filter.d/{filtername}.conf file. Filter file -can have sections INCLUDES (optional) and Definition as follows: - -[INCLUDES] - -before = common.conf - -after = filtername.local - -[Definition] - -failregex = .... - -ignoreregex = .... - -This is also documented in the man page jail.conf (section 5). Other definitions -can be added to make failregex's more readable and maintainable to be used -through string Interpolations (see http://docs.python.org/2.7/library/configparser.html) - - -General rules: - -Use "before" if you need to include a common set of rules, like syslog or if -there is a common set of regexes for multiple filters. - -Use "after" if you wish to allow the user to overwrite a set of customisations -of the current filter. This file doesn't need to exist. - -Try to avoid using ignoreregex mainly for performance reasons. The case when you -would use it is if in trying to avoid using it, you end up with an unreadable -failregex. - -Syslog: - -If your application logs to syslog you can take advantage of log line prefix -definitions present in common.conf. So as a base use: - -[INCLUDES] - -before = common.conf - -[Definition] - -_daemon = app - -failregex = ^%(__prefix_line)s - -In this example common.conf defines __prefix_line which also contains the -_daemon name (in syslog terms the service) you have just specified. _daemon -can also be a regex. - -For example, to capture following line _daemon should be set to "dovecot" - -Dec 12 11:19:11 dunnart dovecot: pop3-login: Aborted login (tried to use disabled plaintext auth): rip=190.210.136.21, lip=113.212.99.193 - -and then ^%(__prefix_line)s would match "Dec 12 11:19:11 dunnart dovecot: -". Note it matches the trailing space(s) as well. - -Substitutions (AKA string interpolations): - -We have used string interpolations in above examples. They are useful for -making the regexes more readable, reuse generic patterns in multiple failregex -lines, and also to refer definition of regex parts to specific filters or even -to the user. General principle is that value of a _name variable replaces -occurrences of %(_name)s within the same section or anywhere in the config file -if defined in [DEFAULT] section. - -Regular Expressions: - -Regular expressions (failregex, ignoreregex) assume that the date/time has been -removed from the log line (this is just how fail2ban works internally ATM). - -If the format is like ' error 1.2.3.4 is evil' then you need to match -the < at the start so regex should be similar to '^<> is evil$' using - where the IP/domain name appears in the log line. - -The following general rules apply to regular expressions: - -* ensure regexes start with a ^ and are as restrictive as possible. E.g. do not - use .* if \d+ is sufficient; -* use functionality of Python regexes defined in the standard Python re library - http://docs.python.org/2/library/re.html; -* make regular expressions readable (as much as possible). E.g. - (?:...) represents a non-capturing regex but (...) is more readable, thus - preferred. - -If you have only a basic knowledge of regular repressions we advise to read -http://docs.python.org/2/library/re.html first. It doesn't take long and would -remind you e.g. which characters you need to escape and which you don't. - -Developing/testing a regex: - -You can develop a regex in a file or using command line depending on your -preference. You can also use samples you have already created in the test cases -or test them one at a time. - -The general tool for testing Fail2Ban regexes is fail2ban-regex. To see how to -use it run: - -./fail2ban-regex --help - -Take note of -l heavydebug / -l debug and -v as they might be very useful. - -TIP: Take a look at the source code of the application you are developing - failregex for. You may see optional or extra log messages, or parts there - of, that need to form part of your regex. It may also reveal how some - parts are constrained and different formats depending on configuration or - less common usages. - -TIP: For looking through source code - http://sourcecodebrowser.com/ . It has - call graphs and can browse different versions. - -TIP: Some applications log spaces at the end. If you are not sure add \s*$ as - the end part of the regex. - -If your regex is not matching, http://www.debuggex.com/?flavor=python can help -to tune it. fail2ban-regex -D ... will present Debuggex URLs for the regexs -and sample log files that you pass into it. - -In general use when using regex debuggers for generating fail2ban filters: -* use regex from the ./fail2ban-regex output (to ensure all substitutions are -done) -* replace with (?&.ipv4) -* make sure that regex type set to Python -* for the test data put your log output with the date/time removed - -When you have fixed the regex put it back into your filter file. - -Please spread the good word about Debuggex - Serge Toarca is kindly continuing -its free availability to Open Source developers. - -Finishing up: - -If you've added a new filter, add a new entry in config/jail.conf. The theory -here is that a user will create a jail.local with [filtername]\nenable=true to -enable your jail. - -So more specifically in the [filter] section in jail.conf: -* ensure that you have "enabled = false" (users will enable as needed); -* use "filter =" set to your filter name; -* use a typical action to disable ports associated with the application; -* set "logpath" to the usual location of application log file; -* if the default findtime or bantime isn't appropriate to the filter, specify - more appropriate choices (possibly with a brief comment line). - -Submit github pull request (See "Pull Requests" above) for -github.com/fail2ban/fail2ban containing your great work. - -Filter Security ---------------- - -Poor filter regular expressions are susceptible to DoS attacks. - -When a remote user has the ability to introduce text that would match filter's -failregex, while matching inserted text to the part, they have the -ability to deny any host they choose. - -So the part must be anchored on text generated by the application, and -not the user, to an extent sufficient to prevent user inserting the entire text -matching this or any other failregex. - -Ideally filter regex should anchor at the beginning and at the end of log line. -However as more applications log at the beginning than the end, anchoring the -beginning is more important. If the log file used by the application is shared -with other applications, like system logs, ensure the other application that use -that log file do not log user generated text at the beginning of the line, or, -if they do, ensure the regexes of the filter are sufficient to mitigate the risk -of insertion. - - -Examples of poor filters ------------------------- - -1. Too restrictive - -We find a log message: - - Apr-07-13 07:08:36 Invalid command fial2ban from 1.2.3.4 - -We make a failregex - - ^Invalid command \S+ from - -Now think evil. The user does the command 'blah from 1.2.3.44' - -The program diligently logs: - - Apr-07-13 07:08:36 Invalid command blah from 1.2.3.44 from 1.2.3.4 - -And fail2ban matches 1.2.3.44 as the IP that it ban. A DoS attack was successful. - -The fix here is that the command can be anything so .* is appropriate. - - ^Invalid command .* from - -Here the .* will match until the end of the string. Then realise it has more to -match, i.e. "from " and go back until it find this. Then it will ban -1.2.3.4 correctly. Since the is always at the end, end the regex with a $. - - ^Invalid command .* from $ - -Note if we'd just had the expression: - - ^Invalid command \S+ from $ - -Then provided the user put a space in their command they would have never been -banned. - -2. Unanchored regex can match other user injected data - -From the Apache vulnerability CVE-2013-2178 -( original ref: https://vndh.net/note:fail2ban-089-denial-service ). - -An example bad regex for Apache: - - failregex = [[]client []] user .* not found - -Since the user can do a get request on: - - GET /[client%20192.168.0.1]%20user%20root%20not%20found HTTP/1.0 -Host: remote.site - -Now the log line will be: - - [Sat Jun 01 02:17:42 2013] [error] [client 192.168.33.1] File does not exist: /srv/http/site/[client 192.168.0.1] user root not found - -As this log line doesn't match other expressions hence it matches the above -regex and blocks 192.168.33.1 as a denial of service from the HTTP requester. - -3. Over greedy pattern matching - -From: https://github.com/fail2ban/fail2ban/pull/426 - -An example ssh log (simplified) - - Sep 29 17:15:02 spaceman sshd[12946]: Failed password for user from 127.0.0.1 port 20000 ssh1: ruser remoteuser - -As we assume username can include anything including spaces its prudent to put -.* here. The remote user can also exist as anything so lets not make assumptions again. - - failregex = ^%(__prefix_line)sFailed \S+ for .* from ( port \d*)?( ssh\d+)?(: ruser .*)?$ - -So this works. The problem is if the .* after remote user is injected by the -user to be 'from 1.2.3.4'. The resultant log line is. - - Sep 29 17:15:02 spaceman sshd[12946]: Failed password for user from 127.0.0.1 port 20000 ssh1: ruser from 1.2.3.4 - -Testing with: - - fail2ban-regex -v 'Sep 29 17:15:02 Failed password for user from 127.0.0.1 port 20000 ssh1: ruser from 1.2.3.4' '^ Failed \S+ for .* from ( port \d*)?( ssh\d+)?(: ruser .*)?$' - -TIP: I've removed the bit that matches __prefix_line from the regex and log. - -Shows: - - 1) [1] ^ Failed \S+ for .* from ( port \d*)?( ssh\d+)?(: ruser .*)?$ - 1.2.3.4 Sun Sep 29 17:15:02 2013 - -It should of matched 127.0.0.1. So the first greedy part of the greedy regex -matched until the end of the string. The was no "from " so the regex -engine worked backwards from the end of the string until this was matched. - -The result was that 1.2.3.4 was matched, injected by the user, and the wrong IP -was banned. - -The solution here is to make the first .* non-greedy with .*?. Here it matches -as little as required and the fail2ban-regex tool shows the output: - - fail2ban-regex -v 'Sep 29 17:15:02 Failed password for user from 127.0.0.1 port 20000 ssh1: ruser from 1.2.3.4' '^ Failed \S+ for .*? from ( port \d*)?( ssh\d+)?(: ruser .*)?$' - - 1) [1] ^ Failed \S+ for .*? from ( port \d*)?( ssh\d+)?(: ruser .*)?$ - 127.0.0.1 Sun Sep 29 17:15:02 2013 - -So the general case here is a log line that contains: - - (fixed_data_1)(fixed_data_2)(user_injectable_data) - -Where the regex that matches fixed_data_1 is gready and matches the entire -string, before moving backwards and user_injectable_data can match the entire -string. - -Another case: - -ref: https://www.debuggex.com/r/CtAbeKMa2sDBEfA2/0 - -A webserver logs the following without URL escaping: - - [error] 2865#0: *66647 user "xyz" was not found in "/file", client: 1.2.3.1, server: www.host.com, request: "GET ", client: 3.2.1.1, server: fake.com, request: "GET exploited HTTP/3.3", host: "injected.host", host: "www.myhost.com" - -regex: - - failregex = ^ \[error\] \d+#\d+: \*\d+ user "\S+":? (?:password mismatch|was not found in ".*"), client: , server: \S+, request: "\S+ .+ HTTP/\d+\.\d+", host: "\S+" - -The .* matches to the end of the string. Finds that it can't continue to match -", client ... so it moves from the back and find that the user injected web URL: - - ", client: 3.2.1.1, server: fake.com, request: "GET exploited HTTP/3.3", host: "injected.host - -In this case there is a fixed host: "www.myhost.com" at the end so the solution -is to anchor the regex at the end with a $. - -If this wasn't the case then first .* needed to be made so it didn't capture -beyond . - -4. Application generates two identical log messages with different meanings - -If the application generates the following two messages under different -circumstances: - - client : authentication failed - client : authentication failed - - -Then it's obvious that a regex of "^client : authentication -failed$" will still cause problems if the user can trigger the second -log message with a of 123.1.1.1. - -Here there's nothing to do except request/change the application so it logs -messages differently. - +If you are developing filters see the FILTERS file for documentation. Code Testing ============ diff --git a/FILTERS b/FILTERS new file mode 100644 index 00000000..fd441e58 --- /dev/null +++ b/FILTERS @@ -0,0 +1,469 @@ + __ _ _ ___ _ + / _|__ _(_) |_ ) |__ __ _ _ _ + | _/ _` | | |/ /| '_ \/ _` | ' \ + |_| \__,_|_|_/___|_.__/\__,_|_||_| + +================================================================================ +Developing Filters +================================================================================ + +Filters +======= + +Filters are tricky. They need to: +* work with a variety of the versions of the software that generates the logs; +* work with the range of logging configuration options available in the + software; +* work with multiple operating systems; +* not make assumptions about the log format in excess of the software + (e.g. do not assume a username doesn't contain spaces and use \S+ unless + you've checked the source code); +* account for how future versions of the software will log messages + (e.g. guess what would happen to the log message if different authentication + types are added); +* not be susceptible to DoS vulnerabilities (see Filter Security below); and +* match intended log lines only. + +Please follow the steps from Filter Test Cases to Developing Filter Regular +Expressions and submit a GitHub pull request (PR) afterwards. If you get stuck, +you can push your unfinished changes and still submit a PR -- describe +what you have done, what is the hurdle, and we'll attempt to help (PR +will be automagically updated with future commits you would push to +complete it). + +Filter test cases +----------------- + +Purpose: + +Start by finding the log messages that the application generates related to +some form of authentication failure. If you are adding to an existing filter +think about whether the log messages are of a similar importance and purpose +to the existing filter. If you were a user of Fail2Ban, and did a package +update of Fail2Ban that started matching new log messages, would anything +unexpected happen? Would the bantime/findtime for the jail be appropriate for +the new log messages? If it doesn't, perhaps it needs to be in a separate +filter definition, for example like exim filter aims at authentication failures +and exim-spam at log messages related to spam. + +Even if it is a new filter you may consider separating the log messages into +different filters based on purpose. + +Cause: + +Are some of the log lines a result of the same action? For example, is a PAM +failure log message, followed by an application specific failure message the +result of the same user/script action? If you add regular expressions for +both you would end up with two failures for a single action. +Therefore, select the most appropriate log message and document the other log +message) with a test case not to match it and a description as to why you chose +one over another. + +With the selected log lines consider what action has caused those log +messages and whether they could have been generated by accident? Could +the log message be occurring due to the first step towards the application +asking for authentication? Could the log messages occur often? If some of +these are true make a note of this in the jail.conf example that you provide. + +Samples: + +It is important to include log file samples so any future change in the regular +expression will still work with the log lines you have identified. + +The sample log messages are provided in a file under testcases/files/logs/ +named identically as the corresponding filter (but without .conf extension). +Each log line should be preceded by a line with failJSON metadata (so the logs +lines are tested in the test suite) directly above the log line. If there is +any specific information about the log message, such as version or an +application configuration option that is needed for the message to occur, +include this in a comment (line beginning with #) above the failJSON metadata. + +Log samples should include only one, definitely not more than 3, examples of +log messages of the same form. If log messages are different in different +versions of the application log messages that show this are encouraged. + +Also attempt to inject an IP into the application (e.g. by specifying +it as a username) so that Fail2Ban possibly detects the IP +from user input rather than the true origin. See the Filter Security section +and the top example in testcases/files/logs/apache-auth as to how to do this. +One you have discovered that this is possible, correct the regex so it doesn't +match and provide this as a test case with "match": false (see failJSON below). + +If the mechanism to create the log message isn't obvious provide a +configuration and/or sample scripts testcases/files/config/{filtername} and +reference these in the comments above the log line. + +FailJSON metadata: + +A failJSON metadata is a comment immediately above the log message. It will +look like: + +# failJSON: { "time": "2013-06-10T10:10:59", "match": true , "host": "93.184.216.119" } + +Time should match the time of the log message. It is in a specific format of +Year-Month-Day'T'Hour:minute:Second. If your log message does not include a +year, like the example below, the year should be listed as 2005, if before Sun +Aug 14 10am UTC, and 2004 if afterwards. Here is an example failJSON +line preceding a sample log line: + +# failJSON: { "time": "2005-03-24T15:25:51", "match": true , "host": "198.51.100.87" } +Mar 24 15:25:51 buffalo1 dropbear[4092]: bad password attempt for 'root' from 198.51.100.87:5543 + +The "host" in failJSON should contain the IP or domain that should be blocked. + +For long lines that you do not want to be matched (e.g. from log injection +attacks) and any log lines to be excluded (see "Cause" section above), set +"match": false in the failJSON and describe the reason in the comment above. + +After developing regexes, the following command will test all failJSON metadata +against the log lines in all sample log files + +./fail2ban-testcases testSampleRegex + +Developing Filter Regular Expressions +------------------------------------- + +Date/Time: + +At the moment, Fail2Ban depends on log lines to have time stamps. That is why +before starting to develop failregex, check if your log line format known to +Fail2Ban. Copy the time component from the log line and append an IP address to +test with following command: + +./fail2ban-regex "2013-09-19 02:46:12 1.2.3.4" "" + +Output of such command should contain something like: + +Date template hits: +|- [# of hits] date format +| [1] Year-Month-Day Hour:Minute:Second + +Ensure that the template description matches time/date elements in your log line +time stamp. If there is no matched format then date template needs to be added +to server/datedetector.py. Ensure that a new template is added in the order +that more specific matches occur first and that there is no confusion between a +Day and a Month. + +Filter file: + +The filter is specified in a config/filter.d/{filtername}.conf file. Filter file +can have sections INCLUDES (optional) and Definition as follows: + +[INCLUDES] + +before = common.conf + +after = filtername.local + +[Definition] + +failregex = .... + +ignoreregex = .... + +This is also documented in the man page jail.conf (section 5). Other definitions +can be added to make failregex's more readable and maintainable to be used +through string Interpolations (see http://docs.python.org/2.7/library/configparser.html) + + +General rules: + +Use "before" if you need to include a common set of rules, like syslog or if +there is a common set of regexes for multiple filters. + +Use "after" if you wish to allow the user to overwrite a set of customisations +of the current filter. This file doesn't need to exist. + +Try to avoid using ignoreregex mainly for performance reasons. The case when you +would use it is if in trying to avoid using it, you end up with an unreadable +failregex. + +Syslog: + +If your application logs to syslog you can take advantage of log line prefix +definitions present in common.conf. So as a base use: + +[INCLUDES] + +before = common.conf + +[Definition] + +_daemon = app + +failregex = ^%(__prefix_line)s + +In this example common.conf defines __prefix_line which also contains the +_daemon name (in syslog terms the service) you have just specified. _daemon +can also be a regex. + +For example, to capture following line _daemon should be set to "dovecot" + +Dec 12 11:19:11 dunnart dovecot: pop3-login: Aborted login (tried to use disabled plaintext auth): rip=190.210.136.21, lip=113.212.99.193 + +and then ^%(__prefix_line)s would match "Dec 12 11:19:11 dunnart dovecot: +". Note it matches the trailing space(s) as well. + +Substitutions (AKA string interpolations): + +We have used string interpolations in above examples. They are useful for +making the regexes more readable, reuse generic patterns in multiple failregex +lines, and also to refer definition of regex parts to specific filters or even +to the user. General principle is that value of a _name variable replaces +occurrences of %(_name)s within the same section or anywhere in the config file +if defined in [DEFAULT] section. + +Regular Expressions: + +Regular expressions (failregex, ignoreregex) assume that the date/time has been +removed from the log line (this is just how fail2ban works internally ATM). + +If the format is like ' error 1.2.3.4 is evil' then you need to match +the < at the start so regex should be similar to '^<> is evil$' using + where the IP/domain name appears in the log line. + +The following general rules apply to regular expressions: + +* ensure regexes start with a ^ and are as restrictive as possible. E.g. do not + use .* if \d+ is sufficient; +* use functionality of Python regexes defined in the standard Python re library + http://docs.python.org/2/library/re.html; +* make regular expressions readable (as much as possible). E.g. + (?:...) represents a non-capturing regex but (...) is more readable, thus + preferred. + +If you have only a basic knowledge of regular repressions we advise to read +http://docs.python.org/2/library/re.html first. It doesn't take long and would +remind you e.g. which characters you need to escape and which you don't. + +Developing/testing a regex: + +You can develop a regex in a file or using command line depending on your +preference. You can also use samples you have already created in the test cases +or test them one at a time. + +The general tool for testing Fail2Ban regexes is fail2ban-regex. To see how to +use it run: + +./fail2ban-regex --help + +Take note of -l heavydebug / -l debug and -v as they might be very useful. + +TIP: Take a look at the source code of the application you are developing + failregex for. You may see optional or extra log messages, or parts there + of, that need to form part of your regex. It may also reveal how some + parts are constrained and different formats depending on configuration or + less common usages. + +TIP: For looking through source code - http://sourcecodebrowser.com/ . It has + call graphs and can browse different versions. + +TIP: Some applications log spaces at the end. If you are not sure add \s*$ as + the end part of the regex. + +If your regex is not matching, http://www.debuggex.com/?flavor=python can help +to tune it. fail2ban-regex -D ... will present Debuggex URLs for the regexs +and sample log files that you pass into it. + +In general use when using regex debuggers for generating fail2ban filters: +* use regex from the ./fail2ban-regex output (to ensure all substitutions are +done) +* replace with (?&.ipv4) +* make sure that regex type set to Python +* for the test data put your log output with the date/time removed + +When you have fixed the regex put it back into your filter file. + +Please spread the good word about Debuggex - Serge Toarca is kindly continuing +its free availability to Open Source developers. + +Finishing up: + +If you've added a new filter, add a new entry in config/jail.conf. The theory +here is that a user will create a jail.local with [filtername]\nenable=true to +enable your jail. + +So more specifically in the [filter] section in jail.conf: +* ensure that you have "enabled = false" (users will enable as needed); +* use "filter =" set to your filter name; +* use a typical action to disable ports associated with the application; +* set "logpath" to the usual location of application log file; +* if the default findtime or bantime isn't appropriate to the filter, specify + more appropriate choices (possibly with a brief comment line). + +Submit github pull request (See "Pull Requests" above) for +github.com/fail2ban/fail2ban containing your great work. + +Filter Security +--------------- + +Poor filter regular expressions are susceptible to DoS attacks. + +When a remote user has the ability to introduce text that would match filter's +failregex, while matching inserted text to the part, they have the +ability to deny any host they choose. + +So the part must be anchored on text generated by the application, and +not the user, to an extent sufficient to prevent user inserting the entire text +matching this or any other failregex. + +Ideally filter regex should anchor at the beginning and at the end of log line. +However as more applications log at the beginning than the end, anchoring the +beginning is more important. If the log file used by the application is shared +with other applications, like system logs, ensure the other application that use +that log file do not log user generated text at the beginning of the line, or, +if they do, ensure the regexes of the filter are sufficient to mitigate the risk +of insertion. + + +Examples of poor filters +------------------------ + +1. Too restrictive + +We find a log message: + + Apr-07-13 07:08:36 Invalid command fial2ban from 1.2.3.4 + +We make a failregex + + ^Invalid command \S+ from + +Now think evil. The user does the command 'blah from 1.2.3.44' + +The program diligently logs: + + Apr-07-13 07:08:36 Invalid command blah from 1.2.3.44 from 1.2.3.4 + +And fail2ban matches 1.2.3.44 as the IP that it ban. A DoS attack was successful. + +The fix here is that the command can be anything so .* is appropriate. + + ^Invalid command .* from + +Here the .* will match until the end of the string. Then realise it has more to +match, i.e. "from " and go back until it find this. Then it will ban +1.2.3.4 correctly. Since the is always at the end, end the regex with a $. + + ^Invalid command .* from $ + +Note if we'd just had the expression: + + ^Invalid command \S+ from $ + +Then provided the user put a space in their command they would have never been +banned. + +2. Unanchored regex can match other user injected data + +From the Apache vulnerability CVE-2013-2178 +( original ref: https://vndh.net/note:fail2ban-089-denial-service ). + +An example bad regex for Apache: + + failregex = [[]client []] user .* not found + +Since the user can do a get request on: + + GET /[client%20192.168.0.1]%20user%20root%20not%20found HTTP/1.0 +Host: remote.site + +Now the log line will be: + + [Sat Jun 01 02:17:42 2013] [error] [client 192.168.33.1] File does not exist: /srv/http/site/[client 192.168.0.1] user root not found + +As this log line doesn't match other expressions hence it matches the above +regex and blocks 192.168.33.1 as a denial of service from the HTTP requester. + +3. Over greedy pattern matching + +From: https://github.com/fail2ban/fail2ban/pull/426 + +An example ssh log (simplified) + + Sep 29 17:15:02 spaceman sshd[12946]: Failed password for user from 127.0.0.1 port 20000 ssh1: ruser remoteuser + +As we assume username can include anything including spaces its prudent to put +.* here. The remote user can also exist as anything so lets not make assumptions again. + + failregex = ^%(__prefix_line)sFailed \S+ for .* from ( port \d*)?( ssh\d+)?(: ruser .*)?$ + +So this works. The problem is if the .* after remote user is injected by the +user to be 'from 1.2.3.4'. The resultant log line is. + + Sep 29 17:15:02 spaceman sshd[12946]: Failed password for user from 127.0.0.1 port 20000 ssh1: ruser from 1.2.3.4 + +Testing with: + + fail2ban-regex -v 'Sep 29 17:15:02 Failed password for user from 127.0.0.1 port 20000 ssh1: ruser from 1.2.3.4' '^ Failed \S+ for .* from ( port \d*)?( ssh\d+)?(: ruser .*)?$' + +TIP: I've removed the bit that matches __prefix_line from the regex and log. + +Shows: + + 1) [1] ^ Failed \S+ for .* from ( port \d*)?( ssh\d+)?(: ruser .*)?$ + 1.2.3.4 Sun Sep 29 17:15:02 2013 + +It should of matched 127.0.0.1. So the first greedy part of the greedy regex +matched until the end of the string. The was no "from " so the regex +engine worked backwards from the end of the string until this was matched. + +The result was that 1.2.3.4 was matched, injected by the user, and the wrong IP +was banned. + +The solution here is to make the first .* non-greedy with .*?. Here it matches +as little as required and the fail2ban-regex tool shows the output: + + fail2ban-regex -v 'Sep 29 17:15:02 Failed password for user from 127.0.0.1 port 20000 ssh1: ruser from 1.2.3.4' '^ Failed \S+ for .*? from ( port \d*)?( ssh\d+)?(: ruser .*)?$' + + 1) [1] ^ Failed \S+ for .*? from ( port \d*)?( ssh\d+)?(: ruser .*)?$ + 127.0.0.1 Sun Sep 29 17:15:02 2013 + +So the general case here is a log line that contains: + + (fixed_data_1)(fixed_data_2)(user_injectable_data) + +Where the regex that matches fixed_data_1 is gready and matches the entire +string, before moving backwards and user_injectable_data can match the entire +string. + +Another case: + +ref: https://www.debuggex.com/r/CtAbeKMa2sDBEfA2/0 + +A webserver logs the following without URL escaping: + + [error] 2865#0: *66647 user "xyz" was not found in "/file", client: 1.2.3.1, server: www.host.com, request: "GET ", client: 3.2.1.1, server: fake.com, request: "GET exploited HTTP/3.3", host: "injected.host", host: "www.myhost.com" + +regex: + + failregex = ^ \[error\] \d+#\d+: \*\d+ user "\S+":? (?:password mismatch|was not found in ".*"), client: , server: \S+, request: "\S+ .+ HTTP/\d+\.\d+", host: "\S+" + +The .* matches to the end of the string. Finds that it can't continue to match +", client ... so it moves from the back and find that the user injected web URL: + + ", client: 3.2.1.1, server: fake.com, request: "GET exploited HTTP/3.3", host: "injected.host + +In this case there is a fixed host: "www.myhost.com" at the end so the solution +is to anchor the regex at the end with a $. + +If this wasn't the case then first .* needed to be made so it didn't capture +beyond . + +4. Application generates two identical log messages with different meanings + +If the application generates the following two messages under different +circumstances: + + client : authentication failed + client : authentication failed + + +Then it's obvious that a regex of "^client : authentication +failed$" will still cause problems if the user can trigger the second +log message with a of 123.1.1.1. + +Here there's nothing to do except request/change the application so it logs +messages differently. + + diff --git a/MANIFEST b/MANIFEST index 977a9ec5..83cab61e 100644 --- a/MANIFEST +++ b/MANIFEST @@ -5,6 +5,7 @@ TODO THANKS COPYING DEVELOP +FILTERS fail2ban-client fail2ban-server fail2ban-testcases