DOC: more on filter regexes - DEVELOP

2013-11-11 08:08:10 +11:00 · 2013-11-11 08:08:10 +11:00 · b8f40fef1b
parent 724c6bfd92
commit b8f40fef1b
1 changed files with 78 additions and 3 deletions
--- a/81
+++ b/81
@ -331,7 +331,7 @@ failregex, while matching inserted text to the <HOST> part, they have the
 ability to deny any host they choose.

 So the <HOST> part must be anchored on text generated by the application, and
-not the user, to a extent sufficient to prevent user inserting the entire text
+not the user, to an extent sufficient to prevent user inserting the entire text
 matching this or any other failregex.

 Ideally filter regex should anchor at the beginning and at the end of log line.
@ -381,7 +381,7 @@ Note if we'd just had the expression:
 Then provided the user put a space in their command they would have never been
 banned.

-2. Filter regex can match other user injected data
+2. Unanchored regex can match other user injected data

 From the Apache vulnerability CVE-2013-2178
 ( original ref: https://vndh.net/note:fail2ban-089-denial-service ).
@ -402,7 +402,82 @@ Now the log line will be:
 As this log line doesn't match other expressions hence it matches the above
 regex and blocks 192.168.33.1 as a denial of service from the HTTP requester.

-3. Application generates two identical log messages with different meanings
+3.  Over greedy pattern matching
+
+From: https://github.com/fail2ban/fail2ban/pull/426
+
+An example ssh log (simplified)
+
+    Sep 29 17:15:02 spaceman sshd[12946]: Failed password for user from 127.0.0.1 port 20000 ssh1: ruser remoteuser
+
+As we assume username can include anything including spaces its prudent to put
+.* here. The remote user can also exist as anything so lets not make assumptions again.
+
+    failregex = ^%(__prefix_line)sFailed \S+ for .* from <HOST>( port \d*)?( ssh\d+)?(: ruser .*)?$
+
+So this works. The problem is if the .* after remote user is injected by the
+user to be 'from 1.2.3.4'. The resultant log line is.
+
+    Sep 29 17:15:02 spaceman sshd[12946]: Failed password for user from 127.0.0.1 port 20000 ssh1: ruser from 1.2.3.4
+
+Testing with:
+
+    fail2ban-regex -v 'Sep 29 17:15:02 Failed password for user from 127.0.0.1 port 20000 ssh1: ruser from 1.2.3.4' '^ Failed \S+ for .* from <HOST>( port \d*)?( ssh\d+)?(: ruser .*)?$'
+
+TIP: I've removed the bit that matches __prefix_line from the regex and log.
+
+Shows:
+
+    1) [1] ^ Failed \S+ for .* from <HOST>( port \d*)?( ssh\d+)?(: ruser .*)?$
+       1.2.3.4  Sun Sep 29 17:15:02 2013
+
+It should of matched 127.0.0.1. So the first greedy part of the greedy regex
+matched until the end of the string. The was no "from <HOST>" so the regex
+engine worked backwards from the end of the string until this was matched.
+
+The result was that 1.2.3.4 was matched, injected by the user, and the wrong IP
+was banned.
+
+The solution here is to make the first .* non-greedy with .*?. Here it matches
+as little as required and the fail2ban-regex tool shows the output:
+
+    fail2ban-regex -v 'Sep 29 17:15:02 Failed password for user from 127.0.0.1 port 20000 ssh1: ruser from 1.2.3.4' '^ Failed \S+ for .*? from <HOST>( port \d*)?( ssh\d+)?(: ruser .*)?$'
+
+    1) [1] ^ Failed \S+ for .*? from <HOST>( port \d*)?( ssh\d+)?(: ruser .*)?$
+       127.0.0.1  Sun Sep 29 17:15:02 2013
+
+So the general case here is a log line that contains:
+
+    (fixed_data_1)<HOST>(fixed_data_2)(user_injectable_data)
+
+Where the regex that matches fixed_data_1 is gready and matches the entire
+string, before moving backwards and user_injectable_data can match the entire
+string.
+
+Another case:
+
+ref: https://www.debuggex.com/r/CtAbeKMa2sDBEfA2/0
+
+A webserver logs the following without URL escaping:
+
+    [error] 2865#0: *66647 user "xyz" was not found in "/file", client: 1.2.3.1, server: www.host.com, request: "GET ", client: 3.2.1.1, server: fake.com, request: "GET exploited HTTP/3.3", host: "injected.host", host: "www.myhost.com"
+
+regex:
+
+    failregex = ^ \[error\] \d+#\d+: \*\d+ user "\S+":? (?:password mismatch|was not found in ".*"), client: <HOST>, server: \S+, request: "\S+ .+ HTTP/\d+\.\d+", host: "\S+"
+
+The .* matches to the end of the string. Finds that it can't continue to match
+", client ... so it moves from the back and find that the user injected web URL:
+
+    ", client: 3.2.1.1, server: fake.com, request: "GET exploited HTTP/3.3", host: "injected.host
+
+In this case there is a fixed host: "www.myhost.com" at the end so the solution
+is to anchor the regex at the end with a $.
+
+If this wasn't the case then first .* needed to be made so it didn't capture
+beyond <HOST>.
+
+4. Application generates two identical log messages with different meanings

 If the application generates the following two messages under different
 circumstances: