We use Pattern.matches() to check the regex:
DOT
. (dot) single character
List<String> list = new ArrayList<>();
list.add("a");
list.add("!");
list.add("he");
list.add("~");
list.forEach(x ->
System.out.println
(Pattern.matches(".", x) + " -> " + x));
true -> a
true -> !
false -> he
true -> ~
true -> !
false -> he
true -> ~
CHARACTER CLASSES
The examples below means one character string without regex quantifiers (will see in next subject)
[abc] | a or b or c |
[^abc] | any character except a and b and c |
[a-z] | all English letters (lowercase) |
[A-Z] | all English letters (uppercase) |
[a-zA-Z] | all English letters (lowercase and uppercase included) |
[a-dm-p] | a through d or m through p |
[a-dM-P] | a through d (lowercase) or M through P (uppercase) |
[a-z&&[^bc]] | a through z except b and c (same as: [ad-z]) |
[a-z&&[^m-p]] | a through z and not m through p (same as: [a-lq-z]) |
System.out.println("a is " + Pattern.matches("[abc]", "a") + " for regex [abc]");
System.out.println("ab is " + Pattern.matches("[abc]", "ab") + " for regex [abc]");
System.out.println("a is " + Pattern.matches("[^abc]", "a") + " for regex [^abc]");
System.out.println("d is " + Pattern.matches("[^abc]", "d") + " for regex [^abc]");
System.out.println("G is " + Pattern.matches("[A-Z]", "G") + " for regex [A-Z]");
System.out.println("g is " + Pattern.matches("[A-Z]", "g") + " for regex [A-Z]");
System.out.println("c is " + Pattern.matches("[a-dM-P]", "c") + " for regex [a-dM-P]");
System.out.println("K is " + Pattern.matches("[a-dM-P]", "K") + " for regex [a-dM-P]");
System.out.println("n is " + Pattern.matches("[a-z&&[^m-p]]", "n") + " for regex [a-z&&[^m-p]]");
System.out.println("f is " + Pattern.matches("[a-z&&[^m-p]]", "f") + " for regex [a-z&&[^m-p]]");
a is | true | for regex | [abc] |
ab is | false | for regex | [abc] // 2 characters, must be 1 |
a is | false | for regex | [^abc] |
d is | true | for regex | [^abc] |
G is | true | for regex | [A-Z] |
g is | false | for regex | [A-Z] // g is lowercase |
c is | true | for regex | [a-dM-P] |
K is | false | for regex | [a-dM-P] // K is not between M and P or a and d |
n is | false | for regex | [a-z&&[^m-p]] // m is between m and p |
f is | true | for regex | [a-z&&[^m-p]] |
REGEX QUANTIFIERS
In the examples below, we use X which means specifically "X" string. We can also use character classes instead of X for these examples.
X? | X occurs once or not at all |
X+ | X occurs once or more times |
X* | X occurs zero or more times |
X{n} | X occurs n times only |
X{n,} | X occurs n or more times |
X{y,z} | X occurs at least y times but less than z times |
System.out.println("X is " + Pattern.matches("X+", "X") + " for regex X+");
System.out.println("XXXX is " + Pattern.matches("X+", "XXXX") + " for regex X+");
System.out.println(" is " + Pattern.matches("X+", "") + " for regex X+");
System.out.println("X is " + Pattern.matches("X*", "X") + " for regex X*");
System.out.println("XXXX is " + Pattern.matches("X*", "XXXX") + " for regex X*");
System.out.println(" is " + Pattern.matches("X*", "") + " for regex X*");
System.out.println("X is " + Pattern.matches("X{2,}", "X") + " for regex X{2,}");
System.out.println("XXXX is " + Pattern.matches("X{2,}", "XXXX") + " for regex X{2,}");
System.out.println("XX is " + Pattern.matches("X{2,}", "XX") + " for regex X{2,}");
X is | true | for regex | X+ |
XXXX is | true | for regex | X+ |
(null) is | false | for regex | X+ // + quantifier must have at least 1 one X |
X is | true | for regex | X* |
XXXX is | true | for regex | X* |
(null) is | true | for regex | X* |
X is | false | for regex | X{2,} // less than 2 |
XXXX is | true | for regex | X{2,} |
XX is | true | for regex | X{2,} |
CHARACTER CLASSES WITH REGEX QUANTIFIERS
After character classes, type regex quantifiers and define the length of regex.
[!_.%&']+ | These characters: ! _ . % & ', one or more |
[0-9]* | Numbers only, zero or more |
[a-z&&[^kmn]]{3,} | a through d except k and m and n, minimum 3 characters |
[A-Z0-9]{8,16} | A through Z and numbers, minimum 8, maximum 16 characters |
System.out.println("(null) is " + Pattern.matches("[!_.%&']+", "") + " for regex [!_.%&']+");
System.out.println("!% is " + Pattern.matches("[!_.%&']+", "!%") + " for regex [!_.%&']+");
System.out.println("2_ is " + Pattern.matches("[!_.%&']+", "2_") + " for regex [!_.%&']+");
System.out.println("(null) is " + Pattern.matches("[0-9]*", "") + " for regex [0-9]*");
System.out.println("1984 is " + Pattern.matches("[0-9]*", "1984") + " for regex [0-9]*");
System.out.println("RC45 is " + Pattern.matches("[0-9]*", "RC45") + " for regex [0-9]*");
System.out.println("abxyz is " + Pattern.matches("[a-z&&[^kmn]]{3,}", "abxyz") + " for regex [a-z&&[^kmn]]{3,}");
System.out.println("ab is " + Pattern.matches("[a-z&&[^kmn]]{3,}", "ab") + " for regex [a-z&&[^kmn]]{3,}");
System.out.println("abk is " + Pattern.matches("[a-z&&[^kmn]]{3,}", "abk") + " for regex [a-z&&[^kmn]]{3,}");
System.out.println("FEVER105 is " + Pattern.matches("[A-Z0-9]{8,16}", "FEVER105") + " for regex [A-Z0-9]{8,16}");
System.out.println("Fever105 is " + Pattern.matches("[A-Z0-9]{8,16}", "Fever105") + " for regex [A-Z0-9]{8,16}");
System.out.println("FE105 is " + Pattern.matches("[A-Z0-9]{8,16}", "FE105") + " for regex [A-Z0-9]{8,16}");
(null) is | false | for regex | [!_.%&']+ // + quantifier must have at least 1 |
!% is | true | for regex | [!_.%&']+ |
2_ is | false | for regex | [!_.%&']+ // there is 2 |
(null) is | true | for regex | [0-9]* |
1984 is | true | for regex | [0-9]* |
RC45 is | false | for regex | [0-9]* // there is non digit R C |
abxyz is | true | for regex | [a-z&&[^kmn]]{3,} |
ab is | false | for regex | [a-z&&[^kmn]]{3,} // less than 3 |
abk is | false | for regex | [a-z&&[^kmn]]{3,} // there is k |
FEVER105 is | true | for regex | [A-Z0-9]{8,16} |
Fever105 is | false | for regex | [A-Z0-9]{8,16} // there is lowercases |
FE105 is | false | for regex | [A-Z0-9]{8,16} // less than 8 |
REGEX METACHARACTERS
Consider them like character group aliases.
. | Any character |
\d | Any digits (same as: [0-9]) |
\D | Any non digits (same as: [^0-9]) |
\s | Any whitespace character (same as: [\t\n\x0B\f\r]) |
\S | Any non whitespace character |
\w | Any word character (same as: [a-zA-Z_0-9]) |
\W | Any non word character |
\b | Any word boundary |
\B | Any non word boundary |
PS: \w includes underscore, so [\w&&[^_]] means word characters without underscore. We use it below.
System.out.println("19216845124 is " + Pattern.matches("[\\d]{11}", "19216845124") + " for regex [\\d]{11}");
System.out.println("05325320532 is " + Pattern.matches("[\\d]{10}", "05325320532") + " for regex [\\d]{10}");
System.out.println("Atif Imal is " + Pattern.matches("[\\w]{2,}", "Atif Imal") + " for regex [\\d]{10}");
System.out.println("Atif Imal is " + Pattern.matches("[\\w\\s]{2,}", "Atif Imal") + " for regex [\\d]{10}");
19216845124 is | true | for regex | [\d]{11} |
05325320532 is | false | for regex | [\d]{10} // must be 10 character |
Atif Imal is | false | for regex | [\w]{2,} // there is whitespace |
Atif Imal is | true | for regex | [\w\s]{2,} |
CHAINING CHARACTER CLASSES
We are about to type an e-mail regex:
For username, it will be characters (word, digits, dot and underscore, min. 3 character) | [\w\d.]{3,} | [a-zA-Z0-9._]{3,} |
After username, it will have @ | @ | [@] |
Then domain name (word, digits, 3-25 characters (for example)) | [\w\d&&[^_]]{3,25} | [a-zA-Z0-9]{3,25} |
Then dot | \. | [.] |
Then domain extension (word, can't be zero length and min. 1 character (for example)) | [\w&&[^_]]+ | [a-zA-Z]+ |
The result is... Just queue them.
[\w\d.]{3,}@[\w\d&&[^_]]{3,25}\.[\w&&[^_]]+
Or use the alternates, (you can use them mixed)
[a-zA-Z0-9._]{3,}[@][a-zA-Z0-9]{3,25}[.][a-zA-Z]+
List<String> list19 = new ArrayList<>();
list19.add(".testing@test.tes");
list19.add("tes.ting@test.tes");
list19.add("testing@test.tes");
list19.add("testing.@test.tes");
list19.add("testing@.test.tes");
list19.add("testing@test..tes");
list19.add("testing@test.");
list19.add("testing@test.t");
list19.add("testing@t.tes");
list19.add("@test.tes");
list19.add(".@test.tes");
list19.add("@.");
list19.add("t@t.t");
list19.add("ttttttttttttttttt@test.tes");
list19.add("tt@test.tes");
list19.add("ttt@test.tes");
list19.forEach(x ->
System.out.println
(Pattern.matches("^[\\w\\d.]{3,}@[\\w\\d&&[^_]]{3,25}\\.[\\w&&[^_]]+$", x) + " -> " + x));
true | -> | .testing@test.tes |
true | -> | tes.ting@test.tes |
true | -> | testing@test.tes |
true | -> | testing.@test.tes |
false | -> | testing@.test.tes |
false | -> | testing@test..tes |
false | -> | testing@test. |
true | -> | testing@test.t |
false | -> | testing@t.tes |
false | -> | @test.tes |
false | -> | .@test.tes |
false | -> | @. |
false | -> | t@t.t |
true | -> | ttttttttttttttttt@test.tes |
false | -> | tt@test.tes |
true | -> | ttt@test.tes |
So, we have some troubles. We don't want username start or end with dot or underscore.
We need to do ...
To be continued.
Comments
Post a Comment