This is going to be a tutorial for making a Regular Expression (Regex) that can search text to find test that fits the format required for Discord usernames. Regexs are very powerful tools for making specific searches in text. Unlike if statements, it is much more condensed, and runs very quickly. The primary downside to Regex over normal logic is the lack of readability.
The Regex we will be using today is
/^.{3,32}#[0-9]{4}$/
This Regex allows between 3 and 32 of any character before a # symbol, and then checks for 4 numbers afterwards. Discord usernames follow this format. Some examples would be
- TiredProgrammer#4242
- earlybird♡#2048
- 満天星#3317
These can be uppercase, lowercase, and even use special symbols. In short, the conditions this Regex checks are:
- Does this start with the characters allowed
- Are there between 3 and 32 characters after the new line but before the # symbol
- Is there a # symbol
- Are there 4 numeric characters after the # symbol
- Anchors
- Quantifiers
- Grouping Constructs
- Bracket Expressions
- Character Classes
- The OR Operator
- Flags
- Character Escapes
Regexs need to be wrapped in slash /
characters in JavaScript because they are literals. This is the most common way in JavaScript, but you can also use a constructor which uses quotes.
Anchors include ^
and $
. If you're looking to verify what comes at the start of a string, the ^
anchor is perfect for the job. It can attach expressions to evaluate said string, whether you want an exact match or a range. The $
character is very similar, but looks for the end of a string instead of the beginning. If you look at the beginning and end of our regex /^.{3,32}#[0-9]{4}$/
you will see these anchors. The anchor at the start is only attached to a quantigier, but the anchor at the end is attached to both a quantifier and a bracket expression. We will go over those in the sections below.
A quantifier does a very simple job. It sets the amount of times a string can match the condition that was set. These quantifiers can come in 3 forms:
+
matches at least once?
does not match{}
{ number }
matches as many times as you specify, no more no less{ number, }
matches at least as many times as you specify{ number, number2 }
matches between the first and second number you specify In the case of our regex/^.{3,32}#[0-9]{4}$/
we can see 2 quantifiers, both using the brackets. Our first quantifier,{3,32}
, has two numbers specified. This means that it will accept anywhere between 3 and 32 characters. The second quantifier we have,{4}
, is not as lenient. It accepts only 4, no less and no more. If it were{4,}
, it would accept any number equal to or greater than 4.
Although this regex does not use them, grouping constructs are important. This regex allows for a large variety of characters, but when you need a more sophisticated regex grouping constructs will be your best friend. If you need to check multiple parts of a string, you will need to use these. These are generally done by writing subexpressions, which are placed inside parentheses ()
and separated by colons, like so: ([a-z]|[0-9]{4}):([0-8]{4})
This would allow for a string that would accept something that matches the first condition, then a colon, then the second set of conditions. It will look for either a single lowercase letter or 4 numbers before the colon. After the colon it looks for 4 numbers excluding 9. Using operations like this you can have distinct searches for different parts of each string.
Looking back at our original regex, /^.{3,32}#[0-9]{4}$/
, you can see a bracket expression: [0-9]
Anything inside square brackets []
like these are called bracket expressions. These allow you to set a range for characters to match instead of having an exact match. If you used a bracket expression like [!@#]
, it would allow any string that had any of those characters. In the expression we used in the regex however, we have a -
, but the string shouldn't have any dashes. For alphanumeric characters, just letters and numbers, you can use a dash instead of typing out every single character you want to include. Our [0-9]
bracket expression includes all numbers between 0 and 9. You can stack these as well, so [0-9!@#]
would allow all numbers, and those 3 symbols. Any string that would pass those bracket expressions separately would definitely pass the combined version here.
These are very handy, and define a group of characters that can appear in a string. some common examples are:
.
This matches any character\d
This matches any digit, and is the same as[0-9]
\w
works for alphanumeric characers and an underscore. It's the same as[a-zA-Z0-9_]
but much easier to type\s
matches whitespace. Spaces, tabs, line breaks you name it You can also capitalize the letters to make them match anything except the thing they used to find.\D
will match anything EXCEPT digits, etc. If you look back to our original regex,/^.{3,32}#[0-9]{4}$/
, you will see we do use a.
after our start anchor. We in theory could have written this regex as/^.{3,32}#/d{4}$/
and would get the same results.
([a-z]|[0-9]{4}):([0-8]{4})
If you look back at the regex example we used for grouping constructs, you will notice a |
symbol. This symbol acts as an OR operator, and in this case separates the inital subexpression into 2 options. It will look for either a single lowercase letter or 4 numbers before the colon.
I told you we had to wrap regexs in slashes /
because they were literals way up there, but there are exceptions to the rule. And not jusdt the one where I mentioned quotes. If you use a flag, you place it AFTER the last slash. There are 6 of them, but 3 of them are common enough that you may find it useful to know them. They are:
m
a search that spans multiple lines.i
a search that doesn't care about caseg
this takes the search and applies it to all possible matches instead of being satisfied with the normal amount
This is a way to make the regex not follow a command you give it. Normally, a symbol like [
would signify the start of a bracket expression, but if you use a character escape, which is done by using a backslash\
, your regex no longer reads it as anything but another character. This is very common if you need characters that are used in regexs. Once within a bracket expression however, you don't need this. In a bracket expression until the close bracket no symbols do anything except look for matches.
Hello, my name is Keshav and I am a web developer and writer based in Philadelphia. I am still learning but thought it would be helpful to make this both for my own sake and to help make regexs feel more approachable. GitHub profile