Skip to main content

Character Classes

Regular Expressions: Character Classes


What is a character class used for in regular expressions?

View Answer:
Interview Response: A character class is a special notation that matches any symbol from a particular set. The most common character classes are \d, \s, and \w used to add additional parameters for regular expressions to manipulate strings. A RegExp may contain both regular symbols and character classes.

Code Example:

let str = 'Is there CSS4?';
let regexp = /CSS\d/;

alert(str.match(regexp)); // alerts CSS4

Is it possible to use multiple character classes in regular expressions?

View Answer:
Interview Response: Yes, we can use multiple character classes in regular expressions to manipulate string queries.

Code Example:

alert('I love HTML5!'.match(/\s\w\w\w\w\d/)); // alerts ' HTML5'

In the context of regular expressions, what is an inverse class?

View Answer:
Interview Response: There is a "inverse class" for each character class, represented by the same letter but in uppercase. We may use \D as the inverse class for \d, which has certain advantages in reducing reliance on methods like str.match(/\d/g).join('').

Code Example:

// When we use \d we have to use the join method.
let str = '+7(903)-123-45-67';

alert(str.match(/\d/g).join('')); // 79031234567

// When we use \D we do not have to use the join method.
let str = '+7(903)-123-45-67';

alert(str.replace(/\D/g, '')); // 79031234567

What does the dot represent in a regular expression?

View Answer:
Interview Response: A dot (.) is a unique character class that matches any character except a new line. We should note that a dot means “any character”, but not the “absence of a character”. There must be a character to match it. By default, a dot does not match the newline character \n.

Code Example:

alert('Z'.match(/./)); // Z

let regexp = /CS.4/;

alert('CSS4'.match(regexp)); // CSS4
alert('CS-4'.match(regexp)); // CS-4
alert('CS 4'.match(regexp)); // CS 4 (space is also a character)\

alert('CS4'.match(/CS.4/)); // null
// no match because there is no character for the dot

What character class should you use with dot to accept all characters, like the (\n) newline character?

View Answer:
Interview Response: By default, a dot does not match the newline character \n. There are many situations when we would like a dot to mean literally “any character”, newline included. To include all characters, we must use the "s" flag. We should note that Internet Explorer does not support the “s” flag.

Code Example:

// Without the "s" flag
alert('A\nB'.match(/A.B/)); // null (no match)

// With the "s" flag
alert('A\nB'.match(/A.B/s)); // A\nB (match!)

When we need to use the “s” flag, is there a way to ensure it works in all browsers?

View Answer:
Interview Response: Because IE does not support the s flag. We can use a regular expression [\s\S] to match any character as an alternative that works everywhere. [\s\S] means "a space character OR not a space character." In other words, "everything." It doesn't matter whether we use another pair of complimentary classes, such as [\d\D]. Or even [^] - which means "match any character except nothing." Also, we may use this approach if we want two types of "dots" in the same pattern: the usual dot acting normally ("without containing a newline") and a way to match "any character" with [\s\S] or something similar.

Code Example:

alert('A\nB'.match(/A[\s\S]B/)); // A\nB (match!)

Why is it important to pay attention to spaces for regular expressions?

View Answer:
Interview Response: If a regular expression does not take spaces into account, it may fail to work. We can fix it by adding spaces into the regular expression. Space is a character and equal in importance to any other character. We cannot add or remove spaces from a regular expression and expect it to work the same. In other words, all characters matter spaces in a regular expression.

Code Example:

// Wrong Approach
alert('1 - 5'.match(/\d-\d/)); // null, no match!

// Correct Approach
alert('1 - 5'.match(/\d - \d/)); // 1 - 5, now it works

// or we can use \s class:
alert('1 - 5'.match(/\d\s-\s\d/)); // 1 - 5, also works