Menu
Regular Expression Basics : How many Matches?

A Regular Expression (regex) is a sequence of characters that define a pattern that allows you to search, match, locate, replace, manipulate, and manage text. I explained the basics of Regular Expressions in an earlier article: What is a Regular Expression?

In case you are not familiar with how to execute code like that in this article, you open a new text file (use a basic ASCII text editor like Windows Notepad, not a word processor. Word processors add formatting characters to the text). Paste the code into the text file. Save the text file with a name that has the file extension .htm. (say test.htm) Double-click on the file to open it in your web browser.

In this article you'll learn how to use regular expressions to count the number of matches. The example below will match the character a in the string "same sales sample".

<script>
var strTarget = "same sales sample";
var num = strTarget.match(/a/).length;
alert(num);
</script>

The code creates a text string "same sales sample" and then passes the regular expression /a/ to the JavaScript String Object method <i>match</i>. Even though we can see that there are 3 a's in the string, the code will return 1 because it will match only the first occurrence. To make it count all occurrences in the entire string we have to use the global identifier g, as shown below.

<script>
var strTarget = "same sales sample";
var num = strTarget.match(/a/g).length;
alert(num);
</script>

This time the code will return 3. We can search for more than single characters. The example below searches for occurrences if "ing" in the sentence "same sales sample".

<script>
var strTarget = "From beginning to ending it's exciting.";
var num = strTarget.match(/ing/g).length;
alert(num);
</script>

The code will return 3. However the code shown below will return 2.

<script>
var strTarget = "From beginning to ENDING it's exciting.";
var num = strTarget.match(/ing/g).length;
alert(num);
</script>

That's because by default the match is case sensitive. We solve that by adding the i switch to make the search case insensitive, as shown below.

<script>
var strTarget = "From beginning to ENDING it's exciting.";
var num = strTarget.match(/ing/ig).length;
alert(num);
</script>

Now let's suppose we want to count the number of "oo" character sequences in the string "look smooth soon". We could use the code shown below.

<script>
var strTarget = "look smooth soon";
var num = strTarget.match(/o/g).length;
alert(num);
</script>

The code will return 6. That's because it counted successive o's. One way to fix that is to use the + specifier as shown below.

<script>
var strTarget = "look smooth soon";
var num = strTarget.match(/o+/g).length;
alert(num);
</script>

The code will return 3. The + specifier means the character can occur one or more times in each group. So successive characters will not be counted.

<script>
var strTarget = "same sales sample";
var num = strTarget.match(/\s/g).length;
alert(num);
</script>

One thing regular expression matches are frequently used for is countings spaces in text. This can be done with the code shown above. In the regular expression the back-slash is an escape character. It means the character following it is actually a code. The escape character \s means match any whitespace character. That includes tab (\t), new line (\n) and carriage return (\r).

The escape character \S means match any non-whitespace character.

The code shown below will return 3. That's because there is an extra space in "same sales sample" and the regular expression counts successive whitespace characters.

<script>
var strTarget = "same  sales sample";
var num = strTarget.match(/\s/g).length;
alert(num);
</script>

Again, we can solve this using the + specifier. The regular expression shown below returns 2.

<script>
var strTarget = "same  sales sample";
var num = strTarget.match(/\s+/g).length;
alert(num);
</script>

Why would we need a regular expression that match any whitespace character including tab (\t), new line (\n) and carriage return (\r)? In other words, how do we create a long text string that will fix in a small message box? This can be done as shown below.

<script>
var strTarget = "We hold these truths	to be self-evident,\n\
that all men are created equal, that they are endowed by\n\
their creator with certain unalienable rights, that among\n\
these are life, liberty and the pursuit of happiness.";

alert(strTarget);
</script>

Note the characters on the end of each string (except the last string). The \n excape character causes a new line, and the backslash \ at the end of each line tells JavaScript that the string will continue on the next line. This method is slightly more efficient then using the concatenation character (+) at the end of each line.

<script>
var strTarget = "We hold these truths	to be self-evident,\n\
that all men are created equal, that they are endowed by\n\
their creator with certain unalienable rights, that among\n\
these are life, liberty and the pursuit of happiness.";

var num = strTarget.match(/\s/g).length;
alert(num);
</script>

Now, the code shown above will count all spaces and new line characters, which can be used to get an accurate word count. It will return 34.


Learn more at amazon.com

More Java Script Code:
• Display a Value in Currency Format
• Easy JavaScript FileReader Code
• Easy Code for Date Count Down
• A JavaScript Function That Returns a Function
• JavaScript Code to Make Image Wave Like a Flag
• HTML5 Canvas Drag-and-Drop
• Easy Java Script Timer Code
• Web Site Menus : Which Section Am I In?
• What is a Regular Expression?
• Code to Add Music to Your Webpage