Menu
Regular Expression Basics : Match a Set of Characters

A Regular Expression (regex) is a sequence of characters that define a pattern that allows you to search, match, locate, replace, manipulate, and manage text. I explained the basics of Regular Expressions in an earlier article: What is a Regular Expression?

In case you are not familiar with how to execute code like that in this article, you open a new text file (use a basic ASCII text editor like Windows Notepad, not a word processor. Word processors add formatting characters to the text). Paste the code into the text file. Save the text file with a name that has the file extension .htm. (say test.htm) Double-click on the file to open it in your web browser.

In this article you'll learn how to use regular expressions to match a set of of characters. The example below will match uppercase A-Z characters in the string "We hold these Truths to be self-evident, that All Men are Created Equal..."

<script>
let strTarget = "We hold these Truths to be self-evident, that All Men are Created Equal...";
let regexp = /[A-Z]/g;

let result = strTarget.match(regexp);
alert(result);
</script>

To match a set of of characters, create a "character set" or "character class". Place the characters you want to match between square brackets. You can use a hyphen inside a character class to specify a range of characters, for example [A-Z].

In the example above, I store the regular expression /[A-Z]/g in a variable named regexp. I pass that variable to the JavaScript String object match method. The g specifier causes the regular expression to search for matches through the entire string.

The String object match method stores the returned matches in a variable named result. If there are no matches, the match method will return null. I use the alert method to display the content of the variable result.

This causes the String match method to return W,T,A,M,C,E.

In previous examples, I used the keyword var to declare variables. Assuming that you will be programming for newer browser versions, the keyword let would be a better choice to declare variables. The difference between var and let is that var is function scoped and let is block scoped. This means that let provides much tighter scoping. (in programming a varaibles scope is the part of the code in which it will be visible).

Also you can actually use variables in JavaScript without declaring them first, which is bad programming practice, while variables declared with let are not accessible before they are declared.

In the example below, the regular expression /[aeiou]/g causes the String match method to return all lowercase vowels.

<script>
let strTarget = "We hold these Truths to be self-evident, that All Men are Created Equal ...";
let regexp = /[aeiou]/g;

let result = strTarget.match(regexp);
alert(result);
</script>

This causes the String match method to return e,o,e,e,u,o,e,e,e,i,e,a,e,a,e,e,a,e,u,a.

In the example below, the regular expression /[A-Ma-m]/ will match all uppercase or lowercase letters from the first half of the alphabet.

<script>
let strTarget = "We hold these Truths to be self-evident, that All Men are Created Equal ...";
let regexp = /[A-Ma-m]/g;

let result = strTarget.match(regexp);
alert(result);
</script>

This causes the String match method to return e,h,l,d,h,e,e,h,b,e,e,l,f,e,i,d,e,h,a,A,l,l,M,e,a,e,C,e,a,e,d,E,a,l.

In the example below, the caret character ^ at the beginning of the character group regular expression means match everything NOT in the list. In this case it will NOT match any characters in the first half of the alphabet. With the i switch for ignore case and it will also not match , (commas) and . (periods) and \s (blank space characters.

<script>
let strTarget = "We hold these Truths to be self-evident, that All Men are Created Equal ...";
let regexp = /[^a-m,.\s]/ig;

let result = strTarget.match(regexp);
alert(result);
</script>

This causes the String match method to return W,o,t,s,T,r,u,t,s,t,o,s,-,v,n,t,t,t,n,r,r,t,q,u.

In the example below, to extract a number from the string "extract the number 82501 from this string", the regular expression [0-9] matches a single digit between 0 and 9. The + specifier means the character can occur one or more times.

<script>
let strTarget = "extract the number 82501 from this string";
let regexp = /[0-9]+/;

let result = strTarget.match(regexp);
alert(result);
</script>

This causes the String match method to return 82501.

In the example below, signed numbers and unsigned numbers are extracted from the string "extract the number -825 and 501 from this string".

<script>
let strTarget = "extract the number -825 and 501 from this string";
let regexp = /[-0-9]+/g;

let result = strTarget.match(regexp);
alert(result);
</script>

To extract signed numbers and unsigned numbers from a string the regular expression [-0-9] matches a single digit between 0 and 9 and between -0 and -9. The + specifier means the character can occur one or more times. The g specifier means that the regular expression should find matches in the entire string.

This causes the String match method to return -825,501.

In the example below, signed or unsigned floating point numbers are extracted from the string "extract the number -825.501 from this string".

<script>
let strTarget = "extract the number -825.501 from this string";
let regexp = /[-.0-9]+/;

let result = strTarget.match(regexp);
alert(result);
</script>

To extract a signed or unsigned floating point numbers from from a string, the regular expression [-0-9] matches a single digit between 0 and 9 and between -0 and -9. The . (dot) matches a decimal point. The + specifier means the character can occur one or more times.

This causes the String match method to return -825.501.

In the example below, regular expression is used to extract the hexadecimal number 0xacc0fd from the string "extract the hexadecimal number 0xacc0fd from this string".

<script>
let strTarget = "extract the hexadecimal number 0xacc0fd from this string";
let regexp = /0[xX][0-9a-fA-F]+/;

let result = strTarget.match(regexp);
alert(result);
</script>

The regular expression /0[xX][0-9a-fA-F]+/ matches a set of of characters starting with 0 followed by either a lower or uppercase x, followed by one or more characters in the ranges 0-9, or a-f, or A-F.

The hexadecimal number will need to be prefixed by 0x because hexadecimal digits are just regular alpha characters and so any regular expression can mistake regular text in the string for hexadecimal digits.


Learn more at amazon.com

More Java Script Code:
• Calendars for Your Website
• HTML5 Canvas JavaScript Code to a Draw Bezier Curve
• Make Your Own Graphical Digital Clock
• Display a Value in Currency Format
• Convert Mixed Number to Decimal
• Code to Drag and Drop an Image Anywhere on a Webpage
• JavaScript to Add and Remove Rows and Cells from a Table
• Object-Oriented JavaScript
• How to Use HTML5 canvas arc and arcTo Functions
• Easy JavaScript Web Storage Code