Regular Expressions Subexpressions
By Stephen Bucaro
A Regular Expression (regex) is a sequence of characters that define a pattern
that allows you to search, match, locate, replace, manipulate, and manage text.
I explained the basics of Regular Expressions in an earlier article:
What is a Regular Expression?
In case you are not familiar with how to execute code like that in this article,
you open a new text file (use a basic ASCII text editor like Windows Notepad, not a
word processor. Word processors add formatting characters to the text). Paste the
code into the text file. Save the text file with a name that has the file extension
.htm. (say test.htm) Double-click on the file to open it in your web browser.
A regular expression subexpression is created by enclosing a separate match in parenthesis.
Each subexpression is stored temporarily in memory. You can then access reference each
subexpression using a backreference. The example below show how to use subexpressions and
backreferences to format names last name first, first name last.
let strTarget = "Robert Sled";
let regexp = /(\w+)\s(\w+)/;
let newstr = strTarget.replace(regexp, "$2, $1");
The subexpressions are numbered $1 to $9. $1 will reference the first match in parenthesis,
$2 will reference the second match in parenthesis and so on. The $x value persist until
another regular expression is encountered.
The example uses the replace() method to switch the words in the string. It references the
value in $2 (last name) first, and then the value in $1 (first), placing a comma between
the two names. The result of the replace() method is stored in the variable newstr,
which is displayed in a message box using the alert() method.
In case you need an explanation of the regular expression, the \w control character
matches any character, the + operator indicates one or more occurrences of the previous
control character. the \s control character matches one white space character.
The output of this example will display "Sled, Robert". The example below uses the
same technique to re-arrange the words in the sentence "dog jumps over cat".
let strTarget = "dog jumps over cat";
let regexp = /(\w+)(\s.*\s)(\w+)/;
let newstr = strTarget.replace(regexp, "$3 $2 $1");
The regular expression is also similar, except it contains three subexpressions referenced
by the ids $1 $2 $3. Subexpression $2 uses the .* control characters to match any number
of characters. The replace() method references the subexpression values in reverse, resulting
in the display of the message "cat jumps over dog".
The example below uses a subexpression to extract the temperature numbers from the sentence
"Temperature is 26 degrees C today", uses a formula to convert that value from Celsius to
Fahrenheit, and displays the same message with the Fahrenheit temperature.
let strTarget = "Temperature is 26 degrees C today";
let regexp = /(.*)\s([0-9]+)\s(degrees)\s(\w)\s(\w+)/;
let tempC = strTarget.replace(regexp, "$2");
let tempF = (tempC*9/5) + 32;
let newstr = strTarget.replace(regexp, "$1 " + tempF + " $3" + ' F ' + "$5");
The temperature numbers are saved in the subexpression with the id $2. The replace() method
references the subexpression's value and stores it in a variable named tempC. The $n
subexpressions are static, read-only properties, so they can't be modified directly.
The value in tempC is used in the Celsius to Fahrenheit conversion formula. The resulting
Fahrenheit temperature value is saved in a variable named tempF. The replace() method
is then used a second time to reconstruct the sentence, replacing the temperature value
and the units identifier character with "F", resulting in the display of the message
"Temperature is 78.8 degrees F today".