Sunday 9 June 2013

Examples on LoadRunner Regular Expressions

How to use Regular Expressions in LoadRunner.

Introduction:
The present article is a summarizing of the LoadRunner Regular Expressions challenge and its results. Also, I added code for RegExp patterns/subpatterns matching.
All LoadRunner Regular Expressions functions are shown with examples.

Outline:

  1. How to check - whether RegExp pattern matches against a text or not
  2. How to get a matched strings (RegExp patterns and subpatterns)

How to check - Whether RegExp pattern matches against a text or not


I thanks Charlie Weiblen and Tim Koopmans for the solution. I modified it slightly.
So, here it is:

  1. Download and unpack Binaries and Developer files for PCRE (Perl Compatible Regular Expressions).
    These and others files are available on Pcre for Windows page.
  2. Unzip downloaded archives into c:\pcreC:\pcre folder
  3. Сomment out the include for stdlib.h file in:
    • C:\pcre\include\pcre.h
    • C:\pcre\include\pcreposix.h
    Commented stdlib.h file
  4. In your LoadRunner script, add to globals.h:
    • #include "c:\\pcre\\include\\pcre.h"
    • #include "c:\\pcre\\include\\pcreposix.h"
    Edited globals.h file
  5. Add the match() function to vuser_init section:

    //////////////////////////////////////////////////////////////////////////
    /// 'match' function matches a 'pattern' against a given 'subject'
    /// It returns 1 for a match, or 0 for a non-match / error
    int match(const char *subject, const char *pattern)
    {
    int rc; // Returned code
    regex_t re; // Compiled regexp pattern

    lr_load_dll("c:\\pcre\\bin\\pcre3.dll");

    if (regcomp(&re, pattern, 0) != 0)
    return 0; // Report error

    rc = regexec(&re, subject, 0, NULL, 0);
    regfree(&re);

    if (rc != 0)
    return 0; // Report error
    else
    return 1;
    }

  6. Let's run sample LoadRunner script and check the result:
    As you can see, match() function works correctly. Using match() function, you can check - whether RegExp pattern matches against a text or not.

    It can be helpful, when you verify in LoadRunner that the text (RegExp pattern) matches the text on a downloaded page.

    I tested the match() function with different patterns and subject strings:
    #Subject stringPatternsResult of
    match()
    Is correct
    result?
    1abcdefb(c(.*))e1Yes
    2abcdefb(z(.*))e0Yes
    32008\\d{2,5}1Yes
    42008\\d{5}0Yes
    5abc 1st of May 2008xyz\\d.*\\d1Yes
    Note: Since LoadRunner uses ANSI C language, please do not forget to double backslashes (\\). For example, to match any digit character (0-9), use pattern "\\d".

    match() function is simple enough. But it searches only and it cannot extract matched subpatterns from the text. For example, we have to extract the name of month from these strings:
    • "abc 1st of May 2008xyz"
    • "abc 25th of February 2031"
    • etc
    We can use the following pattern:
    • \d.+([A-Z]\w+)\s+\d{4}

    The name of month will be matches by subpattern ([A-Z]\w+). How to extract the found text? You can use matchex() function for that. Let's discuss it in details...
How to get a matched strings (RegExp patterns and subpatterns)

To get a matched (found) strings, we have to update our match() function.
That's why I created matchex() ('match' EXtended) function.

  1. Add the matchex() function to vuser_init section
    //////////////////////////////////////////////////////////////////////////
    /// 'matchex' (EXtended) function matches a 'pattern' against a given 'subject'
    /// It returns number of matches:
    /// 0 - for a non-match or error
    /// 1 and more - for successful matches
    int matchex(const char *subject, const char *pattern, int nmatch, regmatch_t *pmatch)
    {
    int rc; // Returned code
    regex_t re; // Compiled regexp pattern

    lr_load_dll("c:\\pcre\\bin\\pcre3.dll");

    if (regcomp(&re, pattern, 0) != 0)
    return 0; // Report error

    rc = regexec(&re, subject, nmatch, pmatch, 0);
    pcre_free(&re); // Release memory used for the compiled pattern

    if (rc < 0)
    return 0; // Report error

    // Get total number of matched patterns and subpatterns
    for (rc = 0; rc < nmatch; rc++)
    if (pmatch[rc].rm_so == -1)
    break;

    return rc;
    }

  2. Let's run sample LoadRunner script and check the result:
    matchex() function returns a number of matched patterns/subpatterns and fill an array in with information about each matched substring.


    What is an information about each matched substring?


    This info contains the offset (rm_so) to the first character of each substring and the offset (rm_eo) to the first character after the end of each substring, respectively.

    Note1: 
    The 0th element of the array relates to the entire portion of string that was matched.
    Note2: Subsequent elements of the array relate to the capturing subpatterns of the regular expression.
    Note3: Unused entries in the array have both structure members set to -1.

    Let's investigate it with the example. This is our subject string:
    ExampleThe replay log shows offsets for matched substrings:
    • Action.c(7): Matched 3 patterns
    • Action.c(10): Start offset: 1, End offset: 6
    • Action.c(10): Start offset: 2, End offset: 5
    • Action.c(10): Start offset: 3, End offset: 5

    Start offset: 1 and End offset: 6 match substring "bcdef".
    Note4: End offset is the first character after the end the current substring. That's why character "g" (with index 6) is not a part of matched string.

    As I've written in Note1, "bcdef" is the entire portion of string that was matched.
    Others items from an array relate to matched subpatterns.


    What is a subpattern in Regular Expression?
    It is a part of the RegExp pattern surrounded with parenthesis - "(" and ")".

    It's easy to get out the order of subpatterns. Just look through your pattern from left to right. When you find an open parenthes, this is a start of the current subpattern.
    Subpattern can be embedded.

    So, others captured subpatterns are:
    • Start offset: 2, End offset: 5 matches substring "cde".
      Note: current subpattern is "([acqz](.*))".
    • Start offset: 3, End offset: 5 match substring "de".
      Note: current subpattern is "(.*)".

    As you can see - this is not so difficult. :)
    Regular Expressions can be very powerful and useful in LoadRunner.
Another example:

Let's practise with an example I mentioned early:
For example, we have to extract the name of month from these strings:

  • "abc 1st of May 2008xyz"
  • "abc 25th of February 2031"
  • etc
We can use the following pattern:
  • \d.+([A-Z]\w+)\s+\d{4}
The name of month will be matches by subpattern ([A-Z]\w+).

Please, see LoadRunner script, which captures and prints name of months:




Note: Pay attention that I use   arr[1] to get info about substring.

As you remember, arr[0] contains info about the entire matched pattern, arr[1], arr[2], and so on contain info about matched subpattern.