Regular Expression

From BC$ MobileTV Wiki
Jump to: navigation, search

A Regular Expression is an arrangement of characters that, when combined together (under some specific language), indicates a pattern of matching of a representative subset of that language, where the representative subset is defined by the particular arrrangement of characters.

Regular Expressions are used for a wide array of functions, but the most common uses are as:

- Filters
- Information Retrieval patterns
- Automated location of a pattern in a text or document
- Validation of adherence to a pattern
- etc...

Commonly matched patterns

People's Names



Allow French Characters in Names




Typically more restrictive to adhere to strict data security and/or data storage standards, not allowing just any character (often only limited list of special characters).



Typically, you're better off allowing a user to select from a "known valid list" where possible, or for instance, select their location on a map and geocode (or better yet allow "Geolocation detection" then reverse geocode to a place name; however there are some approaches that can be taken for validation of inputs if needed.



Sub-Regions include any naming or categorization system a geo-politically sub-divided region may use such as States, Provinces, Prefectures, Counties, etc...



Postal Code

US Zip Codes:


Canadian Postal Codes:

[A-Za-z][0-9][A-Za-z] [0-9][A-Za-z][0-9]

UK Postal Code:

[A-Za-z]{1,2}[0-9Rr][0-9A-Za-z]? [0-9][ABD-HJLNP-UW-Zabd-hjlnp-uw-z]{2}

Spanish Postal Code:


Japanese Postal Codes:



Phone Numbers

International Phone Numbers:


US & Canada phone numbers (omitting the known and common between the two nations "+1" country code):


[4] [5] [6] [7]

Mexico phone numbers:



UK phone numbers:

^\s*\(?(020[7,8]{1}\)?[ ]?[1-9]{1}[0-9{2}[ ]?[0-9]{4})|(0[1-8]{1}[0-9]{3}\)?[ ]?[1-9]{1}[0-9]{2}[ ]?[0-9]{3})\s*$

France phone numbers:

Japan phone numbers:


[9] [10]

Email Addresses

The Email address standard in its current state, makes for an imperfect validation. Keep in mind that it is somewhat impossible to write a perfect validation that will catch all legal combinations. However, there are some patterns that can be used to validate the vast majority of normal allowed and expected email addresses:



More thorough, but missing French characters and some other legal special characters:


Very thorough with French character support, but still lacking support for very long yet legal domains:


[12] [13]

IP Addresses


The following RegEx does a great job of matching most URL patterns:


In Notepad++, it can be effectively used to switch the order of the matched URL group, compared to other data on the same line, using:

^(http(s)?:\/\/[\w.-]+(?:\.[\w\.-]+)+[\w\-\._~:/?#[\]@!\$&'\(\)\*\+,;=.]+ )(.+)

For instance, to reformat "LINK other stuff" into "other stuff: LINK" use the following "Replace" field:

\3: \1  







External Links


  1. What is a reasonable length limit on person “Name” fields?:
  2. Understanding (the rules for) Diacritical Marks in French:
  3. HTML5 input -- pattern attribute - Postal/Zip Code regex examples:
  4. US Area Codes By State:
  5. US States - all area codes:
  6. Canada area codes:
  7. How to call Canada from the USA (area code guide):
  8. wikipedia: Area codes in Mexico by code
  9. Regular Expression for Japan phone number:
  10. HTML5 input -- pattern attribute - Phone regex examples:
  11. Basic Email RegEx (mkyong):
  12. RegEx Tester -- Email example:
  13. How to Find or Validate an Email Address (an explanation on the challenges of trying to find an email validation that covers all cases):
  14. URL Validation Regex:
  15. Rubular - Ruby regular expression editor:
  16. Try RegEx: (JavaScript interactive RegEx teaching tool)
  17. RegEx in JS "test vs match" performance test/comparison:
  18. Read user input from command-line with Java Scanner class:

See Also

Validation | JavaScript