Regular Expressions Cookbook

Regular Expressions Cookbook

Jan Goyvaerts

Language: English

Pages: 612

ISBN: 1449319432

Format: PDF / Kindle (mobi) / ePub

Take the guesswork out of using regular expressions. With more than 140 practical recipes, this cookbook provides everything you need to solve a wide range of real-world problems. Novices will learn basic skills and tools, and programmers and experienced users will find a wealth of detail. Each recipe provides samples you can use right away.

This revised edition covers the regular expression flavors used by C#, Java, JavaScript, Perl, PHP, Python, Ruby, and VB.NET. You’ll learn powerful new tricks, avoid flavor-specific gotchas, and save valuable time with this huge library of practical solutions.

  • Learn regular expressions basics through a detailed tutorial
  • Use code listings to implement regular expressions with your language of choice
  • Understand how regular expressions differ from language to language
  • Handle common user input with recipes for validation and formatting
  • Find and manipulate words, special characters, and lines of text
  • Detect integers, floating-point numbers, and other numerical formats
  • Parse source code and process log files
  • Use regular expressions in URLs, paths, and IP addresses
  • Manipulate HTML, XML, and data exchange formats
  • Discover little-known regular expression tricks and techniques

Microsoft Visual C# 2012 Step By Step

Getting Started with Phalcon

Urban Poverty in China

Dart in Action












this regex can be broken into three groups of digits separated by hyphens. The first group is the most complex. The second and third groups simply match any two or four-digit number, respectively, but use a preceding negative lookahead to rule out the possibility of matching all zeros. The first group of digits is much more complex and harder to read than the others because it matches a numeric range. First, it uses the negative lookahead ‹(?!000| 666)› to rule out the specific values “000”

two conditions. \A (?=[abx]{1,8}\Z) a{0,7}xb{0,7} \ZRegex options: Free-spacing Regex flavors: .NET, Java, PCRE, Perl, Python, Ruby 392 | Chapter 7: URLs, Paths, and Internet Addresses The ‹\A› at the start of the regex anchors it to the start of the subject text. Then the positive lookahead kicks in. It checks whether a series of 1 to 8 letters ‹a›, ‹b›, and/or ‹x› can be matched, and that the end of the string is reached when those 1 to 8 letters have been matched. The ‹\Z› inside the

equals sign. Moving along, we get to the third piece of the regex: ‹([^>]*)›. This is a negated char- acter class and a following “zero or more” quantifier, wrapped in a capturing group. Capturing this part of the match allows you to easily bring back the attributes that each matched tag contained as part of the replacement string. And unlike the negative look- ahead, this part actually adds the attributes within the tag to the string matched by the regex. Finally, the regex matches the

only once, you should create a Pattern object instead of using the static members of the String class. Though it takes a few lines of extra code, that code will run more efficiently. The static calls recompile your regex each and every time. In fact, Java provides static calls for only a few very basic regex tasks. A Pattern object only stores a compiled regular expression; it does not do any actual work. The actual regex matching is done by the Matcher class. To create a Matcher, call the

String, Keeping the Regex Matches | 203 result in an array of nine strings: I●like●, , bold, , ●and●, , italic, , and ●fonts. Solution C# You can use the static call when you process only a small number of strings with the same regular expression: string[] splitArray = Regex.Split(subjectString, "(<[^<>]*>)"); Construct a Regex object if you want to use the same regular expression with a large number of strings: Regex regexObj = new Regex("(<[^<>]*>)"); string[]

Download sample