Subject: Re: Favorite REs? - DN [1]


jeff_hobbs@my-dejanews.com - 19 Apr 1999 - comp.lang.tcl

 In article <3717CE4B.CFF699DA@pinebush.com>,
   Christopher Nelson <chris@pinebush.com> wrote:
 > I'm looking for example regular expressions for my book and I'm not feeling
 very
 > creative today.  Got any favorite REs you care to share?

 Oh, this could be a very long list, but I'll make some heuristics whereby
 you can start exploring regexp in depth to give it the best coverage that
 no book has yet given, plus a couple favs.

 Starting with the favorites, I like:
     set truth {^(1|yes|true|on)$}
     regexp -nocase $truth $val

 I usually use this for setting vals or in expr's, because expr's don't
 allow the Tcl boolean keywords that are allowed a lot of other places.

 I also like to know when I qualify for certain types (mostly with regexp):
 (from a validation function)

     switch -glob -- $type {
     alphab*    { # alphabetic
         return [regexp -nocase "^\[a-z\]$opt\$" $val]
     }
     alphan* { # alphanumeric
         return [regexp -nocase "^\[a-z0-9\]$opt\$" $val]
     }
     b*    { # boolean - would be nice if it were more than 0/1
         return [regexp "^\[01\]$opt\$" $val]
     }
     d*    { # date - always strict
         return [expr {![catch {clock scan $val}]}]
     }
     h*    { # hexadecimal
         return [regexp -nocase "^(0x)?\[0-9a-f\]$opt\$" $val]
     }
     i*    { # integer
         return [regexp "^\[-+\]?\[0-9\]$opt\$" $val]
     }
     n*    { # numeric
         return [regexp "^\[0-9\]$opt\$" $val]
     }
     rea*    { # real
         return [regexp -nocase [expr {$strict
         ?{^[-+]?([0-9]+\.?[0-9]*|[0-9]*\.?[0-9]+)(e[-+]?[0-9]+)?$}
         :{^[-+]?[0-9]*\.?[0-9]*([0-9]\.?e[-+]?[0-9]*)?$}}] $val]
     }
     reg*    { # regexp
         return [expr {![catch {regexp $val {}}]}]
     }
     val*    { # value, any valid number type
         return [expr {![catch {expr {0+$val}}]}]
     }
     l*    { # list
         return [expr {![catch {llength $val}]}]
     }
     w*    { # widget
         return [winfo exists $val]
     }
     default {
         return -code error "bad [lindex [info level 0] 0] type \"$type\":\
             \nmust be [join [lsort {alphabetic alphanumeric date \
             hexadecimal integer numeric real value \
             list boolean}] {, }]"
     }
     }

 Aside from that, there are certain things that people should understand
 to make best use of regexps (and, of course, regsubs).  Since there is
 (still) no eval switch for regexps, you should describe how the eval/subst
 of a regsub result is done, and the necessary escaping to avoid problems.

 Also, looking at 8.1 regexps (soooo much better than 8.0), you'll note
 the benefits of the new \ sequences.  The metasyntax is also a boon, but
 not always where one thinks.  For example, the (?i) says do a case
 insensitive match.  With regexp, you have always had -nocase, but since
 the same regexp engine is used throughout Tcl (which you should note),
 you can now take advantage of this in commands like switch and lsearch.

 Basically, all the switches for regexp can now be placed in the
 metasyntax, and the metasyntax can be used whereever the regexp engine
 is in place.

 newline handling is also always a bit fruity, so you might want to go
 into that a bit.

 The {} matching is new, and helpful for some, so don't forget that.
 Also, greedy versus non-greedy has tripped up many, and is especially
 good to understand for HTML-like parsing.

 There's more I'm sure, but that's all off the top of my head for now.

 --
     jeff.hobbs @SPAM acm.org
     jeffrey.hobbs @SPAM icn.siemens.de

 -----------== Posted via Deja News, The Discussion Network ==----------
 http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own

Last modified
1999-09-27

(195.108.246.50)

Note: you are looking at
the snapshot of an old wiki
- much of this information
is likely to be very outdated