Subject: Re: Favorite REs? - DN [1]
jeff_hobbs@my-dejanews.com - 19 Apr 1999 - comp.lang.tcl
In article <3717CE4B.CFF699DA@pinebush.com>,
Christopher Nelson <chris@pinebush.com> wrote:
> I'm looking for example regular expressions for my book and I'm not feeling
very
> creative today. Got any favorite REs you care to share?
Oh, this could be a very long list, but I'll make some heuristics whereby
you can start exploring regexp in depth to give it the best coverage that
no book has yet given, plus a couple favs.
Starting with the favorites, I like:
set truth {^(1|yes|true|on)$}
regexp -nocase $truth $val
I usually use this for setting vals or in expr's, because expr's don't
allow the Tcl boolean keywords that are allowed a lot of other places.
I also like to know when I qualify for certain types (mostly with regexp):
(from a validation function)
switch -glob -- $type {
alphab* { # alphabetic
return [regexp -nocase "^\[a-z\]$opt\$" $val]
}
alphan* { # alphanumeric
return [regexp -nocase "^\[a-z0-9\]$opt\$" $val]
}
b* { # boolean - would be nice if it were more than 0/1
return [regexp "^\[01\]$opt\$" $val]
}
d* { # date - always strict
return [expr {![catch {clock scan $val}]}]
}
h* { # hexadecimal
return [regexp -nocase "^(0x)?\[0-9a-f\]$opt\$" $val]
}
i* { # integer
return [regexp "^\[-+\]?\[0-9\]$opt\$" $val]
}
n* { # numeric
return [regexp "^\[0-9\]$opt\$" $val]
}
rea* { # real
return [regexp -nocase [expr {$strict
?{^[-+]?([0-9]+\.?[0-9]*|[0-9]*\.?[0-9]+)(e[-+]?[0-9]+)?$}
:{^[-+]?[0-9]*\.?[0-9]*([0-9]\.?e[-+]?[0-9]*)?$}}] $val]
}
reg* { # regexp
return [expr {![catch {regexp $val {}}]}]
}
val* { # value, any valid number type
return [expr {![catch {expr {0+$val}}]}]
}
l* { # list
return [expr {![catch {llength $val}]}]
}
w* { # widget
return [winfo exists $val]
}
default {
return -code error "bad [lindex [info level 0] 0] type \"$type\":\
\nmust be [join [lsort {alphabetic alphanumeric date \
hexadecimal integer numeric real value \
list boolean}] {, }]"
}
}
Aside from that, there are certain things that people should understand
to make best use of regexps (and, of course, regsubs). Since there is
(still) no eval switch for regexps, you should describe how the eval/subst
of a regsub result is done, and the necessary escaping to avoid problems.
Also, looking at 8.1 regexps (soooo much better than 8.0), you'll note
the benefits of the new \ sequences. The metasyntax is also a boon, but
not always where one thinks. For example, the (?i) says do a case
insensitive match. With regexp, you have always had -nocase, but since
the same regexp engine is used throughout Tcl (which you should note),
you can now take advantage of this in commands like switch and lsearch.
Basically, all the switches for regexp can now be placed in the
metasyntax, and the metasyntax can be used whereever the regexp engine
is in place.
newline handling is also always a bit fruity, so you might want to go
into that a bit.
The {} matching is new, and helpful for some, so don't forget that.
Also, greedy versus non-greedy has tripped up many, and is especially
good to understand for HTML-like parsing.
There's more I'm sure, but that's all off the top of my head for now.
--
jeff.hobbs @SPAM acm.org
jeffrey.hobbs @SPAM icn.siemens.de
-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own
Last modified
1999-09-27
1999-09-27
(195.108.246.50)
Note: you are looking at
the snapshot of an old wiki
- much of this information
is likely to be very outdated
