Talk About Network



Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Programming > Awk > Re: Search patt...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 2 of 3 Topic 2150 of 2241
Post > Topic >>

Re: Search pattern for non-ASCII alphabetic characters

by Janis Papanagnou <Janis_Papanagnou@[EMAIL PROTECTED] > Feb 3, 2008 at 07:19 PM

Hermann Peifer wrote:
> Hi,
> 
> Occasionally, I'd like to search for non-ASCII alphabetic characters in 
> UTF-8 encoded text documents.
> 
> In the absence of an appropriate character class (at least I wouldn't 
> know of any), I do something like:
> 
> awk '/[ÀÁÂÃÄÅ ...and so on... ŸŹźŻżŽž]/{ action }'
> 
> This is perhaps not the smartest solution. Any better idea?
> 
> TIA. Hermann

I can't tell if it is a smarter solution but you could use the inverse
logic based on the existing character classes...

   LANG=C  awk '/[^[:alnum:][:punct:][:blank:][:cntrl:]]/'

(Note: there's also the ANSI character class [:ascii:] but my GNU awk
seems to not support it.)

Janis




 3 Posts in Topic:
Search pattern for non-ASCII alphabetic characters
Hermann Peifer <peifer  2008-02-03 18:42:50 
Re: Search pattern for non-ASCII alphabetic characters
Janis Papanagnou <Jani  2008-02-03 19:19:09 
Re: Search pattern for non-ASCII alphabetic characters
Hermann Peifer <peifer  2008-02-03 21:42:45 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Sat May 17 7:18:13 CDT 2008.