Talk About Network



Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Programming > Awk > Re: [OT] collat...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 21 of 43 Topic 2231 of 2236
Post > Topic >>

Re: [OT] collating sequences: using glibc

by pk <pk@[EMAIL PROTECTED] > May 9, 2008 at 10:32 AM

On Friday 9 May 2008 08:51, Steffen Schuler wrote:

> Perhaps the result of the GLIBC functions iswalpha and wprintf together
> with GNU sort explains better the collating of the alphabetic characters
> for the 3 locales C, en_US, en_US.utf8 (my current system: Debian
> GNU/Linux testing (lenny)):
> 
> $ cat collate.c
> #include <stdio.h>
> #include <wchar.h>
> #include <stdlib.h>
> 
> int
> main(int argc, char **argv)
> {
> wint_t i;
> 
> if (argc != 2) {
> fprintf(stderr, "usage: ./collate MAX_CHAR_CODE\n");
> exit(1);
> }
> 
> for (i = 1; i <= atoi(argv[1]); ++i)
> if (iswalpha(i))
> wprintf(L"%c\n", i);
> return 0;
> }
> $ cc -o collate collate.c
> $ export LC_ALL=C; ./collate 255 | sort | tr -d '\n'; echo
> ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
> $ export LC_ALL=en_US; ./collate 255 | sort | tr -d '\n'; echo
> aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ
> $ # ^^^^^^^...: only Latin letters
> $ export LC_ALL=en_US.UTF-8; ./collate 65535 | sort | tr -d '\n'; echo
> aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ
> $
> 
> 
> There seem to be some inconsistencies between gawk and glibc locale
> usage. Especially in the C-code sample only Latin letters are listed for
> LC_ALL=en_US contrary to the similar gawk sample.

Yes, I noticed that, but I could not explain it.

> Besides, the collating order may be different from the order of the
> character codes --- see:
> 
>     http://tinyurl.com/6gdraw

This is something that can immediately be noticed by looking at the en_US
output above.

> I'm really interested whether there are any further locale bugs in gawk
> (for trying to fix them).
> Perhaps I have more time for the research at the current weekend.

I'm not sure I can follow you on this, mostly due to my lack of in-depth
libc/wchar knowledge. However, I'll keep this program and come back to it
when my understanding of the subject is better.

Thanks!

-- 
All the commands are tested with bash and GNU tools, so they may use
nonstandard features. I try to mention when something is nonstandard (if
I'm aware of that), but I may miss something. Corrections are welcome.




 43 Posts in Topic:
Gawk match() and numbers in scientific notation
Hermann Peifer <peifer  2008-05-06 04:16:01 
Re: Gawk match() and numbers in scientific notation
pk <pk@[EMAIL PROTECTE  2008-05-06 13:28:06 
Re: Gawk match() and numbers in scientific notation
Hermann Peifer <peifer  2008-05-07 07:11:38 
Re: Gawk match() and numbers in scientific notation
Ed Morton <morton@[EMA  2008-05-07 09:18:57 
Re: Gawk match() and numbers in scientific notation
Hermann Peifer <peifer  2008-05-07 19:50:11 
Re: Gawk match() and numbers in scientific notation
Ed Morton <morton@[EMA  2008-05-07 13:03:32 
Re: Gawk match() and numbers in scientific notation
Hermann Peifer <peifer  2008-05-07 20:39:44 
Re: Gawk match() and numbers in scientific notation
Ed Morton <morton@[EMA  2008-05-07 21:48:37 
Re: Gawk match() and numbers in scientific notation
Hermann Peifer <peifer  2008-05-08 19:21:58 
Re: Gawk match() and numbers in scientific notation
Janis <janis_papanagno  2008-05-07 07:59:10 
Re: Gawk match() and numbers in scientific notation
Ed Morton <morton@[EMA  2008-05-07 10:20:16 
Re: Gawk match() and numbers in scientific notation
pk <pk@[EMAIL PROTECTE  2008-05-07 17:25:24 
Re: Gawk match() and numbers in scientific notation
Ed Morton <morton@[EMA  2008-05-07 10:37:01 
Re: Gawk match() and numbers in scientific notation
pk <pk@[EMAIL PROTECTE  2008-05-07 18:04:24 
Re: Gawk match() and numbers in scientific notation
schuler.steffen@[EMAIL PR  2008-05-07 11:16:35 
Re: Gawk match() and numbers in scientific notation
pk <pk@[EMAIL PROTECTE  2008-05-07 20:27:53 
Re: Gawk match() and numbers in scientific notation
Ed Morton <morton@[EMA  2008-05-07 21:49:51 
Re: Gawk match() and numbers in scientific notation
schuler.steffen@[EMAIL PR  2008-05-07 13:16:24 
Re: Gawk match() and numbers in scientific notation
pk <pk@[EMAIL PROTECTE  2008-05-08 11:25:06 
[OT] collating sequences: using glibc
Steffen Schuler <schul  2008-05-09 08:51:38 
Re: [OT] collating sequences: using glibc
pk <pk@[EMAIL PROTECTE  2008-05-09 10:32:37 
Re: Gawk match() and numbers in scientific notation
Hermann Peifer <peifer  2008-05-08 06:58:39 
Re: Gawk match() and numbers in scientific notation
pk <pk@[EMAIL PROTECTE  2008-05-08 16:22:59 
OT: Gawk match() and numbers in scientific notation
Hermann Peifer <peifer  2008-05-08 08:46:54 
Re: OT: Gawk match() and numbers in scientific notation
pk <pk@[EMAIL PROTECTE  2008-05-08 18:11:28 
Re: OT: Gawk match() and numbers in scientific notation
Janis Papanagnou <Jani  2008-05-08 22:29:32 
Re: OT: Gawk match() and numbers in scientific notation
Hermann Peifer <peifer  2008-05-08 22:49:38 
Re: OT: Gawk match() and numbers in scientific notation
Hermann Peifer <peifer  2008-05-09 09:44:54 
Re: OT: Gawk match() and numbers in scientific notation
pk <pk@[EMAIL PROTECTE  2008-05-09 10:24:00 
Re: OT: Gawk match() and numbers in scientific notation
Hermann Peifer <peifer  2008-05-08 09:45:28 
[OT] Re: OT: Gawk match() and numbers in scientific notation
Janis <janis_papanagno  2008-05-09 02:08:34 
Re: [OT] Re: OT: Gawk match() and numbers in scientific notation
Hermann Peifer <peifer  2008-05-10 10:58:52 
Re: [OT] Re: OT: Gawk match() and numbers in scientific notation
pk <pk@[EMAIL PROTECTE  2008-05-10 11:52:19 
Re: [OT] Re: OT: Gawk match() and numbers in scientific notation
pk <pk@[EMAIL PROTECTE  2008-05-10 11:55:35 
Re: [OT] Re: OT: Gawk match() and numbers in scientific notation
Hermann Peifer <peifer  2008-05-10 20:10:19 
Re: [OT] Re: OT: Gawk match() and numbers in scientific notation
Hermann Peifer <peifer  2008-05-10 20:31:22 
Re: [OT] Re: OT: Gawk match() and numbers in scientific notation
Steffen Schuler <schul  2008-05-10 21:56:00 
Re: [OT] Re: OT: Gawk match() and numbers in scientific notation
Hermann Peifer <peifer  2008-05-10 23:14:44 
Re: [OT] Re: OT: Gawk match() and numbers in scientific notation
Cesar Rabak <csrabak@[  2008-05-11 10:50:15 
Re: [OT] Re: OT: Gawk match() and numbers in scientific notation
Hermann Peifer <peifer  2008-05-11 17:27:57 
Re: [OT] Re: OT: Gawk match() and numbers in scientific notation
pk <pk@[EMAIL PROTECTE  2008-05-11 11:17:15 
Re: [OT] Re: OT: Gawk match() and numbers in scientific notation
Janis Papanagnou <Jani  2008-05-10 15:07:10 
Re: OT: Gawk match() and numbers in scientific notation
Hermann Peifer <peifer  2008-05-13 03:41:09 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Fri May 16 9:34:33 CDT 2008.