Talk About Network

Google


Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Programming > C > Re: Determining...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 4 of 5 Topic 26084 of 26972
Post > Topic >>

Re: Determining possible encodings of a given text

by richard@[EMAIL PROTECTED] (Richard Tobin) May 6, 2008 at 11:14 AM

In article
<e065097e-3238-459f-8b2d-f432210000a7@[EMAIL PROTECTED]
>,
Nordlöw  <per.nordlow@[EMAIL PROTECTED]
> wrote:
>How do I efficiently determine which possible encoding(s) a given text
>is in? Can I use the iconv.h api somehow?

What do you need to know?

If it doesn't contain any bytes above 127, it's probably ascii.  If it
contains lots of zeros in the even or odd positions it's probably
UTF-16.  If it contains bytes above 127 *and* they're consistent with
UTF-8, then it's almost certainly UTF-8.  If it contains a small
pro****tion of bytes above 127, it's quite likely some ISO-Latin-N
encoding.  I don't know much about far-eastern encoding.

You might look at http://jchardet.sourceforge.net/

-- Richard
-- 
:wq
 




 5 Posts in Topic:
Determining possible encodings of a given text
=?ISO-8859-1?Q?Nordl=F6w?  2008-05-06 02:58:34 
Re: Determining possible encodings of a given text
"Kevin" <yus  2008-05-06 18:11:53 
Re: Determining possible encodings of a given text
jt@[EMAIL PROTECTED] (Je  2008-05-06 10:19:59 
Re: Determining possible encodings of a given text
richard@[EMAIL PROTECTED]  2008-05-06 11:14:14 
Re: Determining possible encodings of a given text
"Malcolm McLean"  2008-05-06 21:51:23 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Fri Jul 25 21:26:29 CDT 2008.