Talk About Network

Google


Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Programming > Java Databases > Re: Encoding co...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 20 of 27 Topic 3647 of 3876
Post > Topic >>

Re: Encoding conversion problem

by Silvio Bierman <sbierman@[EMAIL PROTECTED] > Feb 14, 2008 at 09:44 PM

Andrea wrote:
> Hi everyone,
> sorry for my previous double-post (a mistake).
> 
>> Is is possible to ask the database driver to do the conversions for
>> you?  Perhaps internally it is Unicode or some other encoding that can
>> deal with Euros.
> I've checked the properties of the JDBC driver I use (http://
> publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/
> com.ibm.db2.udb.doc/ad/rjvdsprp.htm) but there's nothing concerning
> encoding conversions.
> 
>> We have the clue that C++ programs seem to store  euro s and get them
back out.
> Yes we have C and COBOL programs that can store and write non-IBM850
> chars without problems too.
> As pointed out by Sabine in her post the reason may be that C programs
> work with the pure sequences of bytes, without performing any encoding
> conversion.
> 
>>> I do not really understand why a Euro sign would work with 8859-1
since
>>> that does not contain that character as far as I am aware of.
> 
> SORRY SORRY SORRY SORRY SORRY
> I tried to insert (through JDBC) the EURO character in a DB2
> configured with
> ...
>  Database territory    = C
>  Database code page    = 819
>  Database code set     = ISO8859-1
> ...
> and I can't neither write nor read in Java the EURO character
> correctly :-(
> A COBOL program works instead correctly.
> 
> Then I tried the same thing on a SQL-Server 2000 instance with
> collation compatibility_51_409_30003 (correponding to a 1252 codepage,
> i.e. Latin 1) and I can store and read the EURO character via
> Java&JDBC.
> 
> That doesn't work in Java with Oracle 10g configured with
> ...
> NLS_LANGUAGE         = AMERICAN
> NLS_TERRITORY        = AMERICA
> NLS_CHARACTERSET     = US7ASCII
> NLS_LENGTH_SEMANTICS = BYTE
> ...
> store&read through COBOL is ok, and in Java I can even write&read
> accented vowels... even if those characters are outside USASCII7...
> 
>> You could do an experiment.  Try feeding your database all possible
>> unicode chars in a set of 1-char records, and see which ones come back
>> unmangled.  This is a kludge, but you could preconvert your Euro to
>> one of those invariant unused chars.
> The EURO character is just an example and part of the problem, I can't
> use this type of kludges.
> The specific problem is much more complex: a password is crypted and
> stored to DB with a C program but the crypted chars fall outside
> IBM850 range and in Java I'm unable to read and decrypt back the
> string... this works if the database is ISO-8859-1 (that's why I
> though I were able to write another 'weird' char, the euro char, on an
> ISO-8859-1 DB, sorry...). I've also the more general problem of data
> entry: I don't know wich characters users will insert so I can't
> substitute chars.
> I've found a workaround for my crypting problem but I'm just trying to
> understand the reason of the problem.
> 
> Now it's clear to me that with a CHAR field Java performs an encoding
> conversion using the encodings of the JVM and of the DBMS: if some
> characters fall outside the destination encoding then they are lost
> (i.e. converted in something completely different).
> The only 'mysterious' thing for me now is the behavior on Oracle (JDBC
> can read&write accented vowels even if they are outside ascii7)... any
> idea? Maybe the Oracle driver is smarter than the DB2 Universal
> Driver...
> 
> Thanks everyone,
> Andrea


Hello Andrea,

Even if you set a database encoding to ASCII it is very unlikely that 
the DB will strip non-ASCII characters. Actually, most databases treat 
every  byte-size (ie 8-bit) encoding almost identically internally. They 
may sometimes have different default collations but that is about it.
The codepage attribute is mostly im****tant for programs interfacing with 
the DB. As most of those (especially older ones) are encoding unaware 
also bytes pass in and out inharmed. In the end all 8-bit encodings are 
equal until actually interpreted to represent characters, aren't they?

I have seen application running on cp-1252 platforms using 8859-1 
encoded databases for years without anyone noticing. Same for cp-1257 on 
a cp-1252 database. Nobody realy cares when the same data that was put 
in comes out again.

This is not unlike SMTP which is supposed to be 7-bit only but since the 
trans****t encoding p***** 8-bit characters freely people are used to 
sending non-ascii characters in plain-text emails although this is not 
sup****ted. This all works great until someone from Lithuania sends me an 
email (I am in the Netherlands).

Regards,

Silvio
 




 27 Posts in Topic:
Encoding conversion problem
Andrea <tol7481@[EMAIL  2008-02-11 04:03:47 
Re: Encoding conversion problem
Lothar Kimmeringer <ne  2008-02-12 08:13:28 
Re: Encoding conversion problem
Andrea <tol7481@[EMAIL  2008-02-12 00:25:51 
Re: Encoding conversion problem
Lothar Kimmeringer <ne  2008-02-14 20:11:04 
Re: Encoding conversion problem
Sabine Dinis Blochberger   2008-02-12 09:33:03 
Re: Encoding conversion problem
Andrea <tol7481@[EMAIL  2008-02-12 03:22:17 
Re: Encoding conversion problem
Sabine Dinis Blochberger   2008-02-12 13:02:17 
Re: Encoding conversion problem
Andrea <tol7481@[EMAIL  2008-02-12 06:33:21 
Re: Encoding conversion problem
Roedy Green <see_websi  2008-02-12 18:07:31 
Re: Encoding conversion problem
Roedy Green <see_websi  2008-02-12 18:10:54 
Re: Encoding conversion problem
Andrea <tol7481@[EMAIL  2008-02-13 03:22:16 
Re: Encoding conversion problem
Silvio Bierman <sbierm  2008-02-13 12:36:49 
Re: Encoding conversion problem
Silvio Bierman <sbierm  2008-02-13 12:48:33 
Re: Encoding conversion problem
Andrea <tol7481@[EMAIL  2008-02-13 06:22:22 
Re: Encoding conversion problem
Silvio Bierman <sbierm  2008-02-13 16:39:59 
Re: Encoding conversion problem
Roedy Green <see_websi  2008-02-13 16:38:17 
Re: Encoding conversion problem
Roedy Green <see_websi  2008-02-13 16:36:07 
Re: Encoding conversion problem
Andrea <tol7481@[EMAIL  2008-02-13 07:28:19 
Re: Encoding conversion problem
Andrea <tol7481@[EMAIL  2008-02-14 08:00:45 
Re: Encoding conversion problem
Silvio Bierman <sbierm  2008-02-14 21:44:14 
Re: Encoding conversion problem
Andrea <tol7481@[EMAIL  2008-02-15 03:26:40 
Re: Encoding conversion problem
Lew <lew@[EMAIL PROTEC  2008-02-15 08:10:47 
Re: Encoding conversion problem
Sabine Dinis Blochberger   2008-02-15 12:02:46 
Re: Encoding conversion problem
Andrea <tol7481@[EMAIL  2008-02-15 06:51:52 
Re: Encoding conversion problem
Lew <lew@[EMAIL PROTEC  2008-02-15 10:00:01 
Re: Encoding conversion problem
Andrea <tol7481@[EMAIL  2008-02-15 09:02:46 
Re: Encoding conversion problem
Silvio Bierman <sbierm  2008-02-15 18:16:01 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Wed Dec 3 22:38:16 CST 2008.