Andrea wrote:
> > > ...
> > > If I save characters outside the range sup****ted by IBM-850 (i.e.
the
> > > euro currency character EURO) then I read garbage...
> >
> > Yes, the Euro symbol is not part of the encodings, so your database
> > can't contain it.
> I've found a strange thing: C and COBOL application can write and read
> (using embedded SQL) characters outside the accepted range without
> problems... So the database can contain those characters without
> loosing any information, but I can't understand how...
>
Yes, in theory you can store any value (0 - 255 in case of one byte
strings) in a string, but how that is interpreted (i.e. encoding) is
where it gets hairy. Also, multibyte characters would break the
interpretation.
> > If you need it, you would have to change the databases
> > encoding (ISO-8859-15 includes the Euro symbol).
> > Otherwise, you have to take care not to try to write unsup****ted
> > character into string/character fields.
> >
> > One solution could be to parse all strings and replace the symbol with
> > the shorthand "EUR", but it might not be acceptable to your client.
> Actually the EURO character is just an example, I have more complex
> strings to handle (and I can't change the encoding of the database).
> If my problem has no solution at all then I'd like to understand why
> other languages don't have this problem...
>
Ah, there is always hacks around limitations. But they aren't usually
pretty. The problem is to funnel a string with these "unsup****ted"
characters through the JDBC driver (both ways).
You might get around it by using typeless fields (you can put any byte
sequence there), like BLOBS maybe...
Or you write a parser that substitutes the impossible characters with
acceptable replacements. Of course, this is most likele not feasable.
But the customer has to be aware that a database with encoding X can
only hold strings encoded in X. If they need UTF-8 for example now, they
will eventually have to change their database. And it would be better to
migrate to a suitable encoding than to hack around it and in a few
years, have to do all over again (and then some), when they finally do
want to change the database encoding.
On other languages not having the problem, in C, you can treat a string
just like an array of bytes and use those for whatever you like, the
compiler won't complain. Even interpreting them as memory addresses is
possible, adding and subtracting etc...
> Thanks,
> Andrea
--
Sabine Dinis Blochberger
Op3racional
www.op3racional.eu


|