Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / Databases / August 2007

Tip: Looking for answers? Try searching our database.

INSERTINg a Clob in a portable way

Thread view: 
Thomas Kellerer - 14 Aug 2007 18:29 GMT
Hello,

I'm trying to insert a values for a CLOB column in a portable way. So far using
PreparedStatement.setCharacterStream() together with a Reader to read the CLOB
contents from a local file was working quite well (with various JDBC drivers).

Now I stumbled across a bug in my code that raised it ugly head when using
Apache Derby.

When reading the contents of a file with a multi-byte encoding I can't seem to
find a way to pass the correct length of the data into the setCharacterStream()
method.

What I did so far:

File f = new File("sourcfile.txt");
Reader r = new InputStreamReader(new FileInputStream(f), "UTF-8");
preparedStatement.setCharacterStream(1, r, f.length());

which worked fine for most JDBC drivers except for Derby as Derby (rightfully)
complains that the value supplied (f.length()) does not match the number of
characters read from the stream.

I can't image a way where I can supply the correct length when using a Reader to
supply the Clob content. If I was able to "create" Clob instance in a generic
way, I think this could solve the problem as the Clob interface contains a
setCharacterStream() which does not require a length parameter.
But how would I create the Clob instance if I don't know which driver will be
used at runtime?

Any input appreciated.

Thanks in advance
Thomas
joeNOSPAM@BEA.com - 14 Aug 2007 20:26 GMT
On Aug 14, 10:29 am, Thomas Kellerer <FJIFALSDG...@spammotel.com>
wrote:
> Hello,
>
[quoted text clipped - 30 lines]
> Thanks in advance
> Thomas

Hi. How about trying setBinaryStream() and send the data as raw bytes?
Joe
Thomas Kellerer - 14 Aug 2007 20:39 GMT
joeNOSPAM@BEA.com wrote on 14.08.2007 21:26:
>> What I did so far:
>>
[quoted text clipped - 4 lines]
> Hi. How about trying setBinaryStream() and send the data as raw bytes?
> Joe

Hi Joe,

thanks for the answer, but this would only work if the file's encoding was the
same as the database's encoding.

Thomas
joeNOSPAM@BEA.com - 14 Aug 2007 22:40 GMT
On Aug 14, 12:39 pm, Thomas Kellerer <FJIFALSDG...@spammotel.com>
wrote:
> joeNOS...@BEA.com wrote on 14.08.2007 21:26:>> What I did so far:
>
[quoted text clipped - 11 lines]
>
> Thomas

Well, yes, but I would separate the issues. If you have an encoding
that needs
conversion, do it explicitly at the client, and then store and send
the resulting
data as bytes.
Thomas Kellerer - 14 Aug 2007 22:59 GMT
joeNOSPAM@BEA.com wrote on 14.08.2007 23:40:
> On Aug 14, 12:39 pm, Thomas Kellerer <FJIFALSDG...@spammotel.com>
> wrote:
[quoted text clipped - 17 lines]
> the resulting
> data as bytes.

Which is exactly what the driver should be doing in setCharacterStream() :)

The problem is, as this is a generic routine, I would first need to find out the
encoding of the database (or the table, or the column). Not an easy task if you
don't know which DBMS and driver will be used.

But even if I could get that to work, I would still need to encode the full
file, because the number of bytes (in the binary representation) of the target
encoding might be different than those in the original encoding.

So I guess my best bet is to test-read the file to get the real length in
characters (using e.g. BufferedReader.skip()) before calling
setCharacterStream(). It does work, but there is obviously an overhead involved
(especially on large files).

Thomas
Frank Meyer - 15 Aug 2007 13:04 GMT
> File f = new File("sourcfile.txt");
> Reader r = new InputStreamReader(new FileInputStream(f), "UTF-8");
> preparedStatement.setCharacterStream(1, r, f.length());

Hi Thomas,

have a look at
http://java.sun.com/mailers/techtips/corejava/2007/tt0207.html#1
where this issue may be clarified.

Frank
Thomas Kellerer - 15 Aug 2007 13:37 GMT
Frank Meyer wrote on 15.08.2007 14:04:
>> File f = new File("sourcfile.txt");
>> Reader r = new InputStreamReader(new FileInputStream(f), "UTF-8");
[quoted text clipped - 5 lines]
> http://java.sun.com/mailers/techtips/corejava/2007/tt0207.html#1
> where this issue may be clarified.

Hi Frank

thanks for the answer, but I don't understand the relation to my question.

If I understand the article correctly, it describes how I can remove (normalize)
problematic characters from a String. But in my case I need to store the
contents of a text file un-altered into a database CLOB field. But to do that I
need to know the number of characters according to a given encoding in a file,
without reading the file into memory.
I can't seem to get the relation to normalizing a text input here...

Thomas
Philipp Taprogge - 15 Aug 2007 22:20 GMT
Hi!

> But in my case I need to store the contents of a text file un-altered into a database CLOB
> field. But to do that I need to know the number of characters according
> to a given encoding in a file, without reading the file into memory.

Hmm... I'd say, you don't want to store "text" at all. What you are trying to
do is store arbitrary data, regardless of it's encoding, in the database and
let the client worry about producing a readable representation.
I'd say that's exactly what a BLOB is for. You should store the file as binary
data, possibly detecting and storing it's encoding alongside and then when
reading it from the DB again, produce appropriate output from it.
A CLOB would only make sense if you wanted the database to do anything "texty"
with that data. But if you can't choose and stick to a certain encoding
beforehand, that would be extremely difficult.

Am I missing something...?

Regards,

    Phil
Thomas Kellerer - 16 Aug 2007 07:54 GMT
Hello Philipp,

> Hi!
>
[quoted text clipped - 13 lines]
>
> Am I missing something...?

Yes and no :)

This is a generic SQL GUI where I support uploading of text files into a
CLOB field (BLOBs are not problem at all). To give you an idea, the
syntax is:

INSERT INTO some_table (col1, clob_col)
VALUES
(1, {$clobfile='test.txt' encoding='UTF-8'});

This is an "extension" so that the end-user can handle file "uploads"
from the client into LOB fields (a similar extension without encoding is
available using {$blobfile=}

So I have no control whatsoever about the data model or the JDBC driver
used in this context. As I said in my initial posting, all (major)
drivers except Derby seem to be able to handle setCharacterStream() with
a length that is bigger than the actual number of characters.

"Counting" the number of characters in the input file using a Reader
does work, and the overhead for a single INSERT isn't that problematic
(not even for larger files). But when it comes to bulk uploads this
might be a problem. And as my current implementation does not comply
with the JDBC definition, I'm trying to find a portable and correct way
to implement it.

Regards
Thomas


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.