Java Forum / General / February 2008
The simplest way to download a file from http resource that need authentication
Andrea Francia - 08 Feb 2008 08:53 GMT I need write a program which download many files from different web sites. The web sites requires basic authentication. And I want spread many threads, each one downloading a list of files that need different authentication credentials.
Anyone knows what is the simplest way to achieve this? I found many http libraries but all seem very complex.
I studied how to use the standard java library to achieve this, but it seems that is not feasible with it.
For example the next code download a file that not require authentication.
URL url = new URL("http://www.example.org/file.txt"); URLConnection con = url.openConnection();
BufferedInputStream in; in = new BufferedInputStream(con.getInputStream()); OutputStream out = new FileOutputStream("C:\\file.txt");
int i = 0; byte[] bytesIn = new byte[8096]; while ((i = in.read(bytesIn)) >= 0) { out.write(bytesIn, 0, i); } out.close(); in.close();
But the problems arise when you try to download a file that need authentication in a threaded enviroment. To provide the authentication credentials you should use the Authenticator.setDefault() method which is a static method and therefore not usable in a threaded enviroment.
I tried also embedding the username and password in the URL but these where ignored. url = new URL("http://username:pass@www.example.org/file.txt");
Thanks
Lew - 08 Feb 2008 09:33 GMT > Authenticator.setDefault() method which is a static method and therefore > not usable in a threaded enviroment. Static methods can be used in a multi-threaded program.
 Signature Lew
Andrea Francia - 08 Feb 2008 10:32 GMT >> Authenticator.setDefault() method which is a static method and therefore >> not usable in a threaded enviroment. > > Static methods can be used in a multi-threaded program. Nooo, really?
There is a race conditions. Here the example:
We have two thread: t1 and t2 that executes the following code
void download(URL url, String username, String password) throws IOException {
Authenticator.setDefault(new Authenticator() { protected PasswordAuthentication getPasswordAuthentication() { return new PasswordAuthentication(username, password.toCharArray()); }}); URLConnection con = url.openConnection(); BufferedInputStream in; in = new BufferedInputStream(con.getInputStream()); OutputStream out = new FileOutputStream("C:\\file.txt");
int i = 0; byte[] bytesIn = new byte[8096]; while ((i = in.read(bytesIn)) >= 0) { out.write(bytesIn, 0, i); } out.close(); in.close(); }
Each thread should download different url using different username and password. t1 use "http://example.org/foo" as url and "foo","foo" as username, password.
t1 use "http://example2.org/bar" as url and "bar","bar" as username, password.
Ipotize this course of events: t1 starts t1 call Authenticator.setDefault() using "foo","foo" as username,password. t1 is suspensed by the scheduler t2 starts t2 call Authenticator.setDefault() using "bar","bar" as username,password. t2 call openConnection(); that will use the correct username and password ("bar","bar") t2 download the file. t2 terminates. t1 is resumed by the scheduler t1 call openConnection(); but the username and password were changed, from the correct values ("foo", "foo") to the values used by t2 ("bar,"bar"). The openConnection fails.
The correcteness of the programs depend of the scheduling, hence there is a race condition. The race conditions depends from the fact that Authenticator.setDefault() is static and hence the same data is shared by all threads.
Lew - 08 Feb 2008 15:27 GMT >>> Authenticator.setDefault() method which is a static method and therefore >>> not usable in a threaded enviroment. >> >> Static methods can be used in a multi-threaded program. >> > Nooo, really? Yes, really.
> There is a race conditions. ...
> The correcteness of the programs depend of the scheduling, hence there > is a race condition. The race conditions depends from the fact that > Authenticator.setDefault() is static and hence the same data is shared > by all threads. So?
Non-static methods can have race conditions, too. Deadlocks, even. There's no difference from static methods in that regard. Why do you single out static methods?
Fortunately Java has a number of lovely built-in constructs to keep threads synchronized, starting with the keyword "synchronized", which are notably absent from the example you posted.
Of course if you don't synchronize your threads, there will be trouble, static or non-static methods notwithstanding. If you think using instance methods without synchronization will solve your threading problems, you're doomed.
For an introduction to the topic, read <http://java.sun.com/docs/books/tutorial/essential/concurrency/index.html>
 Signature Lew
Andreas Leitgeb - 08 Feb 2008 19:26 GMT >>>> Authenticator.setDefault() method which is a static method and therefore >>>> not usable in a threaded enviroment. >>> Static methods can be used in a multi-threaded program. Lew, sometimes I really wonder if you aren't actually trolling.
> Non-static methods can have race conditions, too. Deadlocks, even. There's > no difference from static methods in that regard. Why do you single out > static methods? It's perhaps not so much the static methods, but rather the static data that gets set by the former, and which is supposed to be specific to each thread. Having to synchronize the whole "set user-data and fetch file"-block almost voids the whole point of parallelizing the task.
Perhaps it suffices to synchronize setting the user and opening the connection, and leave the actual transfer unsynchronized, but I don't feel very comfortable that way.
Lew - 09 Feb 2008 00:18 GMT >>>>> Authenticator.setDefault() method which is a static method and therefore >>>>> not usable in a threaded enviroment. [quoted text clipped - 14 lines] > connection, and leave the actual transfer unsynchronized, but I don't > feel very comfortable that way. Andrea felt the same way, but I really don't understand the reaction. It is true that static methods can be used in a multi-threaded program. The statement to the contrary was not correct, and it is normal in Usenet to set the record straight.
There are any number of programs that find it useful or convenient to share static data and methods among threads. I stuck to the technical facts, and provided correct information that should be useful to the OP and everyone else reading. So why the hostility?
 Signature Lew
Andreas Leitgeb - 09 Feb 2008 10:24 GMT >>>>>> Authenticator.setDefault() method which is a static method and therefore >>>>>> not usable in a threaded enviroment. >> It's perhaps not so much the static methods, but rather the static data >> that gets set by the former, and which is supposed to be specific >> to each thread.
> Andrea felt the same way, but I really don't understand the reaction. I think he (or she, can't deduce from name) quite clearly described the problem: User-data is set statically. So despite the exact wording it seemed obvious to me that the problem was the data, and not the method by which it was set. The latter was merely what made the static storage obvious.
We seem to differ on the level of obviousness ;-)
> It is > true that static methods can be used in a multi-threaded program. The > statement to the contrary was not correct, and it is normal in Usenet to set > the record straight. It might have been worth a comment like: "Of course static methods are not inherently problematic with threads, but ..." ideally followed by a trick to solve the actual problem :-)
> There are any number of programs that find it useful or convenient to share > static data and methods among threads. But these actually also share the value stored in those static variables. The point here is, that each thread needs a different value.
> So why the hostility? Because your answer not only focussed on some technical tidbit, but thereby also refuted the mere existence of the actual problem.
Your answer *conveyed*: "static methods are not problematic with multi-threaded usage, so your problem doesn't exist"
At least, it seems like both Andrea and me understood it that way.
Lew - 09 Feb 2008 16:08 GMT Lew wrote:
>> So why the hostility?
> Because your answer not only focussed on some technical tidbit, but > thereby also refuted the mere existence of the actual problem. [quoted text clipped - 3 lines] > > At least, it seems like both Andrea and me understood it that way. OK - you guys were upset about something I didn't say, and blamed me for it.
I assure you I never said, meant or thought, "Your problem doesn't exist." I said only what I said, and what I said was meant to be helpful. I didn't say what I didn't say.
 Signature Lew
Andreas Leitgeb - 09 Feb 2008 19:48 GMT >>> So why the hostility? I don't really see any hostility.
> OK - you guys were upset about something I didn't say, and blamed me for it. Helpfulness is a strange concept.
Correcting speling errors in a technical question is one example of a "helpful" action that is only rarely appreciated.
Focussing on irrelevant details of a posting is another one.
Answering a detail that appears to be crucial at very first glance, but really isn't, is often even explicitly un-appreciated. Probably because it has a likely effect that future readers of the thread may think it's already answered and skip it, even if they perhaps did know the correct answer.
PS: Don't ask me why, but an initial phrase like "This doesn't really help with the question, but [correction of the tidbit]" would probably boost the acceptance of detail-corrections enourmously.
Lew - 09 Feb 2008 20:50 GMT > PS: Don't ask me why, but an initial phrase like "This doesn't really > help with the question, but [correction of the tidbit]" would > probably boost the acceptance of detail-corrections enourmously. Excellent advice, but bear in mind that this is a discussion group and discussions can range over a wide range of topics.
My concern was that when people say here that something is
> a static method and therefore > not usable in a threaded enviroment. that the general readership will believe such an inaccurate remark. Over the years I've used Usenet, correction of such misinformation has not generally been taken as an insult.
Furthermore, pruning the original post to a specific point and answering that point alone should make it clear that the primary point is not under discussion in such a post. Calling a person "trollish" for that was completely out of line and downright insulting.
I don't know about you, but I think there is a distinct risk of bad practices burgeoning if such misinformation is allowed to stand.
Suggesting that one coddle a respondent's feelings through the sort of diplomacy you suggest is a good idea, but I suggest in return that people focus on the facts under presentation and apply a little bit of reason and logic to the information instead of getting all bent out of shape.
The fact is that static methods *are* suitable for multi-threaded programs, just as much as instance methods are. No claim was made that that information solved the OP's fundamental problem. OTOH, when such obvious misinterpretation of the technology is evinced, it is possible that the misunderstanding might indeed bear on the original problem.
Static methods are fully capable of managing distinct information per thread if written to do so. It might not always be the best way, but it's often done and quite safely. One might indeed reject a static method in that scenario, but not because they cannot be used safely in multi-threaded programs.
For example, if the OP had followed my advice and used synchronization to protect the static method call, they'd've been able to solve their problem. It might not be the fastest way, but it certainly could work.
So everybody just take a chill pill and focus on the information provided. Please stop the personal attacks.
 Signature Lew
Andrea Francia - 08 Feb 2008 10:32 GMT >> Authenticator.setDefault() method which is a static method and therefore >> not usable in a threaded enviroment. > > Static methods can be used in a multi-threaded program. There is a race conditions. Here the example:
We have two thread: t1 and t2 that executes the following code
void download(URL url, String username, String password) throws IOException {
Authenticator.setDefault(new Authenticator() { protected PasswordAuthentication getPasswordAuthentication() { return new PasswordAuthentication(username, password.toCharArray()); }}); URLConnection con = url.openConnection(); BufferedInputStream in; in = new BufferedInputStream(con.getInputStream()); OutputStream out = new FileOutputStream("C:\\file.txt");
int i = 0; byte[] bytesIn = new byte[8096]; while ((i = in.read(bytesIn)) >= 0) { out.write(bytesIn, 0, i); } out.close(); in.close(); }
Each thread should download different url using different username and password. t1 use "http://example.org/foo" as url and "foo","foo" as username, password.
t1 use "http://example2.org/bar" as url and "bar","bar" as username, password.
Ipotize this course of events: t1 starts t1 call Authenticator.setDefault() using "foo","foo" as username,password. t1 is suspensed by the scheduler t2 starts t2 call Authenticator.setDefault() using "bar","bar" as username,password. t2 call openConnection(); that will use the correct username and password ("bar","bar") t2 download the file. t2 terminates. t1 is resumed by the scheduler t1 call openConnection(); but the username and password were changed, from the correct values ("foo", "foo") to the values used by t2 ("bar,"bar"). The openConnection fails.
The correcteness of the programs depend of the scheduling, hence there is a race condition. The race conditions depends from the fact that Authenticator.setDefault() is static and hence the same data is shared by all threads.
Arne Vajhøj - 10 Feb 2008 00:09 GMT >>> Authenticator.setDefault() method which is a static method and therefore >>> not usable in a threaded enviroment. [quoted text clipped - 4 lines] > > We have two thread: t1 and t2 that executes the following code
> Authenticator.setDefault(new Authenticator() { > protected PasswordAuthentication getPasswordAuthentication() { > return new PasswordAuthentication(username, > password.toCharArray()); > }}); > URLConnection con = url.openConnection(); Authenticator.setDefault is designed for proxy servers that requires authentication and in that context all requests need the same authenticator.
I think you will need to set HTTP headers manually.
Something like:
con.setRequestProperty ("Authorization", "Basic " + basicauth("user","pass"));
where:
public static String basicauth(String un, String pw) throws MessagingException, IOException { ByteArrayOutputStream baos = new ByteArrayOutputStream(); OutputStream b64os = MimeUtility.encode(baos, "base64"); b64os.write((un + ":" + pw).getBytes()); b64os.close(); return new String(baos.toByteArray()); }
Arne
Lew - 10 Feb 2008 00:10 GMT > The correcteness of the programs depend of the scheduling, hence there > is a race condition. The race conditions depends from the fact that > Authenticator.setDefault() is static and hence the same data is shared > by all threads. Seems to me that the solution lies in having the registered Authenticator itself be able to split up the logic according the desired authentication, rather than having multiple Authenticators. IOW, instead of splitting the logic to choose an Authenticator, have the Authenticator implement the split.
 Signature Lew
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|