Please help on wu recovery

Message boards : Number crunching : Please help on wu recovery

To post messages, you must log in.

AuthorMessage
Profile Cureseekers~Nightanimal
Avatar

Send message
Joined: 20 Nov 05
Posts: 19
Credit: 26,396
RAC: 0
Message 8046 - Posted: 31 Dec 2005, 15:20:55 UTC
Last modified: 31 Dec 2005, 16:15:47 UTC

I do have 42 wu's that are complete and are in my boinc.bakerlab directory, but aren't viewed in the Boinc manager anymore, is there any way to send this Wu's just another time??

What i did was really stupid, i was trying to collect some more wu's but i wanted to hold the crunched wu's for today, so i placed the crunched wu's in a temporarily folder, and connected to the boinc scheduler to get some more workunits, stupid really (i know now). Error messages all over the place and wu's out of the manager but, i was only hoping to get new Wu's and that i send the crunched wu's later. But that didn't work that good.

error example:
HTTP::init_post2: couldn't get file size
2005-12-25 18:30:15 [rosetta@home] Temporarily failed upload of 1dcj__abrelax_rand_len10_jit02_omega_sim_filters_102498_1_0: not found
2005-12-25 18:30:15 [rosetta@home] Backing off 1 minutes and 0 seconds on upload of file 1dcj__abrelax_rand_len10_jit02_omega_sim_filters_102498_1_0
2005-12-25 18:30:15 [rosetta@home] Temporarily failed upload of 1di2__abrelax_rand_len10_jit02_omega_sim_filters_40814_1_0: not found

But thats expected when the wu's aren't in the designated directory.

file example:

1di2__abrelax_rand_len10_jit02_omega_sim_filters_83728_2_0

1dcj__abrelax_rand_len10_jit02_omega_sim_filters_106956_1_0

Is there any possible way to send the workunits manually? with or without the Boinc manager?

I hope someone can help me out!
Happy new years eve!!!

The signature is away on the moment, just leave a message after the beep
ID: 8046 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Tern
Avatar

Send message
Joined: 25 Oct 05
Posts: 575
Credit: 4,488,060
RAC: 3
Message 8047 - Posted: 31 Dec 2005, 15:59:17 UTC

The results have to "match" in three places; file in the directory, entry in the database on the server, and entry in the xml files on the host. I have heard of people successfully recovering from mixups, but it's always involved a lot of work getting those xml files "just right". SHUT DOWN BOINC first thing. Every minute it runs, things get worse. Then back up the _entire_ BOINC folder. Then start looking at the xml files and seeing if you can determine what needs to be changed to "put back" the files in question.

I've not done this, so I don't know exactly how... sorry!

ID: 8047 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Cureseekers~Nightanimal
Avatar

Send message
Joined: 20 Nov 05
Posts: 19
Credit: 26,396
RAC: 0
Message 8052 - Posted: 31 Dec 2005, 16:28:15 UTC

Time is not an issue for me, i hope i can get points from it, and it are results(for project sake)

Ok files are placed in the right directory

The only thing i hope for is to get the crunched wu's back in the manager
You said: "Start looking at the xml files and seeing if you can determine what needs to be changed to "put back" the files in question".
I don't know on forehand which file or what to do to recover any wu's.

Can i get those wu's back in the manager if i alter xml files?
wich file and what must i alter i hope some Boinc guru can tell that .

If it is possible i would like to add it to our Team project Faq.


The signature is away on the moment, just leave a message after the beep
ID: 8052 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
STE\/E

Send message
Joined: 17 Sep 05
Posts: 125
Credit: 3,271,639
RAC: 879
Message 8053 - Posted: 31 Dec 2005, 16:36:46 UTC

Personally I think those WU's are Toast, why in the World would you want to save them to another Directory in the first place. A much easier way would have been to just download the WU's you wanted and then suspended Network Access. The Wu's wouldn't have Uploaded that way until you allowed Network Access again.
ID: 8053 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Cureseekers~Nightanimal
Avatar

Send message
Joined: 20 Nov 05
Posts: 19
Credit: 26,396
RAC: 0
Message 8054 - Posted: 31 Dec 2005, 16:54:28 UTC - in response to Message 8053.  
Last modified: 31 Dec 2005, 17:04:37 UTC

Personally I think those WU's are Toast, why in the World would you want to save them to another Directory in the first place. A much easier way would have been to just download the WU's you wanted and then suspended Network Access. The Wu's wouldn't have Uploaded that way until you allowed Network Access again.

Thnks for your reply Poorboy.
The way you describe didn't work for me.
I did it that way and my wu's fired off too the scheduler even when my network of the client was suspended. I know it was a weird thing to do.

But i do have the crunched and ready wu's, i just want them resend.
i just want to know it is possible or not, otherwise i stop searching.
The signature is away on the moment, just leave a message after the beep
ID: 8054 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
STE\/E

Send message
Joined: 17 Sep 05
Posts: 125
Credit: 3,271,639
RAC: 879
Message 8055 - Posted: 31 Dec 2005, 17:05:46 UTC
Last modified: 31 Dec 2005, 17:07:36 UTC

Unless each & every one of the WU's is listed properly in the client_state.xml files I don't see any way of doing it.

I'm assuming you had to shut down the BOINC Manager each time you saved the WU's, if so then when you restarted the Manager the xml file would be different from the previous xml file & the saved WU's in a different directory wouldn't show up properly then.

Unless you have the xml file for the saved WU's then I think it would be a lost cause ... IMO
ID: 8055 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Tern
Avatar

Send message
Joined: 25 Oct 05
Posts: 575
Credit: 4,488,060
RAC: 3
Message 8061 - Posted: 31 Dec 2005, 17:19:04 UTC - in response to Message 8055.  

Unless you have the xml file for the saved WU's then I think it would be a lost cause ... IMO


If the saved xml file is there, it should be possible to "run out of work" with the current set of results, then put _all_ the old data back. If not, it's going to be one-result-at-a-time paste work in the xml, putting the result in the right directory, then telling the Client that it's there. (Once you succeed with one, you could try the others in a batch...)

I _think_ I could probably figure out what needs to be in the xml files, given the time. I personally am _not_ going to have the time to help on this one this weekend, but if you study the format of the various files (client_state.xml is the main one, and may be the only one you need to deal with) I think you'll see how it all works. You might need to disable network access (unplug the cable if you have to!) and get one in a "ready to send" state; then you could copy that block of text in the xml file and paste it, changing the WU name, etc...

ID: 8061 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Cureseekers~Nightanimal
Avatar

Send message
Joined: 20 Nov 05
Posts: 19
Credit: 26,396
RAC: 0
Message 8062 - Posted: 31 Dec 2005, 17:19:33 UTC
Last modified: 31 Dec 2005, 17:24:52 UTC

Yes every single 42 wu of them are in the client_state.xml.

I pressed the update button in the client, but he couldn't find any wu's eventually offcourse.
Then i couldn't crucnch anymore(cause i he tried to send it every second)
So i delete every 42 wu out of the manager, know im thinking, why did i not placed the wu's on that time in the right directory. .
I think i drinked to much that day.

hmm i do have a client_state.xml from 06 december, there is a change that those wu's are listed in there.

Thnks for your help by the way!!

Edit:

Ah thank guys for your support! i will find it out if it works.
Then i report it here!

Happy new years eve!!


The signature is away on the moment, just leave a message after the beep
ID: 8062 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Cureseekers~Nightanimal
Avatar

Send message
Joined: 20 Nov 05
Posts: 19
Credit: 26,396
RAC: 0
Message 8443 - Posted: 5 Jan 2006, 23:49:53 UTC

Well i did rescue 2 jobs out of 42 cause, have a backup of that.
i was glad it was friday

The signature is away on the moment, just leave a message after the beep
ID: 8443 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Please help on wu recovery



©2024 University of Washington
https://www.bakerlab.org