Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 57 · 58 · 59 · 60 · 61 · 62 · 63 . . . 309 · Next

AuthorMessage
tomduly

Send message
Joined: 5 Apr 20
Posts: 3
Credit: 578,397
RAC: 0
Message 97626 - Posted: 26 Jun 2020, 7:06:38 UTC - in response to Message 97625.  

@Steven: did you change the default settings? The BOINC client on computer does request the number of WUs to process.

E.g. one of my PCs is a 4 core machine, running 24/7 with 100% CPU time for Rosetta. Settings are default. After starting the BOINC client, it requests 4 new tasks, and starts crunching. When these 4 WUs are still running, it requests another 4 WUs. As soon as one of the WUs is completed, one of the 4 waiting WUs is being processed seamlessly. And then a single new WU is being requested. Sometimes 2 or 3 WUs are completed almost at the same time and the client then requests 2 or 3 new tasks. So the computer ist never idling, crunches 12 WUs per 24 hours and no WU is lost due to too late processing.

Tom
ID: 97626 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Keith Myers
Avatar

Send message
Joined: 29 Mar 20
Posts: 97
Credit: 332,619
RAC: 25
Message 97627 - Posted: 26 Jun 2020, 7:27:47 UTC - in response to Message 97626.  

What default settings? ? ? ? Whatever project was last updated for global preferences is the one applied to Rosetta when you first join. My global project preferences were from Seti for 1 day cache. My first connection to the Rosetta project downloaded over 260 tasks on the first connection. No way my computer could finish them in five days. Had to abort all but twenty of them to finish the rest in time.
ID: 97627 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1234
Credit: 14,338,560
RAC: 2,014
Message 97641 - Posted: 26 Jun 2020, 12:16:41 UTC - in response to Message 97627.  
Last modified: 26 Jun 2020, 12:19:29 UTC

Keith Myers,

It is typical for BOINC projects to make wild assumptions about how fast tasks will run for the first few times you download any tasks for a new application version. This is usually corrected after you have returned 10 successful tasks for that version.
ID: 97641 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2141
Credit: 41,518,559
RAC: 10,612
Message 97642 - Posted: 26 Jun 2020, 12:18:44 UTC - in response to Message 97623.  

The project team often doesn't have enough ideas for new work to generate 10 times as many tasks as a few months ago

I don't think the problem is a lack of ideas

Twitter Rosetta@home
We have some BIG NEWS:
Researchers @UWproteindesign have succeeded in creating antiviral proteins that neutralize the new coronavirus in the lab.
(These experimental drugs are being optimized for animal trials now)

THANK YOU for helping out!

Maybe (guessing) the issue is they have to completely re-focus on this area of success
ID: 97642 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 9,863
Message 97654 - Posted: 26 Jun 2020, 18:09:20 UTC - in response to Message 97620.  

Oh wonderful, my messages get deleted for being off topic but yours don't.

UK isn't the only country in the world, Peter.


We have similar lockdown stuff to the USA. Maybe it's because we're healthier, delete that, I know we're healthier. And it's been shown that healthy people suffer less.

The wave is predominantly in the Americas now and isn't significantly reducing. Several states in the US and in Brazil are still increasing.
And immunity, if it even exists, which is completely unproven atm, would require 20x as many to be infected as have currently been.


Every virus goes away of its own accord, we don't get the same flu forever do we? This will just vanish like every other one does, probably before we get a chance to cure it.

If it's any consolation, I've been back at work since the beginning of last week now that the building owners have allowed us to return - it had been a public liability issue for them.
Of the other 7 business on my floor, only 1 other has returned. The other 6 were allowed to open throughout, but chose not to due to the lack of footfall, which is continuing tbh. Last week we reached just 22% of the same week last year.


Hey I don't care I'm off work, our government has gone all left wing on us and is handing out huge sums of money to anyone who wants it. I have no idea where it's coming from, probably borrowed from China. Now how did that come about I wonder....
ID: 97654 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 9,863
Message 97656 - Posted: 26 Jun 2020, 18:14:57 UTC - in response to Message 97625.  

Peter Hucker wrote, "I don't understand what you're getting at. You receive a task that takes 8 hours to complete, and you have to send it back in 3 days (9 times longer than it takes to do it). How can that possibly cause you not to get them done in time?

In my case, they could not be done in time because Rosetta sent me 26 tasks due in 3 days. My computer takes a little over 7 hours to complete one task. Completing them all would take 182 hours of computer time. There are only 72 hours in three days. The computer might be able to complete ten tasks in that time. The remaining 16 tasks will not be completed and will be reported as errors. (These do not include the 15 tasks that were reported as "errors while downloading," which seems to happen a lot with Rosetta, although it rarely happens with my other projects.)

Now do you understand?

Steven Gaber


Your computer has two cores, it should be doing do two at a time. So in 3 days you should be able to do 18 tasks. But 26 is still too much. I wonder if it's the same bug I encountered only 3 times in a row, about a day apart, on my old laptop. It was going nuts and downloading about 100 tasks, so I aborted them and it didn't try to get more. In my case it was actually getting only 1 at a time, but continuously asking for another. I'm 95% sure it was Rosetta that was doing it, and I can't remember if the reason was ever found.

What are your buffer settings?
ID: 97656 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 9,863
Message 97657 - Posted: 26 Jun 2020, 18:16:46 UTC - in response to Message 97627.  

What default settings? ? ? ? Whatever project was last updated for global preferences is the one applied to Rosetta when you first join. My global project preferences were from Seti for 1 day cache. My first connection to the Rosetta project downloaded over 260 tasks on the first connection. No way my computer could finish them in five days. Had to abort all but twenty of them to finish the rest in time.


Why did it do that? Rosetta tasks are very rigidly 8 hours on any machine. The server should have known precisely how many you needed for a 1 day cache.
ID: 97657 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 9,863
Message 97658 - Posted: 26 Jun 2020, 18:18:36 UTC - in response to Message 97641.  

It is typical for BOINC projects to make wild assumptions about how fast tasks will run for the first few times you download any tasks for a new application version. This is usually corrected after you have returned 10 successful tasks for that version.


It assumes what the server thinks is the time required, which in the case of Rosetta is the standard 8 hours. Unless your computer takes nothing like 8 hours to do a task, it should be correct from the beginning. Wild estimates happen in other projects, where work units can take vastly different times depending on processor power.
ID: 97658 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 9,863
Message 97660 - Posted: 26 Jun 2020, 18:21:31 UTC - in response to Message 97642.  

I don't think the problem is a lack of ideas

Twitter Rosetta@home
We have some BIG NEWS:
Researchers @UWproteindesign have succeeded in creating antiviral proteins that neutralize the new coronavirus in the lab.
(These experimental drugs are being optimized for animal trials now)

THANK YOU for helping out!

Maybe (guessing) the issue is they have to completely re-focus on this area of success


Does that mean cure rather than just vaccine? "Neutralize" suggests it could kill it once you already have it?
ID: 97660 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Stevie G

Send message
Joined: 15 Dec 18
Posts: 107
Credit: 865,910
RAC: 814
Message 97668 - Posted: 27 Jun 2020, 3:08:17 UTC - in response to Message 97627.  

What default settings? ? ? ? Whatever project was last updated for global preferences is the one applied to Rosetta when you first join. My global project preferences were from Seti for 1 day cache. My first connection to the Rosetta project downloaded over 260 tasks on the first connection. No way my computer could finish them in five days. Had to abort all but twenty of them to finish the rest in time.



That was my experience also.

Steven Gaber
Oldsmar, FL
ID: 97668 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Stevie G

Send message
Joined: 15 Dec 18
Posts: 107
Credit: 865,910
RAC: 814
Message 97669 - Posted: 27 Jun 2020, 3:19:23 UTC - in response to Message 97626.  

[quote]@Steven: did you change the default settings? The BOINC client on computer does request the number of WUs to process.

Tom:

Yes, I did as Mr.Celery suggested.Set the computer preferences to 0.1 days of work and to store 0.5 additional days of work.

Then I deleted 11 tasks. Two tasks were completed within the deadline and 6 tasks were reported as errors, not started by the deadline and cancelled.

Now I have no tasks at all.

Steven Gaber
Oldsmar, FL
ID: 97669 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2141
Credit: 41,518,559
RAC: 10,612
Message 97675 - Posted: 27 Jun 2020, 11:11:17 UTC - in response to Message 97669.  

@Steven: did you change the default settings? The BOINC client on computer does request the number of WUs to process.

Tom:

Yes, I did as Mr.Celery suggested.Set the computer preferences to 0.1 days of work and to store 0.5 additional days of work.

Then I deleted 11 tasks. Two tasks were completed within the deadline and 6 tasks were reported as errors, not started by the deadline and cancelled.

Now I have no tasks at all.

Glad you saw that, but it's swung the opposite way now <sigh>
Do you have no Rosetta tasks because you set Rosetta to "No New Tasks" or is it just Asteroids taking its turn after so many Rosetta?
Am I right in thinking you only have your PC on for a few hours per day? That is, not for 24hrs a day - just while you're using it?

If so, leave things as they are until Rosetta gets around to calling for new tasks and see how many it calls for.
I'm hoping it'll only start calling for 2 or 4 at a time to complete within the 3 day deadline - 2 to run and 2 as a back up.
If that's what happens and they take their turn to run comfortably within the deadline without you having to intervene, then all's well and you might consider edging the additional days in your buffer up by 0.1 at a time to get nearer to where you used to be. The "right" level is where you remain successful for all your projects without having to fiddle with settings all the time (yes, I know I'm asking you to fiddle with it at the moment, but just to find a sweet spot)
ID: 97675 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
EHM-1
Avatar

Send message
Joined: 21 Mar 20
Posts: 23
Credit: 183,782
RAC: 0
Message 97676 - Posted: 27 Jun 2020, 11:47:02 UTC - in response to Message 97675.  

Hello Sid and all-
Not sure if this is related to what Steven is encountering:
On my computer Rosetta will run for days as expected, then do nothing for a day or two, then resume normal behavior. I've never seen any explanation for this. I'm currently in one of these down times. Yesterday I added a second project to my BOINC acct, and it is running as expected. As far as I know, Rosetta is still paused in mid-task. Anyone have an idea what might be causing this?
Eric

system: up-to-date Windows 10, Intel quad-core 3.6 GHz processor, 8 GB RAM
ID: 97676 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1234
Credit: 14,338,560
RAC: 2,014
Message 97680 - Posted: 27 Jun 2020, 13:15:04 UTC - in response to Message 97676.  
Last modified: 27 Jun 2020, 13:18:20 UTC

EricM,

Rosetta@home currently has so many new users that it's not keeping up with the demand for tasks.

As for being paused mid-task while another BOINC project run, that's normal if you have more than one BOINC project providing tasks. Tasks close to their deadlines get higher priority to run, and tasks for the other project catch up on run time later.
ID: 97680 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
SolidAir79

Send message
Joined: 5 May 20
Posts: 4
Credit: 2,123,173
RAC: 0
Message 97681 - Posted: 27 Jun 2020, 13:31:38 UTC

Getting some errors on a windows machine all similar Stderr message?
<core_client_version>7.16.5</core_client_version>
<![CDATA[
<message>
Incorrect function.
(0x1) - exit code 1 (0x1)</message>
<stderr_txt>
command: projects/boinc.bakerlab.org_rosetta/rosetta_4.20_windows_x86_64.exe @rb_06_25_30642_30047_ab_t000__robetta_FLAGS -in::file::fasta t000_.fasta -jumps:pairing_file t000_.fasta.bbcontacts.jumps -jumps:random_sheets 2 -constraints::cst_file t000_.fasta.CB.cst -constraints:cst_weight 5.0 -constraints::cst_fa_file t000_.fasta.MIN.cst -constraints:cst_fa_weight 5.0 -in:file:boinc_wu_zip rb_06_25_30642_30047_ab_t000__robetta.zip -frag3 rb_06_25_30642_30047_ab_t000__robetta.200.3mers.index.gz -fragA rb_06_25_30642_30047_ab_t000__robetta.200.12mers.index.gz -fragB rb_06_25_30642_30047_ab_t000__robetta.200.3mers.index.gz -nstruct 10000 -cpu_run_time 28800 -boinc:max_nstruct 20000 -checkpoint_interval 120 -mute all -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -boinc::cpu_run_timeout 36000 -run::rng mt19937 -constant_seed -jran 2765475
Using database: database_357d5d93529_n_methylminirosetta_database

[ ERROR ]: Caught exception:


File: C:cygwin64homeboinc4.17Rosettamainsourcesrccore/pack/dunbrack/SingleResidueDunbrackLibrary.hh:306
chi angle must be between -180 and 180: -nan(ind)
------------------------ Begin developer's backtrace -------------------------
BACKTRACE:
------------------------- End developer's backtrace --------------------------


AN INTERNAL ERROR HAS OCCURED. PLEASE SEE THE CONTENTS OF ROSETTA_CRASH.log FOR DETAILS.



</stderr_txt>
]]>

Regards Alan
ID: 97681 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1234
Credit: 14,338,560
RAC: 2,014
Message 97682 - Posted: 27 Jun 2020, 13:37:58 UTC
Last modified: 27 Jun 2020, 13:40:48 UTC

SolidAir79,

That looks like an error in one of the input files for the workunit.

If so, you can't fix the problem, and all other users who get copies of that workunit or any other workunit using that input file will have it crash the same way.
ID: 97682 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
SolidAir79

Send message
Joined: 5 May 20
Posts: 4
Credit: 2,123,173
RAC: 0
Message 97683 - Posted: 27 Jun 2020, 13:43:01 UTC - in response to Message 97682.  

Okay thanks must have a bad batch !
ID: 97683 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
EHM-1
Avatar

Send message
Joined: 21 Mar 20
Posts: 23
Credit: 183,782
RAC: 0
Message 97684 - Posted: 27 Jun 2020, 14:09:21 UTC - in response to Message 97680.  

EricM,
Rosetta@home currently has so many new users that it's not keeping up with the demand for tasks.
As for being paused mid-task while another BOINC project run, that's normal if you have more than one BOINC project providing tasks. Tasks close to their deadlines get higher priority to run, and tasks for the other project catch up on run time later.

Hi Robert, and thanks for the info. I added the second project at your suggestion, thanks for that as well.
But the Rosetta pause occurred before that, and has several other times in the past couple months. So I still wonder why it's not finishing the current task.
Eric

system: up-to-date Windows 10, Intel quad-core 3.6 GHz processor, 8 GB RAM
ID: 97684 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1234
Credit: 14,338,560
RAC: 2,014
Message 97685 - Posted: 27 Jun 2020, 14:29:33 UTC

EricM,

Looks like you need to give more details about not finishing the current task. If it includes leaving a CPU core idle, that's a problem. If it's switching to running another task instead, that's normal.
ID: 97685 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
EHM-1
Avatar

Send message
Joined: 21 Mar 20
Posts: 23
Credit: 183,782
RAC: 0
Message 97686 - Posted: 27 Jun 2020, 15:21:09 UTC - in response to Message 97685.  

EricM,

Looks like you need to give more details about not finishing the current task. If it includes leaving a CPU core idle, that's a problem. If it's switching to running another task instead, that's normal.

Below is a shot of the project properties in BOINC. I don't know of any way to see the status of the current work unit other than to let the screensaver start running and read it from the BOINC status messages that run before the project screensaver does. From the event log, here are the most recent Rosetta-related messages that contain anything other than "no tasks sent/project requested delay":

6/24/2020 7:16:42 PM | Rosetta@home | Result hbnet_surface_design3_0.7_SAVE_ALL_OUT_IGNORE_THE_REST_3yb2cw0d_949347_1_0 is no longer usable
6/24/2020 7:16:42 PM | Rosetta@home | Result miniprotein_relax2_COVID_SAVE_ALL_OUT_IGNORE_THE_REST_9db4ko0a_949448_2_0 is no longer usable
6/24/2020 7:16:42 PM | Rosetta@home | No tasks sent
6/24/2020 7:16:42 PM | Rosetta@home | Project requested delay of 31 seconds

and the most recent Rosetta-related messages prior to the above:

6/23/2020 3:51:03 PM | Rosetta@home | Task hbnet_surface_design3_0.7_SAVE_ALL_OUT_IGNORE_THE_REST_3yb2cw0d_949347_1_0 is 1.74 days overdue; you may not get credit for it.  Consider aborting it.
6/23/2020 3:51:03 PM | Rosetta@home | Task miniprotein_relax2_COVID_SAVE_ALL_OUT_IGNORE_THE_REST_9db4ko0a_949448_2_0 is 0.70 days overdue; you may not get credit for it.  Consider aborting it.
6/23/2020 3:51:03 PM | Rosetta@home | URL https://boinc.bakerlab.org/rosetta/; Computer ID 3864355; resource share 100
6/23/2020 3:51:03 PM | SETI@home | URL http://setiathome.berkeley.edu/; Computer ID 8050566; resource share 100





system: up-to-date Windows 10, Intel quad-core 3.6 GHz processor, 8 GB RAM
ID: 97686 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 57 · 58 · 59 · 60 · 61 · 62 · 63 . . . 309 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org