Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 80 · 81 · 82 · 83 · 84 · 85 · 86 . . . 299 · Next

AuthorMessage
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1231
Credit: 14,242,164
RAC: 3,692
Message 100447 - Posted: 20 Jan 2021, 5:23:19 UTC

One of my tasks is running much longer than expected - about 20.5 hours so far rather than the expected 8 hours.

https://boinc.bakerlab.org/rosetta/result.php?resultid=1323410090

Is something wrong with this task?
ID: 100447 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1664
Credit: 17,369,040
RAC: 24,371
Message 100448 - Posted: 20 Jan 2021, 6:21:28 UTC - in response to Message 100447.  
Last modified: 20 Jan 2021, 6:21:53 UTC

One of my tasks is running much longer than expected - about 20.5 hours so far rather than the expected 8 hours.

https://boinc.bakerlab.org/rosetta/result.php?resultid=1323410090

Is something wrong with this task?
Maybe?
I've got a sebv1 Task that finished in 3 hours. There's a sebv2 Task that is running, and is looking at 10.5 or so to complete- my sebv2 Tasks that have completed have taken 10hrs 45min (2 of them), 4hrs 20min and 5hrs 40min.
Grant
Darwin NT
ID: 100448 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1664
Credit: 17,369,040
RAC: 24,371
Message 100449 - Posted: 20 Jan 2021, 10:02:43 UTC - in response to Message 100448.  

There's a sebv2 Task that is running, and is looking at 10.5 or so to complete
Now looking at 13hrs 15min to completion.
Grant
Darwin NT
ID: 100449 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1231
Credit: 14,242,164
RAC: 3,692
Message 100453 - Posted: 20 Jan 2021, 14:33:54 UTC - in response to Message 100447.  

One of my tasks is running much longer than expected - about 20.5 hours so far rather than the expected 8 hours.

https://boinc.bakerlab.org/rosetta/result.php?resultid=1323410090

Is something wrong with this task?

It finally finished successfully, before it reached 23 hours. I suspect that the sebv2 tasks need much larger time estimates,
ID: 100453 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
dincon

Send message
Joined: 11 Feb 16
Posts: 1
Credit: 1,188,792
RAC: 0
Message 100470 - Posted: 22 Jan 2021, 13:55:33 UTC - in response to Message 100453.  

I've got a sebv2 task that has come to a near standstill since reaching 99.55% about 20 hours ago. It is now at 99.603%, elapsed time of 1d 21:49:00 and has progressed 0.003% in 45 minutes but has not progressed for the last 15 minutes. The time remaining keeps recalculating to 00:10:55 +- 01 second, adjusting between 00:10:54 and 00:10:57 for hours. I've seen it work down to --:--:-- then, eventually, recalculating to 00:10:55. The deadline is in about 5 hours so I'm just going to let it run and see what happens. Oops!! Just made 0.001% progress after 20 minutes!

All the other Rosetta task running are not sebv2 and are making consistent progress.
ID: 100470 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bryn Mawr

Send message
Joined: 26 Dec 18
Posts: 387
Credit: 11,992,939
RAC: 13,701
Message 100471 - Posted: 22 Jan 2021, 18:37:44 UTC - in response to Message 100470.  

I've got a sebv2 task that has come to a near standstill since reaching 99.55% about 20 hours ago. It is now at 99.603%, elapsed time of 1d 21:49:00 and has progressed 0.003% in 45 minutes but has not progressed for the last 15 minutes. The time remaining keeps recalculating to 00:10:55 +- 01 second, adjusting between 00:10:54 and 00:10:57 for hours. I've seen it work down to --:--:-- then, eventually, recalculating to 00:10:55. The deadline is in about 5 hours so I'm just going to let it run and see what happens. Oops!! Just made 0.001% progress after 20 minutes!

All the other Rosetta task running are not sebv2 and are making consistent progress.


The timer and percentage figures are typical of a task that’s overrunning its estimate (because the software has no idea how much it will overrun by) and I believe that the sebv2 tasks have been doing that fairly consistently.

The good news is that it will just go to completion suddenly, the bad news is that no-one can tell when.
ID: 100471 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2108
Credit: 40,963,324
RAC: 18,849
Message 100598 - Posted: 10 Feb 2021, 14:28:17 UTC

Server needs a kick - no tasks coming down
Program	        Host	Status
transitioner1	bwsrv1	Not Running

Application	Unsent	In progress
Rosetta	        5	480390

ID: 100598 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Kissagogo27

Send message
Joined: 31 Mar 20
Posts: 86
Credit: 2,867,026
RAC: 2,891
Message 100599 - Posted: 10 Feb 2021, 15:13:06 UTC

ID: 100599 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1664
Credit: 17,369,040
RAC: 24,371
Message 100600 - Posted: 10 Feb 2021, 19:14:26 UTC
Last modified: 10 Feb 2021, 19:19:57 UTC

Server Status page shows the Transitioners are down. Someone needs to give things a nudge.

Tasks ready to send          1
Transitioner backlog (hours) 8.98    (usually zero, or very close to it).

Grant
Darwin NT
ID: 100600 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 100601 - Posted: 10 Feb 2021, 19:47:35 UTC - in response to Message 100600.  

Server Status page shows the Transitioners are down. Someone needs to give things a nudge.

Just when I reattach a Ryzen 3900X, things fall apart. Back to OPN.
ID: 100601 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2108
Credit: 40,963,324
RAC: 18,849
Message 100602 - Posted: 10 Feb 2021, 21:14:36 UTC - in response to Message 100599.  

and no validation too

https://munin.kiska.pw/munin/rosetta-day.html

No indication of that on the server page, but you're right - several here still waiting after @6hrs
ID: 100602 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Hammer

Send message
Joined: 7 Mar 07
Posts: 4
Credit: 15,451,390
RAC: 0
Message 100603 - Posted: 11 Feb 2021, 2:16:46 UTC

Always wondered what a transitioner did.
ID: 100603 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1664
Credit: 17,369,040
RAC: 24,371
Message 100605 - Posted: 11 Feb 2021, 2:50:31 UTC - in response to Message 100603.  

Always wondered what a transitioner did.
A lot.


From the Seti@home website server page.
transitioner: Handles state transitions of workunits and results. Basically, the transitioners keep track of the results in progress and makes sure they properly move down the pipeline. It is always asking the questions: Is this workunit ready to send out? Has this result been received yet? Is this a valid result? Can we delete it now?
Basically it moves the Task from one state to another. Ready to send, sent & awaiting on a result, result received- Is the result Valid? If so, move it to the science database & delete it after a set time period. If not, send out another copy, then check it's result. Has it timed out? Send another one.
Grant
Darwin NT
ID: 100605 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Garry Heather

Send message
Joined: 23 Nov 20
Posts: 10
Credit: 362,743
RAC: 0
Message 100612 - Posted: 11 Feb 2021, 16:36:47 UTC

I see the Scheduler on bwsrv1 has gone AWOL now. I hope it sends us a postcard.
ID: 100612 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1231
Credit: 14,242,164
RAC: 3,692
Message 100613 - Posted: 11 Feb 2021, 23:10:50 UTC

Looks like none of you bothered to look at a weather report for Seattle, WA, USA.

I did, and found that today's weather includes times above freezing and times below, with snow expected.

That means a lot of ice on paths to and from the building with the server, so delays fixing any problems are likely.
ID: 100613 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1664
Credit: 17,369,040
RAC: 24,371
Message 100614 - Posted: 12 Feb 2021, 2:18:05 UTC - in response to Message 100613.  

Looks like none of you bothered to look at a weather report for Seattle, WA, USA.

I did, and found that today's weather includes times above freezing and times below, with snow expected.

That means a lot of ice on paths to and from the building with the server, so delays fixing any problems are likely.
That's why remote management is such a wonderful thing,

And it looks like they were successful. "Project is down for maintenance" is what i got when i first checked in this morning, but now the web site is back up & work is flowing again.
Thanks to whoever it was that got it working again.
Grant
Darwin NT
ID: 100614 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dave

Send message
Joined: 10 May 09
Posts: 3
Credit: 109,605
RAC: 0
Message 100630 - Posted: 17 Feb 2021, 16:45:02 UTC

Two tasks downloaded on my fairphone2 Under the tasks view they both say download complete 0.000%

BOINC version is 7.16.16 from the BOINC site as I understand this version not available from Google Play.
ID: 100630 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1895
Credit: 9,079,572
RAC: 5,563
Message 100631 - Posted: 18 Feb 2021, 0:31:17 UTC - in response to Message 100630.  
Last modified: 18 Feb 2021, 0:32:56 UTC

Two tasks downloaded on my fairphone2 Under the tasks view they both say download complete 0.000%

BOINC version is 7.16.16 from the BOINC site as I understand this version not available from Google Play.


"David Anderson:
Version 7.16.16 of the BOINC Android client has been released. This is the first new Android version in over 4 years, and is a major rewrite of the GUI. Thanks for Vitalii Koshura, Tal Regev, and Isira Seneviratne for their work on this.

The new version is available from the BOINC web site and (for Amazon Fire tablets) from the Amazon app store. It's not on the Google play store because of new restrictions imposed by Google; hopefully this will be resolved in a future version."

I personally don't update every time a new update gets released, I let other try it out and figure out how it actually works and list the things it does differently, then if I either don't care about the new things or like them I will update. Some times though the older versions just work differently enough to make me keep them.

Another reason not to update right away is the Projects need to implement any necessary compatibility changes too. Some Projects are waaaay behind in their version of the Boinc Server side software.
ID: 100631 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
lohphat

Send message
Joined: 22 Apr 06
Posts: 5
Credit: 4,965,549
RAC: 0
Message 100632 - Posted: 18 Feb 2021, 7:31:10 UTC

I have two failed work units in the last batch. One also failed again with another user's attempt on the same platform (windows_x86_64)

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=1197364660

However this one failed on windows_x86_64 but succeeded on aarch64-unknown-linux-gnu

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=1197364006

Are these types of asymmetrical failures indicate platform bugs vs tasks which fail on all platforms?
ID: 100632 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1664
Credit: 17,369,040
RAC: 24,371
Message 100637 - Posted: 18 Feb 2021, 23:21:21 UTC - in response to Message 100632.  
Last modified: 18 Feb 2021, 23:22:58 UTC

Are these types of asymmetrical failures indicate platform bugs vs tasks which fail on all platforms?
Sometimes/maybe.
If the Tasks only ever fail & only ever complete on particular platform, then you can put it down to the application. But due to the fact that when a Task is run it is started with a random seed value, so even if you were to run the same task 50 times on the very same system, it may error out on some occasions and not others, all due to the different initial value used.
Grant
Darwin NT
ID: 100637 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 80 · 81 · 82 · 83 · 84 · 85 · 86 . . . 299 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org