No work

Message boards : Number crunching : No work

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 984
Credit: 22,553,222
RAC: 13,824
Message 88559 - Posted: 27 Mar 2018, 2:03:08 UTC - in response to Message 88558.  

Hmm... Visited the other machines, and this is the only one that can't get any fresh tasks. Even the other Linux machine was fine and got some fresh work when I woke it up. Rebooted this machine (and checked under Windows 10 at the same time), but still no fresh tasks downloading...

As I've said before, the apparent bugginess of the project tends to cast a shadow on the results. If there is something wrong with the Rosetta@home projects on certain machines, then maybe all of the results need to be verified to make sure they ran on "safe" OSes?

It doesn't take any special knowledge to realise that after nearly 2 days of task unavailability, every live cruncher is trying to drag new tasks down to fill depleted buffers, so in the first few hours it'll be a race to create them sufficiently fast to meet that temporary exceptional demand. That doesn't translate into "apparent bugginess of the project", rather simple arithmetic. Availability or not has no connection with the validity of results, which you conflate. Neither does it imply preferential supply to some machines rather than others. All your machines have received tasks by now before any of them ran out. Same here too.

I've said this before for other reasons, but we make our machines available for whatever the project needs. We don't pay for tasks so we can't demand them. 24/7/365 availability of tasks has never been guaranteed. If the project doesn't utilise that resource, that's up to them. If any of us want to have our machines utilised 24\7 we're at liberty to hold a backup project. Take that approach and all problems disappear. I only mention the shortage of tasks because I don't actually want to run my backup project at all - by preference, not by necessity.
ID: 88559 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 309
Credit: 10,556,676
RAC: 16,730
Message 88561 - Posted: 27 Mar 2018, 4:41:08 UTC - in response to Message 88559.  

I've said this before for other reasons, but we make our machines available for whatever the project needs. We don't pay for tasks so we can't demand them. 24/7/365 availability of tasks has never been guaranteed. If the project doesn't utilise that resource, that's up to them. If any of us want to have our machines utilised 24\7 we're at liberty to hold a backup project.

Exactly so. You can of course never precisely match the supply of work units with the requests by the crunchers for them, and it is not the duty of the scientists to provide us a pastime. The scientists do what they have to do. We are here to assist them.
ID: 88561 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
niswes

Send message
Joined: 21 Jun 09
Posts: 2
Credit: 3,405,317
RAC: 1,401
Message 88564 - Posted: 27 Mar 2018, 10:46:29 UTC

statement from rosetta staff would be nice
ID: 88564 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
JohnH

Send message
Joined: 25 Mar 13
Posts: 41
Credit: 1,792,376
RAC: 788
Message 88566 - Posted: 27 Mar 2018, 12:26:04 UTC - in response to Message 88564.  

Well said. They always seem conspicuous by their absence from these boards.
ID: 88566 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
JohnH

Send message
Joined: 25 Mar 13
Posts: 41
Credit: 1,792,376
RAC: 788
Message 88567 - Posted: 27 Mar 2018, 12:37:09 UTC

At the risk of appearing stupid - who can tell me the difference between these status elements?

Computing status
Work
Tasks ready to send 18125

Tasks by application
Application Unsent
Rosetta 1
Rosetta Mini 0
ID: 88567 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 949
Credit: 3,615,847
RAC: 819
Message 88568 - Posted: 27 Mar 2018, 12:47:52 UTC - in response to Message 88564.  

statement from rosetta staff would be nice


+1
ID: 88568 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Warped

Send message
Joined: 15 Jan 06
Posts: 47
Credit: 1,505,729
RAC: 0
Message 88571 - Posted: 27 Mar 2018, 18:40:32 UTC - in response to Message 88568.  

statement from rosetta staff would be nice


+1

+2
ID: 88571 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile VO
Avatar

Send message
Joined: 4 Nov 05
Posts: 7
Credit: 3,158,878
RAC: 1,133
Message 88572 - Posted: 27 Mar 2018, 19:19:28 UTC

linux only i think
ID: 88572 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile shanen
Avatar

Send message
Joined: 16 Apr 14
Posts: 187
Credit: 11,682,361
RAC: 6,048
Message 88577 - Posted: 28 Mar 2018, 10:08:42 UTC

Back again, apparently affecting all types of machines. The server status page shows very few unsent units (with the requisite scrolling).

I still think I saw sufficient evidence the other day to suggest there was something different going on among the different OS/browser combinations.
#1 Freedom = (Meaningful - Constrained) Choice{5} != (Beer^3 | Speech)
ID: 88577 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 309
Credit: 10,556,676
RAC: 16,730
Message 88579 - Posted: 28 Mar 2018, 12:48:00 UTC

I run the 24-hour work units, and haven't gotten any work for a bit longer than that. So I am all on my backup project, GPUGrid - Quantum Chemistry, which is relatively new, but Linux only, and runs multi-core.
ID: 88579 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 949
Credit: 3,615,847
RAC: 819
Message 88592 - Posted: 30 Mar 2018, 5:24:51 UTC

Before or later the queue will restart.
I hope.
ID: 88592 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 949
Credit: 3,615,847
RAC: 819
Message 88594 - Posted: 30 Mar 2018, 13:13:56 UTC - in response to Message 88559.  
Last modified: 30 Mar 2018, 13:16:18 UTC

I've said this before for other reasons, but we make our machines available for whatever the project needs. We don't pay for tasks so we can't demand them. 24/7/365 availability of tasks has never been guaranteed. If the project doesn't utilise that resource, that's up to them.


I agree. But if admins write two lines to explain the situation (for example: "hey, guys, we have problems with scheduler")...
ID: 88594 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 309
Credit: 10,556,676
RAC: 16,730
Message 88595 - Posted: 30 Mar 2018, 14:15:53 UTC - in response to Message 88594.  

I agree. But if admins write two lines to explain the situation (for example: "hey, guys, we have problems with scheduler")...

That would be useful for our planning purposes. A temporarily glitch is different than a long-term shortage, and we could make arrangements accordingly.
ID: 88595 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
JohnH

Send message
Joined: 25 Mar 13
Posts: 41
Credit: 1,792,376
RAC: 788
Message 88596 - Posted: 30 Mar 2018, 15:08:31 UTC - in response to Message 88594.  

I agree. But if admins write two lines to explain the situation (for example: "hey, guys, we have problems with scheduler")...


True dat
ID: 88596 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
sbirdsill

Send message
Joined: 27 Mar 13
Posts: 2
Credit: 4,381,928
RAC: 704
Message 88598 - Posted: 30 Mar 2018, 16:03:36 UTC

Some of my computers were crunching Rosetta for a little while, but stopped again. I've since installed Folding@home on the more highfalutin machines for now. That way, they're at least doing something worthwhile.
ID: 88598 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
JohnH

Send message
Joined: 25 Mar 13
Posts: 41
Credit: 1,792,376
RAC: 788
Message 88601 - Posted: 30 Mar 2018, 21:44:40 UTC

Looks like we're back running ... wonder how long until next "blockage"
ID: 88601 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 984
Credit: 22,553,222
RAC: 13,824
Message 88602 - Posted: 31 Mar 2018, 2:51:19 UTC - in response to Message 88594.  

I've said this before for other reasons, but we make our machines available for whatever the project needs. We don't pay for tasks so we can't demand them. 24/7/365 availability of tasks has never been guaranteed. If the project doesn't utilise that resource, that's up to them.

I agree. But if admins write two lines to explain the situation (for example: "hey, guys, we have problems with scheduler")...

I'm trying to think what the next few words would be after "..." and I can only really come up with "it wouldn't make the slightest difference to anything"

My current issue is now to manage down the tasks from my back-up project to make space for Rosetta tasks again
ID: 88602 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile shanen
Avatar

Send message
Joined: 16 Apr 14
Posts: 187
Credit: 11,682,361
RAC: 6,048
Message 88611 - Posted: 2 Apr 2018, 3:36:42 UTC

Just stopped by to see if there was any explanation of the recent outages or for the increasing problem with "computation errors" that terminate long-running tasks... Used to be the computation errors usually happened within a few minutes of starting, but I just saw another as the task approached 8 hours.

As usual, I was unable to find much substantive information in these forums, but perhaps that is mostly a visibility-and-search problem for the information that might exist somewhere on the website. Perhaps I have actually come to prefer the "We don't care, so you shouldn't worry either" attitude of this project? It would be nice to know if I get any credit at all for 8 hours of computation that ends with a "computation error" and it would be nice to know if the computation errors were related to particular hardware or OSes, but if they don't care, why should I?

I guess from a BOINC-level perspective the solution is to run several projects. I've actually run a number of them over the years, but most of them were more or less problematic, so that approach doesn't much appeal to me.
#1 Freedom = (Meaningful - Constrained) Choice{5} != (Beer^3 | Speech)
ID: 88611 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 309
Credit: 10,556,676
RAC: 16,730
Message 88614 - Posted: 2 Apr 2018, 12:16:46 UTC - in response to Message 88611.  

Perhaps I have actually come to prefer the "We don't care, so you shouldn't worry either" attitude of this project? It would be nice to know if I get any credit at all for 8 hours of computation that ends with a "computation error" and it would be nice to know if the computation errors were related to particular hardware or OSes, but if they don't care, why should I?

I guess from a BOINC-level perspective the solution is to run several projects. I've actually run a number of them over the years, but most of them were more or less problematic, so that approach doesn't much appeal to me.

Let's just say that they don't find communicating with users to be an efficient use of their time. They might be right.

If you want trouble-free, there is really only World Community Grid. I run a lot of others too of course, but set my expectations accordingly.
ID: 88614 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 984
Credit: 22,553,222
RAC: 13,824
Message 88615 - Posted: 2 Apr 2018, 14:05:13 UTC - in response to Message 88614.  
Last modified: 2 Apr 2018, 14:06:26 UTC

Perhaps I have actually come to prefer the "We don't care, so you shouldn't worry either" attitude of this project? It would be nice to know if I get any credit at all for 8 hours of computation that ends with a "computation error" and it would be nice to know if the computation errors were related to particular hardware or OSes, but if they don't care, why should I?

I guess from a BOINC-level perspective the solution is to run several projects. I've actually run a number of them over the years, but most of them were more or less problematic, so that approach doesn't much appeal to me.

Let's just say that they don't find communicating with users to be an efficient use of their time. They might be right.

If you want trouble-free, there is really only World Community Grid. I run a lot of others too of course, but set my expectations accordingly.

I would've said the same, except in the recent period where I've run a lot of WCG tasks I've come up with 6 errors, all from one sub-project (MIP), however all of which were validly completed by the user who was reissued with them.

This on my new Intel i3-8350K desktop and not at all on my AMD FX8370 which itself has occasional issues with Rosetta 4.07 tasks (but not mini Rosetta 3.78 tasks). However, both are overclocked so maybe those particularly tasks are making specific individual demands that find the cracks on the outer extremes of my machines or during crashes or power losses etc. I have a flaky laptop that has occasional errors too, but my non-overclocked, non-flaky devices produce none. That's a pretty big clue as to where my issues originate and explains why I don't begin by blaming something else for my own self-inflicted problems.

As such, demanding to find a cause at the project end seems to be a futile exercise, when it's just as likely (if not moreso) that it's caused at the user end. So then it's just as legitimate a question for shanen to ask himself what's happening at his end that might explain his computation errors. Do those machines survive a stress test for example. That would be my first port of call before repeatedly blaming somewhere else.
ID: 88615 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : No work



©2020 University of Washington
http://www.bakerlab.org