no new tasks?

Message boards : Number crunching : no new tasks?

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
tomduly

Send message
Joined: 5 Apr 20
Posts: 3
Credit: 578,397
RAC: 148
Message 97494 - Posted: 22 Jun 2020, 16:46:27 UTC
Last modified: 22 Jun 2020, 16:47:01 UTC

Currently, rosetta@home does not provide new WUs to the volunteer community.
Unfortunately, no information is given why.
Gigaflops of computing power are running empty.

Dear Rosetta team: pls. be aware, that providing distributed computing power for free, isn't for free for the most of us, since it costs a significant amount of electricity, that we have to pay for.
Keeping the community uninformed, is... disrespectful.

Tom
ID: 97494 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile yoerik
Avatar

Send message
Joined: 24 Mar 20
Posts: 128
Credit: 168,390
RAC: 0
Message 97501 - Posted: 22 Jun 2020, 19:24:23 UTC - in response to Message 97494.  

I hate to complain - but several volunteers have raised issue with the lack of communication from the project to us, for a while now.
https://boinc.bakerlab.org/rosetta/forum_thread.php?id=13533&postid=96797

It's good that only work that helps the project(s) is ran, but it's not good to run out of WUs completely without notice.

I hate continuing to compare them to WCG - their communication isn't much better, but shortages on subprojects are noted well in advance.
ID: 97501 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1863
Credit: 5,980,047
RAC: 123
Message 97504 - Posted: 22 Jun 2020, 22:03:47 UTC - in response to Message 97501.  

Neither one of you guys would do well in the days before Boinc, Seti the ONLY Project, went down for over a week because someone stole the copper phone lines, yes dialup days, and NO ONE from Seti said one word until they came back online after the phone company replaced the line. YES the website was still running but NO communications what so ever from them to us crunchers, they DID tell a few users individually and that's how we all found out. Even later during it's run when Seti reissued the exact same workunits over and over and over again Seti never said a thing until they got caught and after denying it for weeks finally admitted it was because 'they ran out of workunits and thought people would leave the project if they didn't have workunits to crunch'. NO the reissued workunits were NOT reprocessed by the Project, the results were thrown away and the workunits were sent out again and again. They did eventually apologize but it wasseen asweakand waaaay too late.

Rosetta is dependant on outside vendors to supply tasks to them, when that doesn't happen they run out of tasks. One thing to know though is within a few days they always get new workunits again. So give it time guys and crunch for some other Boinc project right now. This is NOT a once in a lifetime thing for them, no they have no clue it's about to happen and no they have no clue when the units will flow again. World Community Grid has LOTS of their Open Pandemic COVID-19 workunits available to crunch in the meantime.
ID: 97504 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile yoerik
Avatar

Send message
Joined: 24 Mar 20
Posts: 128
Credit: 168,390
RAC: 0
Message 97508 - Posted: 23 Jun 2020, 1:22:55 UTC - in response to Message 97504.  
Last modified: 23 Jun 2020, 1:24:06 UTC

Neither one of you guys would do well in the days before Boinc, Seti the ONLY Project, went down for over a week because someone stole the copper phone lines, yes dialup days, and NO ONE from Seti said one word until they came back online after the phone company replaced the line. YES the website was still running but NO communications what so ever from them to us crunchers, they DID tell a few users individually and that's how we all found out. Even later during it's run when Seti reissued the exact same workunits over and over and over again Seti never said a thing until they got caught and after denying it for weeks finally admitted it was because 'they ran out of workunits and thought people would leave the project if they didn't have workunits to crunch'. NO the reissued workunits were NOT reprocessed by the Project, the results were thrown away and the workunits were sent out again and again. They did eventually apologize but it wasseen asweakand waaaay too late.

Rosetta is dependant on outside vendors to supply tasks to them, when that doesn't happen they run out of tasks. One thing to know though is within a few days they always get new workunits again. So give it time guys and crunch for some other Boinc project right now. This is NOT a once in a lifetime thing for them, no they have no clue it's about to happen and no they have no clue when the units will flow again. World Community Grid has LOTS of their Open Pandemic COVID-19 workunits available to crunch in the meantime.


and that's fair - but we're not in the age of dialup anymore. It's more than reasonable to expect some level of communication from the project that we are spending time, money and resources to support. I'm not whining, or demanding for more tasks. I specifically said I'm glad that they're not wasting resources by giving us bad WUs, multiple times across the forums.
But - there hasn't been a news posting since early May - it's nearly July. That seems to be par for the course for Rosetta, at a glimpse of past news threads - but it is more than reasonable to ask for a few sentences a month to keep us up to date on the project's work - put out an announcement to the clients so that non-forum users (which there are clearly plenty) can be aware of the shortage without panicking when they suddenly stop getting tasks. Not everyone here is a computer expert - I'm certainly not. But I also know that stringing a few sentences together isn't difficult.

I was ridiculed here before during the last potential shortage in the beginning of May. It is certainly not my intention to whine or complain - and I'm sorry if I'm coming off that way, but what OP asks for - and what other users asked for in the news post I linked prior - is not much to ask. It takes a minute of one person's time to keep us informed - and the project is either unaware or unwilling of how important communication is, especially when it requires such little effort.
ID: 97508 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1477
Credit: 6,021,364
RAC: 5,144
Message 97511 - Posted: 23 Jun 2020, 6:58:13 UTC - in response to Message 97504.  

Neither one of you guys would do well in the days before Boinc, Seti the ONLY Project, went down for over a week because someone stole the copper phone lines, yes dialup days, and NO ONE from Seti said one word until they came back online after the phone company replaced the line. YES the website was still running but NO communications what so ever from them to us crunchers, they DID tell a few users individually and that's how we all found out.

I remember.....but this is not a justification.

Rosetta is dependant on outside vendors to supply tasks to them, when that doesn't happen they run out of tasks.

Only Robetta works is from outside. Others simulations are inside Bakerlab.

Maybe they have hw problems
Maybe they want to "clean up" the queues to start new simulation campaigns
Maybe they are analizing results before new works
Maybe

And volunteer goes to other projects and, maybe, they will not return
ID: 97511 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1477
Credit: 6,021,364
RAC: 5,144
Message 97512 - Posted: 23 Jun 2020, 6:59:58 UTC - in response to Message 97508.  

It takes a minute of one person's time to keep us informed - and the project is either unaware or unwilling of how important communication is, especially when it requires such little effort.

+1
ID: 97512 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1074
Credit: 12,211,048
RAC: 23,046
Message 97518 - Posted: 23 Jun 2020, 7:49:19 UTC - in response to Message 97512.  

It takes a minute of one person's time to keep us informed - and the project is either unaware or unwilling of how important communication is, especially when it requires such little effort.
Or maybe it's night time over there & everyone is still asleep?
Grant
Darwin NT
ID: 97518 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile gmercato

Send message
Joined: 2 May 20
Posts: 9
Credit: 861,494
RAC: 2,118
Message 97519 - Posted: 23 Jun 2020, 7:58:06 UTC

Right, it's 0057h in Seatle, WA now. For the next 7h probably nothing is gonna happen.
ID: 97519 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1477
Credit: 6,021,364
RAC: 5,144
Message 97524 - Posted: 23 Jun 2020, 10:34:31 UTC - in response to Message 97518.  

Or maybe it's night time over there & everyone is still asleep?

The queue is empty since yesterday. Over 24h of interruption...
ID: 97524 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Don_Herres

Send message
Joined: 24 May 20
Posts: 1
Credit: 596,887
RAC: 990
Message 97527 - Posted: 23 Jun 2020, 12:12:35 UTC - in response to Message 97512.  

Depends on what part of the system is down. The usual about not sending a message because the messaging system does not work.
ID: 97527 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Daedalus

Send message
Joined: 1 Aug 08
Posts: 36
Credit: 9,153,516
RAC: 6,719
Message 97528 - Posted: 23 Jun 2020, 12:57:54 UTC
Last modified: 23 Jun 2020, 13:08:33 UTC

i don't know if it is related but BOINC on my android smartphone has been unable to communicate with r@h for days. Soon after Einstein@home. i doubt it is related but no warning whatsoever.

Edit: That said i crunch for folding@home which is far from being useless. I used to alternate between projects.
ID: 97528 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
CIA

Send message
Joined: 3 May 07
Posts: 100
Credit: 21,056,786
RAC: 0
Message 97530 - Posted: 23 Jun 2020, 15:37:55 UTC
Last modified: 23 Jun 2020, 15:42:34 UTC

Depending on the day I have ~90 cores spread across several machines running Rosetta. Even with the queue mostly empty currently I have ~90 tasks total.

Most machines are set to 24 hour crunch times vs the default 8, and they all have bare min (.1, .1) or zero cache. So even with the WU drought things are still coming through. My primary work machine I'm typing this on only had 6 of 24 cores crunching this morning when I came into work, but about 20 minutes ago it randomly grabbed 18 more WU's on it's own all with the same naming convention. Checking that computers tasks here on the site all the WU's I just got aren't re-issued ones, so new work is trickling out. Just slowly.

/edit. I will add that in the last few days I've seen a lot of "error while downloading" messages spread across all my machines. 21 errors and counting.
ID: 97530 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile gmercato

Send message
Joined: 2 May 20
Posts: 9
Credit: 861,494
RAC: 2,118
Message 97531 - Posted: 23 Jun 2020, 17:19:20 UTC

My Intel cores are working for folding at home now too but Rosetta seems to have the only ARM64 Linux client.

So my Pi4s will have to wait for the rosetta queue.
ID: 97531 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile shanen
Avatar

Send message
Joined: 16 Apr 14
Posts: 195
Credit: 12,654,673
RAC: 0
Message 97532 - Posted: 23 Jun 2020, 19:12:41 UTC - in response to Message 97494.  

Might be hard to believe, but some of the other BOINC projects have even worse project management than this one. The management bottleneck on this one seems to be focused on one or two people.

I've run a number of projects over the decades,,, Actually started with seta@home before BOINC was created.

Then again this one does seem to be entering its death throes. I tested out another project with one of my machines, but didn't like it much, so I haven't switched yet. Anyone have any recommendations? WCG keeps bugging me to come back, so maybe they've cleaned up their act.
#1 Freedom = (Meaningful - Constrained) Choice{5} != (Beer^3 | Speech)
ID: 97532 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 660
Credit: 47,280,720
RAC: 62,603
Message 97534 - Posted: 23 Jun 2020, 19:37:52 UTC - in response to Message 97532.  

Then again this one does seem to be entering its death throes. I tested out another project with one of my machines, but didn't like it much, so I haven't switched yet. Anyone have any recommendations? WCG keeps bugging me to come back, so maybe they've cleaned up their act.

They are definitely NOT in their death throes, just the usual poor communications. It is probably because they have so many researchers submitting work from various locations that they have no idea what is coming in. At least that is the charitable interpretation.

As for WCG, it is not a bad idea. I am doing Africa Rainfall Project (ARP), which run well on my Ryzen 3000 series; around 12 to 16 hours.
You could do COVID-19 (and beyond) research with OPN also, but they are developing a GPU version which should be out before long. I would wait for that.
ID: 97534 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
manalog

Send message
Joined: 8 Apr 15
Posts: 24
Credit: 227,409
RAC: 1
Message 97536 - Posted: 23 Jun 2020, 20:35:02 UTC - in response to Message 97531.  
Last modified: 23 Jun 2020, 20:39:09 UTC

Put them on tn-grid https://gene.disi.unitn.it/test it works very well on arm even if 32 bit (if I remember correctly). If you want, you could compile the source for your platform and share the binary with the others. It's extremely easy. Other projects for arm are: wcg opn , universe, lhc sixtrack, einstein
EDIT: I've just checked and there is a 64 bit arm version of tn-grid actually
ID: 97536 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1693
Credit: 31,654,782
RAC: 19,828
Message 97540 - Posted: 23 Jun 2020, 21:59:28 UTC - in response to Message 97534.  

Then again this one does seem to be entering its death throes. I tested out another project with one of my machines, but didn't like it much, so I haven't switched yet.
Anyone have any recommendations? WCG keeps bugging me to come back, so maybe they've cleaned up their act.

They are definitely NOT in their death throes, just the usual poor communications.
It is probably because they have so many researchers submitting work from various locations that they have no idea what is coming in. At least that is the charitable interpretation.

As for WCG, it is not a bad idea. I am doing Africa Rainfall Project (ARP), which run well on my Ryzen 3000 series; around 12 to 16 hours.

Agree with pretty much all that - that's how things are here.

Usually Rosetta runs pretty well. Task shortages come once or twice per year. Maybe 3 at most. One of those times is either Christmas or shortly after returning from the holidays in January.
In March, everyone and their dog turned up here - number of hosts increased x10 and the additions were hefty in CPU power, so all tasks got wiped out and it took 7-10 days to sort it out. Is this the first shortage since?

I dug up a message from late March, some of which ( poimt 3) doesn't apply any more, but most does. Worth re-posting it as a reminder what's involved in bring tasks to us

Yes indeed. We burned through all the work units.

We're currently facing 3 issues that sort of made us run out:
1. The compute power is incredible. I queued up something 2 days ago that would take more than a week to run locally and it all picked up in like 18 hours.
2. Creating these jobs takes human and CPU time on our end and we're stretched thin. I'm working my hardest to get stuff pushed through the pipeline, but for a lot of these, I have to develop new software to get the jobs set up correctly. Additionally, some stuff we can't run on R@H and so it has to be precomputed.
3. We almost have the new update out, but not yet. We're updating Rosetta by about 2 years. Once this happens, we'll be able to do interface design and the newer members of the lab who do protein design will have a much easier time submitting work. As it stands, doing design on R@H requires me to use a pretty old Rosetta without all the newest features.

Once we get interface design going, we should hopefully be able to bring you a lot more work. It'll all finally make sense why we've been making these "scaffold" proteins.

But, to give you an idea, queueing up a full day of Rosetta design on Boinc these days is really tough on our end. We'll say each WU needs 10 structures, and R@H is going through 1M per day. That's 10M structures I need to produce or about 700GB of data. And that's to queue 1 day of work.

The structure prediction stuff is a lot easier to queue. You only have to upload 1 structure and then say I want 10M outputs. So you send the same WU 1M times.

ID: 97540 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
manalog

Send message
Joined: 8 Apr 15
Posts: 24
Credit: 227,409
RAC: 1
Message 97548 - Posted: 24 Jun 2020, 0:28:39 UTC
Last modified: 24 Jun 2020, 0:44:27 UTC

Just got some new tasks with this name:
DL_TrR_PSSM_plus_rosetta1998881_0001_0001_fragments_abinitio_SAVE_ALL_OUT_950049_595_0


DeepLearning TrRosetta.
https://boinc.bakerlab.org/rosetta/forum_thread.php?id=14042
https://boinc.bakerlab.org/rosetta/forum_thread.php?id=13921
PSSM: https://en.wikipedia.org/wiki/Position_weight_matrix
IMHO prepare for a breakthrough in R@h after this pseudo-break ;)
ID: 97548 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 660
Credit: 47,280,720
RAC: 62,603
Message 97550 - Posted: 24 Jun 2020, 1:39:23 UTC - in response to Message 97548.  

Just got some new tasks with this name:
DL_TrR_PSSM_plus_rosetta1998881_0001_0001_fragments_abinitio_SAVE_ALL_OUT_950049_595_0

Yes, I got some too. They are back.
ID: 97550 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Endgame124

Send message
Joined: 19 Mar 20
Posts: 63
Credit: 13,677,611
RAC: 33,380
Message 97551 - Posted: 24 Jun 2020, 2:56:21 UTC

Looks like about 30000 tasks ready to download. Some of my hosts that were out of work have started to pick some up, and I’m guessing I’ll have work across the board by morning.
ID: 97551 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · Next

Message boards : Number crunching : no new tasks?



©2021 University of Washington
https://www.bakerlab.org