Posts by TPCBF

1) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 109014)
Posted 17 Mar 2024 by TPCBF
Post:
Please report any issues with work units in this thread.

All recent WUs on at least 3 different hosts all got cut short with the "error while computing"... :(
2) Message boards : Number crunching : Stuck on uploading is a new problem? (Message 81482)
Posted 17 Apr 2017 by TPCBF
Post:
As mentioned before, I doubt that this is something networking/IP related or protocol/IDS related, as WUs from the same host get uploaded just fine.

And it is certainly not Windows 10 related, as someone else mentioned in his response, as in my case, the host in question is Windows 7 Pro/64 bit...

And those 3 WUs on the other laptop that I mentioned now show up with a validation error. strangely enough...

Ralf
3) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 81467)
Posted 17 Apr 2017 by TPCBF
Post:
I have seen so far one WU that is stuck, though I can't remotely check all the hosts that are running R@H.
On that one host where I noticed this since Friday, other WUs are uploading fine.
And the one that gets stuck is trying to upload but as far as I watched some forced upload retries, it craps out at various amounts of data, between 3KB and 32KB, out of 739.36KB.

The WU in question is https://boinc.bakerlab.org/rosetta/workunit.php?wuid=820755819.

I now see that there are actually a few more WUs from the same date send that should have all been returned by now, on other hosts as well...
EDIT: Actually, I just checked and there are exactly 3 more WUs from the same data (one is early morning the next day, the 3/12) on one other host, a laptop that I can't check remotely. However, that same laptop has successfully received and returned WU send after 3/11, 3/12...

As other WUs are uploading just fine, even on the same machine, I can not think of a reason as to why a networking issue at UW should be causing this... :?

Ralf
4) Message boards : Number crunching : Stuck on uploading is a new problem? (Message 81466)
Posted 17 Apr 2017 by TPCBF
Post:
Have one WU stuck on upload since at least Friday, Other Rosetta tasks seem to complete and upload fine.
If this problem is so elusive, why aren't there any admins/programmers actively communicating with folks that have this problem in order to solve it?

Ralf
5) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 78673)
Posted 2 Sep 2015 by TPCBF
Post:
Two of our servers went down again. We are currently looking into it. Hold tight.


All server status shows disabled except for the data-driven web pages, yet my completed tasks are being successfully uploaded. However, I am not receiving credit for any of the uploads. Will I receive credit for them later?
Well, server status shows all green but for "file deleter".

The account info shows updated stats but it looks like the stats XML files aren't generated as none of the external stats sites are showing any updates for now...

Ralf
6) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 78655)
Posted 31 Aug 2015 by TPCBF
Post:
Well, at least two hosts where able to upload again. Let's see how it all looks in a few hours...

Ralf
7) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 78641)
Posted 31 Aug 2015 by TPCBF
Post:
I think this may be more like about the fact nobody from the projects cares to inform his donors about the problem.

The server status page already informs everybody, who cares to look at it, that something is wrong with the servers (specially since it's not even updating anymore). That's what this page is done for, to inform us about the servers.

What could be improved, are the messages from servers. Curretly they just say "No work sent". Yeah, I can see that from the previous line "Scheduler request completed: got 0 new tasks". So instead of this completely useless message, they should send "Project has no tasks available", than I even don't need to look at the SSP to find out, why I'm not getting any work.
Well, that there is something wrong with the servers, that's kind of the obvious part.

But what has been missing from this project (for years!) is basic communication that those responsible for those servers is aware that there is something wrong with them and at least some basic info that they will be working on it.
I am working as sysadmin myself, and I know that it isn't THAT hard to be kept up to date of server status and in particular issues that require attention. There are plenty of Open Source tools out there for things like this.

It's about noon on Monday, by this time someone should have been able to take a look at it and post a message "Hey guys, we're on it, it just might take a while" instead of just remaining incommunicado...

Ralf
8) Message boards : Number crunching : Minirosetta 3.59 (Message 78387)
Posted 30 Jun 2015 by TPCBF
Post:
Please report which BOINC version is installed and which host you are making yout observations on.
BOINC 7.4.42 on Windows 8.1/64(4core i3/8GB of RAM).

Task ID 742059437 and Task ID 742059580.

After restarting BOINC 3 or 4 times and shutdown and restart my laptop half a couple times since yesterday, right now both WUs are currently continuing working again, but I have the feeling that they will stop and do the same thing all over again in a while...

Ralf


9) Message boards : Number crunching : Minirosetta 3.59 (Message 78375)
Posted 29 Jun 2015 by TPCBF
Post:
Those WUs seem to have some serious problems, at least here on my Windows 8.1 laptop (4core i3, 8GB of RAM)
They just rack up a few minutes of CPU time before kind of stopping, just "real" time advancing but with no apparent processing happening for hours...
And when restarting BOINC, they might pretend to work for a (very) short time, then crapping out with "Computation Error"...

Ralf



Are you seeing this problem with all WUs? There doesn't seem to be a general issue as far as I can tell so far.
Not all, but still weird. 6 other tasks seemed to have finished today fine, though I see messages about "exit status 0" and "you might have to reset the project if this continues" in the logs.

Ralf
Ok, here we go again. Yesterday, two WU crapped out after stopping and restarting BOINC a couple of times with computation error. Today (actually, since last night) there are again two WUs that start fine, then just rack up "real time" without doing anything past the last check point in terms of CPU time, just blocking two cores on this CPU for no good reason at all...

Ralf
10) Message boards : Number crunching : Minirosetta 3.59 (Message 78320)
Posted 17 Jun 2015 by TPCBF
Post:
Those WUs seem to have some serious problems, at least here on my Windows 8.1 laptop (4core i3, 8GB of RAM)
They just rack up a few minutes of CPU time before kind of stopping, just "real" time advancing but with no apparent processing happening for hours...
And when restarting BOINC, they might pretend to work for a (very) short time, then crapping out with "Computation Error"...

Ralf



Are you seeing this problem with all WUs? There doesn't seem to be a general issue as far as I can tell so far.
Not all, but still weird. 6 other tasks seemed to have finished today fine, though I see messages about "exit status 0" and "you might have to reset the project if this continues" in the logs.

Ralf
11) Message boards : Number crunching : Minirosetta 3.59 (Message 78314)
Posted 16 Jun 2015 by TPCBF
Post:
Those WUs seem to have some serious problems, at least here on my Windows 8.1 laptop (4core i3, 8GB of RAM)
They just rack up a few minutes of CPU time before kind of stopping, just "real" time advancing but with no apparent processing happening for hours...
And when restarting BOINC, they might pretend to work for a (very) short time, then crapping out with "Computation Error"...

Ralf
12) Message boards : Number crunching : Minirosetta 3.54 (Message 78040)
Posted 14 Mar 2015 by TPCBF
Post:
Well, there is total silence on the RALPH@Home site...
I got 4 WUs earlier today and all 4 are now stuck a various percentages, though running for more than 4 hours by now.


They ask you the url of wus with problem, but you don't reply
Who asked me where?
I replied to the question of the WU names a bit earlier, after I got back to check in here. The main problem for me is that I don't necessarily have always the time to babysit. And there has been no post on the RALPH forum (if that is what you are referring to) for more than 30 days. And not that they ever have been very communicative in the first place...

Ralf
13) Message boards : Number crunching : Minirosetta 3.54 (Message 78039)
Posted 14 Mar 2015 by TPCBF
Post:
Well, there is total silence on the RALPH@Home site...

I got 4 WUs earlier today and all 4 are now stuck a various percentages, though running for more than 4 hours by now.
Are they still supposed to do something useful? Right now they just blocking everything else but I would hate to abort them just yet...

Ralf



Are these Ralph jobs? Can you provide the task names?
Yes, they are RALPH jobs ;-)

I had suspended them for a while so some other project WUs could finish before their deadline and then resumed all 4 jobs. They seemed to have picked up on resume just fine and two of the already reported:

cb_mar11_dock_placestub_EEEH_1035_vegf_ProteinInterfaceDesign_20241_91_0_0
cb_mar11_dock_placestub_EEEH_1038_vegf_ProteinInterfaceDesign_20241_91_0_0

The other two are currently still crunching, named

cb_mar11_dock_placestub_EEEH_1037_vegf_ProteinInterfaceDesign_20241_91_0_0
cb_mar11_dock_placestub_EEEH_1036_vegf_ProteinInterfaceDesign_20241_91_0_0

and should finish within the next 15min or so...

Those got previously stuck for hours at 12% and 20% IIRC,

Ralf
14) Message boards : Number crunching : Minirosetta 3.54 (Message 78028)
Posted 13 Mar 2015 by TPCBF
Post:
Well, there is total silence on the RALPH@Home site...

I got 4 WUs earlier today and all 4 are now stuck a various percentages, though running for more than 4 hours by now.
Are they still supposed to do something useful? Right now they just blocking everything else but I would hate to abort them just yet...

Ralf
15) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 77329)
Posted 12 Aug 2014 by TPCBF
Post:
Is anyone still having issues getting work?
No, now I am getting new WUs, more than before, but for some strange reasons, SETI@HOME,which "shares" the CPU cycles on that machine "isn't getting any"... :-(
16) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 77311)
Posted 9 Aug 2014 by TPCBF
Post:
Just keep getting "no work sent"... :-(
17) Message boards : Number crunching : Only 20 credits for 25,000 seconds (Message 75192)
Posted 2 Mar 2013 by TPCBF
Post:
Even worth :(

Whiskey Tango Foxtrot???
0.77 credits for 11k seconds (and apparently 73 decoys detected only to reset itself...)

Any one care to explain?

Ralf
And another one.
First crunching happily and generating 59/59 decoys, then starts over to report another, single one for a mere 0.69 credits...

It would really be nice if someone has a reasonable explanation for this or even better, would try to fix this. This is really getting old... :(

Ralf
18) Message boards : Number crunching : Problems with uploading the results (Message 75168)
Posted 26 Feb 2013 by TPCBF
Post:
Yup welcome to the least communicative project in the BOINC community.

Yes they are...but they are WAAAAAAAAAAAAAAAAAAAAAAAAY better then Seti used to be!! I remember back in the VERY old days when someone stole the copper cable out of the ground at Seti, it was A WEEK before anyone at Seti said ANYTHING AT ALL!! They just didn't think about telling anyone! They had plenty of opportunities to tell people, they just didn't! They lost THOUSANDS of crunchers because of it!! Sooo it COULD be worse, but Rosetta IS at the bottom of the pile right now!
Well, that must have been at a time I wasn't active over there.
But since I rejoined SETI (a year after getting back into DC with R@H), their communication hasn't been 'perfect' (try WCG when you want to something close to that), but here at R@H, it's simply ridiculous...

There might be issues that they can't immediately resolve due to resource constraints, but at the very least it should be possible to acknowledge any problem and give a heads up on a possible ETA for a fix. That takes just mere minutes, possible from everywhere in this time and age of smartphones.
It's this eerie silence here from anyone directly associated with the project that really p***** m* t** h** o**... :-(

Ralf
19) Message boards : Number crunching : Only 20 credits for 25,000 seconds (Message 75167)
Posted 26 Feb 2013 by TPCBF
Post:
Even worth :(

Whiskey Tango Foxtrot???
0.77 credits for 11k seconds (and apparently 73 decoys detected only to reset itself...)

Any one care to explain?

Ralf
20) Message boards : Number crunching : Problems with uploading the results (Message 75137)
Posted 18 Feb 2013 by TPCBF
Post:
Mr.Baker seems to like basking in the limelight, but doesn't give a **** when it comes how the project is maintained... :(


I don't think so, his CV speaks for itself (http://depts.washington.edu/bakerpg/drupal/node/336)
As I mentioned, limelight...
And if you think that rosetta has problems with comunication, you don't know project like correlizer, docking, etc....
I am participating in 5 different projects (Rosetta@Home & RALPH@Home, SETI@Home, WCG, Einstein@Home and since recently, Climateprediction) and R@H has pretty much no communication at all.

All projects can have technical issues, it's a matter how they deal with it and show at least basic communication with the folks doing the crunching for them. And that barely cost anything, mere minutes to at least someone from the project admins/techs/profs/etc to acknowledge that they are aware of the problem.
Not to mention such gaffes as releasing bad applications untested in the wild and than taking a couple of days to try and remedy situation..
Dr.Baker might be a good scientist, but given that his name is associated directly with all of this, he is a miserable communicator and project manager...

Ralf


Next 20



©2024 University of Washington
https://www.bakerlab.org