Posts by TPCBF

41) Message boards : Number crunching : Validator down... :-( (Message 71480)
Posted 24 Oct 2011 by TPCBF
Post:
All my uploads have gone through and all previous uploads have been validated now.

Typically, the server status page shows bk1 is still down, but that's about par for the course.
The two WUs stuck on uploading finally moved and a couple of WUs have been validated over night, but most of them still show "pending", so not much news on this end...

Ralf
42) Message boards : Number crunching : Validator down... :-( (Message 71470)
Posted 23 Oct 2011 by TPCBF
Post:
At least the laptop where I had added R@H again after it "seemed" that things are working ok for a few weeks has two previous finished WU's still sitting as "uploading"...
Subsequent WU's finshed and uploaded fine, but those two just wont budge... :-(

In the meantime, the pending list keeps growing... :-(

Ralf
43) Message boards : Number crunching : Validator down... :-( (Message 71468)
Posted 23 Oct 2011 by TPCBF
Post:
Problem is now that not only the validator doesn't run but that you can not upload finished WU's either... :-(

Is that the case for you? Everything's uploaded here and new downloads coming down too. Just awaiting validation - I think for 14 hours.
1 WU made it mysteriously past the uploading and joined the rest of the previous WU waiting for validation. At least the laptop where I had added R@H again after it "seemed" that things are working ok for a few weeks has two previous finished WU's still sitting as "uploading"...
Get the occasional message that "there is no active Internet connection" (which is absolute bullcrap) now on that one too.
Have stopped R@H from receiving new WU's and added WCG instead. Will see what the R@H WUs currently running will do when they finish in about an hour, tried already to reboot to no avail...

Ralf
44) Message boards : Number crunching : Validator down... :-( (Message 71466)
Posted 23 Oct 2011 by TPCBF
Post:
Well, never a dull moment...

Does anyone know what the issue is here or is this (just) another "it's weekend and no sysadmin is around" kind of typical R@H thing again? :-(

Ralf

I would have to guess that server bk1 either has gone down or failed. All of the processes that are on bk1 are down, but the processes on the other servers are up.
Well, that's a very obvious guess...

Problem is now that not only the validator doesn't run but that you can not upload finished WU's either... :-(

And of course this all just happens to happen when I added another workstation back to crunching for R@H... :?

Ralf
45) Message boards : Number crunching : Validator down... :-( (Message 71461)
Posted 22 Oct 2011 by TPCBF
Post:
Well, never a dull moment...

Does anyone know what the issue is here or is this (just) another "it's weekend and no sysadmin is around" kind of typical R@H thing again? :-(

Ralf
46) Message boards : Number crunching : Web Site Updates (Message 71177)
Posted 1 Sep 2011 by TPCBF
Post:
That's the only (slight) grief about the web site and stats at WCG. They internally use a different point score scheme than point. And that's explained quite clearly on the web site/FAQ...

Is it? Where's that? I went to the Forum. Everything was so bland & there was so much of it I didn't know where to start. So I didn't.
So much of what?
When you click on "Points Generated" on the "My Statistics" page, it explains the point score scheme.
Did you bother to check in their forum and read the FAQ or the BOINC Agent thread/sections in the support sections...
Why do they multiply everything by 7?. Why don't they just divide everything by 7 again?
See above...
What else is not to understand?

I don't know what question there is to ask. I just grab the occasional WU, crunch it & send it back. Hopefully it does what the title of the sub-project says it does. I really wouldn't know.[/quote]Excuse me? I am more worried that anyone at Rosetta@Home is doing anything with the results that are being returned.

And it seems you don't seem to be bothered to check the forums @WCG, I doubt that would leave any questions open...

Ralf
47) Message boards : Number crunching : Web Site Updates (Message 71160)
Posted 28 Aug 2011 by TPCBF
Post:
I've always said, that if rosetta had a website similar to that of WCG... many people would join, just because the website looks professional.

I divert a few resources to WCG and it does look professional but I'll also say I don't understand a single thing about it. I can't even equate my point scores with anything I see on boincstats. I simply don't use it at all.

Very professional looking though...
That's the only (slight) grief about the web site and stats at WCG. They internally use a different point score scheme than point. And that's explained quite clearly on the web site/FAQ...

All credits shown on the "My Statistics" page are exactly 7x what the BOINC "credit score" is. However, on the "Result Status" page, it will the show exactly the requested and granted BOINC credits as they will sum up...

What else is not to understand?

Ralf
48) Message boards : Number crunching : Daily Limit? (Message 71121)
Posted 21 Aug 2011 by TPCBF
Post:
Oh there are LOTS of other Boinc projects out there that could use your help during your time of excess computing power:
http://www.distributedcomputing.info/projects.html

Those are just the active Distributed Computing projects with some being Boinc and some not, the Boinc ones are noted.
And that classification is misleading as all the World Community Grid projects for example are BOINC projects as well...

Ralf
49) Message boards : Number crunching : Lack of communication from project (Message 71051)
Posted 13 Aug 2011 by TPCBF
Post:
And we still have total silence and sealed keyboard lips from this project.
I guess no one reads the boards anymore.


somethings up.............. have not seen it like this since i started here


We COULD always start a rumor that Rosie is shutting down, that should bring them out of the woodwork!
:shock: You mean they didn't and are supposedly still operating?!?!?

Well, if someone wants to see how proper communication on serious research projects looks like might want to take a look at the various WGC sub project forums... ;-)

Ralf
50) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 70971)
Posted 8 Aug 2011 by TPCBF
Post:
We're aware that the queue is empty - a message has been sent out on the appropriate internal mailing list.
So why was this not posted out on the front page where everyone can read it instead of buried in deep in this topic?
Looks like nobody who cares got the memo... :-(

Ralf
51) Message boards : Number crunching : Lack of communication from project (Message 70967)
Posted 8 Aug 2011 by TPCBF
Post:
I'm curious about the experiences some of you have had with other projects, do they communicate better than Rosetta? I've only had experience with a couple other projects, and frankly Rosetta's communication is pretty good, comparatively speaking
After R@H had it's server meltdown over the change of the year, I added World Community Grid projects, first to cover only the downtime of R@H. Have since added it to all my hosts and run it on a few exclusively.
When they know that there is a batch of WUs ending or a whole project is coming to an end, they post it on the projects forum days if not a couple of weeks in advance, so there is no doubt about what is going on.
Didn't really have a server outage that I am aware of, but then their sysadmins might be a bit 'closer to the ball' than for example those of R@H...

Ralf
52) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 70965)
Posted 8 Aug 2011 by TPCBF
Post:
And what you're saying is?

Ralf
53) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 70911)
Posted 4 Aug 2011 by TPCBF
Post:
I did not set a run time for R@H. I did not realize this was required.
It isn't required...
I figured a unit will run to completion. NO?
Unless there's apparently a bad batch of WU's, it will...

Ralf
54) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 70902)
Posted 4 Aug 2011 by TPCBF
Post:
What is your run time for R@H? 4,8 or longer hrs?
The program will/should run until your time limit is up and then stop crunching and compile and upload the information. The reporting you can do manually or it will do it itself automatically at a later time.
That is something that it is certainly not doing for me...

I had reported this "freezing" WU issue several weeks ago and there was no clear answer as to what causes this...

And i just checked on the WU's I recently got and about half of them show up with "compute error", half of them seem to go through fine...

I will stop those hosts still set to run R@H from receiving new task and work on WGC for now only until this whole mess hopefully settles. :-(
Just no point in wasting resources this way...

Ralf
55) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 70894)
Posted 4 Aug 2011 by TPCBF
Post:
@TCPBE -
"F" ;-)
Right up front I will say that during the year and a half that I have participated in the Rosetta, the developers and SysAdmins have set the bar for communications pretty low, and then consistently miss the mark.

Now I take a different position than my friends in the “Bangers and Mash” crowd who espouse the viewpoint that we, as volunteers, are due nothing, and we really don't have a leg to stand on when it comes to complaining.

It is my position that each and every one of us that regularly contribute cycles to the effort are full partners in the project. Unpaid partners to be sure, but partners none the less.
That's more or less my POV as well. And I don't think that it is too outlandish too ask for a better communication from the side of "project organizers", specially when such an "outage" was apparently known before hand as Rocco eluded to in his post two days into "the (non)event"...
However, there is no expectation that Rosetta is a “full employment” program for our systems. It is expected that from time to time there will be a pause in the work flow. Just keep us informed.
Eggsacktly...
This lack of communication actually caused me to withdraw from the project for a period of time. I believed then, as I believe now, that failing to keep us volunteer partners informed in a timely manner was not an option.
Well, for the time being, I have held off from disconnecting from the project. I don't run any GPUs and don't see a point to act as one of those "credit mongers", I just see a possibility to participate in some potentially useful research by providing whatever spare processing power I can provide with my hosts...
So while R@H is in hibernation in the midst of summer, my hosts are still busy with working for WGC projects for a 100% of they available resource instead of the usual 50/50 split...
However, I have ** NEVER ** seen a post censored unless it was blatantly vulgar or contained a personal attack. And there were moments when I was pretty blunt with the moderator Mod.Sense (who at moments I referred to as Non.Sense)
...
If you really think that you had a critical post “censored” it is likely you crossed the line when it came to decorum or personal attacks.
Certainly not...

I am a forum admin in an Open Source software forum myself, so I think I know where "the line" is. And certainly did I not personally attack anyone, just expressed my frustration with the lackluster response (or better the complete lack thereof) from the project admins...
The post ID is 70835 and it seemed to me rather that this was a misguided attempt to try and quell some opposition to the preferred "head in the sand" attitude...

Ralf
56) Message boards : Number crunching : Newbie Q&A, if you're new, have a view! (Message 70892)
Posted 4 Aug 2011 by TPCBF
Post:
I think it is more interesting to know how long the task(s) have been running. Default a task will run for about 3 hours. If a task is stuck the watchdog will/should abort after an additional 4 hours. If a task is stuck (percentage complete does not increase), stopping and restarting BOINC will sometimes help.
I have (had) WUs coming in with a "time to completion" at download anywhere from 2h to 10h, possibly depending on the machine they are running.
And then at some point about a couple of month ago, more or less randomly some WU's would run fine as usual until anywhere from 1% to 60% completed, they would just "sit" and the only thing moving was the "time to completion" counter. Had terminated than jobs after +20h for what normally would have been a 4-5h job...

Disabled R@H on three hosts, attached to RALPH@Home on one machine instead (with the WUs that come in few and far between usually running without a hitch for 1.5-2h)...

Ralf
57) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 70891)
Posted 4 Aug 2011 by TPCBF
Post:
The project is out of work again.
Just checked the server status.
Only 1 task ready to go.
There hasn't really been any "work" since Friday, for almost 6 days now, I got maybe (half) a dozen WU's since then, on 6 or 7 hosts running R@H, probably some cleanup jobs...

This picture from the accumulated WU's at the "BOINC Combined Stats" site shows how much this has dropped since...

Again, my complain is not that there currently barely WU's available but that the administrators for R@H don't have the guts/decency to tell (upfront) what the deal is/was.
Looked at first like a typical "Friday afternoon server barf with no sysadmin around" until Rocco posted two days later that they knew they would run out of jobs in the current project(s). Other projects like WGC announce those kind of things weeks in advance...

And the fact that critical posts now even get censored in here shows that there isn't much hope of anything changing anytime soon... :(

Ralf
58) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 70875)
Posted 3 Aug 2011 by TPCBF
Post:
Hi everyone. I joined today. I added Rosetta to my Seti workload.

Joined around noon. 5 hours later, no work units. Not an issue as I have plenty of Seti WU to keep the computer crunching, but was wondering if it was just me.
No, it's not just you, there is going on since Friday.
Apparently, by a post by Rocco, this was known to them that they would run out of WU's and didn't bother to post an official notification. And it took some complaints to even get that post from Rocco two days later. A few WU's seem to drop in once in a while now, but by far not back to normal.
The powers to be at R@H don't seem to think it's necessary to tell anyone what's going on and for how long...

Ralf
59) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 70839)
Posted 1 Aug 2011 by TPCBF
Post:
Calm down everybody. We're trying to cure diseases and make science progress. A little patience. The project is doing excellent science, and while communication could be better sometimes, they're allowed to miss a few days here and there. We might not see it on our end, but I'm sure things are pretty hectic on the other side of the screen...
Sorry, but basic communication from side of the researchers/sysadmins is an absolute must...
At least for me (and I am sure quite a few other will think similar) it is not that there is an outage of WU's, regardless of knowingly (like this time) or by "accident" (like it was the case the last few times since the change of the year), but that there is an absolute lack of communication from their side. They need the collaboration of the people running the WU's but time and time again, they don't seem to bother to keep those people informed. As I already said, a simple message on the home page or a quick note here in the forum, up front or within reasonable time, is all that it takes...

Ralf
60) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 70812)
Posted 30 Jul 2011 by TPCBF
Post:
Message from server: No work sent

Server Status: Ready to send: 1
Yup, same here. R@H's home page shows 0 queued jobs and server status pretends all servers are running fine...

Ralf


Previous 20 · Next 20



©2024 University of Washington
https://www.bakerlab.org