Message boards : Number crunching : MiniRosetta 3.17 Problems. (Message 71522)
Posted 28 Oct 2011 by pieface
I really don't mind the small things like the DrSOP problem, they tie up some resources for download then upload, but I don't get charged extra for that. But, during the same timeframe I also had something like a dozen ProteinInterfaceDesign and Ploop2x3 run to their full allotted time (6hrs or so depending on how watchdog was feeling) and then when the validator finally got caught-up they were marked as invalid. I had some of these on both machines I had crunching Rosetta - one is a Win XP X64 system and the other a Win7 box, no overclocking at all. Here are a couple of examples - any ideas or anyone else get those kind of results in this last batch?


note: edited to take out 'over the weekend'.
Message boards : Number crunching : Anyone running Milkyway in addition to Rosetta? (Message 70723)
Posted 14 Jul 2011 by pieface
It's been very hot and humid here in upstate New York the past few days.

The people who run things on the RPI campus where Milky Way is home-based announced that when it got too hot they were going to shut down the air conditioning to various buildings on campus to save energy (as many students are home for the summer). So, Milky Way may be up/down with little or no notice depending on the weather.

Better today, in the 50's overnite and 60's (F) this morning. Maybe they will be able to get things going later today.
Message boards : Number crunching : Whats up with Ralph? (Message 64022)
Posted 11 Nov 2009 by pieface
Are the servers down or something? I haven't been able to upload or connect all day...
Message boards : Number crunching : Hughe Upload data sizes / Upload Problems ? (Message 58873)
Posted 17 Jan 2009 by pieface
I have had one of the small guys ~ 48kb thats been stuck trying to upload all nite long, so unless there are a bunch of you folks with those huge guys jamming things up it may just be that the upload server is fubar?
Message boards : Number crunching : Problems with version 5.90/5.91 (Message 50149)
Posted 28 Dec 2007 by pieface
Something still fishy with 5.90, I just had two units run for 24hrs without ever finishing a single decoy, sounds like some kinda loop-de-loop going on or it just doesn't know when to say a decoy is 'complete'. Good luck to the next boxes running these two:

wu 117532625
wu 117529650
Message boards : Number crunching : Whats up with Ralph website? (Message 49141)
Posted 28 Nov 2007 by pieface
Anyone know why Ralph has been down all day?
Message boards : Number crunching : Problems with Rosetta version 5.54 (Message 38287)
Posted 25 Mar 2007 by pieface
Been a while since i've seen an error on rosie, but got one here:

RESID 69376321

ERROR:: Exit at: line:761

Rosie 5.54, on DOCKING_3rhj_SYMM_13rhj_1_d.hom027_top10.out.1_1628_270_0
Message boards : Number crunching : Who are you talking to???? (Message 21134)
Posted 25 Jul 2006 by pieface
numbers numbers numbers....
I guess I am one of the lurkers, I visit fairly often (but stick pretty much to the NC forum), and seldom post unless I have a problem.
Message boards : Number crunching : Report Problems with Rosetta Version 5.16 II (Message 17419)
Posted 31 May 2006 by pieface
I also had watchdog knock down one of those pdbblast guys: resultid
like XS DUC's.
Message boards : Number crunching : Report Problems with Rosetta Version 5.16 I (Message 16692)
Posted 20 May 2006 by pieface
This is probably the same 0xc0000005 problem as i reported on 5.13 earlier here
but on a different machine, still win xp, but this one is a pentium-m.
the unit died overnite, i.e. no-one was messing with the screensaver or anything and then the security package tied things up with a dialog box because rosetta was trying to access a DNS server. output is in result.
I guess this means that with the 'new' debugger code all of the executing programs have to be identified to security software in case they need to go out looking for symbols for a dump or something?
Message boards : Number crunching : Report Problems with Rosetta Version 5.13 (Message 16386)
Posted 16 May 2006 by pieface
Lost a Rosetta 5.13 unit overnite: 20297842
Running BM 5.4.9 on Win XP, hit one of those 0xc0000005 errors. when I looked at the machine this morning there were several dialog boxes saying rosetta was trying to connect to the internet / dns server (norton internet security). I don't know if they were related to this unit or one of the others that finished overnite though.
Message boards : Number crunching : Report stuck & aborted WU here please (Message 12930)
Posted 2 Apr 2006 by pieface
Not a problem, I suspended the WU again instead of aborting, so I could get on with some new work without losing it (in case you folks want something else from it).
Message boards : Number crunching : Report stuck & aborted WU here please (Message 12926)
Posted 2 Apr 2006 by pieface
I have a 'stuck' 4.83, wuid=11843998, cpid=163786.
Noticed that it was still running after 20+ hours cpu time. Looked at graphics and it was on 21.742 pct complete. suspended unit and bm (this guy is still running 5.2.13), closed down windows and did a cold start. Brought BM back up and un-suspended the unit. Cpu time went back to about 52 minutes, then started moving forward. Graphics looked ok, lots of movement. Now after a couple of hours it's stuck on 21.742 percent complete again, model 8, step 266356. task manager says it's pulling 100pct of the CPU.

Edit: just noticed that someone else with a similar machine (pentium-m, 1.86) had already aborted this unit...interesting...
Message boards : Number crunching : Report stuck work units here (Message 7400)
Posted 23 Dec 2005 by pieface
This is just an update to my message nr 7186 from yesterday on a 'stuck' wu.
I left it (and rosetta) suspended overnite to see if there would be any reply, and since there wasn't anything new this morning I thought I would just abort the WU and get on with it. I 'resumed' it before aborting, the pct complete went back to zero and wouldn't you it, the danged thing went from zero to completion in 4,892 cpu secs. Odd behavior for something that should be 'repeatable' ???
Message boards : Number crunching : Report stuck work units here (Message 7186)
Posted 22 Dec 2005 by pieface
I think I have one of those 'stuck' WU's as well. I have 'suspended' rosetta for a bit, and took a full backup of the BOINC directory if you want it (or any part of it). Let me know if you want it aborted.

Rosetta Version 481 [workunit: 1hz6a_abrelaxmode_test_20349]
1% complete
CPU time: 6 hr 46 min 43 sec
stage: Ab Initio
Step: 2699
Accepted Rmsd: 14.14
Accepted energy: 29.42311

It's running on a P4 2ghz machine, win xp home sp2, BM 5.2.13, sharing 50/50 with einstein, and left in-memory when swapped. Both the cpu time and time to completion increased every 5 secs or so. The 'step' hasn't changed since i noticed it was having a problem.

Message boards : Number crunching : Wu going backwards? (Message 424)
Posted 24 Sep 2005 by pieface
Odd... I suspended the wu, shut down boinc manager then brought it back up and unsuspended the unit. It went back to zero cpu time and then stalled at one pct for a bit, but now it seems to be back on track going up in increments of 8.33 pct.
Message boards : Number crunching : Wu going backwards? (Message 419)
Posted 24 Sep 2005 by pieface
I have had a WU (25451) using rosetta 4.77 runing by itself on my P-IV 2 Ghz machine (CPID 1496 - Win XP pro, Boinc 4.45) for about 16-1/2 hours now, and the pct complete is stuck at 1.00 while the estimated time to finish keeps getting bigger (now at around 1618 hrs). Do some of these units loop-de-loop ?

