Message boards : Number crunching : Problems with Minirosetta Version 1.67
Previous · 1 · 2 · 3 · 4 · 5 · Next
Author | Message |
---|---|
svincent Send message Joined: 30 Dec 05 Posts: 219 Credit: 12,120,035 RAC: 0 |
I spoke too soon when thinking that updating BOINC to the latest version might have fixed the termination problem: another task 250685314 has just failed in the same way. Crashed executable name: minirosetta_1.67_i686-apple-darwin built using BOINC library version 6.5.0 Machine type Intel 80486 (32-bit executable) System version: Macintosh OS 10.4.11 build 8S2167 Tue May 12 09:37:40 2009 Thread 0 Crashed: 0 ...etta_1.67_i686-apple-darwin 0x00efc8bd __ZN7utility7signals9SignalHubIvN4core12conformation7signals16DestructionEventEE11send_signalES5_ + 1701 1 ...etta_1.67_i686-apple-darwin 0x0002a4a7 __ZN4core12conformation12ConformationD1Ev + 7373 2 ...etta_1.67_i686-apple-darwin 0x000910d0 __ZN4core4pose4PoseD1Ev + 4652 3 ...etta_1.67_i686-apple-darwin 0x00518bdc __ZN9protocols3jd214JobDistributor2goEN7utility7pointer10owning_ptrINS_5moves5MoverEEE + 1730 4 ...etta_1.67_i686-apple-darwin 0x00b59c20 __ZN9protocols3jd219BOINCJobDistributor2goEN7utility7pointer10owning_ptrINS_5moves5MoverEEE + 42 5 ...etta_1.67_i686-apple-darwin 0x0013b068 __ZN9protocols8abinitio24Loopbuild_Threading_mainEv + 720 6 ...etta_1.67_i686-apple-darwin 0x00005db8 _main + 7640 7 ...etta_1.67_i686-apple-darwin 0x0000292e __start + 216 8 ...etta_1.67_i686-apple-darwin 0x00002855 start + 41 |
TomaszPawel Send message Joined: 28 Apr 07 Posts: 54 Credit: 2,791,145 RAC: 0 |
https://boinc.bakerlab.org/rosetta/result.php?resultid=250780868 iRp40_S_3d85_ip40_1zzo.pdb.gz_00000010_fa_dock.xml_score12_pert38_DOCK_11891_476_1 "The validator error has been found: Our data format was changed and the validator was not updated. We're doing that now. " More details please? WWW of Polish National Team - Join! Crunch! Win! |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
download issues with parts of a task 5/12/2009 8:10:43 PM rosetta@home Requesting new tasks 5/12/2009 8:10:48 PM rosetta@home Scheduler request completed: got 1 new tasks 5/12/2009 8:10:50 PM rosetta@home Started download of threading_lb_reference.loopbuild.t360_.mtyka.flags.boinc 5/12/2009 8:10:50 PM rosetta@home Started download of threading_lb_reference.loopbuild.t360_.mtyka.boinc_files.zip 5/12/2009 8:11:00 PM Project communication failed: attempting access to reference site 5/12/2009 8:11:00 PM rosetta@home Temporarily failed download of threading_lb_reference.loopbuild.t360_.mtyka.boinc_files.zip: HTTP error 5/12/2009 8:11:02 PM Internet access OK - project servers may be temporarily down. 5/12/2009 8:11:02 PM rosetta@home [error] File threading_lb_reference.loopbuild.t360_.mtyka.boinc_files.zip has wrong size: expected 6045722, got 0 5/12/2009 8:11:02 PM rosetta@home Started download of threading_lb_reference.loopbuild.t360_.mtyka.boinc_files.zip 5/12/2009 8:12:11 PM rosetta@home Finished download of threading_lb_reference.loopbuild.t360_.mtyka.boinc_files.zip 5/12/2009 8:15:53 PM Project communication failed: attempting access to reference site 5/12/2009 8:15:53 PM rosetta@home Temporarily failed download of threading_lb_reference.loopbuild.t360_.mtyka.flags.boinc: HTTP error 5/12/2009 8:15:54 PM Internet access OK - project servers may be temporarily down. 5/12/2009 8:15:54 PM rosetta@home [error] File threading_lb_reference.loopbuild.t360_.mtyka.flags.boinc has wrong size: expected 706, got 0 5/12/2009 8:15:54 PM rosetta@home Started download of threading_lb_reference.loopbuild.t360_.mtyka.flags.boinc 5/12/2009 8:15:55 PM rosetta@home Finished download of threading_lb_reference.loopbuild.t360_.mtyka.flags.boinc |
LizzieBarry Send message Joined: 25 Feb 08 Posts: 76 Credit: 201,862 RAC: 0 |
"The validator error has been found: Our data format was changed and the validator was not updated. We're doing that now." The Validator is running on the server status page, but some delays in awarding over the last couple of hours. Must be working on it right now. |
Snags Send message Joined: 22 Feb 07 Posts: 198 Credit: 2,888,320 RAC: 0 |
Another absent output file: lb_alnmatrix_threading_hb_t313__IGNORE_THE_REST_11904_223_0 exit code 193(0xcl) -63 Snags |
Alien Send message Joined: 10 Nov 05 Posts: 5 Credit: 117,597 RAC: 0 |
Hi all, I have severral of these Tasks that all start with "iRp40_S_3d85_ip40_ xxxx" that give Client Errors with the Outcome "Validate error" I have even aborted several of them to avoid the error........... One Example: 250695394 The second problem Is that I have 6 finished (up to now) and uploaded Results with Pending Credits to be seen here: Pending credit Alan |
Mike Tyka Send message Joined: 20 Oct 05 Posts: 96 Credit: 2,190 RAC: 0 |
https://boinc.bakerlab.org/rosetta/result.php?resultid=250780868 Lol - sure if you want the gory details, here they are :) Here's the content of one of our result files: SEQUENCE: IWELKKDVYVVELDWYPDAPGEMVVLTCDTPEEDGITWTLDQSSEVLGSGKTLTIQVKEFGDAGQYTCHKGGEVLSHSLLLLHKKEDGIWSTDILKDQKNKTFLRCEAKNYSGRFTCWWL TTISTDLTFSVKSSRGSSDPQGVTCGAATLSAERVRNKEYEYSVECQEDSACPAAEESLPIEVMVDAVHKLKYEQYTSSFFIRDIIKPDPPKNLQLKPLQVEVSWEYPDTWSTPHSYFSLTFCVQVQGKD RVFTDKTSATVICRKNASISVRAQDRYYSSSWSEWASVPCAHPLENAWTFWFDNPQGKSRQRDWGSTIHPIHTFSTVEDFWGLYNNIHNPSKLNVGADFHCFKNKIEPKWEDPISANGGKWTISCGRGKS DTFWLHTLLAMIGEQFDFGDEICGAVVSVRQKQERVAIWTKNAANEAAQISIGKQWKEFLDYKDSIGFIVHEDAKRSDKGPKNRYTV SCORE: score fa_atr fa_rep fa_sol hack_elec hbond_sr_bb hbond_lr_bb hbond_bb_sc hbond_sc dslf_ss _dst dslf_cs_ang dslf_ss_dih dslf_ca_dih fa_dun ref rms description REMARK BINARY_SILENTFILE SCORE: -575.033 -705.385 16.172 295.133 -5.204 -24.046 -60.222 -11.093 -8.141 -11 .137 -8.813 -2.536 -2.322 19.600 -67.040 6.010 S_S_3d85_ip40_2idv.pdb.gz_00000009.pdb.gz_00000001 FOLD_TREE EDGE 1 117 -1 EDGE 117 290 -1 EDGE 117 405 1 EDGE 405 291 -1 EDGE 405 467 -1 S_S_3d85_ip40_2idv.pdb.gz_00000009.pd b.gz_00000001 RT 0.635059 0.770564 -0.0541478 -0.64237 0.565739 0.51701 0.429023 -0.293549 0.854265 -20.5199 35.9508 31.0298 S_S_3d85_ip40_2id v.pdb.gz_00000009.pdb.gz_00000001 ANNOTATED_SEQUENCE: I[ILE_p:NtermProteinFull]WELKKDVYVVELDWYPDAPGEMVVLTCDTPEEDGITWTLDQSSEVLGSGKTLTIQVKEFGDAGQYTCHKGGEVLSHSLLLLHKKE DGIWSTDILKDQKNKTFLRCEAKNYSGRFTCWWLTTISTDLTFSVKSSRGSSDPQGVTCGAATLSAERVRNKEYEYSVECQEDSACPAAEESLPIEVMVDAVHKLKYEQYTSSFFIRDIIKPDPPKNLQL KPLQVEVSWEYPDTWSTPHSYFSLTFCVQVQGKDRVFTDKTSATVICRKNASISVRAQDRYYSSSWSEWASVPC[CYS_p:CtermProteinFull]A[ALA_p:NtermProteinFull]HPLENAW TFWFDNPQGKSRQRDWGSTIH[HIS_D]PIHTFSTVEDFWGLYNNIHNPSKLNVGADFHCFKNKIEPKWEDPISANGGKWTISCGRGKSDTFWLHTLLAMIGEQFDFGDEICGAVVSVRQKQERVAIWTK NAANEAAQISIGKQWKEFLDYKDSIGFIVH[HIS_D]EDAKRSDKGPKNRYTV[VAL_p:CtermProteinFull] S_S_3d85_ip40_2idv.pdb.gz_00000009.pdb.gz_00000001 CHAIN_ENDINGS 290 S_S_3d85_ip40_2idv.pdb.gz_00000009.pdb.gz_00000001 LWOksBxKHvH0xLCoQ5QbqBJbn1HUO0GoQR2+oB13P8HErc8nQ++JqBZkt+HEv0rnQO0irBZQg8HkSMQoQ99TpBVNeAI0KHXoQf/UtBZ5wCIUMIMoQEYVqBBr8CIUgVhoQD IduB1lIwHUqmEoQk8gsB5SJnHUh4CoQ+/6sBJ6YvHEho0nQUo3oBdqiwHEfMKoQ+TtsBpFZ3HkQgVoQlYQoBNcdDIEEYSoQjs9nBBBW6H057YoQueEuBFYFFIU14SoQKxA vBFYFBI0GvHoQGZLsB9SXFIkR2GoQy2poBtdeEIk++loQNeZrBtz3/HEAAmoQDCsrB5lOGIUKcfoQ S_S_3d85_ip40_2idv.pdb.gz_00000009.pdb.gz_00000001 LBWZmBBrc+HkWk/nQ++5kBx0tCIEW5wnQqGviBFYlFIUep9nQ8nKiBtyBEIUy2HoQ46xjB9na/HEZ7cnQVj3hBd8S2HkoFhnQKxgiBhU4rH0gAlnQ2O/dB9S32HEsyhnQt dOgB557lHEAAonQOJGcBZQgsHUGEmnQRLSaBdSM/H0IbfnQwfqWBhrHqHE+TnnQQ14UBVO08HU6mgnQhrHTBRgVyH04lknQcSclBFXP7H0NJGoQXsPmBVTtFI02GqnQIwq iBlsdCI0GvSnQXPalBtz37HEsyTnQfUokBNy2pH00NlnQXPKgBlBBeHkrHrnQZ7cbBdnvDIEFucnQyLdVBhU4hHkmZqnQFDCSBlupBIUjXenQpw1OBJFuwHEShlnQ S_S_ 3d85_ip40_2idv.pdb.gz_00000009.pdb.gz_00000001 LZ7shB9RBKIk8S1nQFDifBx0NNIECsAoQgqxZBRK8LIkSM8nQT3EYBd9oMI025pnQiBRgBRhLTIESh9nQBr8cB5k4WI03PFoQWOUeBB/pcI0bSDoQamZhBB/JeIU5QFoQT iBbBlsdfIk789nQnvPiBhVOLIUxgmnQlB8fBh9YMIUzNJoQzhWiBZ78TIkqxAoQam5fB125TIUxgsnQcSsYBlYQWIULyDoQuz3dB1h2VIkQgNoQ S_S_3d85_ip40_2idv .pdb.gz_00000009.pdb.gz_00000001 LgVuWBxJRKIkSMGoQ67HRB1h2II0yhEoQZ78NBZ78NIE7RDoQrcILB9TtOI0Jx2nQKc9OBpvfFIkR2NoQc9IJBNepDI0fqMoQx1jIBVMIAIULyCoQPKcHBd9oAIUhrWoQM dTYB989JIkZmNoQx+uQBVEiGI0eY6nQmupRBtHFCIE7RNoQ2jiPB5OfHIUxgVoQhArGBR2OHIkEDMoQjXaEBxfq9HkBBCoQ+TtJBpFZCIUHa3nQEtILBJaR5HU3kDoQzMT DBxfq+H0P1VoQKxAKBh/U6HEsyXoQ46xHBxJRDIU6mdoQ S_S_3d85_ip40_2idv.pdb.gz_00000009.pdb.gz_00000001 LtIbOB55bRIk8SLoQXktLBJxgWI0eULoQR2OPBR2OaIUIwRoQCBWTBFDiYIEAAWoQOJGGBpZGWIEqGQoQ789FBhUYUIEsyboQ9oQABpvfTI0cofoQepb6A13vYIUKcgoQ6 mEvAlX6XIU6mjoQgqxQBxLdQIkWkRoQ0rULBBN/XIEaJDoQ03PEBFDCaI0eUPoQ3k4DBx0NTI0bSLoQR2OIBhArQI04lcoQCB2HBpvfXI04lgoQFue8AdR2QIk++ZoQBWZ ABNJmRIUNenoQ3kY+ANIQbIUjXmoQIFu6AZktaIE/pYoQueUrA5OfbI0jCkoQamZrA14lVIUJGeoQCsyuANJGWIktzqoQ S_S_3d85_ip40_2idv.pdb.gz_00000009.p db.gz_00000001 THere's a line there starting with CHAIN ENDINGS - that was added b one of our developers. We didn't realize and so the validator was not updated, hence rejecting these results that contained that new line. It should be fixed now. http://beautifulproteins.blogspot.com/ http://www.miketyka.com/ |
Feet1st Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0 |
this one shows max memory used over 530MB. I believe that is more then is expected. But I have no way to know if this task was marked to go only to high memory hosts. Task name is abinitio_norelax_homfrag__plus_native_9rnt_0001A__SAVE_ALL_OUT_11817_1957 Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ |
Alien Send message Joined: 10 Nov 05 Posts: 5 Credit: 117,597 RAC: 0 |
Hi all, Replying to my own message.... I now have 8 Tasks , and counting , waiting for their credit to be granted. Without recieving the credit and adding the other Invalid tasks I had this is a waste of power......... If I recieve a few more Tasks without the credit being granted I may turn Rosetta@Home off and look for some other project to take part in.... Pending credit: 341.13 |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,265,269 RAC: 4,483 |
So far, most of the iRp40_S workunits I've had ran rather fast - 99 decoys in less than an hour - but then had validate errors. Could you run the new version of your validator software on these workunits, already completed, which the previous version did not accept? In at least one case, a wingman needs this in addition to me or instead of me. https://boinc.bakerlab.org/rosetta/workunit.php?wuid=228683054 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=228690029 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=228717383 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=228737287 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=228721480 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=228701305 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=228826391 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=228809090 |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
So far, most of the iRp40_S workunits I've had ran rather fast - 99 decoys in less than an hour - but then had validate errors. Could you run the new version of your validator software on these workunits, already completed, which the previous version did not accept? In at least one case, a wingman needs this in addition to me or instead of me. add me to the list with 1 so far: iRp40_S_3d85_ip40_1lu4.pdb.gz_00000001_fa_dock.xml_score12_pert38_DOCK_11891_469_0 full run to the 99 decoy limit and then validate error. |
tralala Send message Joined: 8 Apr 06 Posts: 376 Credit: 581,806 RAC: 0 |
Obviously a widespread problem with iRp40_S workunits: https://boinc.bakerlab.org/rosetta/workunit.php?wuid=228722758 |
Snags Send message Joined: 22 Feb 07 Posts: 198 Credit: 2,888,320 RAC: 0 |
So far, most of the iRp40_S workunits I've had ran rather fast - 99 decoys in less than an hour - but then had validate errors. Could you run the new version of your validator software on these workunits, already completed, which the previous version did not accept? In at least one case, a wingman needs this in addition to me or instead of me. If you click on the task ID(far left column on both the workunit details page and the tasks for computer page) you should see that at least some of these have in fact been given credit. It appears that if both crunchers reported before the fix they have both now been granted credit. If the second reports after the fix it goes through the validator in the ordinary order and the first cruncher is left without credit. I know this is a bit annoying but as the validator appears to be struggling to keep up with its work at the moment perhaps it's for the best. I believe the two workunits are identical (they were begun from the same seed) and so only one copy is very useful. If they run those first copies through the validator again or run a script to grant credit it would be more to please us than to benefit the research. Not necessarily a bad thing (pleasing us) but not at the cost of causing more troubles. And folks are complaining about their pending credit. Snags |
Alien Send message Joined: 10 Nov 05 Posts: 5 Credit: 117,597 RAC: 0 |
And folks are complaining about their pending credit. Excuse my complaining about my pending credit. I'm not only doing this for science reasons only. It's all about competing within a team and globaly and the fun involved with it........... I like to see the results for the tasks I let my computer run for , and it's running 24/7 . More will follow.....if it works without all of these problems... Without recieving credits there's just no fun left and it's a waste of my money , my energy and my time............. So , I believe it's perfectly OK to complain about pending credits...... Alan |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,265,269 RAC: 4,483 |
So far, most of the iRp40_S workunits I've had ran rather fast - 99 decoys in less than an hour - but then had validate errors. Could you run the new version of your validator software on these workunits, already completed, which the previous version did not accept? In at least one case, a wingman needs this in addition to me or instead of me. For most of those workunits, they haven't been finished yet by any one else the last I looked. One of them hadn't even been assigned to anyone else. |
Snags Send message Joined: 22 Feb 07 Posts: 198 Credit: 2,888,320 RAC: 0 |
And folks are complaining about their pending credit. I too believe it's perfectly okay to complain about pending credits and certainly didn't intend to imply otherwise. The lag lag between reporting and getting credit has been vanishingly small on Rosetta so the existence not to mention the increase in pendings is definitely worth noting. Maybe I should have said, "Now folks are noticing their pending credits". Let me try getting at my (wildly speculative) point from another angle: We now have pendings. This might be because two copies of a bunch of iRp40 tasks have been run through the validator a second time, lengthening the lines, slowing things down for all the other WUs. If this is what they did and if this caused the growth in pendings perhaps they could have mitigated the issue by only running one copy through the validator a second time instead of both. But this would have left the other cruncher without credit for probably valid though unnecessary work. In cases in which the second cruncher reported his WU after the fix was in the first cruncher has not been granted credit. Maybe they are holding off because maybe they've got validator issues as evidenced by the pendings. If they have already received or have an expectation of receiving the data in the WU through the ordinary procedures the science needs do not require them to do anything with that first task. And maybe doing something about those first tasks just to please us could cause the project more difficulties like longer lines at the validator which would increase the pendings which would TA DA: displease us. (Just for the record, maybe the pendings have nothing to do with the iRp40 tasks. I'm speculating, wildly, irresponsibly even. In fact I am so regretting making that first post right now. For the other record, insomnia is worse than drugs. Seriously). A few final points: I would never want to discourage anyone from reporting any observations, noting any concerns or even complaining on these boards. I couldn't care less why you crunch: none of my business, the more the merrier, glad you are having fun, etc. Everything I've said about iRp40 tasks, pendings, the validator, and what the project may or may not be doing about them is merely speculation by someone who has no more information and probably less knowledge than many if not most of the readers here. Snags |
Snags Send message Joined: 22 Feb 07 Posts: 198 Credit: 2,888,320 RAC: 0 |
[/quote] For most of those workunits, they haven't been finished yet by any one else the last I looked. One of them hadn't even been assigned to anyone else. [/quote] Now that the fix for those tasks is in the validator the resends won't require additional intervention by staff. All of my tasks hit the 99 model mark in just about an hour so the turnaround time was probably pretty quick on average. Doing something with those first tasks will require intervention. I didn't see any WUs without a second cruncher and I can only speculate about any delay in resending. Forgive me, robert, but I'm going cease speculating at least until I've had some sleep. My brain is grinding away, mind you, but something, perhaps what little shred of good sense I have left, is telling me stop typing. Snags |
David E K Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Jul 05 Posts: 1018 Credit: 4,334,829 RAC: 0 |
You'll notice that the invalid jobs have been granted credit if you click on the specific workunit. Looks like the validator has caught up now also. this has been one of those rare occasions where we sent out a batch of workunits that had a change in output format with the latest app update and thus got rejected by our validator. It required us to rebuild our validator to fix the issue and the system had a little bit of catching up to do, possibly slowed down by the invalid batch. |
Terrasapiens Send message Joined: 25 Apr 08 Posts: 15 Credit: 368,919 RAC: 0 |
I recently built a new PC and figured all the problems I was having with Rosetta Mini on my old machine would go away, however I'm still seeing most of the mini WUs crashing, at least all the crashed ones checked were mini WUs. Yet with the new machine the reason for the crashes is different. Could someone please look at my results and provide some feedback as to what the problem may be? The errors checked all showed a lot of "Can't acquire lockfile - exiting" messages. I'm running the same OS (Win XP Pro, SP3) and antivirus (Kaspersky v6.0) as on the old machine. Could the antivirus or some other application be causing these errors? BTW, as with the old machine the Rosetta Beta WUs run just fine. If this problem persists, it there a way to block mini WUs and only allow processing of beta WUs? This would allow my machine to do a lot more useful work and not waste time on WUs that will continue to fail. My tasks: https://boinc.bakerlab.org/rosetta/results.php?userid=254884 Thank you! |
Betting Slip Send message Joined: 26 Sep 05 Posts: 71 Credit: 5,702,246 RAC: 0 |
You'll notice that the invalid jobs have been granted credit if you click on the specific workunit. Looks like the validator has caught up now also. It's a nightmare dealing with the public isn't it :) |
Message boards :
Number crunching :
Problems with Minirosetta Version 1.67
©2024 University of Washington
https://www.bakerlab.org