Message boards : Number crunching : Minirosetta v1.40 bug thread
Previous · 1 . . . 8 · 9 · 10 · 11 · 12 · 13 · 14 . . . 15 · Next
Author | Message |
---|---|
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
sid, thats pretty odd as the first 2 tasks have the same output errors as rochester and the others. even with the -226. so something else happened on your system. looks like its my turn again for compute errors. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
https://boinc.bakerlab.org/rosetta/result.php?resultid=209650572 loopbuild_boinc4_tex_cst_hombench_loopbuild_tex_cst_t326__IGNORE_THE_REST_1ZH8A_6_4790_9_0 https://boinc.bakerlab.org/rosetta/result.php?resultid=209650574 loopbuild_boinc4_tex_cst_hombench_loopbuild_tex_cst_t326__IGNORE_THE_REST_1ZH8A_6_4790_10 2X's - <core_client_version>6.2.19</core_client_version> <![CDATA[ <message> Incorrect function. (0x1) - exit code 1 (0x1) </message> <stderr_txt> # cpu_run_time_pref: 21600 ERROR: NANs occured in hbonding! ERROR:: Exit from: ....srccorescoringhbondshbonds_geom.cc line: 763 called boinc_finish </stderr_txt> ]]> 2956 for the last one and 6905 seconds for the first |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
Hi. Here's another, after 3hrs, 38mins. https://boinc.bakerlab.org/rosetta/workunit.php?wuid=191817904 1g73A_ZNMP_ABRELAX_tetraL_IGNORE_THE_REST_ZINC_METALLOPROTEIN-1g73A-_4652_57735_0 <core_client_version>6.2.14</core_client_version> <![CDATA[ <message> process exited with code 1 (0x1, -255) </message> ERROR: NANs occured in hbonding! ERROR:: Exit from: src/core/scoring/hbonds/hbonds_geom.cc line: 763 called boinc_finish pete. |
HA-SOFT, s.r.o. Send message Joined: 27 Jan 07 Posts: 10 Credit: 94,518,643 RAC: 0 |
I have problem on W2008Server 64 bit, where all Minirosetta task hangs at 0.00 progress. Rosetta beta work ok. BOINC 6.2.19 Zdenek |
BF Send message Joined: 1 Dec 05 Posts: 1 Credit: 3,854,531 RAC: 0 |
I have the same problem. Rosetta beta works well but rosetta mini gives compute error within seconds. Most of the time, I got an access violation: <![CDATA[ <message> - exit code -1073741819 (0xc0000005) </message> <stderr_txt> Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x00030003 Engaging BOINC Windows Runtime Debugger... (I can provide the complete file if needed). This computer has WinXP SP2 - and a core 2 duo processor (E6600). Another pc with the same configuration but with a pentium 4 works well. BF |
David Ball Send message Joined: 25 Nov 05 Posts: 25 Credit: 1,439,333 RAC: 0 |
https://boinc.bakerlab.org/rosetta/result.php?resultid=210193423 2vik__BOINC_ABRELAX_SPLIT_SPLIT2_IGNORE_THE_REST-S25-9-S3-3--2vik_-_4768_1689_1 Vista home premium 64 bit system with 5 GB of ram. C2 Quad Q6600. Only running BOINC. 2 rosetta tasks were running along with 2 tasks from other projects. Lots of free memory and disk space. BOINC is set to leave tasks in memory. BOINC is not used as a screensaver. BOINC client version is 6.2.19. The WU above was running but the CPU time (3 hours 50 minutes 2 seconds) and percent complete (about 69%) weren't increasing. I checked with task manager and it WAS using 25% cpu (1 of 4 cores in a C2Q Q6600). I suspended the WU and the status in the BOINC manager changed from running to waiting to run. However, windows task manager showed that it was still running. I had another rosetta task running so I suspended the second WU as well to make sure I had the right one. The second rosetta WU stopped using CPU when it was suspended but remained in memory as it should. BOINC manager now showed NO rosetta tasks running, but windows task manager showed the problem WU was still using all the cpu time it could get. I killed it in task manager and aborted the WU. When looking at the result, I found that I was the second person to get the WU and it had died on the other computer after about 3 minutes. IIRC, the WU was on the 5th model when this happened. Hope this helps. Have you read a good Science Fiction book lately? |
AdeB Send message Joined: 12 Dec 06 Posts: 45 Credit: 4,428,086 RAC: 0 |
for the team to know what is going on, please post your affected work units links in your next message. FalconFly, i noticed that you are crunching for LHC@home as well. It might be that LHC@home is causing your crashes. I've had some crashes too this week. Next time it happens check your boinc.log file, the last message there, before SIGSEGV and the stack trace, is probably: [lhcathome] Scheduler request A few weeks ago this has also been mentioned by several people in the LHC@home message boards. AdeB |
rochester new york Send message Joined: 2 Jul 06 Posts: 2842 Credit: 2,020,043 RAC: 0 |
|
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=189759587 this is not really a issue with the task. but rather the time of 10 days to crunch the task and report back has expired. you may have to much work on your system and it is not active enough to complete the work assigned to it. I see no CPU time on this task, so it appears it never got crunched to begin with. There are no error codes either. This was also the case of another task you reported earlier. It never got crunched in 10 days. |
(_KoDAk_) Send message Joined: 18 Jul 06 Posts: 109 Credit: 1,859,263 RAC: 0 |
https://boinc.bakerlab.org/rosetta/result.php?resultid=208906339 |
Rifleman Send message Joined: 19 Nov 08 Posts: 17 Credit: 139,408 RAC: 0 |
Can someone take a quick look at my results and see if they know why I am getting massive numbers of errors and wasted time? The ones I terminated myself were still runing in task manager after retarting BOINC so I'd end up with 8 WUs vying for CPU time while only 4 showed in BOINC. Here is my results page and thanks. https://boinc.bakerlab.org/rosetta/results.php?userid=288725 |
Mike.Gibson Send message Joined: 3 Nov 07 Posts: 19 Credit: 311,844 RAC: 0 |
I am using a dual-core 3800+ with Vista Premium and Boinc 6.2.19. If I have a mini 1.40 & Beta 5.98 running and suspend the project, both tasks are shown as suspended by user. However, the mini 1.40 keeps on running, albeit slowly. Two other tasks start to run, one at normal speed and the other slowly. Obviously, one of the new tasks is running on its own in one core and the other new task is sharing the second core with mini 1.40. I have never seen a core sharing before. Is this ok, or is this a problem. None of my other projects show any signs of this phenomenum. Any ideas? |
Rifleman Send message Joined: 19 Nov 08 Posts: 17 Credit: 139,408 RAC: 0 |
Check your processes running in task manager by pressing control, alt, delete. Do you show more than the normal number of tasks running? |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Can someone take a quick look at my results and see if they know why I am getting massive numbers of errors and wasted time? The ones I terminated myself were still runing in task manager after retarting BOINC so I'd end up with 8 WUs vying for CPU time while only 4 showed in BOINC. Your link to your results page is not correct, that is your own internal link i think. Here is the public one: https://boinc.bakerlab.org/rosetta/results.php?hostid=948562 Read this message and then go up the board a bit and see what others did when it comes to lockfile issues. I see that other results have Nan issues. No one has explained what this is or if they are working on a fix for it or not. Your non nan and lockfile results that errored out are possibly due to a out of date graphics card device driver. read this message for an explanation. hope this helps get you back on track. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
I am using a dual-core 3800+ with Vista Premium and Boinc 6.2.19. There has been a few problems I have experienced and others have as well with 1.4 tasks not suspending, that was mostly in the loopbuild tasks. I have found that you have to just exit boinc and restart it. you may also have to reboot your system. but that is probably last ditch. After one or both of these steps boinc mgr will act properly again. |
Mike.Gibson Send message Joined: 3 Nov 07 Posts: 19 Credit: 311,844 RAC: 0 |
Check your processes running in task manager by pressing control, alt, delete. Do you show more than the normal number of tasks running? Already checked - all 3 registered at variable amounts around 44%, 22% & 22%. |
Mike.Gibson Send message Joined: 3 Nov 07 Posts: 19 Credit: 311,844 RAC: 0 |
I am using a dual-core 3800+ with Vista Premium and Boinc 6.2.19. I have tried all sorts of combinations including reboots but it recurs next time. It seems to happen with either suspending project or suspending task. However, suspending both can clear the problem until the next time. |
(_KoDAk_) Send message Joined: 18 Jul 06 Posts: 109 Credit: 1,859,263 RAC: 0 |
https://boinc.bakerlab.org/rosetta/result.php?resultid=210335636 ERROR: NANs occured in hbonding! ERROR:: Exit from: ....srccorescoringhbondshbonds_geom.cc line: 763 called boinc_finish CPU time 39732.38 (((((( |
Alec Rosa Send message Joined: 11 Nov 08 Posts: 18 Credit: 2,635 RAC: 0 |
People said it -- I said it -- I insist: Rosetta mini should be worked as a Beta project. It seems SO obvious! We want to crunch Rosetta again. Start sending Rosetta Beta 5.xx WU again, please! |
rochester new york Send message Joined: 2 Jul 06 Posts: 2842 Credit: 2,020,043 RAC: 0 |
https://boinc.bakerlab.org/rosetta/results.php?hostid=267483 People said it -- I said it -- I insist: |
Message boards :
Number crunching :
Minirosetta v1.40 bug thread
©2024 University of Washington
https://www.bakerlab.org