Message boards : Number crunching : Minirosetta v1.47 bug thread.
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · Next
Author | Message |
---|---|
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
The runtime should not directly effect the success of a task. But, since it will run more models, it increases the odds of you hitting a long-running model. So, running 5 models on 5 different 1 hour tasks should give you the same result as running 5 models on a single 5 hour task. But if 20% of the models are long-running, you would say that 100% of your 5hr tasks "fail", and only 20% of your 1hr tasks do. But, with a 1 hour runtime preference, the watchdog will kick in much sooner. If watchdog is set to 3 times normal, it would only allow a task to run for 3 hours. Whereas with the longer runtime above, it would go for up to a total of 15 before ending the task. Rosetta Moderator: Mod.Sense |
sslickerson Send message Joined: 14 Oct 05 Posts: 101 Credit: 578,497 RAC: 0 |
Just for the fun of it I checked my desktop (AMD 4200+) for any errors, typically this one is and has been rock solid for years. Lo and behold, there was one error there that occurred in the past few hours with the same error as my vista laptop. So the error is not machine or cpu specific (AMD vs Intel...XP vs Vista) it has happened in each (as far as my setup at least). AMD 4200:218361409 |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2146 Credit: 41,570,180 RAC: 6,141 |
It seems like the 4 or 5 times that I have come back to Rosetta with this setup (64bit Vista) everything works well until the runtime is increased to greater than 1 hour. Perhaps I will increase the runtime but switch to "leave app in memory" to see if there is any change... @Robertsslickerson I had loads of problems (can't acquire lockfile) with Vista64 until Boinc 6.4.5 at which point they disappeared completely. I also reduced my runtime to 2 hours for greater success with earlier versions. With 6.4.5 they seem to have gone. An upgrade may help you too. That said, it hasn't solved any issues with exception errors, which I still get to a small extent (1 out of 93 when I investigated). All your problems seems to be of that type (many more than me) so it may not solve your problems. For what it's worth, I kept applications in memory, which I understand to be the best advice. Maybe you should try that too. Hope it helps you to some degree. |
Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0 |
Also check to see if processor usage is set to 100% ... I saw a note on EaH that with windows and the processor usage not set to 100% this is a common error. In that this killed about 20 models here for me ... I am interested if this is really the case ... I know ROsetta runs well on OS-X in that I have not had any failures there ... On Win XP I got 10 failures out of about 20 tries ... which is when *I* gave up again on RaH ... I had set usage to 99% to give me a little more head room and that may have been enough to farble things up ... Anyone up for the test? THis is addressed to the "Cant' acquire lock-file" problem only ... |
sslickerson Send message Joined: 14 Oct 05 Posts: 101 Credit: 578,497 RAC: 0 |
Thanks Paul and everyone else. I'll give these suggestions a try sometime this week (besides Rosie is out of work for the time being). Also check to see if processor usage is set to 100% ... |
LizzieBarry Send message Joined: 25 Feb 08 Posts: 76 Credit: 201,862 RAC: 0 |
Thanks Paul and everyone else. I'll give these suggestions a try sometime this week (besides Rosie is out of work for the time being). Not to forget, it's an ideal time to reset the project too. |
rochester new york Send message Joined: 2 Jul 06 Posts: 2842 Credit: 2,020,043 RAC: 0 |
|
Ian_D Send message Joined: 21 Sep 05 Posts: 55 Credit: 4,216,173 RAC: 0 |
Duplicate post |
Ian_D Send message Joined: 21 Sep 05 Posts: 55 Credit: 4,216,173 RAC: 0 |
Duplicate post - wow it's slow..... |
Ian_D Send message Joined: 21 Sep 05 Posts: 55 Credit: 4,216,173 RAC: 0 |
<core_client_version>6.2.12</core_client_version> <![CDATA[ <message> process exited with code 1 (0x1, -255) </message> <stderr_txt> # cpu_run_time_pref: 28800 ERROR: phil how did we get here-2? ERROR:: Exit from: src/core/kinematics/AtomTree.cc line: 1378 called boinc_finish </stderr_txt> ]]> You're having a laugh, right ? https://boinc.bakerlab.org/rosetta/result.php?resultid=218842260 |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
me and the 2nd cruncher both got computer errors on lr5_score12_rlbd_1ubi_IGNORE_THE_REST_DECOY_5559_1100_1 the combined task summary is https://boinc.bakerlab.org/rosetta/workunit.php?wuid=198993154]here the error is: <core_client_version>6.4.5</core_client_version> <![CDATA[ <message> - exit code -1073741819 (0xc0000005) </message> <stderr_txt> # cpu_run_time_pref: 14400 Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x0049162C read attempt to address 0x00000000 Engaging BOINC Windows Runtime Debugger... it ran CPU time 1089.156 seconds out of 1440 on my machine and on the other system Computer ID 593083 Report deadline 13 Jan 2009 17:11:30 UTC CPU time 526.051 stderr out <core_client_version>6.4.5</core_client_version> <![CDATA[ <message> - exit code -1073741819 (0xc0000005) </message> <stderr_txt> Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x0049162C read attempt to address 0x00000000 Engaging BOINC Windows Runtime Debugger... |
Jim Leatherman Send message Joined: 15 Jun 08 Posts: 2 Credit: 987,127 RAC: 0 |
After upgrading to 6.4.5 BOINC doesn't seem to be downloading any tasks now for Rosetta@Home. Same message all the time: 01/06/09 12:00:56|rosetta@home|Fetching scheduler list 01/06/09 12:01:01|rosetta@home|Master file download succeeded 01/06/09 12:01:06|rosetta@home|Sending scheduler request: To fetch work. Requesting 172801 seconds of work, reporting 0 completed tasks 01/06/09 12:01:11|rosetta@home|Scheduler request completed: got 0 new tasks I have reset the project, but still no downloads -- was working fine prior to 6.4.5. Any ideas? |
Jim Leatherman Send message Joined: 15 Jun 08 Posts: 2 Credit: 987,127 RAC: 0 |
After upgrading to 6.4.5 BOINC doesn't seem to be downloading any tasks now for Rosetta@Home. Same message all the time: 01/06/09 12:00:56|rosetta@home|Fetching scheduler list 01/06/09 12:01:01|rosetta@home|Master file download succeeded 01/06/09 12:01:06|rosetta@home|Sending scheduler request: To fetch work. Requesting 172801 seconds of work, reporting 0 completed tasks 01/06/09 12:01:11|rosetta@home|Scheduler request completed: got 0 new tasks I have reset the project, but still no downloads -- was working fine prior to 6.4.5. Any ideas? |
Evan Send message Joined: 23 Dec 05 Posts: 268 Credit: 402,585 RAC: 0 |
The problem is not at your end. If you have similar problems in the future always check the server status. Right now there are problems on the other end as you will see by the prominent red boxes. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
The problem is not at your end. If you have similar problems in the future always check the server status. Right now there are problems on the other end as you will see by the prominent red boxes. Generate work servers have been offline today (European time)for quite some time. No news from the team as to what is causing this outage. Keep an eye on the server status page to see when they come back online. |
Rifleman Send message Joined: 19 Nov 08 Posts: 17 Credit: 139,408 RAC: 0 |
I have 3 finished WUs that don't seem to upload to the server---is that because of the problems today? |
Rifleman Send message Joined: 19 Nov 08 Posts: 17 Credit: 139,408 RAC: 0 |
I have 3 finished WUs that don't seem to upload to the server---is that because of the problems today? |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
See the server problems thread. Apparently a connection to the outside world got pulled while they were working on the rack. They expect things to be be up and running today. But it is midnight pacific time at the moment, so don't expect anything to happen for at least 8 hours. If you go into boinc manager and then goto the projects tab, you can set RAH to 'accept no new tasks' and that will stop it from requesting new work. This will cut back on your status messages. Turn it back on later tonight (European time). I have 3 finished WUs that don't seem to upload to the server---is that because of the problems today? |
yose-ue Send message Joined: 30 Dec 05 Posts: 3 Credit: 228,710 RAC: 0 |
This job (wuid=198707114)appeares to have finished twice and after using 71456 cpu seconds total I was only granted 2 points <core_client_version>6.4.5</core_client_version> <![CDATA[ <stderr_txt> # cpu_run_time_pref: 28800 # cpu_run_time_pref: 28800 ====================================================== DONE :: 1 starting structures 47173.5 cpu seconds This process generated 1 decoys from 1 attempts ====================================================== BOINC :: Watchdog shutting down... # cpu_run_time_pref: 28800 # cpu_run_time_pref: 28800 # cpu_run_time_pref: 28800 # cpu_run_time_pref: 28800 ====================================================== DONE :: 1 starting structures 71456 cpu seconds This process generated 1 decoys from 1 attempts ====================================================== BOINC :: Watchdog shutting down... BOINC :: BOINC support services shutting down... called boinc_finish </stderr_txt> ]]> Validate state Valid Claimed credit 156.119054549462 Granted credit 2 application version 1.47 |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
DK has now corrected the problem where results are always granted 2 credits per model. See his post. Rosetta Moderator: Mod.Sense |
Message boards :
Number crunching :
Minirosetta v1.47 bug thread.
©2025 University of Washington
https://www.bakerlab.org