Message boards : Number crunching : Problems with Minirosetta v1.54
Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 · Next
Author | Message |
---|---|
Evan Send message Joined: 23 Dec 05 Posts: 268 Credit: 402,585 RAC: 0 |
Another WU with 99 successful decoys This appears to look normal, I am getting through them at the rate of about 1.17 minutes per model. If my calculations are correct you are .02 minutes faster per model. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
first error in a long time! ran 100% and had a compute error at the end abinitio_nohomfrag_129_B_1o73A_SAVE_ALL_OUT_7581_8721_1 Exit status -1073741819 (0xc0000005) CPU time 11314.84 Starting work on structure: _U9X3X_00001 # cpu_run_time_pref: 14400 Starting work on structure: _U9X3X_00002 Starting work on structure: _U9X3X_00003 Starting work on structure: _U9X3X_00004 Starting work on structure: _U9X3X_00005 Starting work on structure: _U9X3X_00006 Starting work on structure: _U9X3X_00007 Starting work on structure: _U9X3X_00008 Starting work on structure: _U9X3X_00009 Starting work on structure: _U9X3X_00010 Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x00587042 write attempt to address 0x34A2BAB7 Engaging BOINC Windows Runtime Debugger... |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2141 Credit: 41,526,036 RAC: 10,392 |
frb_1_8_bestfrag_hb_t313___IGNORE_THE_REST_1F9TA_5_9696_15_0 I spoke too soon... frb_0_8_template_enriched_hb_t313___IGNORE_THE_REST_1CZ7A_7_9682_18_1 # cpu_run_time_pref: 14400 CPU time 17744.52 [1 decoy] Claimed credit 86.8616680245843 Granted credit 9.36388194088631 :( |
Feet1st Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0 |
This one cut off after a clean exit of BOINC and a reboot to install a MS fix. What wasn't clean was the restart. I forgot BOINC was in my Win startup folder and so ended up starting two of them. I then ended both and after 61 second after starting again, this task was ended. No messages, just that it finished. But it should have run another couple of hours. Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ |
svincent Send message Joined: 30 Dec 05 Posts: 219 Credit: 12,120,035 RAC: 0 |
Task 241419982 failed on Mac: see below. Oddly, it then went out to someone on a Linux machine and completed fine. Watchdog active. # cpu_run_time_pref: 14400 Hbond tripped. ERROR: dis==0 in pairtermderiv! ERROR:: Exit from: src/core/scoring/methods/PairEnergy.cc line: 338 BOINC:: Error reading and gzipping output datafile: default.out called boinc_finish </stderr_txt> ]]> |
Klimax Send message Joined: 27 Apr 07 Posts: 44 Credit: 2,800,788 RAC: 68 |
Again another task is now not crunching due to "Accepted Energy:1.#QNAN" and "Accpeted RMSD:1.#QQ". It is 39.50% Complete ; Model:11 Step 7788. I have now suspended task. I can create dump file.Should I? Or is it already fixed in next version? |
Evan Send message Joined: 23 Dec 05 Posts: 268 Credit: 402,585 RAC: 0 |
Error with this one 240746159 ERROR: in::file::boinc_wu_zip fragments_2hkv.zip does not exist! ERROR:: Exit from: ....srcappspublicboincminirosetta.cc line: 108 BOINC:: Error reading and gzipping output datafile: default.out called boinc_finish |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Klimax, why don't you go ahead and take a dump and EMail it to me, along with details on what you observered with it as it ran. I will forward it to the Project Team. Rosetta Moderator: Mod.Sense |
jswolf19 Send message Joined: 3 Apr 09 Posts: 3 Credit: 1,040,577 RAC: 0 |
I'm also having an issue with no progress. Rosetta Beta runs fine, but Rosetta Mini (1.54) never registers any progress even after clocking hours of CPU time (the current process I just aborted clocked at almost 17 hours). I have an Intel(R) Core(TM)2 Duo CPU P8600 @ 2.40GHz (WinXP Professional SP3) . It also won't switch off, freeing up a core for another BOINC (v6.4.7) process to run. |
Klimax Send message Joined: 27 Apr 07 Posts: 44 Credit: 2,800,788 RAC: 68 |
Klimax, why don't you go ahead and take a dump and EMail it to me, along with details on what you observered with it as it ran. I will forward it to the Project Team. Ups,didn't know :-( Last time I reported it,I was told to let it finish and upload.(IIRC) Mail is being prepared. |
TomaszPawel Send message Joined: 28 Apr 07 Posts: 54 Credit: 2,791,145 RAC: 0 |
|
robertmiles Send message Joined: 16 Jun 08 Posts: 1234 Credit: 14,338,560 RAC: 2,014 |
I have this error : Looks like you've hit one of the errors still in 1.54 because it's too uncommon to debug quickly. Let's hope your results for that workunit help them finally debug it. Looking at the rest of the jobs your machine has been working on lately, I'd say that that you have a lower frequency of errors than I do because you've set up your machine well for aiming at a high score (probably selecting Rosetta@home as your only BOINC project on that machine, selecting leave in memory, and running at 100% CPU usage), while I'm deliberately choosing settings aimed at helping debug problems with the program (giving other BOINC projects enough computer time to prevent workunits from Rosetta@home from being likely to complete without being interrupted to give workunits from other projects a turn, and running at 95% CPU usage, although with leave in memory selected). However, is there any good reason for maintaining such a long queue of jobs waiting for your machine to choose them next, and therefore delaying any work at the Rosetta@home end on your results? I can't tell if you've also tried a few other things I've also found good for getting a high score, such as: 1. Selecting a black screen, instead of the BOINC graphics, as your screen saver, and avoiding activating the BOINC graphics. 2. If you see the lockfile problem in your results, suspend all projects, reboot the machine to clear any lockfiles left behind by failed workunits, then resume the projects. 3. Running the machine 24 hours a day, except when shutting BOINC down for Windows updates or other updates, running antivirus programs, running antispyware programs, and any needed reboots. 4. If you happen to need some update that doesn't require a reboot, such as most Windows Defender updates, only tell BOINC to suspend all jobs while you install the update, instead of shutting it down completely; then resume the projects after the update completes. |
Speedy Send message Joined: 25 Sep 05 Posts: 163 Credit: 808,337 RAC: 1 |
I'm interested to know how Selecting a black screen, instead of the BOINC graphics, as your screen saver, and avoiding activating the BOINC graphics. helps increase a work units score? Thank's in advance Have a crunching good day!! |
TomaszPawel Send message Joined: 28 Apr 07 Posts: 54 Credit: 2,791,145 RAC: 0 |
I have this error : Thanks for reply. If you see scores I achieve for WU on that host witch make error I must tell you 2 important things: 1. It was computer with orginally Q6600@3200. On 7 apr 09 I replace this CPU to Q9550@3600. So it is safe to say that credits form 6 apr 09 and older represents Q6600 and from 8 apr 09 and newer represents Q9550. 2. I am crunching Rosetta@home at all 4 cores with GPUGRID on my GTX260. So in reality i run 5 treads by Boinc. Also: AD 1. I don't use BOINC screen saver only windows logo screen saver on my CRT NEC 2111SB AD 2. I sometimes suspend to play some games.... AD 3. I must shut down my PC for night because it is to loud for me, so it crunch from 10 a.m. do 11-12 p.m. usually. Ad 4. Rosetta@home is very GUI friendly because there is no slow down in interface. GPUGRID is real horror in that matter... Running at 100% CPU usage is also set. Live in memory option was not selected but today I selected it. I will see what happend :) Also i work in 32 bit XP with 2x2Gb as CL4 DDR2 423 (846). |
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,860,059 RAC: 7,494 |
I'm interested to know how Selecting a black screen, instead of the BOINC graphics, as your screen saver, and avoiding activating the BOINC graphics. just because your computer doesn't have to do the computation for the graphics tread too then. |
Speedy Send message Joined: 25 Sep 05 Posts: 163 Credit: 808,337 RAC: 1 |
I'm interested to know how Selecting a black screen, instead of the BOINC graphics, as your screen saver, and avoiding activating the BOINC graphics. OK thanks Have a crunching good day!! |
robertmiles Send message Joined: 16 Jun 08 Posts: 1234 Credit: 14,338,560 RAC: 2,014 |
I'm interested to know how Selecting a black screen, instead of the BOINC graphics, as your screen saver, and avoiding activating the BOINC graphics. Selecting a black screen, which only needs to be calculated once, cuts down on CPU time needed to calculate the graphics, and lets more of what's available be used for the scientific calculations. Since Rosetta@home uses the number of decoys produced as a more important factor in calculating how much credit to give you than the CPU time required to do it, this is likely to increase the number of decoys your computer produces for that workunit, and therefore the resulting score. Also, something involving the graphics seems to be able to trigger the lockfile problem for a workunit, with the results then returned marked as invalid and therefore worth a score of zero. Once a lockfile problem occurs, 1.54 seems to be unable to erase the lockfile from the slot used by that workunit, and therefore lets the problems spread to any 1.54 workunits run later in the same slot but before the next reboot. My results for Ralph@home indicate that the 1.58 now being tested there has kept this same problem, and therefore needs more testing before the 1.54 used at Rosetta@home is replaced with a newer version. |
Evan Send message Joined: 23 Dec 05 Posts: 268 Credit: 402,585 RAC: 0 |
Also, something involving the graphics seems to be able to trigger the lockfile problem for a workunit, with the results then returned marked as invalid I turn the graphics on and off several times during the course of the day to check on the performance and I haven't encountered this lockfile problem for a long time now on both Rosetta and Ralph. Having said that, Murphy's Law states 'watch this space'!! |
robertmiles Send message Joined: 16 Jun 08 Posts: 1234 Credit: 14,338,560 RAC: 2,014 |
Also, something involving the graphics seems to be able to trigger the lockfile problem for a workunit, with the results then returned marked as invalid The lockfile problem results could vary depending on what operating system version and what BOINC version is used; if so, my results could easily apply only when using BOINC 6.2.28 under Vista SP1. In other words, I suspect that results from just the two of us aren't enough; we need more people with access to other operating system versions and more versions of BOINC to test for graphics causing the lockfile problem and report the results, along with which operating system version and which BOINC version was used. |
Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0 |
New error: ERROR: dis==0 in pairtermderiv! ERROR:: Exit from: src/core/scoring/methods/PairEnergy.cc line: 338 BOINC:: Error reading and gzipping output datafile: default.out For this task The same task run on an XP machine ran for a long time and only failed on validate. Which is kind of interesting, it almost seems as if my machine (OS-X) tipped over on an assertion or parameter file error ... what is the difference in OS platform guys ... |
Message boards :
Number crunching :
Problems with Minirosetta v1.54
©2024 University of Washington
https://www.bakerlab.org