Message boards : Number crunching : Problems with rosetta 5.48
Author | Message |
---|---|
Ingemar Send message Joined: 28 Feb 06 Posts: 20 Credit: 1,680 RAC: 0 |
Please report here for problems you have observed with Rosetta version 5.48. |
ramostol Send message Joined: 6 Feb 07 Posts: 64 Credit: 584,052 RAC: 0 |
Sorry, but our problems are not over. On my Mac iBook G4 10.3.9 the very first WU administered by 5.48 had to restart after being unable to find its process. From my internal notes: (This WU was not docked.) File stdoutdae.txt, line no. 12747-: 2007-03-03 00:01:33 [rosetta@home] Starting FRA_t349_IG9_hom001_1_t349_1_model_1o1za.pdb_1586_12_0 2007-03-03 00:01:33 [rosetta@home] Starting task FRA_t349_IG9_hom001_1_t349_1_model_1o1za.pdb_1586_12_0 using rosetta version 548 2007-03-03 01:14:42 [---] Restarting FRA_t349_IG9_hom001_1_t349_1_model_1o1za.pdb_1586_12_0 - message timeout 2007-03-03 01:14:43 [---] [error] Process 5284 not found At 01:50:00 - after 35 minutes of computing - claims to have used 1:17:00 CPU !! Progress: 38.902 % -- quite abnormal, ordinary processes rise from 0.0 % to about 1.5 % and then to 100 %... The "To completion" time is much lower than currently displayed for an ordinary process. Finished at 2:00:21 after merely 45 min. computing - I can't believe it - no WU has ever used less than one hour even when running undisturbed by competing CPU tasks. Result 65326544: FRA_t349_IG9_hom001_1_t349_1_model_1o1za.pdb_1586_12_0: stderr out <core_client_version>5.8.15</core_client_version> <![CDATA[ <stderr_txt> # random seed: 3185788 # cpu_run_time_pref: 7200 # cpu_run_time_pref: 7200 ====================================================== DONE :: 1 starting structures built 2 (nstruct) times This process generated 2 decoys from 2 attempts 0 starting pdbs were skipped ====================================================== BOINC :: Watchdog shutting down... BOINC :: BOINC support services shutting down... </stderr_txt> For the record: Most often these missing processes are time-consuming and irritating, but harmless. But occasionally they lead to computing errors - all computing errors produced on my iBook originate from second processings of WUs that lost their processes when run the first time. R. A. Mostol |
Huge Send message Joined: 8 Jan 06 Posts: 1 Credit: 5,034 RAC: 0 |
Hi all, All of a sudden I get the foolowing message: 3-3-2007 22:26:45|rosetta@home|Sending scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi 3-3-2007 22:26:45|rosetta@home|Reason: To fetch work 3-3-2007 22:26:45|rosetta@home|Requesting 8640 seconds of new work 3-3-2007 22:26:51|rosetta@home|Scheduler request succeeded 3-3-2007 22:26:51|rosetta@home|Message from server: Your computer has 447.48MB of memory, and a job requires 476.84MB 3-3-2007 22:26:51|rosetta@home|Message from server: No work sent 3-3-2007 22:26:51|rosetta@home|Message from server: (there was work but your computer doesn't have enough memory) 3-3-2007 22:26:51|rosetta@home|No work from project If have NOTHING changed to my computer. I also run lhcathome, LEIDEN Classical and SETI. Anyone any ideas? Best regards, Huge |
netwraith Send message Joined: 3 Sep 06 Posts: 80 Credit: 13,483,227 RAC: 0 |
-- Did 5.48 change the memory requirements... I have a system that was using 80MB of 512mb ... now the jobs are saying 478MB required.. "Your computer does not have enough memory"..... What is up ???? *update* I am showing a few jobs on my larger Linux systems that are taking 390mb in real memory, but, nothing more than that... Average jobs are still in the 100mb range... The machines showing these jobs have 4GB or more each. Looking for a team ??? Join BoincSynergy!! |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
I've got the same this morning still no work. 3/4/2007 09:50:05|rosetta@home|Sending scheduler request: To fetch work 3/4/2007 09:50:05|rosetta@home|Requesting 5236 seconds of new work 3/4/2007 09:50:10|rosetta@home|Scheduler RPC succeeded [server version 509] 3/4/2007 09:50:10|rosetta@home|Message from server: Your preferences limit memory usage to 460.30MB, and a job requires 476.84MB 3/4/2007 09:50:10|rosetta@home|Message from server: No work sent 3/4/2007 09:50:10|rosetta@home|Message from server: (there was work but your computer doesn't have enough memory) P.S. This job is running at the momento. s036__BOINC_ABRELAX_hom013__1583_1618_0 using rosetta version 548 |
Rob Lilley Send message Joined: 11 Jan 06 Posts: 11 Credit: 133,120 RAC: 0 |
My machine has 512mb of memory too, and I had the same message come up when Rosetta 5.48 appeared. It seems that in General Preferences you can now determine what percentage of memory (actual not virtual) a project can use. I just increased the percentage to use when the computer is idle from 90 to 95% and it downloaded just fine. Now let's see what happens when the crunching / fun starts! |
netwraith Send message Joined: 3 Sep 06 Posts: 80 Credit: 13,483,227 RAC: 0 |
-- Won't help mine... Something meschuga in the PCI/ACPI code.. uses 80MB for kernel space.. I suspect that a bunch of it is wasted, but, can't get it fixed without BIOS code... and the MB manu discontinued the model... no more code to be had... so.. will need to wait for smaller jobs or switch off to other crunching tasks... pity, cuz it's a relatively hot machine... maybe I will add some more RAM next week, but, for now..... Looking for a team ??? Join BoincSynergy!! |
UtahTestLabs Send message Joined: 1 Jan 07 Posts: 4 Credit: 164,281 RAC: 0 |
My WU is at 1% and wont go any higher. The CPU time is counting, but the percentage is constantly 1% and the time remaining is counting up. |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
Well i have increased both memory useage setting in Boinc to 90% and i still cannot get work. |
hedera Send message Joined: 15 Jul 06 Posts: 76 Credit: 5,263,150 RAC: 87 |
This is interesting; I have not received any of the "you don't have enough memory" messages with 5.48; and I have 1GB memory on my system. I can't imagine why it should need so much that it can't run on a box with 512MB; but I'm having no trouble running 2 tasks (and only Rosetta) on a 1GB system. --hedera Never be afraid to try something new. Remember that amateurs built the ark. Professionals built the Titanic. |
River~~ Send message Joined: 15 Dec 05 Posts: 761 Credit: 285,578 RAC: 0 |
This is interesting; I have not received any of the "you don't have enough memory" messages with 5.48; and I have 1GB memory on my system. I can't imagine why it should need so much that it can't run on a box with 512MB; but I'm having no trouble running 2 tasks (and only Rosetta) on a 1GB system. Yes, and a possibly releated effect, 5.48 seems a little more efficient in its effects on other programs on a 2-cpu system. Detail: On a 2 cpu (2 separate chips, not one of these new-fangled multicore jobs), I have been tracking the performance of CPDN running on one cpu with other tasks running on the other cpu, averaging over 12 hours or so of timesteps. Taking the speed of crunching CPDN when the other cpu is idle as 100, then the speed when crunching Rosetta on the other cpu dropped to 95.4 +/- 0.5 with 5.46, but improved to 97.6 +/- 0.6 as soon as 5.48 started to run. This is a real effect, the timesteps get closer together as soon as Rosetta starts on its new version, and although the timesteps vary in length both before and after there is no overlap at all in the sets of values. By comparison, when running CPDN alongside another CPDN the speed of each drops to 87.5 +/- 0.4, which I why I usually run just one CPDN and let Rosetta have the other cpu. This is a severe test of the combination of tasks, as the box has only 256Mb of RAM, far less than either CPDN or Rosetta would like even when running solo. In its favour it does not have a GUI to soak up extra cycles. Conjecture: My guess is that Rosetta is playing more friendly in one or both of the following ways. The new tasks could simply be doing work that is confined in tighter loops. This would mean that the Rosetta core would be able to keep its code in cache for more of the time, and would not be contending with the other core for RAM access to re-load program code. The new tasks could simply be using less memory overall, meaning that less of both progams' virtual memory is paged out to disk. In view of hedera's comments, the second possibility seems more likely. Sadly I have not been monitoring swapfile usage, so I can't actually tell. Question: If I am right that the new tasks are using less memory, is this simply an artefact of the particuar jobs they have been given, or is it down to some re-optimisation of the code in the Rosetta app? btw, apols for being OT here, as this is not a *problem* ;-) Like hedera I thought that positive feedback might be interesting ... R~~ |
rumbach Send message Joined: 15 Aug 06 Posts: 1 Credit: 30,180 RAC: 0 |
Have 512mb on a 1.2ghz cpu with w2k. I can no longer get work units after finishing the last one. 3/4/2007 12:16:54 AM|rosetta@home|Message from server: Your preferences limit memory usage to 460.34MB, and a job requires 476.84MB 3/4/2007 12:16:54 AM|rosetta@home|Message from server: No work sent 3/4/2007 12:16:54 AM|rosetta@home|Message from server: (there was work but your computer doesn't have enough memory) Change preferences to 95%, change pagefile to 2gb, still can not get any work. |
William Ostie Send message Joined: 6 Feb 07 Posts: 5 Credit: 1,125,655 RAC: 0 |
Hi all, |
William Ostie Send message Joined: 6 Feb 07 Posts: 5 Credit: 1,125,655 RAC: 0 |
Hi all, |
RichardJ Send message Joined: 19 Mar 06 Posts: 8 Credit: 73,014 RAC: 0 |
Hi all, |
RichardJ Send message Joined: 19 Mar 06 Posts: 8 Credit: 73,014 RAC: 0 |
Me too: 04/03/2007 09:58:06|rosetta@home|Message from server: Your computer has 223.48MB of memory, and a job requires 476.84MB Been happily chugging away for nearly a year now and not seen this before. Anything I can do? Hi all, |
288VKYUjwsXfAaTXn6SFJC4LVPRf Send message Joined: 16 Dec 05 Posts: 31 Credit: 153,110 RAC: 0 |
My WU is at 1% and wont go any higher. The CPU time is counting, but the percentage is constantly 1% and the time remaining is counting up. The same for me. Here is the link to my Failed WU Paused the WU, resumed it. Closed Boinc and restarted Boinc. Nothing worked. It was just totally frozen after 1 minute and a couple of seconds. |
288VKYUjwsXfAaTXn6SFJC4LVPRf Send message Joined: 16 Dec 05 Posts: 31 Credit: 153,110 RAC: 0 |
|
Rene Send message Joined: 2 Dec 05 Posts: 10 Credit: 67,269 RAC: 0 |
My Ubuntu (6.10) host just "froze" a couple of minutes ago while crunching this wu... I had to do a "hard reset" because nothing was responding... wu has restarted now. Problem appeared just before reaching 41%. ;-) Edit: just happened for the second time... after reaching 41.602%... did a "hard reset" again... wu restarted and cpu time is at approx 1:19:00 Edit 2: and it's repeating.... thirth reset of the host was needed... % went back to 41.600 and cpu time to 1:15.44 I will give it another try, but will abort it if it occurs again. |
Rene Send message Joined: 2 Dec 05 Posts: 10 Credit: 67,269 RAC: 0 |
I will give it another try, but will abort it if it occurs again. Just did... after it "froze" the complete system again the wu kicked back again to 41.600% and it looks like the next stage could not be reached. Kept an eye on the running processes and just before things got bad, 5.48 went back to 0% of CPU use. Other Rosetta wu in que seems to have the same problem... only this one stopt at 1.030% ;-) |
Message boards :
Number crunching :
Problems with rosetta 5.48
©2024 University of Washington
https://www.bakerlab.org