Minirosetta 3.73-3.78

Message boards : Number crunching : Minirosetta 3.73-3.78

To post messages, you must log in.

Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 . . . 14 · Next

AuthorMessage
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2122
Credit: 41,196,472
RAC: 9,799
Message 80163 - Posted: 6 Jun 2016, 23:59:00 UTC - in response to Message 80162.  

Hi Sid,

Thanks for the alert! looks like these jobs require lots of memory. We have a way to specify how much memory to use. It will corrected in the next round of submission!

That would kind of explain why the task runs for a reasonable while before crashing out, and I've seen occasional tasks using 1.2Gb, but I'm running with just short of 10Gb free of 16Gb total.

I set Boinc to run 60% of memory when the computer is in use (90% when not in use). Do people routinely allocate more than that? What can I safely adjust that setting to, or is it just trial and error?
ID: 80163 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1232
Credit: 14,276,734
RAC: 1,594
Message 80164 - Posted: 7 Jun 2016, 1:00:02 UTC - in response to Message 80163.  

Hi Sid,

Thanks for the alert! looks like these jobs require lots of memory. We have a way to specify how much memory to use. It will corrected in the next round of submission!

That would kind of explain why the task runs for a reasonable while before crashing out, and I've seen occasional tasks using 1.2Gb, but I'm running with just short of 10Gb free of 16Gb total.

I set Boinc to run 60% of memory when the computer is in use (90% when not in use). Do people routinely allocate more than that? What can I safely adjust that setting to, or is it just trial and error?


I've found that 64-bit Windows Vista is rather inefficient at handling memory for running 32-bit applications, so I set that computer to use 30% to 40% of the memory for BOINC out of 8 GB. 64-bit Windows 7 and Windows 10 are more efficient, so I set that computer to use 70% out of 16 GB. 64-bit BOINC is not very good at giving up memory when the computer is in use, so these settings are the same for when the computer is in use as when not in use.
ID: 80164 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2122
Credit: 41,196,472
RAC: 9,799
Message 80167 - Posted: 8 Jun 2016, 18:37:32 UTC - in response to Message 80164.  

Hi Sid,

Thanks for the alert! looks like these jobs require lots of memory. We have a way to specify how much memory to use. It will corrected in the next round of submission!

That would kind of explain why the task runs for a reasonable while before crashing out, and I've seen occasional tasks using 1.2Gb, but I'm running with just short of 10Gb free of 16Gb total.

I set Boinc to run 60% of memory when the computer is in use (90% when not in use). Do people routinely allocate more than that? What can I safely adjust that setting to, or is it just trial and error?

I've found that 64-bit Windows Vista is rather inefficient at handling memory for running 32-bit applications, so I set that computer to use 30% to 40% of the memory for BOINC out of 8 GB. 64-bit Windows 7 and Windows 10 are more efficient, so I set that computer to use 70% out of 16 GB. 64-bit BOINC is not very good at giving up memory when the computer is in use, so these settings are the same for when the computer is in use as when not in use.

Useful, thanks. I'll tweak my Min 60% Max 90% to 65% & 85% on both my Win7 machines and see how it goes for now
ID: 80167 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Andy_Taximan

Send message
Joined: 20 Jan 14
Posts: 1
Credit: 736,798
RAC: 0
Message 80176 - Posted: 14 Jun 2016, 18:17:21 UTC

Not much of a problem but 3 hours to download minirosetta_database_d0bf94b.zip really is a pain ! lol and no its not my internet speed
ID: 80176 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
David Fickes

Send message
Joined: 12 Jul 15
Posts: 1
Credit: 1,113,855
RAC: 0
Message 80196 - Posted: 19 Jun 2016, 5:42:15 UTC

Just been having communications problem with the rosetta@home servers since moving to El Capitan. I had to update the BOINC software but other projects are still running the log follows::

Sat Jun 18 22:38:32 2016 | rosetta@home | Requesting new tasks for CPU and AMD/ATI GPU and Intel GPU
Sat Jun 18 22:39:03 2016 | | Project communication failed: attempting access to reference site
Sat Jun 18 22:39:03 2016 | rosetta@home | Scheduler request failed: Server returned nothing (no headers, no data)
Sat Jun 18 22:39:04 2016 | | Internet access OK - project servers may be temporarily down.
Sat Jun 18 22:40:08 2016 | World Community Grid | Sending scheduler request: To fetch work.
Sat Jun 18 22:40:08 2016 | World Community Grid | Requesting new tasks for CPU and Intel GPU
Sat Jun 18 22:40:10 2016 | World Community Grid | Scheduler request completed: got 2 new tasks
Sat Jun 18 22:40:12 2016 | World Community Grid | Started download of fahb.FAH2_avx40811-ls_000076-in1.dms
Sat Jun 18 22:40:12 2016 | World Community Grid | Started download of fahb.FAH2_avx40811-ls_000076-in2.dms
ID: 80196 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2122
Credit: 41,196,472
RAC: 9,799
Message 80431 - Posted: 25 Jul 2016, 17:51:27 UTC

Not sure what's happening with this task atm

000096_C5_0052_0004_fragments_relax_SAVE_ALL_OUT_402757_2_1

CPU time at last checkpoint 07:09:30
CPU time 07:26:30
Elapsed time 07:59:48

62.135% complete (of 8 hour runtime - lagging behind what it should be)

Only at Model 1 Step 10

Getting full CPU time according to Task Manager - heading for the watchdog at that rate

It looks very complicated when I show graphics. Is all well with it?
ID: 80431 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
anarchic teapot

Send message
Joined: 25 Mar 06
Posts: 2
Credit: 511,007
RAC: 331
Message 80488 - Posted: 5 Aug 2016, 14:39:44 UTC

Rosetta Mini 3.73 is running well past the time it's supposed to take on my computer. One task has been running for over 2 days, is shown as being less than 50% done, but the remaining estimated time is blank.

From my logs, I see I've already had trouble with a different Rosetta module this morning: it ended with an error message 05/08/2016 11:12:31 | rosetta@home | Aborting task fEbH1149_fold_SAVE_ALL_OUT_402410_390_0; not started and deadline has passed

There's also this on my account:

851723397 769539264 22 Jul 2016 9:12:30 UTC 5 Aug 2016 9:12:30 UTC Over No reply New 0.00 --- ---
851723358 769539231 22 Jul 2016 9:12:30 UTC 5 Aug 2016 9:13:03 UTC Over Client error Aborted by user 0.00 0.00 ---
851723337 769539210 22 Jul 2016 9:12:30 UTC 5 Aug 2016 9:13:03 UTC Over Client error Aborted by user 0.00 0.00 ---
851723269 769539146 22 Jul 2016 9:12:30 UTC 5 Aug 2016 9:12:30 UTC Over No reply New 0.00 --- ---
851718775 769535427 22 Jul 2016 8:58:34 UTC 5 Aug 2016 8:58:34 UTC Over No reply New 0.00 --- ---

No, I haven't (yet) aborted any tasks, so I don't know why that message appears. It does look as if Mini 3.73 tasks are overrunning to the extent of being rejected by the server.

I'm going to terminate all 8 Mini 3.73 tasks currently in my queue & turn Rosetta off for a bit, to give the devs time to fix the problem.
ID: 80488 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1994
Credit: 9,601,580
RAC: 9,078
Message 80489 - Posted: 5 Aug 2016, 15:05:30 UTC

Error on 857376630 task

Exit status: 194 (0xc2)
<message>
finish file present too long
</message>

ID: 80489 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
nanoprobe

Send message
Joined: 5 Apr 09
Posts: 8
Credit: 381,804
RAC: 0
Message 80522 - Posted: 9 Aug 2016, 18:48:49 UTC

I installed Android 5.1.1 on a Pine64 device and attached to Rosetta. I received 1 task which completed and validated. I'm not receiving any more tasks and the event logs says "Minirosetta is not available for your type of computer" every time I try to update. What's up with that?
ID: 80522 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
nanoprobe

Send message
Joined: 5 Apr 09
Posts: 8
Credit: 381,804
RAC: 0
Message 80524 - Posted: 9 Aug 2016, 23:58:51 UTC

Looking again there was an upload error.

<message>
upload failure: <file_xfer_error>
<file_name>db_pred12_7mer_android_7res_t1c.2.86_0001_SAVE_ALL_OUT_344206_6803_3_0</file_name>
<error_code>-161 (not found)</error_code>
</file_xfer_error>

</message>
ID: 80524 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1994
Credit: 9,601,580
RAC: 9,078
Message 80732 - Posted: 11 Oct 2016, 8:34:49 UTC

880278687

Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x0116CAE0 write attempt to address 0x017D7EC1

ID: 80732 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1994
Credit: 9,601,580
RAC: 9,078
Message 80766 - Posted: 21 Oct 2016, 6:51:55 UTC

Some of these...
881841371
881841206
etc

ERROR: unrecognized residue TIP
ERROR:: Exit from: ......srccoreiopose_from_sfrPoseFromSFRBuilder.cc line: 1030
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

ID: 80766 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1994
Credit: 9,601,580
RAC: 9,078
Message 80768 - Posted: 24 Oct 2016, 5:49:15 UTC

Some errors after over 3h of calc (my default runtime is 2h)

882253156
882249925

And this after 6h :-(
882253123

- Unhandled Exception Record -
Reason: Out Of Memory (C++ Exception) (0xe06d7363) at address 0x75CFA6F2


ID: 80768 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jesse Viviano

Send message
Joined: 14 Jan 10
Posts: 42
Credit: 2,700,472
RAC: 0
Message 86721 - Posted: 24 Jun 2017, 19:15:47 UTC

I am getting some 197 (0x000000C5) EXIT_TIME_LIMIT_EXCEEDED errors. See tasks 923287511, 923287516, and 923287790 for examples. These happened after the website upgrade. I normally run my work units for 1 day, but it looks like I will have to cut that target time to avoid the time limit errors.
ID: 86721 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1994
Credit: 9,601,580
RAC: 9,078
Message 86752 - Posted: 27 Jun 2017, 9:51:48 UTC

Task 920456670 after 10 minutes:
1073741819 (0xC0000005) STATUS_ACCESS_VIOLATION

Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x01407AF3 read attempt to address 0x00000000

ID: 86752 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
FurryGuy

Send message
Joined: 16 May 11
Posts: 2
Credit: 3,684,958
RAC: 0
Message 87216 - Posted: 6 Sep 2017, 0:52:15 UTC
Last modified: 6 Sep 2017, 0:55:00 UTC

Mini Rosetta 3.73 uploads are being rejected by the upload server:

9/5/2017 4:02:40 PM | Rosetta@home | Started upload of rb_09_04_77181_119995__t000__ab_robetta_IGNORE_THE_REST_514917_880_0_r297470578_0
9/5/2017 4:02:40 PM | Rosetta@home | Started upload of 540f84082d4e4eb07396c6091c3b1110_C2_docking_big_job_17_08_21_16_49_globalDocking_3_SAVE_ALL_OUT_512075_9_0_r118051807_0
9/5/2017 4:02:42 PM | Rosetta@home | [error] Error reported by file upload server: [rb_09_04_77181_119995__t000__ab_robetta_IGNORE_THE_REST_514917_880_0_r297470578_0] locked by file_upload_handler PID=-1
9/5/2017 4:02:42 PM | Rosetta@home | [error] Error reported by file upload server: [540f84082d4e4eb07396c6091c3b1110_C2_docking_big_job_17_08_21_16_49_globalDocking_3_SAVE_ALL_OUT_512075_9_0_r118051807_0] locked by file_upload_handler PID=-1
9/5/2017 4:02:42 PM | Rosetta@home | Temporarily failed upload of rb_09_04_77181_119995__t000__ab_robetta_IGNORE_THE_REST_514917_880_0_r297470578_0: transient upload error
9/5/2017 4:02:42 PM | Rosetta@home | Backing off 04:48:03 on upload of rb_09_04_77181_119995__t000__ab_robetta_IGNORE_THE_REST_514917_880_0_r297470578_0
9/5/2017 4:02:42 PM | Rosetta@home | Temporarily failed upload of 540f84082d4e4eb07396c6091c3b1110_C2_docking_big_job_17_08_21_16_49_globalDocking_3_SAVE_ALL_OUT_512075_9_0_r118051807_0: transient upload error
9/5/2017 4:02:42 PM | Rosetta@home | Backing off 00:05:31 on upload of 540f84082d4e4eb07396c6091c3b1110_C2_docking_big_job_17_08_21_16_49_globalDocking_3_SAVE_ALL_OUT_512075_9_0_r118051807_0
ID: 87216 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kcirza

Send message
Joined: 30 May 10
Posts: 3
Credit: 12,416,016
RAC: 280
Message 87218 - Posted: 6 Sep 2017, 2:01:10 UTC - in response to Message 87216.  

Ditto here, and most all day today, on all machines running the project.
ID: 87218 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1232
Credit: 14,276,734
RAC: 1,594
Message 87517 - Posted: 14 Oct 2017, 7:02:52 UTC

ID: 87517 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile bcavnaugh
Avatar

Send message
Joined: 7 Dec 13
Posts: 7
Credit: 2,389,640
RAC: 0
Message 87531 - Posted: 16 Oct 2017, 21:05:37 UTC
Last modified: 16 Oct 2017, 21:07:01 UTC

Seems all the RDKIBLER-2layer_2+1 I got today failed under Windows OS
Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz [Family 6 Model 79 Stepping 1]
Microsoft Windows Server 2012 R2 Standard x64 Edition, (06.03.9600.00)
https://boinc.bakerlab.org/results.php?hostid=3112116&offset=0&show_names=0&state=6&appid
https://boinc.bakerlab.org/workunit.php?wuid=854588205
https://boinc.bakerlab.org/show_host_detail.php?hostid=3112116

Crunching@EVGA The Number One Team in the BOINC Community. Folding@EVGA The Number One Team in the Folding@Home Community.
ID: 87531 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mad_Max

Send message
Joined: 31 Dec 09
Posts: 209
Credit: 25,945,383
RAC: 13,162
Message 87546 - Posted: 20 Oct 2017, 14:34:23 UTC

Sometimes minirosetta lose calculated result somehow at task restarts (eg. computer or boinc reboot or just switch to another project if few are running on same CPU).
I am not talking about checkpoints in the middle of model calculation but of entire models which was already successfully calculated but did not reported to the server.

Here example: https://boinc.bakerlab.org/result.php?resultid=948310700
======================================================
DONE :: 133 starting structures 28635.4 cpu seconds
This process generated 133 decoys from 133 attempts
======================================================

But after task restart (NOT crash/hang, just normal correct restart when taks unloaded from memory and loaded back from disk later ) only one last model (decoy) reported to server.
======================================================
DONE :: 1 starting structures 28382.7 cpu seconds
This process generated 1 decoys from 1 attempts
======================================================


All previous 133 calculated decoys lost.

This is not happens often. But i see such task from time to time (may be 1-2 per week).
Best way to track(search) for such task is to query databese for VALID task but with abnormal low credit compared to used CPU time - because credit calculated in proportion to number of decoys reported. And if many decoys were lost - granted CR will be abnormal low.
ID: 87546 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 . . . 14 · Next

Message boards : Number crunching : Minirosetta 3.73-3.78



©2024 University of Washington
https://www.bakerlab.org