Posts by amgthis

21) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 81435)
Posted 15 Apr 2017 by amgthis
Post:
I'll take a look. Sorry for being late on this.



at 17:40 pacific time here I still have 16 queued up waiting in line...

Happy Easter everyone!

Cheers,

/M
22) Message boards : Number crunching : No more work (Message 80960)
Posted 30 Dec 2016 by amgthis
Post:


Alright! More work!! Happy New Year!

cheers!
23) Message boards : Number crunching : No more work (Message 80959)
Posted 30 Dec 2016 by amgthis
Post:
I see the server status shows no more work queued for download. I guess if no one is at work today, Friday,
then we will be out at least over the New Year's weekend. Hopefully more will come next week?
Happy New Year everyone!

24) Message boards : Number crunching : No more work (Message 80958)
Posted 30 Dec 2016 by amgthis
Post:
I see the server status shows no more work queued for download. I guess if no one is at work today, Friday,
then we will be out at least over the New Year's weekend. Hopefully more will come next week?
Happy New Year everyone!
25) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 80608)
Posted 9 Sep 2016 by amgthis
Post:
Here's an update, sorry for the delay.

Our database server is running out of disk space. We had to reconfigure it which took a long time because it was over 140gigs, however it is operating at a very sluggish pace. Our project has been quite busy lately mainly due to Charity Engine providing 1000s of new hosts each day. This has been going on for quite some time and our database finally reached it's space limit with the current project configuration. We are working on a temporary solution since our full upgrade will take some time, on the order of months I am told.

At least it's good to know none of this could be foreseen. I mean, really?

It will take some time to settle as there are a lot of jobs (millions) that need to be processed. We plan to have another long period of down time when we transition to the temporary upgrade for the database server. Keith, Darwin, and Patrick, our sys admins, are working on getting it set up now.

So in the near future expect intermittent down time. The project status page may be incorrect and data dumps, and credit granting for failed jobs may also be delayed. Expect this to improve as we catch up on things.

On a side note, I was told that Charity Engine is going to detach from our project soon due to commercial interests/projects. This obviously will help our servers but unfortunately we'll see a huge drop in throughput. We greatly appreciate the massive computing they've provided us and hope to get their hosts crunching again for us in the future if possible.


Hope that was worthwhile for all the regular users that you lose who gave up and moved elsewhere do to no more work from the project.

this is a shame I think. lot's of volunteers basically getting the heave-ho due to whatever this 'charity engine' thing is. good luck.
26) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 80262)
Posted 24 Jun 2016 by amgthis
Post:
I now have 15 tasks stuck on "Uploading." Some have missed their deadline. Server Status Page shows all green.


Yep, I've got 12 finished results due earlier today. Orphaned on the vine.

I saw a few uploads get thru when I restarted my NICs but it died pretty quickly.

I've got 73 results queued for upload right now. Stuff happens ....
27) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 80239)
Posted 23 Jun 2016 by amgthis
Post:
Something's busted.
28) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 79971)
Posted 29 Apr 2016 by amgthis
Post:
All of my pred5csxxxxx w/u's terminate after 2-3 hrs. but show no error.
Is this by design and OK?

I don't want to dump good w/u's for no reason.

good luck with the server work.

/mike
29) Message boards : Number crunching : Problems with web site (Message 77186)
Posted 31 Jul 2014 by amgthis
Post:
[quote][quote]Let me share what I can. I am not at UW, and so have no first-hand knowledge of this specific situation. But let me offer the following:

Thanks for the update on the situation.

Now would be a great time to suspend network communicatiion until the project gets a chance to get it's head back up above water...

Thanks.

/Mike

30) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 77135)
Posted 30 Jul 2014 by amgthis
Post:
It seems like the server 'status' page for Rosetta@home
rarely shows when anything is down. <sigh>

30-Jul-2014 06:42:38 [rosetta@home] Started upload of rtrpv1_full_length_rosettacm_cartrelax_truncated_asymm_IGNORE_THE_REST_176238_47563_0_0
30-Jul-2014 06:42:38 [rosetta@home] Started upload of rb_07_21_48263_94856_ab_stage0_t000___robetta_IGNORE_THE_REST_08_06_179551_12_0_0
30-Jul-2014 06:43:46 [---] Project communication failed: attempting access to reference site
30-Jul-2014 06:43:46 [rosetta@home] Temporarily failed upload of rtrpv1_full_length_rosettacm_cartrelax_truncated_asymm_IGNORE_THE_REST_176238_47563_0_0: connect() failed
30-Jul-2014 06:43:46 [rosetta@home] Backing off 2 hr 30 min 58 sec on upload of rtrpv1_full_length_rosettacm_cartrelax_truncated_asymm_IGNORE_THE_REST_176238_47563_0_0
30-Jul-2014 06:43:46 [rosetta@home] Temporarily failed upload of rb_07_21_48263_94856_ab_stage0_t000___robetta_IGNORE_THE_REST_08_06_179551_12_0_0: connect() failed
30-Jul-2014 06:43:46 [rosetta@home] Backing off 4 hr 11 min 26 sec on upload of rb_07_21_48263_94856_ab_stage0_t000___robetta_IGNORE_THE_REST_08_06_179551_12_0_0
30-Jul-2014 06:43:53 [---] Internet access OK - project servers may be temporarily down.
30-Jul-2014 06:44:58 [rosetta@home] Started upload of rtrpv1_full_length_rosettacm_cartrelax_truncated_asymm_IGNORE_THE_REST_176238_47563_0_0
30-Jul-2014 06:44:58 [rosetta@home] Started upload of rb_07_21_48263_94856_ab_stage0_t000___robetta_IGNORE_THE_REST_08_06_179551_12_0_0
30-Jul-2014 06:46:07 [---] Project communication failed: attempting access to reference site
30-Jul-2014 06:46:07 [rosetta@home] Temporarily failed upload of rtrpv1_full_length_rosettacm_cartrelax_truncated_asymm_IGNORE_THE_REST_176238_47563_0_0: connect() failed
30-Jul-2014 06:46:07 [rosetta@home] Backing off 3 hr 5 min 54 sec on upload of rtrpv1_full_length_rosettacm_cartrelax_truncated_asymm_IGNORE_THE_REST_176238_47563_0_0
30-Jul-2014 06:46:07 [rosetta@home] Temporarily failed upload of rb_07_21_48263_94856_ab_stage0_t000___robetta_IGNORE_THE_REST_08_06_179551_12_0_0: connect() failed
30-Jul-2014 06:46:07 [rosetta@home] Backing off 5 hr 18 min 18 sec on upload of rb_07_21_48263_94856_ab_stage0_t000___robetta_IGNORE_THE_REST_08_06_179551_12_0_0
30-Jul-2014 06:46:08 [---] Internet access OK - project servers may be temporarily down.
30-Jul-2014 06:49:55 [---] Received signal 15
30-Jul-2014 06:49:56 [---] Exit requested by user
30-Jul-2014 06:51:10 [---] Starting BOINC client version 7.0.27 for x86_64-pc-linux-gnu
30-Jul-2014 06:51:10 [---] log flags: file_xfer, sched_ops, task
30-Jul-2014 06:51:10 [---] Libraries: libcurl/7.26.0 OpenSSL/1.0.1e zlib/1.2.7 libidn/1.25 libssh2/1.4.2 librtmp/2.3
30-Jul-2014 06:51:10 [---] Data directory: /var/lib/boinc-client
30-Jul-2014 06:51:10 [---] Processor: 4 GenuineIntel Intel(R) Core(TM) i5-2500K CPU @ 3.30GHz [Family 6 Model 42 Stepping 7]
30-Jul-2014 06:51:10 [---] Processor: 6.00 MB cache
30-Jul-2014 06:51:10 [---] Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid
30-Jul-2014 06:51:10 [---] OS: Linux: 3.2.0-4-amd64
30-Jul-2014 06:51:10 [---] Memory: 7.79 GB physical, 15.62 GB virtual
30-Jul-2014 06:51:10 [---] Disk: 18.15 GB total, 12.60 GB free
30-Jul-2014 06:51:10 [---] Local time is UTC -7 hours
30-Jul-2014 06:51:10 [---] No usable GPUs found
30-Jul-2014 06:51:10 [---] Config: GUI RPC allowed from:
30-Jul-2014 06:51:10 [---] Config: 192.168.242.174
30-Jul-2014 06:51:10 [---] A new version of BOINC is available. <a href=http://boinc.berkeley.edu/download.php>Download it.</a>
30-Jul-2014 06:51:10 [rosetta@home] URL http://boinc.bakerlab.org/rosetta/; Computer ID 1675855; resource share 100
30-Jul-2014 06:51:10 [rosetta@home] General prefs: from rosetta@home (last modified 19-Dec-2010 18:19:25)
30-Jul-2014 06:51:10 [rosetta@home] Computer location: home
30-Jul-2014 06:51:10 [---] General prefs: using separate prefs for home
30-Jul-2014 06:51:10 [---] Reading preferences override file
30-Jul-2014 06:51:10 [---] Preferences:
30-Jul-2014 06:51:10 [---] max memory usage when active: 7176.28MB
30-Jul-2014 06:51:10 [---] max memory usage when idle: 7176.28MB
30-Jul-2014 06:51:10 [---] max disk usage: 14.84GB
30-Jul-2014 06:51:10 [---] (to change preferences, visit the web site of an attached project, or select Preferences in the Manager)
30-Jul-2014 06:51:10 [---] Not using a proxy
Initialization completed
30-Jul-2014 06:51:56 [rosetta@home] Started upload of tj_7_11_2helix_highRadius_X16_BBB_14_GB_1_o_fb_fragments_abinitio_SAVE_ALL_OUT_174752_292_0_0
30-Jul-2014 06:51:56 [rosetta@home] Started upload of ab_Tx767_t000__cstwt_1.0_IGNORE_THE_REST_03_09_179579_2110_0_0
30-Jul-2014 06:51:56 [rosetta@home] Restarting task frxtrimer_b5_04744_r3_A_frxtrimer_b5_04744_r3_B_patchdock_split_06_140721_SAVE_ALL_OUT__179597_176_0 using minirosetta version 352 in slot 3
30-Jul-2014 06:52:21 [rosetta@home] Restarting task 5H2LD_3_A_5H2LD_3_B_patchdock_split_05_140722_SAVE_ALL_OUT__179674_600_0 using minirosetta version 352 in slot 0
30-Jul-2014 06:52:21 [rosetta@home] Restarting task rb_07_20_48108_94838_ab_stage0_h001___robetta_IGNORE_THE_REST_09_09_179487_21_0 using minirosetta version 352 in slot 2
30-Jul-2014 06:52:41 [rosetta@home] Restarting task benchmark_0026_master_9699c665b4702afa86c605c374d1e7c8266f4b0e_T5_0.00_7.10_0.00_contact_opt_iteration_2_b417a9e3170e49b3874bdd0ac2ed91a3_fold_SAVE_ALL_OUT_179672_1875_0 using minirosetta version 352 in slot 1
30-Jul-2014 06:56:48 [---] Project communication failed: attempting access to reference site
30-Jul-2014 06:56:48 [rosetta@home] Temporarily failed upload of tj_7_11_2helix_highRadius_X16_BBB_14_GB_1_o_fb_fragments_abinitio_SAVE_ALL_OUT_174752_292_0_0: connect() failed
30-Jul-2014 06:56:48 [rosetta@home] Backing off 3 hr 26 min 31 sec on upload of tj_7_11_2helix_highRadius_X16_BBB_14_GB_1_o_fb_fragments_abinitio_SAVE_ALL_OUT_174752_292_0_0
30-Jul-2014 06:56:48 [rosetta@home] Started upload of hc_centroids_2bf5_4_0.25_06-01-14_SAVE_ALL_OUT_168127_3439_0_0
30-Jul-2014 06:57:01 [---] Internet access OK - project servers may be temporarily down.
30-Jul-2014 06:58:03 [---] Project communication failed: attempting access to reference site
30-Jul-2014 06:58:03 [rosetta@home] Temporarily failed upload of hc_centroids_2bf5_4_0.25_06-01-14_SAVE_ALL_OUT_168127_3439_0_0: connect() failed
30-Jul-2014 06:58:03 [rosetta@home] Backing off 54 min 35 sec on upload of hc_centroids_2bf5_4_0.25_06-01-14_SAVE_ALL_OUT_168127_3439_0_0
30-Jul-2014 06:58:03 [rosetta@home] Started upload of tj_7_11_2helix_highRadius_X16_BAB_14_BBGB_1_h_fb_fragments_abinitio_SAVE_ALL_OUT_174713_373_0_0
30-Jul-2014 06:58:10 [---] Internet access OK - project servers may be temporarily down.
30-Jul-2014 06:58:12 [---] Project communication failed: attempting access to reference site
30-Jul-2014 06:58:12 [rosetta@home] Temporarily failed upload of ab_Tx767_t000__cstwt_1.0_IGNORE_THE_REST_03_09_179579_2110_0_0: transient HTTP error
30-Jul-2014 06:58:12 [rosetta@home] Backing off 1 hr 45 min 48 sec on upload of ab_Tx767_t000__cstwt_1.0_IGNORE_THE_REST_03_09_179579_2110_0_0
30-Jul-2014 06:58:14 [---] Internet access OK - project servers may be temporarily down.
30-Jul-2014 06:59:12 [---] Project communication failed: attempting access to reference site
30-Jul-2014 06:59:12 [rosetta@home] Temporarily failed upload of tj_7_11_2helix_highRadius_X16_BAB_14_BBGB_1_h_fb_fragments_abinitio_SAVE_ALL_OUT_174713_373_0_0: connect() failed
30-Jul-2014 06:59:12 [rosetta@home] Backing off 7 min 56 sec on upload of tj_7_11_2helix_highRadius_X16_BAB_14_BBGB_1_h_fb_fragments_abinitio_SAVE_ALL_OUT_174713_373_0_0
30-Jul-2014 06:59:14 [---] Internet access OK - project servers may be temporarily down.
31) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 77104)
Posted 29 Jul 2014 by amgthis
Post:
I'm having a ton of problems in the last couple of weeks that I *thought* appeared to be DNS lookup related, but I've come to find that the only application with the problem is Rosetta. Reading here about server problems gives me a little bit of relief in thinking it's not all on my end, or a telco problem. At least there is some clue to server status with the project posted here. Thanks for the efforts.

/Mike
32) Questions and Answers : Unix/Linux : firewall rules for ICMP - Rosetta@home (Message 75250)
Posted 18 Mar 2013 by amgthis
Post:
Certainly sounds likely. You (and the project servers) could really make good use of a caching proxy. That would eliminate downloads of the same files (such as the application executable and the database) for each and every machine.


An excellent idea that I've been putting off for too long... thanks Mod.Sense

/Mike
33) Questions and Answers : Unix/Linux : firewall rules for ICMP - Rosetta@home (Message 75245)
Posted 17 Mar 2013 by amgthis
Post:
Iptables log shows ICMP connections to 128.95.160.144 sometimes blocked with a TYPE=3 and CODE=10. Could this be the project rejecting the connection because of too many connections already to my IP? I have 21 boxes sharing the one outgoing IP thru my firewall, maybe the project rejects over x-number of current connections at one time?

Sorry for being so ignorant of Iptables rules.
I'm trying to make sure I don't have a problem on my side, but usually I can connect with Rosetta@home fine. Sometimes downloads stall out for awhile, maybe it's related.

Thanks for any help from you admin guru types.

Regards,

Mike
34) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 75126)
Posted 18 Feb 2013 by amgthis
Post:
Server page is all green however...


I'll bet I've seen that 'server status' page actually show something as 'down'
maybe once over a few years. AFAIK it's rarely if ever accurate or updated.

Oh well.[/quote]


Looks like the server status pages have been updated to 'disabled'.
Someone is paying attention.

I'm sure the servers will be back up soon.
35) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 75125)
Posted 18 Feb 2013 by amgthis
Post:
Server page is all green however...


I'll bet I've seen that 'server status' page actually show something as 'down'
maybe once over a few years. AFAIK it's rarely if ever accurate or updated.

Oh well.[/quote]
36) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 75106)
Posted 17 Feb 2013 by amgthis
Post:
Server page is all green however...
[/quote]

I'll bet I've seen that 'server status' page actually show something as 'down'
maybe once over a few years. AFAIK it's rarely if ever accurate or updated.

Oh well.
37) Message boards : Number crunching : 3.43 is causing pop-ups (Message 74409)
Posted 14 Nov 2012 by amgthis
Post:
It should be safe now with the new app update. If you have any 3.43 jobs cached or running, please cancel them and update your client to get the new app update.



confirmed working fine here with win7 x64 - Rosetta 3.45
Boinc 7.0.28 x64

Thanks again, that was fast.

/Mike
38) Message boards : Number crunching : 3.43 is causing pop-ups (Message 74400)
Posted 14 Nov 2012 by amgthis
Post:
It should be safe now with the new app update. If you have any 3.43 jobs cached or running, please cancel them and update your client to get the new app update.


Thanks David! I was checking my debian boxes for this behavior but none
of them were up to the 3.43 release yet....

Thanks.

/Mike
39) Message boards : Number crunching : 3.43 is causing pop-ups (Message 74396)
Posted 14 Nov 2012 by amgthis
Post:
I have windows 7 x64 on an intel quad core cpu.
2 work units for 3.41 show normal progress, etc.
The 3.43 w/u's show zero progress on the progress bar in the
manager (7.0.28 -x64). One has been 'running' for 3:47 hrs.
the other one for 1:52. both show 'zero' progress and the cpus
status look like only 2 cores are running (the 3.41 w/u's).

The 3.43 w/u's in progress don't seem to be doing anything. This
box only has 4 gigs. of ram and normally 4 cores running Rosetta will
tie up around 2.8-3.0 gigs. Right now with the 2 cores working I'm
showing a little over 2 gigs working. Maybe if all 4 were really calculating
I'd be maxed out on ram and starting to swap?

I also have this log entry if it's of any help:

11/14/2012 9:37:31 AM | rosetta@home | Restarting task rb_11_13_34696_65379_t000__sufucter_IGNORE_THE_REST_07_12_64159_3_0 using minirosetta version 341 in slot 3
11/14/2012 9:37:31 AM | rosetta@home | Restarting task Ploop5_3_3_1_abinitio_design_y027_008_63498_924_0 using minirosetta version 343 in slot 2
11/14/2012 9:38:17 AM | | Using proxy info from GUI
11/14/2012 9:38:17 AM | | Not using a proxy
11/14/2012 9:39:03 AM | rosetta@home | Task Ploop5_3_3_1_abinitio_design_y027_008_63498_924_0 exited with zero status but no 'finished' file
11/14/2012 9:39:03 AM | rosetta@home | If this happens repeatedly you may need to reset the project.
11/14/2012 9:39:03 AM | rosetta@home | Restarting task Ploop5_3_3_1_abinitio_design_y027_008_63498_924_0 using minirosetta version 343 in slot 2
11/14/2012 9:41:34 AM | rosetta@home | Task Ploop5_3_3_1_abinitio_design_y027_008_63498_924_0 exited with zero status but no 'finished' file
11/14/2012 9:41:34 AM | rosetta@home | If this happens repeatedly you may need to reset the project.
11/14/2012 9:41:34 AM | rosetta@home | Restarting task Ploop5_3_3_1_abinitio_design_y027_008_63498_924_0 using minirosetta version 343 in slot 2
11/14/2012 11:37:14 AM | rosetta@home | Computation for task Rossmann2x2_abinitio_SAVE_ALL_OUT_design_r998_010_64153_1017_0 finished


hope this helps...........

/Mike
40) Questions and Answers : Windows : Minirosetta popping up windows (Message 74365)
Posted 14 Nov 2012 by amgthis
Post:


This is SOOOOOOOOOOOO annoying!!!!!

help!!!!


Previous 20 · Next 20



©2024 University of Washington
https://www.bakerlab.org