Posts by JAMES DORISIO

1) Message boards : Number crunching : Rosetta Python floods disks with snapshots (Message 104280)
Posted 16 Jan 2022 by Profile JAMES DORISIO
Post:
I have had the checkpoint interval set to 3600 seconds, below is the sdterr output from a python task

2022-01-16 01:57:49 (9405): Setting checkpoint interval to 3600 seconds. (Higher value of (Preference: 3600 seconds) or (Vbox_job.xml: 600 seconds))

2022-01-16 02:58:03 (9405): Creating new snapshot for VM.
2022-01-16 02:58:12 (9405): Checkpoint completed.
2022-01-16 03:36:54 (9405): Status Report: Elapsed Time: '6000.749009'
2022-01-16 03:36:54 (9405): Status Report: CPU Time: '5950.310000'
2022-01-16 03:58:29 (9405): Creating new snapshot for VM.
2022-01-16 03:58:38 (9405): Deleting stale snapshot.
2022-01-16 03:58:38 (9405): Checkpoint completed.
2022-01-16 04:58:55 (9405): Creating new snapshot for VM.
2022-01-16 04:59:04 (9405): Deleting stale snapshot.
2022-01-16 04:59:04 (9405): Checkpoint completed.
2022-01-16 05:15:58 (9405): Status Report: Elapsed Time: '12001.543604'
2022-01-16 05:15:58 (9405): Status Report: CPU Time: '11930.110000'

It looks like it is using the higher value of 3600 seconds 1 hour
I am going to try changing this to 7200 seconds to see what happens as these computers are on 24 hours a day and rarely reboot.

Jim
2) Message boards : Number crunching : Minirosetta 3.46 (Message 75510)
Posted 27 Apr 2013 by Profile JAMES DORISIO
Post:
The applications page still shows Windows/x86 as 3.45 and I am getting Rosetta Mini 3.45 with windows xp x86 32 bit computers.
Are you planing to upgrade this version also.
Thanks Jim.


3) Message boards : Number crunching : Client errors (Message 75269)
Posted 21 Mar 2013 by Profile JAMES DORISIO
Post:
This problem appears to be fixed for me. I have 5 linux computers running under my name and as of 3-20-13 they all started returning successful tasks, before this they were all client errors. I have made no changes to them not even a reboot. I don't see any posts from Rosetta admins that they changed anything but something has changed.
I would suggest that anybody with this problem enable new work to see if this problem is really fixed.
Thanks Jim
4) Message boards : Number crunching : Client errors (Message 75171)
Posted 26 Feb 2013 by Profile JAMES DORISIO
Post:
Thanks for the information!

Unfortunately I can't see anything in the logs that might suggest what is causing the issue. I'm planning to update the server scheduler to use the version that Ralph uses. Hopefully this will fix things as users suggest the Ralph server is okay and does not have this issue.

thanks again everyone.


Any idea about the possibility of upgrading to the server scheduler version that Ralph uses. We seem to have confirmed that it solves this problem, at least for my 2 computers that have this problem. They have both completed at successful tasks on Ralph but continue to have errors on Rosetta when i try them here.

Thanks Jim
5) Message boards : Number crunching : Client errors (Message 75064)
Posted 8 Feb 2013 by Profile JAMES DORISIO
Post:
Successful tasks completed on ralph@home. I managed to pick up some tasks on ralph.

Computer ralph
http://ralph.bakerlab.org/show_host_detail.php?hostid=29722

Tasks for computer ralph (all success)
http://ralph.bakerlab.org/results.php?hostid=29722

Computer rosetta
http://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=1579123

Tasks for computer rosetta (all client error)
http://boinc.bakerlab.org/rosetta/results.php?hostid=1579123

Intel I7-3770, Ubuntu linux 12.04 amd64, nvidia driver 310.14. Boinc 7.0.27

There were no changes to this computer, same exact setup, it actually ran some ralph and rosetta tasks at the same time.

To David
I hope this comfirms that ralph does not have this issue. If you any questions please post them or PM me.
Thanks Jim
6) Message boards : Number crunching : Client errors (Message 75052)
Posted 6 Feb 2013 by Profile JAMES DORISIO
Post:
This computer also has this problem & is not Ivy Bridge.
Intel(R) Pentium(R) 4 CPU 3.00GHz Nvidia gts450.
Ubuntu linux 12.04 amd64 ,nvidia driver 310.14, Boinc 7.0.27
It was ok until upgrading from Ubuntu 10.04 & new drivers & boinc that came with it.

http://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=1485068

I have been checking Ralph but it never shows tasks available, I will try to set up a computer there anyway as soon as i get a chance.

Jim
7) Message boards : Number crunching : Client errors (Message 75013)
Posted 29 Jan 2013 by Profile JAMES DORISIO
Post:
David
Please let us know when this is done. I would like to bring some computers back here if this works.
Thanks Jim
8) Message boards : Number crunching : Client errors (Message 75002)
Posted 28 Jan 2013 by Profile JAMES DORISIO
Post:
Log number 2 for this computer

Intel I7-3770, Ubuntu linux 12.04 amd64 ,nvidia driver 310.14, Boinc 7.0.27
http://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=1579123

task outcome client error, state invalid
http://boinc.bakerlab.org/rosetta/result.php?resultid=558798870

Boinc log less checkpointing.

Sun 27 Jan 2013 12:33:42 PM EST | rosetta@home | [task] ACTIVE_TASK::start(): forked process: pid 10428
Sun 27 Jan 2013 12:33:42 PM EST | rosetta@home | [task] task_state=EXECUTING for P2_1_s2_f5_abinitio_design_y022_001_72082_825_0 from start
Sun 27 Jan 2013 12:33:42 PM EST | rosetta@home | Starting task P2_1_s2_f5_abinitio_design_y022_001_72082_825_0 using minirosetta version 345 in slot 7
Sun 27 Jan 2013 07:33:06 PM EST | rosetta@home | [task] Process for P2_1_s2_f5_abinitio_design_y022_001_72082_825_0 exited, status 0, task state 1
Sun 27 Jan 2013 07:33:06 PM EST | rosetta@home | [task] process exited with status 0
Sun 27 Jan 2013 07:33:06 PM EST | rosetta@home | [task] task_state=EXITED for P2_1_s2_f5_abinitio_design_y022_001_72082_825_0 from handle_exited_app
Sun 27 Jan 2013 07:33:06 PM EST | rosetta@home | Computation for task P2_1_s2_f5_abinitio_design_y022_001_72082_825_0 finished
Sun 27 Jan 2013 07:33:06 PM EST | rosetta@home | [task] result state=FILES_UPLOADING for P2_1_s2_f5_abinitio_design_y022_001_72082_825_0 from CS::app_finished
Sun 27 Jan 2013 07:33:11 PM EST | rosetta@home | [task] result state=FILES_UPLOADED for P2_1_s2_f5_abinitio_design_y022_001_72082_825_0 from CS::update_results
Sun 27 Jan 2013 07:33:11 PM EST | rosetta@home | Sending scheduler request: To report completed tasks.
Sun 27 Jan 2013 07:33:11 PM EST | rosetta@home | Reporting 1 completed tasks, not requesting new tasks

Jim
9) Message boards : Number crunching : Client errors (Message 74997)
Posted 27 Jan 2013 by Profile JAMES DORISIO
Post:
Intel I7-3770, Ubuntu linux 12.04 amd64 ,nvidia driver 310.14, Boinc 7.0.27

task outcome client error, state invalid
http://boinc.bakerlab.org/rosetta/result.php?resultid=558662180

Boinc log less checkpointing.

Sat 26 Jan 2013 09:23:40 PM EST | rosetta@home | [task] ACTIVE_TASK::start(): forked process: pid 9248
Sat 26 Jan 2013 09:23:40 PM EST | rosetta@home | [task] task_state=EXECUTING for rb_01_26_36263_68951__t000__1_C1_SAVE_ALL_OUT_IGNORE_THE_REST_72986_286_0 from start
Sat 26 Jan 2013 09:23:40 PM EST | rosetta@home | Starting task rb_01_26_36263_68951__t000__1_C1_SAVE_ALL_OUT_IGNORE_THE_REST_72986_286_0 using minirosetta version 345 in slot 2
Sun 27 Jan 2013 04:20:49 AM EST | rosetta@home | [task] Process for rb_01_26_36263_68951__t000__1_C1_SAVE_ALL_OUT_IGNORE_THE_REST_72986_286_0 exited, status 0, task state 1
Sun 27 Jan 2013 04:20:49 AM EST | rosetta@home | [task] process exited with status 0
Sun 27 Jan 2013 04:20:49 AM EST | rosetta@home | [task] task_state=EXITED for rb_01_26_36263_68951__t000__1_C1_SAVE_ALL_OUT_IGNORE_THE_REST_72986_286_0 from handle_exited_app
Sun 27 Jan 2013 04:20:49 AM EST | rosetta@home | Computation for task rb_01_26_36263_68951__t000__1_C1_SAVE_ALL_OUT_IGNORE_THE_REST_72986_286_0 finished
Sun 27 Jan 2013 04:20:49 AM EST | rosetta@home | [task] result state=FILES_UPLOADING for rb_01_26_36263_68951__t000__1_C1_SAVE_ALL_OUT_IGNORE_THE_REST_72986_286_0 from CS::app_finished
Sun 27 Jan 2013 04:20:57 AM EST | rosetta@home | [task] result state=FILES_UPLOADED for rb_01_26_36263_68951__t000__1_C1_SAVE_ALL_OUT_IGNORE_THE_REST_72986_286_0 from CS::update_results
Sun 27 Jan 2013 04:20:57 AM EST | rosetta@home | Sending scheduler request: To report completed tasks.
Sun 27 Jan 2013 04:20:57 AM EST | rosetta@home | Reporting 1 completed tasks, not requesting new tasks

Jim



10) Message boards : Number crunching : Client errors (Message 74976)
Posted 25 Jan 2013 by Profile JAMES DORISIO
Post:
This computer also seems to be affected by this problem. Intel I7-3770, Ubuntu linux 12.04 amd64 ,nvidia driver 310.14, Boinc 7.0.27, all downloaded from the Ubuntu repository. I have the run time preference set at 6 hours.

http://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=1579123

This machine was built in Nov of 2012 when built, using only intel built in graphics it would complete all work units valid. after installing a gtx650ti graphic card and nvidia driver 310.14 all work units show client error. I have tried not using the gpu for boinc and there was no difference all client error. I did not uninstall the nvidia driver, just set no new work from the gpu projects.

Hope this helps
Jim
11) Message boards : Number crunching : Client errors (Message 74970)
Posted 24 Jan 2013 by Profile JAMES DORISIO
Post:
I just set this computer to allow new tasks, run time preference 6 hours. I believe from reading other threads that it has this problem and it is the nvidia driver that causes the problem.

http://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=1485068

This computer was running Ubuntu 10.04 amd64 nvidia driver 304.** Boinc 6.10.17 and was ok, I upgraded it to Ubuntu linux 12.04 amd64 ,nvidia driver 310.14, Boinc 7.0.27. Since then all work units complete ok but show client error. Before the change all work units were ok and valid, but that was a month ago and they are removed from history.

It was and still is running Gpu work from GPUgrid and WCG on a GTS450. Just an upgrade to Ubuntu 12.04 along with the new versions of Boinc and nvidia drivers that came with it. It is successfully completing WCG human proteome folding phase 2 work units from WCG which uses Rosetta software.

I will let this machine run for a few days and add another one with this problem to see if this can help.

Thanks Jim

12) Message boards : Number crunching : Current issues with 7+ boinc client (Message 74779)
Posted 24 Dec 2012 by Profile JAMES DORISIO
Post:
I just upgraded another computer to Ubuntu linux 12.04 amd64 ,nvidia driver 310.14, Boinc 7.0.27, all downloaded from the Ubuntu repository. I have the run time preference set at 12 hours.

http://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=1485068

This computer was running Ubuntu 10.04 amd64 nvidia driver 304.** Boinc 6.10.17.
All hardware remained the same, it was and still is running Gpu work from GPUgrid and WCG on a GTS450. Just an upgrade to Ubuntu 12.04 along with the new versions of Boinc and nvidia drivers that came with it. Before the upgrade it ran with no errors after the upgrade it has produced 3 errors out of 3 work units. I have stopped new tasks from Rosetta@home for now. It is successfully completing WCG human proteome folding phase 2 work units from WCG which use Rosetta software.

The below computer also is affected by this bug. See message 74598

http://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=1579123

I have 3 more computers to upgrade but it looks like they will not be able to run here if I do. For now I will hold off. It would be nice if someone from the Rosetta staff could post here so we know they are looking into this.
Thanks Jim
13) Message boards : Number crunching : Current issues with 7+ boinc client (Message 74598)
Posted 27 Nov 2012 by Profile JAMES DORISIO
Post:
This new computer also seems to be affected by this problem. Intel I7-3770, Ubuntu linux 12.04 amd64 ,nvidia driver 310.14, Boinc 7.0.27, all downloaded from the Ubuntu repository. I have the run time preference set at 12 hours.

http://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=1579123

The 1st weekend I tested it with Rosetta@home and WCG it successfully completed 5 work units with no errors.

The 2nd weekend same setup it successfully completed 10 of 12 work units with 2 validate errors, then I installed nvidia drivers to test it on gpu projects, in this case Einstein@home. Since then it has returned all client errors even after setting no new work from Einstein finishing all the gpu workunits and rebooting it. Then Running only Rosetta & WCG it received 5 new work units all ended with client errors.

I can see no difference in the log files from the successful work units and the client errors. Interestingly it has successfully completed all WCG human proteome folding phase 2 work units, which use Rosetta software as per the web site. quote "Human Proteome Folding Phase 2 (HPF2) continues where the first phase left off. The two main objectives of the project are to: 1) obtain higher resolution structures for specific human proteins and pathogen proteins and 2) further explore the limits of protein structure prediction by further developing Rosetta software structure prediction."

I post this information hoping it will help the Rosetta staff fix this problem, but I am willing to use this computer at WCG until this problem is fixed. I will continue to test a few work units per weekend along side gpu projects to see what happens if this is of any help.

Also one question for the Rosetta staff, are the work units that seem to complete normally but are marked client error of any scientific value to you? I have noticed that credit is eventually granted on the task id page although the server gives the work unit out again to someone else so it is kind of a waste of time.

Please ask if you have any questions. Thanks Jim
14) Message boards : Rosetta@home Science : Google to donate 1 billion core-hours of computational capacity (Message 70036)
Posted 14 Apr 2011 by Profile JAMES DORISIO
Post:
Google to donate 1 billion core-hours of computational capacity for researchers

News from Google research blog

Google Exacycle for Visiting Faculty website

The second link mentions Rosetta@Home and Boinc
The application deadline is 11:59 p.m. PST May 31, 2011






©2024 University of Washington
https://www.bakerlab.org