Posts by Jesse Viviano

1) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 108110)
Posted 20 Feb 2023 by Jesse Viviano
Post:
I have a task that has been stuck waiting for validation at https://boinc.bakerlab.org/rosetta/result.php?resultid=1498325740. Please diagnose why my task never validated.
2) Message boards : Number crunching : Rosetta 4.0+ (Message 88584)
Posted 29 Mar 2018 by Jesse Viviano
Post:
Rosetta v4.07 crashed on work unit 886791424. The error message is below:
Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Out Of Memory (C++ Exception) (0xe06d7363) at address 0x756C3EF2
3) Message boards : Number crunching : Minirosetta 3.73-3.78 (Message 86721)
Posted 24 Jun 2017 by Jesse Viviano
Post:
I am getting some 197 (0x000000C5) EXIT_TIME_LIMIT_EXCEEDED errors. See tasks 923287511, 923287516, and 923287790 for examples. These happened after the website upgrade. I normally run my work units for 1 day, but it looks like I will have to cut that target time to avoid the time limit errors.
4) Message boards : Rosetta@home Science : New paper in Science Magazine argues that computational models are worsening (Message 80984)
Posted 6 Jan 2017 by Jesse Viviano
Post:
Please see http://arstechnica.com/science/2017/01/in-chemistry-computational-models-may-be-getting-worse/ and http://science.sciencemag.org/content/355/6320/49 for context. I do not know anything about how computational models in chemistry work, but I think that if the models used by Rosetta@home are a victim to this phenomenon that is causing many modern chemical models to fail that the ones used by Rosetta@home could be fixed.
5) Questions and Answers : Web site : Server status page lists the transitioner as down (Message 80873)
Posted 26 Nov 2016 by Jesse Viviano
Post:
The server status page at http://srv0.bakerlab.org/rah_status.php lists the transitioner as not running. However, the project still seems to be running fine, which would imply that the transitioner is probably working. Could an administrator please investigate the issue?
6) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 80854)
Posted 15 Nov 2016 by Jesse Viviano
Post:
I got a validate error at https://boinc.bakerlab.org/rosetta/result.php?resultid=886140787. I sincerely doubt that this is caused by situations where task files are uploaded and then the tasks are reported before the files are finally written in storage because of the time stamp of the reporting of the work unit and the following BOINC client log:
11/15/2016 10:56:30 AM | rosetta@home | Computation for task rb_11_12_70285_113852__t000__3_C1_SAVE_ALL_OUT_IGNORE_THE_REST_449751_697_1 finished
11/15/2016 10:56:31 AM | rosetta@home | Started upload of rb_11_12_70285_113852__t000__3_C1_SAVE_ALL_OUT_IGNORE_THE_REST_449751_697_1_0
11/15/2016 10:56:37 AM | rosetta@home | Finished upload of rb_11_12_70285_113852__t000__3_C1_SAVE_ALL_OUT_IGNORE_THE_REST_449751_697_1_0
11/15/2016 10:56:48 AM | rosetta@home | Computation for task rb_11_12_70288_113830_ab_stage0_t000___robetta_cstwt_3.0_IGNORE_THE_REST_03_09_449736_8103_1 finished
11/15/2016 10:56:51 AM | rosetta@home | Computation for task rb_11_12_70288_113830_ab_stage0_t000___robetta_cstwt_3.0_IGNORE_THE_REST_03_09_449736_7725_1 finished
11/15/2016 10:56:52 AM | rosetta@home | Started upload of rb_11_12_70288_113830_ab_stage0_t000___robetta_cstwt_3.0_IGNORE_THE_REST_03_09_449736_8103_1_0
11/15/2016 10:56:53 AM | rosetta@home | Started upload of rb_11_12_70288_113830_ab_stage0_t000___robetta_cstwt_3.0_IGNORE_THE_REST_03_09_449736_7725_1_0
11/15/2016 10:56:56 AM | rosetta@home | Finished upload of rb_11_12_70288_113830_ab_stage0_t000___robetta_cstwt_3.0_IGNORE_THE_REST_03_09_449736_8103_1_0
11/15/2016 10:56:57 AM | rosetta@home | Finished upload of rb_11_12_70288_113830_ab_stage0_t000___robetta_cstwt_3.0_IGNORE_THE_REST_03_09_449736_7725_1_0
11/15/2016 11:00:01 AM | rosetta@home | Computation for task rb_11_12_70288_113830_ab_stage0_t000___robetta_cstwt_3.0_IGNORE_THE_REST_03_09_449736_7681_1 finished
11/15/2016 11:00:02 AM | rosetta@home | Started upload of rb_11_12_70288_113830_ab_stage0_t000___robetta_cstwt_3.0_IGNORE_THE_REST_03_09_449736_7681_1_0
11/15/2016 11:00:05 AM | rosetta@home | Finished upload of rb_11_12_70288_113830_ab_stage0_t000___robetta_cstwt_3.0_IGNORE_THE_REST_03_09_449736_7681_1_0
11/15/2016 11:03:00 AM | rosetta@home | Computation for task rb_11_12_70293_113866__t000__3_C1_SAVE_ALL_OUT_IGNORE_THE_REST_449742_965_1 finished
11/15/2016 11:03:01 AM | rosetta@home | Started upload of rb_11_12_70293_113866__t000__3_C1_SAVE_ALL_OUT_IGNORE_THE_REST_449742_965_1_0
11/15/2016 11:03:07 AM | rosetta@home | Finished upload of rb_11_12_70293_113866__t000__3_C1_SAVE_ALL_OUT_IGNORE_THE_REST_449742_965_1_0
11/15/2016 11:03:35 AM | rosetta@home | Computation for task rb_11_12_70288_113830_ab_stage0_t000___robetta_cstwt_3.0_IGNORE_THE_REST_03_09_449736_7647_1 finished
11/15/2016 11:03:36 AM | rosetta@home | Started upload of rb_11_12_70288_113830_ab_stage0_t000___robetta_cstwt_3.0_IGNORE_THE_REST_03_09_449736_7647_1_0
11/15/2016 11:03:40 AM | rosetta@home | Finished upload of rb_11_12_70288_113830_ab_stage0_t000___robetta_cstwt_3.0_IGNORE_THE_REST_03_09_449736_7647_1_0
11/15/2016 11:05:08 AM | rosetta@home | Computation for task rb_11_12_70288_113830_ab_stage0_t000___robetta_cstwt_3.0_IGNORE_THE_REST_03_09_449736_7638_1 finished
11/15/2016 11:05:09 AM | rosetta@home | Started upload of rb_11_12_70288_113830_ab_stage0_t000___robetta_cstwt_3.0_IGNORE_THE_REST_03_09_449736_7638_1_0
11/15/2016 11:05:16 AM | rosetta@home | Finished upload of rb_11_12_70288_113830_ab_stage0_t000___robetta_cstwt_3.0_IGNORE_THE_REST_03_09_449736_7638_1_0
11/15/2016 11:05:16 AM | rosetta@home | Computation for task rb_11_12_70293_113866__t000__3_C1_SAVE_ALL_OUT_IGNORE_THE_REST_449742_398_1 finished
11/15/2016 11:05:17 AM | rosetta@home | Started upload of rb_11_12_70293_113866__t000__3_C1_SAVE_ALL_OUT_IGNORE_THE_REST_449742_398_1_0
11/15/2016 11:05:23 AM | rosetta@home | Finished upload of rb_11_12_70293_113866__t000__3_C1_SAVE_ALL_OUT_IGNORE_THE_REST_449742_398_1_0
11/15/2016 11:13:50 AM | rosetta@home | Computation for task rb_11_12_70287_113849__t000__1_C1_SAVE_ALL_OUT_IGNORE_THE_REST_449753_103_1 finished
11/15/2016 11:13:51 AM | rosetta@home | Started upload of rb_11_12_70287_113849__t000__1_C1_SAVE_ALL_OUT_IGNORE_THE_REST_449753_103_1_0
11/15/2016 11:13:56 AM | rosetta@home | Finished upload of rb_11_12_70287_113849__t000__1_C1_SAVE_ALL_OUT_IGNORE_THE_REST_449753_103_1_0
11/15/2016 11:19:39 AM | rosetta@home | update requested by user
11/15/2016 11:19:40 AM | rosetta@home | Sending scheduler request: Requested by user.
11/15/2016 11:19:40 AM | rosetta@home | Reporting 9 completed tasks
11/15/2016 11:19:40 AM | rosetta@home | Not requesting tasks: don't need (CPU: not highest priority project; NVIDIA GPU: job cache full)
11/15/2016 11:19:41 AM | rosetta@home | Scheduler request completed

The file for the affected work unit was uploaded at 11/15/2016 11:05:16 AM UTC-5 (US Eastern Standard Time). The scheduler request was issued at 11/15/2016 11:19:40 AM UTC-5 and finished at 11/15/2016 11:19:41 AM UTC-5. There is at least 14 minutes between the upload and the report. Does the upload server need to have its file system checked?
7) Message boards : Number crunching : Minirosetta 3.73-3.78 (Message 79490)
Posted 6 Feb 2016 by Jesse Viviano
Post:
I can confirm that work unit 717556481 crashes due to out of memory errors in both results, and I suspect that work units 717556482 and 717556497 will also fail due to the same reason once the computers assigned these work units return their results because my computer also had the same errors in these work units. Their names are of the form 02_2016_(one numeral and then three letters)_backrub_design_(six numerals)_(one or more numerals).
8) Message boards : Number crunching : Minirosetta 3.52 (Message 77895)
Posted 5 Feb 2015 by Jesse Viviano
Post:
Work unit 647152330 generated result files that were too big to upload when the work unit processing time limit is set to 24 hours. Please see my result log and the result log for someone who used a shorter work unit time limit.

I found the relevant BOINC event log entries by digging into the appropriate BOINC data directory. By default, this file is located at C:ProgramDataBOINCstdoutdae.old in Windows 7. The BOINC event log entries are listed below.
02-Feb-2015 13:14:03 [rosetta@home] Computation for task A__2_2015_01_29_B__2_2015_01_29_patchdock_split_02_150129_SAVE_ALL_OUT__242418_37_0 finished
02-Feb-2015 13:14:03 [rosetta@home] Output file A__2_2015_01_29_B__2_2015_01_29_patchdock_split_02_150129_SAVE_ALL_OUT__242418_37_0_0 for task A__2_2015_01_29_B__2_2015_01_29_patchdock_split_02_150129_SAVE_ALL_OUT__242418_37_0 exceeds size limit.
02-Feb-2015 13:14:03 [rosetta@home] File size: 65833683.000000 bytes.  Limit: 50000000.000000 bytes

I therefore will have to change my preferences to 12 hour work units to prevent this error once my current work units drain out unless the file upload size limit is raised.
9) Message boards : Number crunching : Minirosetta 3.52 (Message 77893)
Posted 4 Feb 2015 by Jesse Viviano
Post:
Work unit 647152330 generated result files that were too big to upload when the work unit processing time limit is set to 24 hours. Please see my result log and the result log for someone who used a shorter work unit time limit.
10) Message boards : Number crunching : Minirosetta 3.52 (Message 77781)
Posted 30 Dec 2014 by Jesse Viviano
Post:
I just got a new computation error. I recently switched to running work units for 2 days in the middle of running result 707976354 from one day, so I don't know if this is a problem caused by switching in the middle of the work unit or a bug in Minirosetta 3.52.


No, I've not seen changing the runtime preference cause any problem. Only issue with that might be if you made it shorter than an active task had already run.

[edit]...But now I see similar errors on other tasks that tried to run with the 48hr runtime preference, so I've sent a note to the Project Team to look in to that. Appears the tasks have an internal runtime limit that may need to be extended to match the 48hrs (plus the 4hr watchdog).

Until they get a chance to resolve it, I'd suggest going back to the 24hr runtime preference.

Thanks!

By the way, since you noticed that the 2 day limit apparently causes errors, you might want to have that option edited to tell users to not use it or better yet to remove that option and force everyone on the 2 day limit to the 1 day limit and announce why it was done on the message board and news sections.
11) Message boards : Number crunching : GPU Potential (Message 77778)
Posted 29 Dec 2014 by Jesse Viviano
Post:
Please do not forget the fma3 / fma4 capabilities of the AMD cpu's. Crunch3r's fma4 implementation @ Asteroids@Home is extreme efficient!


But:
The incompatibility between Intel's FMA3 and AMD's FMA4 is due to both companies changing plans without coordinating coding details with each other. AMD changed their plans from FMA3 to FMA4 while Intel changed their plans from FMA4 to FMA3 almost at the same time.

AMD's latest CPUs can support both FMA3 and FMA4.
12) Message boards : Number crunching : Minirosetta 3.52 (Message 77777)
Posted 29 Dec 2014 by Jesse Viviano
Post:
I just got a new computation error. I recently switched to running work units for 2 days in the middle of running result 707976354 from one day, so I don't know if this is a problem caused by switching in the middle of the work unit or a bug in Minirosetta 3.52.
13) Message boards : Number crunching : GPU Potential (Message 77720)
Posted 4 Dec 2014 by Jesse Viviano
Post:
From my standpoint, this project has potential with a GPU app. BOINC regularly polls this project trying to find tasks for my Nvidia, even though I don't see any GPU apps. From what I know, the way data is analyzed and processed, I think a GPU would be able to process tasks much faster than a CPU. There's plenty of RAM to hold the process, and the multiple cores would allow for more data to be sent through. How hard would it be to implement a GPU version of Rosetta@home?

The way Rosetta@home folds proteins is extremely serial in which each step of creating a decoy, or a guess as to what a protein that is folded would be shaped like, feeds upon the previous step. The only steps that do not feed upon each other is creating a new decoy after the previous decoy is finished. There is not much to parallelize in this program, so a GPU would fail in this job.
14) Message boards : Number crunching : Bad work unit 584007561 (Message 76563)
Posted 29 Mar 2014 by Jesse Viviano
Post:
Work unit 584007561 could not be sent to any computers it was assigned to. I tried to do a project reset, and that did not get the work unit resent to my computer. After that, I detached and reattached to either force a resend or kill my dead result. Could someone investigate this work unit?
15) Message boards : Number crunching : Minirosetta 3.48 (Message 76500)
Posted 2 Mar 2014 by Jesse Viviano
Post:
Work unit 583729340 crashed on both computers it ran on including mine. It was running Minirosetta 3.48 on both computers.
16) Questions and Answers : Wish list : Idea for more accurate progress meter (Message 76393)
Posted 2 Feb 2014 by Jesse Viviano
Post:
Oops, I forgot that I posted this idea over three years ago.
17) Message boards : Number crunching : Obsolete server software (Message 76392)
Posted 31 Jan 2014 by Jesse Viviano
Post:
The problem that I believe is blocking the updating of this server's software is that it has been customized to allow it to display the reports and plots of results from active work units for a user. Since that feature keeps failing (which includes the time of this writing), I feel that the server could be upgraded with no real loss of volunteer functionality with a minimum of customization.

If there is customization for the BOINC project server functionality that would be lost on an upgrade, I can understand the reluctance to upgrade.
18) Questions and Answers : Wish list : Idea for more accurate progress meter (Message 76391)
Posted 31 Jan 2014 by Jesse Viviano
Post:
I have noticed that the progress meter becomes wildly inaccurate when Rosetta@home processes work units that cause it to shut down early because it has created enough decoys (typically 99) to force it to terminate early rather than create too many decoys for the server. I believe that a more accurate progress meter would be generated if the progress was defined as the maximum of the original CPU time-based progress meter or the number that is generated by dividing the number of decoys completed by the number of decoys allowed before an early shutdown is triggered.
19) Message boards : Number crunching : Rosetta@Home Version 3.22 (Message 72447)
Posted 4 Mar 2012 by Jesse Viviano
Post:
I have another bad work unit: CASP9_be_perfect_aln_hybrid_T0563_SAVE_ALL_OUT_IGNORE_THE_REST_42825_11. It failed for me and the previous person who tried to crunch this work unit.
20) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 72113)
Posted 13 Jan 2012 by Jesse Viviano
Post:
I was wondering if anyone has experienced a problem with the R@H screensaver?

When I initially installed BOINC I joined R@H, SETI@H and ClimatePrediction. All of which have interesting screen saver displays.

At some point the R@H screen saver stopped being displayed although the miniroseta task was running.

Once the server problems were resolved my R@H WU were being processed and the other two projects screen savers were running ok.

But I still am not seeing R@H on my computer.

I'm running Windows 7 Home Professional and have the latest version of BOINC installed.

JohnB

The Roestta@home screensaver as of now crashes upon launch. It is probably missing a file needing to launch and not crash.


Next 20



©2025 University of Washington
https://www.bakerlab.org