Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 317 · 318 · 319 · 320 · 321 · 322 · 323 . . . 352 · Next
| Author | Message | 
|---|---|
| Bill Swisher  Send message Joined: 10 Jun 13 Posts: 81 Credit: 61,780,497 RAC: 16,820   | 
 Mine all run 8 hours. I grabbed the results from 6, first 6 I found and they weren't by beta/whatever, they averaged 3h 33m. I do have some older slower computers that weren't included. Perhaps one of the wizards around here can explain to me what the Project Resource share is, they're all set to 100, and how to set it for each project (without going in and manually editing the xml files). From my experience WCG will overrun boinc if given the opportunity.   | 
| Ivailo Bonev Send message Joined: 9 May 07 Posts: 16 Credit: 6,196,220 RAC: 0 | 
 Second biggest issue seems to be the incorrect setting of default runtimes to 4hrs on tasks running through Rosetta Beta 6.06 - instead of 8hrs The new RosettaVS tasks have been running for 8 hours, the others are still running for 4 hours. | 
| Bill Swisher  Send message Joined: 10 Jun 13 Posts: 81 Credit: 61,780,497 RAC: 16,820   | 
 The important part is the scientific results Doubly agree!   | 
|  Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1895 Credit: 18,534,891 RAC: 0 | 
 If you set a Target CPU Time then they will run for that length of time.Second biggest issue seems to be the incorrect setting of default runtimes to 4hrs on tasks running through Rosetta Beta 6.06 - instead of 8hrsStrange: Mine all run 8 hours. I have seven of those running right now on Linux. If you leave it blank- ie the Default Target CPU time then they will run for as long as the project determines they need to run. For Rosetta 4.20 Tasks the default is 8 hours. For the Beta Tasks the default is usually 3 hours, sometimes it can be 8 hours. Grant Darwin NT | 
|  Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1895 Credit: 18,534,891 RAC: 0 | 
 You have set your Target CPU run time to 4 hours, so that is how long they will run for- unless they will run longer than that time in order complete in which case they will end before the 4 hours.Mine all run 8 hours.I grabbed the results from 6, first 6 I found and they weren't by beta/whatever, they averaged 3h 33m. I do have some older slower computers that weren't included. Perhaps one of the wizards around here can explain to me what the Project Resource share is, they're all set to 100, and how to set it for each project (without going in and manually editing the xml files). From my experience WCG will overrun boinc if given the opportunity.The Resource share setting is a ratio, not a percentage (the BOINC manager shows in brackets what it is as a percentage next to the resource share value). You set the percentage in the Computing preferences for the particular project you wish to increase or decrease the allocation of computing resources. What is being shared is not computing time. eg- 2 projects, one with an extremely efficient application, the other with an extremely inefficient application. Equal Resource share values. The efficient application project may only run a Task every few days, the other may run dozens of Tasks every single day. The amount of time is obviously very different, however the amount of computing actually done is the same (as per Resource share settings). It is not based on the Credit (RAC) awarded by the projects. If a project is down for a while, then all the other projects will get that computing resource (time). When the down project comes back up, it will get the lion's share of computing resources (time) until your Resource share settings are being met in which case it will reduce the work on that project & increase the work for the other projects again to maintain that. The larger your cache, the more projects you run, the less time BOINC has to process work (ie system on for limited hours, BOINC limited in when it can do work during those hours, limit on the number of cores/threads it can use, limit on Use at most % of CPU time, Suspend when non-BOINC CPU usage is above is less than 100%), the more you micro manage BOINC then the longer it will take for your Resource share settings to be honoured- we are talking months here. More than one project, then no cache is best (0.1 days and 0.01 additional days), and your Resource share settings should be honoured within a few days to a week (of course if a project/ more than one project are having issues with sending or receiving work, then that will cause things to fluctuate). Grant Darwin NT | 
| Sid Celery Send message Joined: 11 Feb 08 Posts: 2475 Credit: 46,506,558 RAC: 3,357   | 
 Second biggest issue seems to be the incorrect setting of default runtimes to 4hrs on tasks running through Rosetta Beta 6.06 - instead of 8hrs Admittedly this issue is one I report second-hand - it doesn't happen to me as I personally force 12hr runtimes. As you can see from the comments of others there are "issues" with runtimes internal to some tasks, even though the default in Boinc begins at 8hrs for all tasks, so it messes up BOINC's scheduling as well. Good to read that the new RosettaVS tasks run correctly to 8hrs, but it should be the default for all tasks.     | 
| Sid Celery Send message Joined: 11 Feb 08 Posts: 2475 Credit: 46,506,558 RAC: 3,357   | 
 Third biggest issue is the daily cleanup job that awards credit to tasks with Validation failures (without Compute errors). This job hasn't run for a year or more It also just happened to one task on my other home PC. I was under the impression it was very rare, so only a small issue, but now I check, it happens more often than I thought. Also, the latest batch of work seems to be reporting a disappointingly high rate of errors again.     | 
| Sid Celery Send message Joined: 11 Feb 08 Posts: 2475 Credit: 46,506,558 RAC: 3,357   | 
 Perhaps one of the wizards around here can explain to me what the Project Resource share is, they're all set to 100, and how to set it for each project (without going in and manually editing the xml files). From my experience WCG will overrun boinc if given the opportunity. It allows you to set a weighting for each project. For Rosetta, click Your Account in Boinc manager, then chose Rosetta@homepreferences and edit it as desired. Fwiw I weight Rosetta 2900, WCG 100, SiDock 100 to get a majority Rosetta processed while there's work. (Note: the figure chosen isn't a %age)     | 
| Bill Swisher  Send message Joined: 10 Jun 13 Posts: 81 Credit: 61,780,497 RAC: 16,820   | 
 Ahhh...thanks, and you too Grant. I now have it set as DENIS@home 200 (no work there anyhow), Rosetta@home 500, WCG 100, and Einstein@Home 100. But I limit Einstein to 1 thread via it's app_config.xml file (on a couple of machines, the rest have stopped downloads completely). I'll watch things for a week or so.   | 
| Sid Celery Send message Joined: 11 Feb 08 Posts: 2475 Credit: 46,506,558 RAC: 3,357   | 
 Third biggest issue is the daily cleanup job that awards credit to tasks with Validation failures (without Compute errors). This job hasn't run for a year or more This validation error has now risen to 5 of my last 10 completed tasks, and several computation errors, but just on the one PC. I'm now suspecting a heat-related issue I need to investigate further.     | 
| Sid Celery Send message Joined: 11 Feb 08 Posts: 2475 Credit: 46,506,558 RAC: 3,357   | 
 Also, the latest batch of work seems to be reporting a disappointingly high rate of errors again. I've only just got around to checking what's going on after noticing that all my running tasks are of the type 9a_ce1* All RosettaVS tasks are crashing within a few minutes with the following error message <core_client_version>8.0.2</core_client_version> <![CDATA[ <message> Incorrect function. (0x1) - exit code 1 (0x1)</message> <stderr_txt> command: projects/boinc.bakerlab.org_rosetta/rosetta_beta_6.06_windows_x86_64.exe @M2_substructure_real_fulldb_zb230H_6_1204.flags -nstruct 10000 -cpu_run_time 28800 -boinc:max_nstruct 20000 -checkpoint_interval 120 -mute all -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -boinc::cpu_run_timeout 36000 -run::rng mt19937 Using database: database_f5ae1de8e1database Starting watchdog... Watchdog active. ERROR: Error in core::scoring::ScoringManager::get_rama_prepro_mainchain_torsion_potential(). Could not load RamaPrePro scoring table for GLN from file all.ramaProb. ERROR:: Exit from: src/core/scoring/ScoringManager.cc line: 1520 BOINC:: Error reading and gzipping output datafile: default.out 19:15:46 (8400): called boinc_finish(1) </stderr_txt> ]]> That's only on my Ryzen 5800X PC but not on my i5 9600K PC Is anyone else seeing this? I haven't seen it reported yet. Very odd     | 
| Jean-David Beyer Send message Joined: 2 Nov 05 Posts: 221 Credit: 7,572,744 RAC: 0 | 
 It is Wednesday again and about half of the servers are down, as usual. Let' see of they come up later today as happened at least once in my memory, or if we must wait until late Friday as is more usual.   | 
| Sid Celery Send message Joined: 11 Feb 08 Posts: 2475 Credit: 46,506,558 RAC: 3,357   | 
 It is Wednesday again and about half of the servers are down, as usual. In previous weeks the server has revived quite early on Friday, but this week it hasn't come back yet. Fingers-crossed that it comes back soon     | 
| Sid Celery Send message Joined: 11 Feb 08 Posts: 2475 Credit: 46,506,558 RAC: 3,357   | 
 Also, the latest batch of work seems to be reporting a disappointingly high rate of errors again. Perhaps not so odd in the end. I was getting significant hard drive failure warning messages last week (luckily not the boot drive) A replacement HDD data drive to clone onto is on its way All my user files going back to 2003 are on it - yes, 22 years worth!     | 
| Jean-David Beyer Send message Joined: 2 Nov 05 Posts: 221 Credit: 7,572,744 RAC: 0 | 
 In previous weeks the server has revived quite early on Friday, but this week it hasn't come back yet. It is now almos noon New York City time, and the servers are not yet up;   | 
| ![View the profile of [VENETO] boboviz Profile](https://boinc.bakerlab.org/rosetta/img/head_20.png) [VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 2124 Credit: 12,428,047 RAC: 2,329   | 
 | 
|  Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1895 Credit: 18,534,891 RAC: 0 | 
 boinc-process still dead, Validation backlog making it's way towards 1 million. Grant Darwin NT | 
| Sid Celery Send message Joined: 11 Feb 08 Posts: 2475 Credit: 46,506,558 RAC: 3,357   | 
 boinc-process still dead, Validation backlog making its way towards 1 million. It's currently 4:30pm on Friday afternoon, their time. It's not looking good ahead of the weekend. We've been seduced by the last 6 weeks to assume there's a reliable pattern in the server being revived, but I can't say I've ever been really confident, which is why I've always pointed out when it's come back in previous weeks <sigh> At least there's still task availability, although how much is slightly in doubt as the front page has been frozen for a while too...     | 
| Sid Celery Send message Joined: 11 Feb 08 Posts: 2475 Credit: 46,506,558 RAC: 3,357   | 
 boinc-process still dead, Validation backlog making its way towards 1 million. No sooner do I write that and everything's now showing green on the Server page     | 
| Jean-David Beyer Send message Joined: 2 Nov 05 Posts: 221 Credit: 7,572,744 RAC: 0 | 
 It's currently 4:30pm on Friday afternoon, their time. From now on, would you please make that post on Wednesday afternoon.   | 
            Message boards : 
            Number crunching : 
        Problems and Technical Issues with Rosetta@home
    
 
         ©2025 University of Washington 
https://www.bakerlab.org