Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 47 · 48 · 49 · 50 · 51 · 52 · 53 . . . 302 · Next
Author | Message |
---|---|
James W Send message Joined: 25 Nov 12 Posts: 130 Credit: 1,766,254 RAC: 0 |
Today I started receiving the following message in BOINC Manager (v 7.16.5) Event Log, as well as in BOINC Notices. 5/1/2020 8:08:57 PM | Rosetta@home | This project is using an old URL. When convenient, remove the project, then add https://boinc.bakerlab.org/rosetta/ Is it really necessary to remove the project to change URL? Doing this will remove all my current and pending tasks and I'd have to reload from square-one. Correct? Another way to fix this issue? |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1684 Credit: 17,950,321 RAC: 23,118 |
Today I started receiving the following message in BOINC Manager (v 7.16.5) Event Log, as well as in BOINC Notices.Set No New Tasks.5/1/2020 8:08:57 PM | Rosetta@home | This project is using an old URL. When convenient, remove the project, then add https://boinc.bakerlab.org/rosetta/Is it really necessary to remove the project to change URL? Doing this will remove all my current and pending tasks and I'd have to reload from square-one. Correct? Another way to fix this issue? When all Tasks have been completed & returned, then Remove & re-attach to the project. When re-attaching to the project, select the "Existing user option." (or whatever it is actually called). Grant Darwin NT |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,281,662 RAC: 1,150 |
Today I started receiving the following message in BOINC Manager (v 7.16.5) Event Log, as well as in BOINC Notices.5/1/2020 8:08:57 PM | Rosetta@home | This project is using an old URL. When convenient, remove the project, then add https://boinc.bakerlab.org/rosetta/ You can set No new tasks, wait for all current tasks to finish, return those, THEN follow the above instructions before turning off No new tasks. I've done this on other BOINC projects, causing no problems other than a few hours with no tasks for the affected projects running. It MIGHT be a good way to delete a few hundred megabytes of obsolete R@h files. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2125 Credit: 41,249,734 RAC: 9,368 |
Today I started receiving the following message in BOINC Manager (v 7.16.5) Event Log, as well as in BOINC Notices.Set No New Tasks.5/1/2020 8:08:57 PM | Rosetta@home | This project is using an old URL. When convenient, remove the project, then add https://boinc.bakerlab.org/rosetta/Is it really necessary to remove the project to change URL? Doing this will remove all my current and pending tasks and I'd have to reload from square-one. Correct? Another way to fix this issue? So I've realised. Thanks. Of course, it's also possible to abort all non-running Rosetta tasks to make the process of running down the cache much quicker. I may do that so removingre-attaching is done at my convenience and not in the middle of the night. |
GoldenHat Send message Joined: 14 Apr 20 Posts: 3 Credit: 122,663 RAC: 0 |
Thanks, very helpful. Could you also explain how one cleans out the cache and old files etc? Thanks. |
MarkJ Send message Joined: 28 Mar 20 Posts: 72 Credit: 25,238,680 RAC: 0 |
Thanks, very helpful. That happens when you Reset or Detach (Remove on the BOINC Manager screen) the project. BOINC blog |
James W Send message Joined: 25 Nov 12 Posts: 130 Credit: 1,766,254 RAC: 0 |
Today I started receiving the following message in BOINC Manager (v 7.16.5) Event Log, as well as in BOINC Notices.5/1/2020 8:08:57 PM | Rosetta@home | This project is using an old URL. When convenient, remove the project, then add https://boinc.bakerlab.org/rosetta/ As a followup, let me state how this process worked for me. Note that I use BOINCstatsBAM as my account manager. I marked Rosetta project "No new tasks" in my host BOINC manager so I could complete jobs in cache before deleting and replacing project with current URL address. I later noticed a note had been added next to "no new tasks" in Project tab that when all tasks completed the project would be deleted and ready for replacement (I've paraphrased exact wording). Sure enough, after last Rosetta task completed and next time host reported to account manager, Rosetta was taken out of my project list. Next time my host reported to account manager, Rosetta was reinstalled with correct info and I was given a starter set of jobs for cache. I was surprised! Not much I had to do other than be sure host synchronized with acct manager. Note that I had previously updated to BOINC manager v7.16.5. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1684 Credit: 17,950,321 RAC: 23,118 |
Seems to be a problem with the web site/database when checking out Tasks. When i go to my Account, i can click on View next to Tasks to see all of my Tasks. But if go to click on Valid or Error etc all i get is Already logged in and the url is https://boinc.bakerlab.org/rosetta/login_form.php?next_url=%2Fresults.php%3Fuserid%3D2125796%26offset%3D0%26show_names%3D0%26state%3D4%26appid%3DFor some reason it's pulling up the login form. If on my account page i click on View for Computers on this account, then Tasks for each of the computers i can then see the Valids, Errors etc. However at the top right corner my name is replaced with "Sign Up" and next to it Log out is replaced with Login. Grant Darwin NT |
Boiler Paul Send message Joined: 14 Apr 20 Posts: 4 Credit: 775,245 RAC: 0 |
Seems to be a problem with the web site/database when checking out Tasks. happening to me too. I logged out and logged back in...no change. Even cleared out cookies and rebooted.....no change. |
James W Send message Joined: 25 Nov 12 Posts: 130 Credit: 1,766,254 RAC: 0 |
Same situation with me since last night. All I can see is first screen of "All tasks for James W." If I click on any option, such as to go to next screen, see valid tasks, etc., will get the "already logged in" message like Grant mentioned. Apparently a web site issue. |
Toni Guerrero Send message Joined: 1 Oct 08 Posts: 1 Credit: 163,278 RAC: 0 |
Hello everybody. I get computation errors (exit code 11) in all Junior_HalfRoid_design5_COVID-19 tasks I'm crunching on Android 5.0.2, Boinc 7.4.53, Rosetta v4.20 arm-android-linux-gnu, CPU ARMv7 Processor rev 0 (v7l). Previously, when runnin rosetta v4.16 this same device whas crunching those tasks with no issues. Anyone has this same behaviour? Thank you. |
Admin Project administrator Send message Joined: 1 Jul 05 Posts: 4805 Credit: 0 RAC: 0 |
Sorry about the web site issues. I made some updates that obviously caused a bug. I'll work on a fix. |
David E K Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Jul 05 Posts: 1018 Credit: 4,334,829 RAC: 0 |
Hello everybody. Those jobs appear to have some issues. People have reported that if the job restarts, it can cause an error. We'll look into this but since it's somewhat rare we are continuing these jobs. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1684 Credit: 17,950,321 RAC: 23,118 |
Sorry about the web site issues. I made some updates that obviously caused a bug. I'll work on a fix.Working again. Thanks. Grant Darwin NT |
svincent Send message Joined: 30 Dec 05 Posts: 219 Credit: 12,120,035 RAC: 0 |
This task failed on Ubuntu 19.10 https://boinc.bakerlab.org/rosetta/result.php?resultid=1172834155 <core_client_version>7.16.3</core_client_version> <![CDATA[ <message> process exited with code 1 (0x1, -255)</message> <stderr_txt> command: ../../projects/boinc.bakerlab.org_rosetta/rosetta_4.20_x86_64-pc-linux-gnu @rb_05_07_20646_23758_ab_t000__robetta_FLAGS -in::file::fasta t000_.fasta -jumps:pairing_file t000_.fasta.bbcontacts.jumps -jumps:random_sheets 1 -constraints::cst_file t000_.fasta.CB.cst -constraints:cst_weight 5.0 -constraints::cst_fa_file t000_.fasta.MIN.cst -constraints:cst_fa_weight 5.0 -in:file:boinc_wu_zip rb_05_07_20646_23758_ab_t000__robetta.zip -frag3 rb_05_07_20646_23758_ab_t000__robetta.200.3mers.index.gz -fragA rb_05_07_20646_23758_ab_t000__robetta.200.11mers.index.gz -fragB rb_05_07_20646_23758_ab_t000__robetta.200.5mers.index.gz -nstruct 10000 -cpu_run_time 28800 -boinc:max_nstruct 20000 -checkpoint_interval 120 -mute all -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -boinc::cpu_run_timeout 36000 -run::rng mt19937 -constant_seed -jran 2100616 Using database: database_357d5d93529_n_methyl/minirosetta_database [ ERROR ]: Caught exception: File: src/core/pack/dunbrack/SingleResidueDunbrackLibrary.hh:306 chi angle must be between -180 and 180: -nan ------------------------ Begin developer's backtrace ------------------------- BACKTRACE: [0x3ce8b7f] [0x62b4e53] [0x408ae82] |
Folding Proteins Send message Joined: 27 Mar 20 Posts: 2 Credit: 349,986 RAC: 0 |
Hello, since changing the URL in the BOINC client (7.16.5, Win10) with the HTTPS prefix, my WUs are not saved correctly upon exit of the BOINC manager. I have the "Leave non-GPU tasks in memory while suspended" option checked in computing preferences. The issue also coincides with the Roseta 4.20 release, so I am not exactly sure whether the problem comes from the URL change or it is something from how the server handles tasks now. I had no problems of WU being saved and resumed after cleint, machine shutdowns before. Not sure what exactly is needed but I can attach some log files and settings if requested. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Another recent change was to add checkpointing between completed models in one of the search protocols. When you say the WUs were not saved correctly... what are you looking at to define that? Are you familiar with the CPU time since last checkpoint shown in the task properties? When you "exit" (rather than "close") BOINC Manager, the active tasks are ended, and will revert to their last checkpoints when they restart. The setting for leaving tasks in memory only applies when BOINC is still running, but has suspended the task to run another project, or because the user requested it. Rosetta Moderator: Mod.Sense |
Folding Proteins Send message Joined: 27 Mar 20 Posts: 2 Credit: 349,986 RAC: 0 |
To explain further: I crunch during the bigger part of the day but then have to shut down the machine overnight (for about 8 hours or so). My routine is as follows: while running, update the Rosetta project so all work is check-pointed, then simply exit the BOINC manager (with the option to suspend all work on exit enabled), then power off my PC. The next day I just power on the machine with running on startup settings and usually the WUs just continue from where they were before shut down. What happens now is when I power on the PC, all WUs are gone (marked as failed tasks) and new ones start from scratch. Since the URL/Rosetta 4.20 change I have 60/40 portion in completed and failed tasks. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,281,662 RAC: 1,150 |
To explain further: Most computers do not have main memory that will retain its contents with the power turned off. Updating Rosetta@home does NOT automatically checkpoint all work. You may need to look into telling BOINC to suspend all work, then telling your computer to Sleep instead of Shut down, so it can write the entire contents of its memory to the hard drive, and then write this back into main memory when you turn the computer on again. This lets it resume any programs that were suspended rather than aborted. I suspect that you previously had your computer set to use sleep instead of shut down, and some change since then (possibly 4.20) has turned off this setting. It could, however, also mean that 4.20 has timing routines that cannot properly handle very long delays well, or a resume from checkpoint section that fails to work properly 40% of the time it is used. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
As robertmiles points out, you cannot force checkpoints. And using the "sleep" function (where memory remains active and everything is preserved) or the "hibernate" function (where memory contents are purged to disk and memory is powered off) would be good ideas to maximize the work done on your machine. It should also avoid whatever this error condition is that is being encountered. You mention that you "...exit the BOINC manager (with the option to suspend all work on exit enabled)". I am not familiar with such an option. What is the wording on the screen for this option? Are you referring to the activity option for when to run? And setting it to suspend? Rosetta Moderator: Mod.Sense |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org