Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 42 · 43 · 44 · 45 · 46 · 47 · 48 . . . 311 · Next
Author | Message |
---|---|
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1735 Credit: 18,532,940 RAC: 14,716 |
Hi,But it's not an issue that other people are seeing, so your settings are very likely to be a factor. I would strongly suggest changing 10.04.2020 13:23:43 | | don't compute while activeto allow processing, and change 10.04.2020 13:23:43 | | suspend work if non-BOINC CPU load exceeds 15%and leave that blank. Suspend when computer is in use (leave un-checked) Suspend GPU computing when computer is in use 'In use' means mouse/keyboard input in last 3 minutes Suspend when no mouse/keyboard input in last --- minutes Suspend when non-BOINC CPU usage is above --- %You have allocated only 2 CPU threads out of 8 to process BOINC work, so there shouldn't be any benefit to stopping BOINC from processing work when the computer is in use, or non BOINC CPU usage is high (Rosetta Applications run at Idle priority). See if the errors no longer occur with those settings. Grant Darwin NT |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1735 Credit: 18,532,940 RAC: 14,716 |
I can't upload this Task https://boinc.bakerlab.org/rosetta/workunit.php?wuid=1029331860 . It is stuck for two days now. Always uploading to 100% but not disappreaing from the transfer tab.While the file is uploading (not waiting, but with the elapsed time counting upward), select Activity, "Suspend network activity." Give it a second or 2, then set it back to "Network activity based on preferences." This can often get a stuck transfer unstuck. That's showing it's not able to contact the Rosettta servers. If on the Project tab with Rosetta selected you click update, what result do you get in the Event log? (you haven't updated any AV/Malware software recently, or installed a new programme?) Grant Darwin NT |
Tom M Send message Joined: 20 Jun 17 Posts: 97 Credit: 16,726,096 RAC: 36,642 |
Now I receive again and again error messages that taks are exited with zero status and no finish file, see below. Are you running the available cpu/threads at 90% or less? You often need at least 1 thread "idle" to keep from over-committing your cpu which can produce that symptom. Tom Help, my tagline is missing..... Help, my tagline is......... Help, m........ Hel..... |
EHM-1 Send message Joined: 21 Mar 20 Posts: 23 Credit: 183,782 RAC: 0 |
FYI to anyone who may be paying attention: Rosetta resumed apparently normal behavior on my desktop this morning, after around 2 days of appearing stalled. I have no idea what is causing this behavior. Any ideas? Original post https://boinc.bakerlab.org/rosetta/forum_thread.php?id=6893&postid=92534#92534. Eric |
Sven Send message Joined: 7 Feb 16 Posts: 8 Credit: 222,005 RAC: 0 |
Jim, Grant, I don't see any problems with any other projects with these settings. And as I said, I tried it on several computers. Only Rosetta frequently stops crunching tasks. The low number of cpu kernels is for reducing the fan speed, which can reach a nerving sound. But to find out, if any of my settings are the problem, i try and change the settings to more power consumption. |
svincent Send message Joined: 30 Dec 05 Posts: 219 Credit: 12,120,035 RAC: 0 |
Problem with this task on Ubuntu https://boinc.bakerlab.org/rosetta/result.php?resultid=1146197010 4mc4in7o_Mini_Protein_binds_IL1R_COVID-19_design4_SAVE_ALL_OUT_905389_4_1 Validate error after a couple of minutes ERROR: [ERROR] Unable to open constraints file: mot_HHH_b1_05627_000000248_0001_1_19_H_._28a7dda1a33635c05f5ab621834c8d3e_0001_0001_0001.MSAcst ERROR:: Exit from: src/core/scoring/constraints/ConstraintIO.cc line: 457 00:51:21 (5035): called boinc_finish(0) |
robertmiles Send message Joined: 16 Jun 08 Posts: 1234 Credit: 14,338,560 RAC: 677 |
svincent, that error often means that some overaggressive antivirus program somewhere on the path from the download server to your computer prevented successful downloading of the missing file. A less likely cause is the workunit being set up to use a file that wasn't on the download server. Does the BOINC log file say anything about attempts to download that file? That file is emptied when BOINC restarts, so it might already be too late to see them. Do you have any antivirus program running on that computer? If so, does it keep a list of what files it decided to delete or hide? It doesn't have to have been blocked on your computer. Some previous occurrences of that type of error were on servers on the links that connect the R@h download server to their main internet portal. |
svincent Send message Joined: 30 Dec 05 Posts: 219 Credit: 12,120,035 RAC: 0 |
I have no antivirus software on my computer. Here's a portion of the event log that I hope may be helpful Sat 11 Apr 2020 12:49:03 AM PDT | Rosetta@home | Starting task 4mc4in7o_Mini_Protein_binds_IL1R_COVID-19_design4_SAVE_ALL_OUT_905389_4_1 Sat 11 Apr 2020 12:49:04 AM PDT | Rosetta@home | Started upload of hgfp_dimer_3x_42_fold_SAVE_ALL_OUT_906263_790_0_r1313969874_0 Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Info: Too old connection (2574 seconds), disconnect it Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Info: Connection 2712 seems to be dead! Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Info: Closing connection 2712 Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Info: Too old connection (2574 seconds), disconnect it Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Info: Connection 2713 seems to be dead! Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Info: Closing connection 2713 Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Info: Trying 128.95.160.157:80... Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Info: TCP_NODELAY set Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Info: Connected to boinc.bakerlab.org (128.95.160.157) port 80 (#2714) Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Sent header to server: POST /rosetta_cgi/file_upload_handler HTTP/1.1 Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Sent header to server: Host: boinc.bakerlab.org Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Sent header to server: User-Agent: BOINC client (x86_64-pc-linux-gnu 7.16.3) Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Sent header to server: Accept: */* Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Sent header to server: Accept-Encoding: deflate, gzip Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Sent header to server: Accept-Language: en_CA Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Sent header to server: Content-Length: 314 Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Sent header to server: Content-Type: application/x-www-form-urlencoded Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Sent header to server: Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Info: We are completely uploaded and fine Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Info: Mark bundle as not supporting multiuse Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Received header from server: HTTP/1.1 200 OK Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Received header from server: Date: Sat, 11 Apr 2020 07:49:05 GMT Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Received header from server: Server: Apache/2.4.18 Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Received header from server: Vary: Accept-Encoding Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Received header from server: Content-Encoding: gzip Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Received header from server: Content-Length: 75 Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Received header from server: Content-Type: text/plain Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Received header from server: Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | Sat 11 Apr 2020 12:49:05 AM PDT | Rosetta@home | [http] [ID#4338] Info: Connection #2714 to host boinc.bakerlab.org left intact Sat 11 Apr 2020 12:49:07 AM PDT | Rosetta@home | [http] [ID#4338] Info: Found bundle for host boinc.bakerlab.org: 0x7f9cb4001520 [serially] Sat 11 Apr 2020 12:49:07 AM PDT | Rosetta@home | [http] [ID#4338] Info: Can not multiplex, even if we wanted to! Sat 11 Apr 2020 12:49:07 AM PDT | Rosetta@home | [http] [ID#4338] Info: Re-using existing connection! (#2714) with host boinc.bakerlab.org Sat 11 Apr 2020 12:49:07 AM PDT | Rosetta@home | [http] [ID#4338] Info: Connected to boinc.bakerlab.org (128.95.160.157) port 80 (#2714) Sat 11 Apr 2020 12:49:07 AM PDT | Rosetta@home | [http] [ID#4338] Sent header to server: POST /rosetta_cgi/file_upload_handler HTTP/1.1 Sat 11 Apr 2020 12:49:07 AM PDT | Rosetta@home | [http] [ID#4338] Sent header to server: Host: boinc.bakerlab.org Sat 11 Apr 2020 12:49:07 AM PDT | Rosetta@home | [http] [ID#4338] Sent header to server: User-Agent: BOINC client (x86_64-pc-linux-gnu 7.16.3) Sat 11 Apr 2020 12:49:07 AM PDT | Rosetta@home | [http] [ID#4338] Sent header to server: Accept: */* Sat 11 Apr 2020 12:49:07 AM PDT | Rosetta@home | [http] [ID#4338] Sent header to server: Accept-Encoding: deflate, gzip Sat 11 Apr 2020 12:49:07 AM PDT | Rosetta@home | [http] [ID#4338] Sent header to server: Accept-Language: en_CA Sat 11 Apr 2020 12:49:07 AM PDT | Rosetta@home | [http] [ID#4338] Sent header to server: Content-Length: 283230 Sat 11 Apr 2020 12:49:07 AM PDT | Rosetta@home | [http] [ID#4338] Sent header to server: Content-Type: application/x-www-form-urlencoded Sat 11 Apr 2020 12:49:07 AM PDT | Rosetta@home | [http] [ID#4338] Sent header to server: Expect: 100-continue Sat 11 Apr 2020 12:49:07 AM PDT | Rosetta@home | [http] [ID#4338] Sent header to server: Sat 11 Apr 2020 12:49:07 AM PDT | Rosetta@home | [http] [ID#4338] Info: Mark bundle as not supporting multiuse Sat 11 Apr 2020 12:49:07 AM PDT | Rosetta@home | [http] [ID#4338] Received header from server: HTTP/1.1 100 Continue Sat 11 Apr 2020 12:49:07 AM PDT | Rosetta@home | [http] [ID#4338] Info: We are completely uploaded and fine Sat 11 Apr 2020 12:49:07 AM PDT | Rosetta@home | [http] [ID#4338] Info: Mark bundle as not supporting multiuse Sat 11 Apr 2020 12:49:07 AM PDT | Rosetta@home | [http] [ID#4338] Received header from server: HTTP/1.1 200 OK Sat 11 Apr 2020 12:49:07 AM PDT | Rosetta@home | [http] [ID#4338] Received header from server: Date: Sat, 11 Apr 2020 07:49:07 GMT Sat 11 Apr 2020 12:49:07 AM PDT | Rosetta@home | [http] [ID#4338] Received header from server: Server: Apache/2.4.18 Sat 11 Apr 2020 12:49:07 AM PDT | Rosetta@home | [http] [ID#4338] Received header from server: Content-Length: 64 Sat 11 Apr 2020 12:49:07 AM PDT | Rosetta@home | [http] [ID#4338] Received header from server: Content-Type: text/plain Sat 11 Apr 2020 12:49:07 AM PDT | Rosetta@home | [http] [ID#4338] Received header from server: Sat 11 Apr 2020 12:49:07 AM PDT | Rosetta@home | [http] [ID#4338] Received header from server: <data_server_reply> Sat 11 Apr 2020 12:49:07 AM PDT | Rosetta@home | [http] [ID#4338] Received header from server: <status>0</status> Sat 11 Apr 2020 12:49:07 AM PDT | Rosetta@home | [http] [ID#4338] Received header from server: </data_server_reply> Sat 11 Apr 2020 12:49:07 AM PDT | Rosetta@home | [http] [ID#4338] Info: Connection #2714 to host boinc.bakerlab.org left intact Sat 11 Apr 2020 12:49:08 AM PDT | Rosetta@home | Finished upload of hgfp_dimer_3x_42_fold_SAVE_ALL_OUT_906263_790_0_r1313969874_0 Sat 11 Apr 2020 12:51:24 AM PDT | Rosetta@home | Computation for task 4mc4in7o_Mini_Protein_binds_IL1R_COVID-19_design4_SAVE_ALL_OUT_905389_4_1 finished |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1735 Credit: 18,532,940 RAC: 14,716 |
Jim, Grant,That's our point- you are having these issues, on multiple systems, yet other people aren't. So the most likely cause is what is different between what you have and everyone else has? And that would appear to be your settings. The low number of cpu kernels is for reducing the fan speed, which can reach a nerving sound.Yep, so because you have limited the number of Tasks that will run, there is no need to start & stop computation work when other programmes are running. But to find out, if any of my settings are the problem, i try and change the settings to more power consumption.Hopefully it will sort things out. Grant Darwin NT |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2146 Credit: 41,570,180 RAC: 6,141 |
I don't see any problems with any other projects with these settings. Just because Rosetta runs on the Boinc platform, like other projects, there's no requirement for it to run under the same parameters as other projects. Indeed, it's <very> different. Rosetta imposes very high demands on RAM, disk space and CPU power. It needs a relatively long processing time (compared to some projects, but lower than some others) but also requires a relatively short turnaround time. Consequently, the buffer of tasks it allow users to hold offline are lower than you may be used to. If people insist on maintaining the assumptions that apply to entirely different projects, those assumptions are going to fall flat on their face here and problems are inevitable. So be prepared to adapt |
robertmiles Send message Joined: 16 Jun 08 Posts: 1234 Credit: 14,338,560 RAC: 677 |
I have no antivirus software on my computer. [snip] Looks like you found log file information for an upload. You need to look for information on downloads instead, which for the same task, should be earlier in the log file than the upload. As I mentioned before, the cause is not necessarily on your computer. |
svincent Send message Joined: 30 Dec 05 Posts: 219 Credit: 12,120,035 RAC: 0 |
Sorry: I don't seem to be able to find that info: the log didn't go back that far. Anyway its the only task where it happened. |
Evil Penguin Send message Joined: 10 Jun 08 Posts: 5 Credit: 10,168,989 RAC: 0 |
Could someone please help me determine if the system I'm running is to blame or if these are bad WUs? https://boinc.bakerlab.org/rosetta/result.php?resultid=1147046920 https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=3817017 Only thing out of spec is how fast I have the memory running. I ran MemTest Pro for a good 24 hours and no errors. |
VelocityRC Send message Joined: 4 Apr 20 Posts: 4 Credit: 516,338 RAC: 0 |
Hi all. I posted earlier about BOINC dropping Rosetta upon every re-boot and having to re-add it to the projects list. I finally got the time to track it down. I was running BOINC in my AV sandbox. I took it out of the sandbox and all is well. I can now put it back in the sandbox and Rosetta stays in the projects after re-boot. One last question that the above didn't solve is upon every re-boot I have to re-adjust CPU useage. I have 2 different values on 2 different machines. Both behave the same way out of the sandbox or in. Ideas ?? Thanks. Bill S. EDIT: After a cold boot all the issues are resolved. Above seems to be the procedure for those running BOINC in a sandbox. At least where Avast Premium is concerned. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1234 Credit: 14,338,560 RAC: 677 |
Sorry: I don't seem to be able to find that info: the log didn't go back that far. Anyway its the only task where it happened. I'm now using BOINC 7.16.5. I think the release notes for this version mentioned a fix for this problem - the problem with waits They consider the oldest parts of the log being truncated a feature necessary when BOINC runs nonstop for weeks or months at a time. One more detail - the log file is automatically emptied when BOINC restart after being shut down. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1735 Credit: 18,532,940 RAC: 14,716 |
Only thing out of spec is how fast I have the memory running.Programmes such as Rosetta can work the memory harder than memory testing programmes, or just differently enough, to find a problem the testing programme doesn't. I'd suggest reverting to the default (or at least the XMP) settings for the RAM and see if that stops the errors from occurring. Grant Darwin NT |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1735 Credit: 18,532,940 RAC: 14,716 |
One last question that the above didn't solve is upon every re-boot I have to re-adjust CPU useage. I have 2 different values on 2 different machines. Both behave the same way out of the sandbox or in.Looks like you figured it out. That is the whole point of a Sandbox- you can do whatever you want in there and it makes no difference, when you restart it everything is set back the how it was at the last start. If you make changes that you actually want to keep, you need to explicitly do so using whatever mechanism that Sandbox supports. Grant Darwin NT |
Evil Penguin Send message Joined: 10 Jun 08 Posts: 5 Credit: 10,168,989 RAC: 0 |
Only thing out of spec is how fast I have the memory running.Programmes such as Rosetta can work the memory harder than memory testing programmes, or just differently enough, to find a problem the testing programme doesn't. The errors have been from running Rosetta Mini v3.78 windows_intelx86 tasks. Why is it running the 32-bit version instead of 64-bit? The Rosetta Mini v3.78 windows_x86_64 tasks have been running without issue. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1735 Credit: 18,532,940 RAC: 14,716 |
The errors have been from running Rosetta Mini v3.78 windows_intelx86 tasks.The BOINC Manager will try all applications for a particular Platform, and go with whichever one appears to be the best. The Rosetta Mini v3.78 windows_x86_64 tasks have been running without issue.And on my system 90% or more of the Rosetta Mini Tasks have been done with Rosetta Mini v3.78 windows_intelx86 with no recent computation errors (there were some dud Tasks some time back), even the same type of Tasks as your system is erroring on have Validated. It could be a problem with the application, and those Tasks, on the Ryzen platform. But i haven't seen any other people reporting issues, so i'd suggest seeing how things go with the memory at default settings. Grant Darwin NT |
Keith Myers Send message Joined: 29 Mar 20 Posts: 97 Credit: 332,619 RAC: 8 |
I had nothing but errors on both the i686 applications on my Ryzen. Gave up on Rosetta and moved to Einstein. Discovered later that you can set a flag in cc_config.xml to ignore alternate platforms. <no_alt_platform>1</no_alt_platform> That would have told the Rosetta scheduler to not send me x86 applications and just send me the x86_64 applications. |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2025 University of Washington
https://www.bakerlab.org