Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 31 · 32 · 33 · 34 · 35 · 36 · 37 . . . 257 · Next

AuthorMessage
Profile karbonade
Avatar

Send message
Joined: 22 Mar 20
Posts: 2
Credit: 1,837,525
RAC: 2
Message 92680 - Posted: 31 Mar 2020, 0:45:42 UTC

Hi,
My PC crashed and I had to install the OS again (VM Ubuntu) and gave it the same name synstem name as the install before. After that I Insatlled Boinc and set up R@H.
But now I don't get any new tasks. I just keep getting the message: "Scheduler request completed: got 0 new tasks. No tasks sent"
On my other PC I still get new tasks sent.
Is it because the tasks that where there before the crash still are "open" ? Well, they will not be procecced by me....
I also used in Boinc the button " reset project" but that didn't help in any way.
Is there something else I have to do?
ID: 92680 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile yoerik
Avatar

Send message
Joined: 24 Mar 20
Posts: 128
Credit: 169,525
RAC: 0
Message 92683 - Posted: 31 Mar 2020, 0:53:10 UTC - in response to Message 92680.  

Hi,
My PC crashed and I had to install the OS again (VM Ubuntu) and gave it the same name synstem name as the install before. After that I Insatlled Boinc and set up R@H.
But now I don't get any new tasks. I just keep getting the message: "Scheduler request completed: got 0 new tasks. No tasks sent"
On my other PC I still get new tasks sent.
Is it because the tasks that where there before the crash still are "open" ? Well, they will not be procecced by me....
I also used in Boinc the button " reset project" but that didn't help in any way.
Is there something else I have to do?


It's not your end. Rosetta is out of WUs. The latest updates from a Project Administrator: https://boinc.bakerlab.org/rosetta/forum_thread.php?id=13533&postid=92681#92681
So don't worry. Most people aren't getting new WUs right now. The WUs you were using, were likely redeployed, or the system will take a few days to realize that you stopped working on those WUs.[/url]
ID: 92683 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
MarkJ

Send message
Joined: 28 Mar 20
Posts: 72
Credit: 24,868,193
RAC: 32
Message 92685 - Posted: 31 Mar 2020, 1:13:51 UTC - in response to Message 92606.  

There are no silly questions. But HT is enabled. Here are the other BIOS options about performance (last word is the current setting):
Intel(R) Turbo Boost Technology Default - Enabled Enabled
ACPI SLIT Default - Enabled Enabled

Node Interleaving Default - Disabled Disabled
Intel NIC DMA Channels (IOAT) Default - Enabled Enabled
HW Prefetcher Default - Enabled Enabled
Adjacent Sector Prefetch Default - Enabled Enabled
DCU Stream Prefetcher Default - Enabled Enabled
DCU IP Prefetcher Default - Enabled Enabled
QPI Snoop Configuration Default - Home Snoop Home Snoop
QPI Home Snoop Optimization Default - Directory + OSB Enabled
QPI Bandwidth Optimization (RTID) Default - Balanced Balanced
Memory Proximity Reporting for I/O Default - Enabled Enabled
I/O Non-posted Prefetching Default - Enabled Enabled
NUMA Group Size Optimization Default - Clustered Clustered
Intel Performance Monitoring Support Default - Disabled Disabled

I would compare these settings with one of your other machines that is working properly.
Node Interleaving
NUMA group size

About the only other thing I can think of would be the windows version. Are you running Home, Pro or Enterprise on it? Is it the same as the other (working) machines?
ID: 92685 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1217
Credit: 13,359,660
RAC: 431
Message 92687 - Posted: 31 Mar 2020, 1:16:17 UTC - in response to Message 92678.  

I have 2 tasks with Computation error resolution:

https://boinc.bakerlab.org/rosetta/result.php?resultid=1136321350
https://boinc.bakerlab.org/rosetta/result.php?resultid=1136320461

Both looks receiving out of memory at a point.
I have 64GB RAM; it's about Rosetta@home on 32bit or something?

Thanks,
Iulian

It looks likely that you have enough memory, but haven't given BOINC permission to use enough of it.

Try this:

If you are using the simple view, click on View near the top line, then Advance view....

Click on Projects, then Rosetta@home, then Your account.

Under Preferences, click on Computing preferences.

For each of the following sections, if they are present:
Primary (default) preferences
Separate preferences for home
Separate preferences for work
Separate preferences for school

Scroll down to Memory.

If the memory percentages are too low to allow 64 GB divided by the number of processors (12 in your case), to be at least 2 GB, then scroll down to Other and click on Edit preferences, then scroll down to Memory and increase the percentages, then scroll down to Other and click on Update preferences.

Click on the X at the top right corner of the Computing preferences window to close it.

Click on Projects, then Rosetta@home, then Update.

If you want to go back to the Simple view, click on View, then Simple view....
ID: 92687 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1380
Credit: 13,693,695
RAC: 136
Message 92705 - Posted: 31 Mar 2020, 4:41:49 UTC
Last modified: 31 Mar 2020, 4:53:56 UTC

Can't upload anything at present- uploads going in to instant timeout.

31/03/2020 13:56:53 | Rosetta@home | Computation for task rb_03_21_19079_18887_ab_t000__robetta_cstwt_5.0_IGNORE_THE_REST_03_07_902861_8_1 finished
31/03/2020 13:56:54 | Rosetta@home | Starting task 1xl3zs2u_Junior_HalfRoid_design2_COVID-19_SAVE_ALL_OUT_904232_1_0
31/03/2020 13:56:55 | Rosetta@home | Started upload of rb_03_21_19079_18887_ab_t000__robetta_cstwt_5.0_IGNORE_THE_REST_03_07_902861_8_1_r402737099_0
31/03/2020 13:56:57 | Rosetta@home | [error] Error reported by file upload server: can't open log file '../log_bwsrv2/file_upload_handler.log' (errno: 9)
31/03/2020 13:56:57 | Rosetta@home | Temporarily failed upload of rb_03_21_19079_18887_ab_t000__robetta_cstwt_5.0_IGNORE_THE_REST_03_07_902861_8_1_r402737099_0: transient upload error
31/03/2020 13:56:57 | Rosetta@home | Backing off 00:03:13 on upload of rb_03_21_19079_18887_ab_t000__robetta_cstwt_5.0_IGNORE_THE_REST_03_07_902861_8_1_r402737099_0
31/03/2020 13:58:40 | Rosetta@home | Computation for task 0jb7gi3t_jhr_design1_COVID-19_SAVE_ALL_OUT_903456_1_0 finished
31/03/2020 13:58:41 | Rosetta@home | Starting task rb_03_29_19780_19680_ab_t000__robetta_IGNORE_THE_REST_05_08_904234_12_0
31/03/2020 13:58:43 | Rosetta@home | Started upload of 0jb7gi3t_jhr_design1_COVID-19_SAVE_ALL_OUT_903456_1_0_r1714937121_0
31/03/2020 13:58:45 | Rosetta@home | [error] Error reported by file upload server: can't open log file '../log_bwsrv2/file_upload_handler.log' (errno: 9)
31/03/2020 13:58:45 | Rosetta@home | Temporarily failed upload of 0jb7gi3t_jhr_design1_COVID-19_SAVE_ALL_OUT_903456_1_0_r1714937121_0: transient upload error



EDIT-
finally managed to get them to upload.
Grant
Darwin NT
ID: 92705 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1380
Credit: 13,693,695
RAC: 136
Message 92706 - Posted: 31 Mar 2020, 4:44:18 UTC - in response to Message 92600.  
Last modified: 31 Mar 2020, 4:51:08 UTC

I really don't know what's wrong. I use the exact same setting on all my servers. As said, Gen8 servers, even with 64 cores are fully loaded. Gen9 servers only take half...
Any luck with this problem?
Not sure if you answered one of my earlier questions- "Are all of the threads in use on just the one CPU?"
If you run another CPU intensive programme, does it use one of the other unused threads, or does it end up on the threads presently in use?

Win Sever 2016- could it be a licencing issue? Licence is expired/no longer valid, so only 1 socket usable, even though both CPUs are detected & recognised by the OS? (never had to deal with socket/core/thread licencing myself).
Grant
Darwin NT
ID: 92706 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile yoerik
Avatar

Send message
Joined: 24 Mar 20
Posts: 128
Credit: 169,525
RAC: 0
Message 92707 - Posted: 31 Mar 2020, 4:52:49 UTC - in response to Message 92705.  

Can't upload anything at present- uploads going in to instant timeout.

31/03/2020 13:56:53 | Rosetta@home | Computation for task rb_03_21_19079_18887_ab_t000__robetta_cstwt_5.0_IGNORE_THE_REST_03_07_902861_8_1 finished
31/03/2020 13:56:54 | Rosetta@home | Starting task 1xl3zs2u_Junior_HalfRoid_design2_COVID-19_SAVE_ALL_OUT_904232_1_0
31/03/2020 13:56:55 | Rosetta@home | Started upload of rb_03_21_19079_18887_ab_t000__robetta_cstwt_5.0_IGNORE_THE_REST_03_07_902861_8_1_r402737099_0
31/03/2020 13:56:57 | Rosetta@home | [error] Error reported by file upload server: can't open log file '../log_bwsrv2/file_upload_handler.log' (errno: 9)
31/03/2020 13:56:57 | Rosetta@home | Temporarily failed upload of rb_03_21_19079_18887_ab_t000__robetta_cstwt_5.0_IGNORE_THE_REST_03_07_902861_8_1_r402737099_0: transient upload error
31/03/2020 13:56:57 | Rosetta@home | Backing off 00:03:13 on upload of rb_03_21_19079_18887_ab_t000__robetta_cstwt_5.0_IGNORE_THE_REST_03_07_902861_8_1_r402737099_0
31/03/2020 13:58:40 | Rosetta@home | Computation for task 0jb7gi3t_jhr_design1_COVID-19_SAVE_ALL_OUT_903456_1_0 finished
31/03/2020 13:58:41 | Rosetta@home | Starting task rb_03_29_19780_19680_ab_t000__robetta_IGNORE_THE_REST_05_08_904234_12_0
31/03/2020 13:58:43 | Rosetta@home | Started upload of 0jb7gi3t_jhr_design1_COVID-19_SAVE_ALL_OUT_903456_1_0_r1714937121_0
31/03/2020 13:58:45 | Rosetta@home | [error] Error reported by file upload server: can't open log file '../log_bwsrv2/file_upload_handler.log' (errno: 9)
31/03/2020 13:58:45 | Rosetta@home | Temporarily failed upload of 0jb7gi3t_jhr_design1_COVID-19_SAVE_ALL_OUT_903456_1_0_r1714937121_0: transient upload error


My PC is having trouble transferring files as well. The server on their end is probably overwhelmed by the number of WUs out in the wild, trying to upload.
ID: 92707 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile vaughan

Send message
Joined: 17 Sep 05
Posts: 4
Credit: 21,277,287
RAC: 53
Message 92710 - Posted: 31 Mar 2020, 5:39:34 UTC

Why don't the Rosetta stats sent to Free-DC update more often?
ID: 92710 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1380
Credit: 13,693,695
RAC: 136
Message 92712 - Posted: 31 Mar 2020, 6:28:18 UTC

Still unable to report, the Scheduler has been down for a while now (3+hrs).
Grant
Darwin NT
ID: 92712 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile karbonade
Avatar

Send message
Joined: 22 Mar 20
Posts: 2
Credit: 1,837,525
RAC: 2
Message 92713 - Posted: 31 Mar 2020, 6:38:54 UTC - in response to Message 92683.  

Hi,
My PC crashed and I had to install the OS again (VM Ubuntu) and gave it the same name synstem name as the install before. After that I Insatlled Boinc and set up R@H.
But now I don't get any new tasks. I just keep getting the message: "Scheduler request completed: got 0 new tasks. No tasks sent"
On my other PC I still get new tasks sent.
Is it because the tasks that where there before the crash still are "open" ? Well, they will not be procecced by me....
I also used in Boinc the button " reset project" but that didn't help in any way.
Is there something else I have to do?


It's not your end. Rosetta is out of WUs. The latest updates from a Project Administrator: https://boinc.bakerlab.org/rosetta/forum_thread.php?id=13533&postid=92681#92681
So don't worry. Most people aren't getting new WUs right now. The WUs you were using, were likely redeployed, or the system will take a few days to realize that you stopped working on those WUs.[/url]


Thanks for your answer yourik.
A good thing that there are updates comiming up.
ID: 92713 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
ToeBlister

Send message
Joined: 24 Mar 20
Posts: 3
Credit: 426,122
RAC: 0
Message 92714 - Posted: 31 Mar 2020, 6:39:15 UTC - in response to Message 92712.  
Last modified: 31 Mar 2020, 6:39:40 UTC

Rosetta appears to be down for maint?
31-Mar-20 2:12:46 PM | Rosetta@home | Project is temporarily shut down for maintenance
ID: 92714 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
HPE Belgium

Send message
Joined: 27 Mar 20
Posts: 16
Credit: 358,070,414
RAC: 27,914
Message 92727 - Posted: 31 Mar 2020, 11:21:57 UTC - in response to Message 92685.  

There are no silly questions. But HT is enabled. Here are the other BIOS options about performance (last word is the current setting):
Intel(R) Turbo Boost Technology Default - Enabled Enabled
ACPI SLIT Default - Enabled Enabled

Node Interleaving Default - Disabled Disabled
Intel NIC DMA Channels (IOAT) Default - Enabled Enabled
HW Prefetcher Default - Enabled Enabled
Adjacent Sector Prefetch Default - Enabled Enabled
DCU Stream Prefetcher Default - Enabled Enabled
DCU IP Prefetcher Default - Enabled Enabled
QPI Snoop Configuration Default - Home Snoop Home Snoop
QPI Home Snoop Optimization Default - Directory + OSB Enabled
QPI Bandwidth Optimization (RTID) Default - Balanced Balanced
Memory Proximity Reporting for I/O Default - Enabled Enabled
I/O Non-posted Prefetching Default - Enabled Enabled
NUMA Group Size Optimization Default - Clustered Clustered
Intel Performance Monitoring Support Default - Disabled Disabled

I would compare these settings with one of your other machines that is working properly.
Node Interleaving
NUMA group size

About the only other thing I can think of would be the windows version. Are you running Home, Pro or Enterprise on it? Is it the same as the other (working) machines?


I will check the settings... All servers are running windows 2016 Std. 64bit
ID: 92727 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
HPE Belgium

Send message
Joined: 27 Mar 20
Posts: 16
Credit: 358,070,414
RAC: 27,914
Message 92746 - Posted: 31 Mar 2020, 14:23:17 UTC - in response to Message 92706.  

I really don't know what's wrong. I use the exact same setting on all my servers. As said, Gen8 servers, even with 64 cores are fully loaded. Gen9 servers only take half...
Any luck with this problem?
Not sure if you answered one of my earlier questions- "Are all of the threads in use on just the one CPU?"
If you run another CPU intensive programme, does it use one of the other unused threads, or does it end up on the threads presently in use?

Win Sever 2016- could it be a licencing issue? Licence is expired/no longer valid, so only 1 socket usable, even though both CPUs are detected & recognised by the OS? (never had to deal with socket/core/thread licencing myself).


All servers are fresh install. To be honest, I did not really troubleshoot any further since I was busy setting up all other hosts. This seems to be the only host with the issue. I will dig into it later tomorrow I think.

I installed boinctask to get a decent overview and it seems this specific host is not underperforming. But you are right, everything is running on only 1 CPU.

I'll come back on that later. First setting up all other hosts.
ID: 92746 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Horny

Send message
Joined: 5 May 12
Posts: 2
Credit: 1,067,923
RAC: 0
Message 92782 - Posted: 31 Mar 2020, 17:33:28 UTC

Hello,
90% of the wus i uploaded today have only been awarded 8-10 points, is there a reason why these covid wus were distributed to 2 and now hardly give any points?
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=1023883360
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=1023826450
lg
ID: 92782 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 1,128
Message 92783 - Posted: 31 Mar 2020, 17:45:17 UTC - in response to Message 92782.  

90% of the wus i uploaded today have only been awarded 8-10 points

I run the 18-hour work units, and see one with 38 points and several more with 64 points.

I used to post lengthy analyses trying to understand their scoring, but have given up on it.
There are more interesting things to to.
ID: 92783 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Horny

Send message
Joined: 5 May 12
Posts: 2
Credit: 1,067,923
RAC: 0
Message 92785 - Posted: 31 Mar 2020, 17:57:14 UTC

na what is noticeable is all these 90% were sent 2 times, the other 10% were not and these got normal credits
ID: 92785 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 92793 - Posted: 31 Mar 2020, 18:35:52 UTC

Some of the WUs are having problems, and without any other work on the server, it gets entirely likely that all you get are ones that others have failed on. WUs are created to only attempt running on two host machines. After that, they die.

The Project Team is working on this issue, seems to be primarily with COVID tasks.
Rosetta Moderator: Mod.Sense
ID: 92793 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Plomos

Send message
Joined: 4 Mar 11
Posts: 11
Credit: 435,400
RAC: 48
Message 92799 - Posted: 31 Mar 2020, 19:55:03 UTC

So I'm running on Fedora 31 Boinc 7.16.1 and rosetta tasks do not have the option to view the graphics to be able to see how many decoys a unit is running.

I have 12GB of Ram and a core i5 8250U cpu but only use 4 cores because of heating issues, so there should be plenty of RAM to go around for the graphics of the screensaver but it does not show as an option I can click on and is only greyed out. Anyone know why it is not available?
ID: 92799 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mariusz Borowski

Send message
Joined: 18 Jul 10
Posts: 1
Credit: 10,383
RAC: 0
Message 92812 - Posted: 31 Mar 2020, 21:34:37 UTC

As 21:20 UTC time I'm not getting any tasks from R@H. My computer was busy with another BOINC project for several days. I've have just logged to my Rosetta account and there is something strange. There is only 8 abandoned tasks listed while, there should be about 50 with various status. Is that normal or there is something wrong with my account?
https://boinc.bakerlab.org/rosetta/results.php?userid=386774&offset=0&show_names=0&state=0&appid=0
ID: 92812 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 92813 - Posted: 31 Mar 2020, 21:37:11 UTC

Normal that results do not stay on the website forever. Typically removed after a week or two.
Rosetta Moderator: Mod.Sense
ID: 92813 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 31 · 32 · 33 · 34 · 35 · 36 · 37 . . . 257 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2022 University of Washington
https://www.bakerlab.org