Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 310 · 311 · 312 · 313 · 314 · 315 · 316 . . . 352 · Next

AuthorMessage
Bill Swisher
Avatar

Send message
Joined: 10 Jun 13
Posts: 81
Credit: 61,780,497
RAC: 16,820
Message 111896 - Posted: 11 Jan 2025, 1:11:18 UTC - in response to Message 111895.  

There was quite a conversation a while back about those transient errors. Someone found that if the line
128.95.160.156 boinc-files.bakerlab.org
is included in the /etc/hosts file that things work better. I don't know if it's still needed or not, I haven't removed it.
ID: 111896 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Matthias Lehmkuhl

Send message
Joined: 20 Nov 05
Posts: 13
Credit: 2,685,355
RAC: 102
Message 111897 - Posted: 11 Jan 2025, 1:36:54 UTC - in response to Message 111896.  

There was quite a conversation a while back about those transient errors. Someone found that if the line
128.95.160.156 boinc-files.bakerlab.org
is included in the /etc/hosts file that things work better. I don't know if it's still needed or not, I haven't removed it.


Thanks, it is working for me too
Matthias

ID: 111897 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Stevie G

Send message
Joined: 15 Dec 18
Posts: 129
Credit: 1,028,210
RAC: 0
Message 111898 - Posted: 11 Jan 2025, 4:26:21 UTC - in response to Message 111885.  

Does this mean that the processed data can be uploaded again soon and that there will soon be new work?
My Boinc can't get rid of the work for rosetta@home and the new work can't seem to be downloaded.
It would be nice if it could continue again soon.


I finally got 11 Rosetta tasks, three of which were almost immediately rejected as Errors While Computing.

They were all Rosetta Beta v6.06
windows_x86_64

S. Gaber
ID: 111898 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Stevie G

Send message
Joined: 15 Dec 18
Posts: 129
Credit: 1,028,210
RAC: 0
Message 111899 - Posted: 11 Jan 2025, 4:26:22 UTC - in response to Message 111885.  

Does this mean that the processed data can be uploaded again soon and that there will soon be new work?
My Boinc can't get rid of the work for rosetta@home and the new work can't seem to be downloaded.
It would be nice if it could continue again soon.


I finally got 11 Rosetta tasks, three of which were almost immediately rejected as Errors While Computing.

They were all Rosetta Beta v6.06
windows_x86_64

S. Gaber
ID: 111899 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2475
Credit: 46,506,558
RAC: 3,357
Message 111901 - Posted: 11 Jan 2025, 8:20:01 UTC - in response to Message 111867.  

The latest update reads:
January 4, 2025
Update from the data centre: "Having issues with the physical network , likely can't get it diagnosed and fixed until Monday, January 6.". As a result - we still do not have a connection to our servers.


It is now Tuesday January 7 morning, and still nothing.


Now :-

January 7, 2025
Networking issues have been resolved. The data centre staff is finalizing our access.

Since then:
January 8, 2025
Most of our infrastructure is back online. Unfortunately, some issues with the network and specific virtual machines remain. Thus, the BOINC database node remains unavailable, and the website and forums also do not function properly.
Sharcnet data center team is working on to restore access to these instances in priority order.
Once this is resolved, we will have a smooth restart of the workunit management and BOINC components on the backend, and be able to isolate and diagnose any remaining issues as we restart.

And finally
January 9, 2025
BOINC database is up and in a good state. We are waiting on two more servers to regain access to the network, at which point we will be restarting the scheduler, transitioner, assimilators and validators.
All deadlines for outstanding MCM1 work units have been extended to just after 6:00 p.m. Eastern Standard Time on January 15th, 2025.
Web site is up; stats will be updated soon.
Forums are up.

One hour ago I got 50 new tasks, which have struggled down with everyone piling in, but after several retries I succeded.
Hoping for a period of normality
ID: 111901 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile ServicEnginIC

Send message
Joined: 17 Mar 20
Posts: 2
Credit: 4,166,628
RAC: 163
Message 111902 - Posted: 11 Jan 2025, 10:58:16 UTC - in response to Message 111896.  

There was quite a conversation a while back about those transient errors. Someone found that if the line
128.95.160.156 boinc-files.bakerlab.org
is included in the /etc/hosts file that things work better. I don't know if it's still needed or not, I haven't removed it.

That remedy worked for me too.
It took immediate effect after editing hosts file, no need to restart compuer, nor BOIC Manager.
Thank you very much!
ID: 111902 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2475
Credit: 46,506,558
RAC: 3,357
Message 111903 - Posted: 13 Jan 2025, 7:55:08 UTC - in response to Message 111901.  

January 9, 2025
BOINC database is up and in a good state. We are waiting on two more servers to regain access to the network, at which point we will be restarting the scheduler, transitioner, assimilators and validators.
All deadlines for outstanding MCM1 work units have been extended to just after 6:00 p.m. Eastern Standard Time on January 15th, 2025.
Web site is up; stats will be updated soon.
Forums are up.

One hour ago I got 50 new tasks, which have struggled down with everyone piling in, but after several retries I succeded.
Hoping for a period of normality

And a final housekeeping update
January 10, 2025
All our servers are back online. We are downloding processed work units and sending out new ones, and they have the right deadlines (after the initial glitch with the time).
We have noticed 92,194 results that ended up in an error state. We saved all these in a file so we can repair this issue. We'll be able to rescue these, especially since there is no file deleter or db purge daemon running for now, so all data related to these workunits is still on the filesystem.
We will continue to monitor the system to make sure it is stable.
We are also working on getting MAM project into beta.
Thank you all for continuing to support research. Happy new year.

ID: 111903 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 2124
Credit: 12,428,047
RAC: 2,329
Message 111904 - Posted: 13 Jan 2025, 9:47:19 UTC - in response to Message 111903.  

We are also working on getting MAM project into beta.
Thank you all for continuing to support research. Happy new year.


What is MAM??
ID: 111904 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 288
Credit: 540,373
RAC: 0
Message 111905 - Posted: 13 Jan 2025, 10:11:14 UTC

Mapping Arthritis Markers.
ID: 111905 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 2124
Credit: 12,428,047
RAC: 2,329
Message 111906 - Posted: 13 Jan 2025, 10:28:48 UTC - in response to Message 111905.  

Mapping Arthritis Markers.


Interesting!!
ID: 111906 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 2124
Credit: 12,428,047
RAC: 2,329
Message 111916 - Posted: 15 Jan 2025, 12:54:15 UTC

It's Wednesday.
And the servers are down.

As usual
ID: 111916 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1895
Credit: 18,534,891
RAC: 0
Message 111918 - Posted: 15 Jan 2025, 18:28:17 UTC - in response to Message 111916.  

It's Wednesday.
And the servers are down.
Just the usual one.
Grant
Darwin NT
ID: 111918 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2475
Credit: 46,506,558
RAC: 3,357
Message 111923 - Posted: 18 Jan 2025, 2:22:36 UTC - in response to Message 111918.  

It's Wednesday.
And the servers are down.
Just the usual one.

Not sure when, but boinc-server is back and some tasks keep sneaking through even though I never see any stock available on the server page or front page.
Luck of the draw if you see any.
ID: 111923 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Anonymous

Send message
Joined: 21 Nov 08
Posts: 2
Credit: 1,107,476
RAC: 0
Message 111931 - Posted: 20 Jan 2025, 22:57:03 UTC - in response to Message 111902.  
Last modified: 20 Jan 2025, 22:59:04 UTC

There was quite a conversation a while back about those transient errors. Someone found that if the line
128.95.160.156 boinc-files.bakerlab.org
is included in the /etc/hosts file that things work better. I don't know if it's still needed or not, I haven't removed it.

That remedy worked for me too.
It took immediate effect after editing hosts file, no need to restart compuer, nor BOIC Manager.
Thank you very much!



This also worked for me on Ubuntu. Thank you!
ID: 111931 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2475
Credit: 46,506,558
RAC: 3,357
Message 111934 - Posted: 21 Jan 2025, 6:37:42 UTC

With some of the recent robetta tasks, single decoys are running for 5 or 6 hours, so tasks are running short of the 8hr default.
With my 12hr runtime my tasks are either running just over 6hrs, with a few completing 2 decoys in about 10.5hrs.
Not ideal, but it is what it is.
ID: 111934 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 2124
Credit: 12,428,047
RAC: 2,329
Message 111935 - Posted: 21 Jan 2025, 11:36:06 UTC - in response to Message 111934.  

With some of the recent robetta tasks, single decoys are running for 5 or 6 hours, so tasks are running short of the 8hr default.
With my 12hr runtime my tasks are either running just over 6hrs, with a few completing 2 decoys in about 10.5hrs.
Not ideal, but it is what it is.


I have 4hrs as default running time
Now I have some wu over 11hrs.
Big decoys!!
ID: 111935 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bill Swisher
Avatar

Send message
Joined: 10 Jun 13
Posts: 81
Credit: 61,780,497
RAC: 16,820
Message 111936 - Posted: 21 Jan 2025, 16:26:36 UTC - in response to Message 111934.  

2 decoys

OK, I'll bite. What's a "decoy"? I've seen that term used around here in my results, something I seldom look at.
ID: 111936 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 288
Credit: 540,373
RAC: 0
Message 111937 - Posted: 21 Jan 2025, 21:25:26 UTC - in response to Message 111936.  

ID: 111937 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2475
Credit: 46,506,558
RAC: 3,357
Message 111939 - Posted: 22 Jan 2025, 3:12:29 UTC - in response to Message 111937.  

I have found this: https://foldit.fandom.com/wiki/Decoy

That's most likely the correct answer.
But I used the term in line with what's shown in the Stderr output of the task detail like here to discover what was making the task to run short.
At one time they accidentally imposed a maximum decoy limit which caused the task to terminate when it was reached (very short time for each decoy).
The decoy limit was quickly corrected, but sometimes those corrections get undone in a new batch of work.
But as it turned out in this case it was that the time for each decoy was too long, not very short.
The current rb tasks are running fine - 27 decoys and a full 12hrs for me again.
More a simple comment than a complaint.
ID: 111939 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 2124
Credit: 12,428,047
RAC: 2,329
Message 111942 - Posted: 22 Jan 2025, 10:34:49 UTC - in response to Message 111916.  

It's Wednesday.
And the servers are down.

As usual



It's Wednesday. Again.
ID: 111942 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 310 · 311 · 312 · 313 · 314 · 315 · 316 . . . 352 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2025 University of Washington
https://www.bakerlab.org