Moderators Contact Point (Explanations, Assistance etc) Post here!

Message boards : Cafe Rosetta : Moderators Contact Point (Explanations, Assistance etc) Post here!

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5

AuthorMessage
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 4871
Credit: 3,948,842
RAC: 2,879
Message 90114 - Posted: 31 Dec 2018, 13:32:46 UTC - in response to Message 90102.  

Hey Mod,

Why is there never any news on the home page or in the technical page?
Is who was it, DVK or DEK that used to do tech for you guys and write when there was a problem?
Who's the tech guy now and why does he not write something.
This non communication is what drives me nuts with this project.
I know its not your fault and that Dr. B does not have time to write about tech stuff, but someone needs to put up info about system issues. We should not have to go hunting all over the place for news.
I looked here as a last resort.

One other little thing...why is there no create a new thread button in the technical section page?

Happy New Year when it arrives.
ID: 90114 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator
Project administrator

Send message
Joined: 22 Aug 06
Posts: 3690
Credit: 0
RAC: 0
Message 90115 - Posted: 31 Dec 2018, 20:54:19 UTC - in response to Message 90114.  

The Technical News section is for administrators to post to. That is why you do not see an option to create a new thread on that message board.

I don't have any information on workflow. Much of it is automated systems. Each researcher submits their own tasks when they are ready to analyze the results. Since I would imagine many of the researchers took the holidays off, they were not in a position to perform that follow-up.
Rosetta Moderator: Mod.Sense
ID: 90115 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 4871
Credit: 3,948,842
RAC: 2,879
Message 90125 - Posted: 1 Jan 2019, 22:22:26 UTC - in response to Message 90115.  

oh right on about the technical thing.
I was thinking like some other forums where you can post questions like work flow issues and such.

Well thanks Mod., but let the people at the lab know we need better communication, because the work units issue has been going on for a long time. My stats dropped really bad and I thought it was other projects gulping up time but then I saw in my status log what was happening.

It's just a persistent problem with this project that there is very little communication from the lab about various technical issues.

Anyway..thanks for the explanation
ID: 90125 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mixtrak

Send message
Joined: 26 Apr 12
Posts: 1
Credit: 8,179
RAC: 0
Message 91160 - Posted: 25 Sep 2019, 7:50:29 UTC

Please delete my account and all associated information from your systems as far as practicable and email me with confirmation.

An option for the users to do this themselves would be appreciated.
ID: 91160 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator
Project administrator

Send message
Joined: 22 Aug 06
Posts: 3690
Credit: 0
RAC: 0
Message 91359 - Posted: 12 Nov 2019, 16:31:06 UTC - in response to Message 91160.  
Last modified: 12 Nov 2019, 16:31:26 UTC

Please delete my account and all associated information from your systems as far as practicable and email me with confirmation.

An option for the users to do this themselves would be appreciated.


https://boinc.bakerlab.org/rosetta/forum_thread.php?id=3291&postid=31231
Rosetta Moderator: Mod.Sense
ID: 91359 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
SirMark

Send message
Joined: 15 Jul 19
Posts: 1
Credit: 101,812
RAC: 3,138
Message 91878 - Posted: 5 Mar 2020, 23:01:04 UTC

I am not receiving any Android related jobs for Rosetta. I am running the BOINC client available from the google play store and have successfully attached to the project. The Rosetta statistics shows my Android devices correctly and the last communication to the server, however there are no jobs sent. Is there anything else I need to do?
ID: 91878 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator
Project administrator

Send message
Joined: 22 Aug 06
Posts: 3690
Credit: 0
RAC: 0
Message 91894 - Posted: 7 Mar 2020, 22:09:29 UTC

Nothing else to do on your end. The project has not been creating Android work units recently.
Rosetta Moderator: Mod.Sense
ID: 91894 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 390
Credit: 14,135,942
RAC: 55,543
Message 91898 - Posted: 8 Mar 2020, 12:48:23 UTC

Is Rosetta aware of the stalled downloads?
http://boinc.bakerlab.org/rosetta/forum_thread.php?id=13519

I see newly issued ones every day or two, indicating that they are not. It has been going on for weeks.
ID: 91898 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator
Project administrator

Send message
Joined: 22 Aug 06
Posts: 3690
Credit: 0
RAC: 0
Message 91941 - Posted: 11 Mar 2020, 14:04:25 UTC

I've sent an EMail to David Kim and asked that he look in to it, pointing out that the hung downloads are preventing machines from getting work so they can resume processing.
Rosetta Moderator: Mod.Sense
ID: 91941 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 390
Credit: 14,135,942
RAC: 55,543
Message 91942 - Posted: 11 Mar 2020, 14:37:44 UTC - in response to Message 91941.  

Thanks. It will affect their output at some point.
ID: 91942 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 735
Credit: 9,979,591
RAC: 4,035
Message 92052 - Posted: 18 Mar 2020, 12:37:52 UTC

Another repeated failure of a rather small download:

3/18/2020 5:30:34 AM | Rosetta@home | Scheduler request completed: got 1 new tasks
3/18/2020 5:30:36 AM | Rosetta@home | Started download of 11v2nmgb_c17434_11mer_gb_001197.zip
3/18/2020 5:30:36 AM | Rosetta@home | Started download of 11v2nmgb_c17434_11mer_gb_001197.flags
3/18/2020 5:30:38 AM | Rosetta@home | Finished download of 11v2nmgb_c17434_11mer_gb_001197.flags
3/18/2020 5:35:43 AM | | Project communication failed: attempting access to reference site
3/18/2020 5:35:43 AM | Rosetta@home | Temporarily failed download of 11v2nmgb_c17434_11mer_gb_001197.zip: transient HTTP error
3/18/2020 5:35:43 AM | Rosetta@home | Backing off 00:03:54 on download of 11v2nmgb_c17434_11mer_gb_001197.zip
3/18/2020 5:35:45 AM | | Internet access OK - project servers may be temporarily down.

[skipping entries for a few attempts]

3/18/2020 6:05:39 AM | Rosetta@home | Started download of 11v2nmgb_c17434_11mer_gb_001197.zip
3/18/2020 6:10:45 AM | Rosetta@home | Temporarily failed download of 11v2nmgb_c17434_11mer_gb_001197.zip: transient HTTP error
3/18/2020 6:10:45 AM | Rosetta@home | Backing off 00:31:20 on download of 11v2nmgb_c17434_11mer_gb_001197.zip
3/18/2020 6:10:46 AM | | Project communication failed: attempting access to reference site
3/18/2020 6:10:48 AM | | Internet access OK - project servers may be temporarily down.
3/18/2020 6:42:06 AM | Rosetta@home | Started download of 11v2nmgb_c17434_11mer_gb_001197.zip
3/18/2020 6:47:14 AM | Rosetta@home | Temporarily failed download of 11v2nmgb_c17434_11mer_gb_001197.zip: transient HTTP error
3/18/2020 6:47:14 AM | Rosetta@home | Backing off 00:59:31 on download of 11v2nmgb_c17434_11mer_gb_001197.zip
3/18/2020 6:47:15 AM | | Project communication failed: attempting access to reference site
3/18/2020 6:47:17 AM | | Internet access OK - project servers may be temporarily down.

This is another case of attempting the download a file larger than 2 KB but less than 4 KB, getting it in blocks of about 1 KB, but the last block (less than 1 KB) repeatedly failing to download. For me, this is always an *.zip file.

I aborted the download, since it would otherwise block downloading any other tasks from Rosetta@Home.
ID: 92052 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 735
Credit: 9,979,591
RAC: 4,035
Message 92145 - Posted: 22 Mar 2020, 23:32:10 UTC
Last modified: 23 Mar 2020, 0:09:12 UTC

Another repeatedly stalled SMALL download - 1.20 of 2.62 KB this time.

3/22/2020 3:39:46 PM | Rosetta@home | Sending scheduler request: To report completed tasks.
3/22/2020 3:39:46 PM | Rosetta@home | Reporting 2 completed tasks
3/22/2020 3:39:46 PM | Rosetta@home | Requesting new tasks for CPU
3/22/2020 3:39:48 PM | Rosetta@home | Scheduler request completed: got 3 new tasks
3/22/2020 3:39:50 PM | Rosetta@home | Started download of 8v2nm_gb_c601_8mer_gb_005028.zip
3/22/2020 3:39:50 PM | Rosetta@home | Started download of 8v2nm_gb_c601_8mer_gb_005028.flags
3/22/2020 3:39:52 PM | Rosetta@home | Finished download of 8v2nm_gb_c601_8mer_gb_005028.flags
3/22/2020 3:39:52 PM | Rosetta@home | Started download of 2fo1ke1p_3h3_design3_COVID-19.zip
3/22/2020 3:39:53 PM | Rosetta@home | Finished download of 2fo1ke1p_3h3_design3_COVID-19.zip
3/22/2020 3:39:53 PM | Rosetta@home | Started download of 2fo1ke1p_3h3_design3_COVID-19.flags
3/22/2020 3:39:54 PM | Rosetta@home | Finished download of 2fo1ke1p_3h3_design3_COVID-19.flags
3/22/2020 3:39:54 PM | Rosetta@home | Started download of 10v2nmgb_c18957_10mer_gb_000051.zip
3/22/2020 3:39:55 PM | Rosetta@home | Finished download of 10v2nmgb_c18957_10mer_gb_000051.zip
3/22/2020 3:39:55 PM | Rosetta@home | Started download of 10v2nmgb_c18957_10mer_gb_000051.flags
3/22/2020 3:39:56 PM | Rosetta@home | Finished download of 10v2nmgb_c18957_10mer_gb_000051.flags
3/22/2020 3:44:57 PM | Rosetta@home | Temporarily failed download of 8v2nm_gb_c601_8mer_gb_005028.zip: transient HTTP error
3/22/2020 3:44:57 PM | Rosetta@home | Backing off 00:02:22 on download of 8v2nm_gb_c601_8mer_gb_005028.zip
3/22/2020 3:44:58 PM | | Project communication failed: attempting access to reference site
3/22/2020 3:44:59 PM | | Internet access OK - project servers may be temporarily down.

[snip]

3/22/2020 5:59:29 PM | Rosetta@home | Started download of 8v2nm_gb_c601_8mer_gb_005028.zip
3/22/2020 6:04:36 PM | | Project communication failed: attempting access to reference site
3/22/2020 6:04:36 PM | Rosetta@home | Temporarily failed download of 8v2nm_gb_c601_8mer_gb_005028.zip: transient HTTP error
3/22/2020 6:04:36 PM | Rosetta@home | Backing off 01:42:36 on download of 8v2nm_gb_c601_8mer_gb_005028.zip
3/22/2020 6:04:38 PM | | Internet access OK - project servers may be temporarily down.


Speed of last attempt was shown as 1.01 KBps, but I don't trust that to be accurate. My internet connection can handle 3 MBps.

Using Windows 10. Downloads of much larger files don't show this problem, even if they start and finish after the stalled download starts.

Under Tools, I clicked on Retry pending Transfers, but that doesn't seem to help.

I aborted this stalled download, then tried repeatedly to Update R@H, but every attempt shows this in the log:

3/22/2020 6:18:00 PM | Rosetta@home | Not requesting tasks: some download is stalled.

Is there some way to tell the BOINC Manager that since the task has already been reported as failing due to a failed download, it is no longer useful to keep trying to download this file?

Is there some way to tell the BOINC Manager to show more details about why the download attempts failed?

Is someone at the project trying to reduce the amount of output from tasks to have less to look through?
ID: 92145 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 735
Credit: 9,979,591
RAC: 4,035
Message 92146 - Posted: 23 Mar 2020, 0:05:58 UTC
Last modified: 23 Mar 2020, 0:10:13 UTC

More Update attempts.

I'm no longer getting the stalled upload messages in the log; I'm getting these instead:

3/22/2020 6:55:24 PM | Rosetta@home | Not requesting tasks: don't need (CPU: not highest priority project; NVIDIA GPU: job cache full)

I currently have 5 COVID-19 tasks; so not all of the time is wasted.

The project with download failures had the server disk full; now corrected enough that those files have downloaded.
ID: 92146 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Aveena
New member

Send message
Joined: 26 Mar 20
Posts: 2
Credit: 3,377
RAC: 205
Message 92342 - Posted: 26 Mar 2020, 14:39:10 UTC

Hello!

I work for a top50 fortune500 company. I am working with a global team to deploy Rosetta@home on our lab servers to put potentially 100's of high-specced servers to start number crunching.

Our lab admins have some concerns regarding man-in-the-middle attacks, and I was wondering if you could provide some guidance on setting up BOINC with rosetta@home on a secure HTTPS connection.

I was unable to find any log entries to show what ports are being utilized, and I was wondering if you had more in-depth details on the application itself.

Initially I was looking at folding@home, but it uses exclusively HTTP traffic on port 80 and 8080.

Appreciate if you could assist us on this matter so that we can get some significant power on this project.

Cheers,
Martin
ID: 92342 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
xotwod
New member

Send message
Joined: 23 Mar 20
Posts: 3
Credit: 34,057
RAC: 1,904
Message 92370 - Posted: 27 Mar 2020, 0:05:06 UTC

Updates are needed to a few pages :

On https://boinc.bakerlab.org/rosetta/stats.php there is a link titled : "BOINC Statistics for the WORLD! developed by Zain Upton " - this links to a spam site, the domain must have expired.

Same thing on https://boinc.bakerlab.org/rosetta/home.php , "BOINC Statistics for the WORLD!" links to that same spam site.

Fixing this would be much appreciated :)
ID: 92370 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator
Project administrator

Send message
Joined: 22 Aug 06
Posts: 3690
Credit: 0
RAC: 0
Message 92408 - Posted: 27 Mar 2020, 18:56:50 UTC - in response to Message 92342.  

The executables are all signature verified using BOINC code signing.

I was not clear what you are getting at with the rest. I believe all of the HTTP downloads and server interactions are on port 80. The BOINC client does use other ports locally on the PC to interact with the application threads running there.
Rosetta Moderator: Mod.Sense
ID: 92408 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
xotwod
New member

Send message
Joined: 23 Mar 20
Posts: 3
Credit: 34,057
RAC: 1,904
Message 92433 - Posted: 28 Mar 2020, 0:49:05 UTC - in response to Message 92408.  

The executables are all signature verified using BOINC code signing.

I was not clear what you are getting at with the rest. I believe all of the HTTP downloads and server interactions are on port 80. The BOINC client does use other ports locally on the PC to interact with the application threads running there.


I think they were referring to that the file transfers aren't TLS-encrypted traffic.
ID: 92433 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
xotwod
New member

Send message
Joined: 23 Mar 20
Posts: 3
Credit: 34,057
RAC: 1,904
Message 92484 - Posted: 28 Mar 2020, 19:59:19 UTC
Last modified: 28 Mar 2020, 19:59:36 UTC

I have noticed that a few tasks have disappeared from my statistics page in the Valid and Error section - I'm not concerned about points or anything, I'm just wondering if this is a bug and if not, I'm curious as to what's happening.

Also, what happens to work units that error twice? Are they just done forever?

Thanks :)
ID: 92484 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator
Project administrator

Send message
Joined: 22 Aug 06
Posts: 3690
Credit: 0
RAC: 0
Message 92488 - Posted: 28 Mar 2020, 20:22:20 UTC - in response to Message 92484.  

I have noticed that a few tasks have disappeared from my statistics page in the Valid and Error section - I'm not concerned about points or anything, I'm just wondering if this is a bug and if not, I'm curious as to what's happening.


They are not kept forever. The points add up, but the specifics on the tasks are removed after a few days.

Also, what happens to work units that error twice? Are they just done forever?


When a work unit fails to run properly and is reported back to the project as failing, the project server queues it up to send to another host... maybe it will run better there. R@h has things set so that this is as far as it goes. A given WU will not be sent to more than two machines, regardless of the outcome on that second machine.

The way most WUs work at R@h, there is a pool of tasks created for a given project. Each task in that pool will produce many models. The only difference between WUs is a randomized starting configuration of the protein. The researcher typically needs between 10,000 and 100,000 models to feel they have rather completely covered the possibilities. If a few work units fall out, or are never reported back, etc. it does not really impact the overall work effort.
Rosetta Moderator: Mod.Sense
ID: 92488 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · 5

Message boards : Cafe Rosetta : Moderators Contact Point (Explanations, Assistance etc) Post here!



©2020 University of Washington
http://www.bakerlab.org