Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 122 · 123 · 124 · 125 · 126 · 127 · 128 . . . 309 · Next

AuthorMessage
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1725
Credit: 18,382,444
RAC: 19,446
Message 102667 - Posted: 18 Sep 2021, 6:05:58 UTC

Oh well, a few more hours & i'll be out of work again, even though this time there's still millions available.
Grant
Darwin NT
ID: 102667 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1725
Credit: 18,382,444
RAC: 19,446
Message 102668 - Posted: 18 Sep 2021, 6:32:35 UTC - in response to Message 102667.  

Oh well, a few more hours & i'll be out of work again, even though this time there's still millions available.
I don't want to jinx things, but work appears to be flowing again.
Complains about the lack of VirtualBox messages keep occurring, but at least i can get work again.
Grant
Darwin NT
ID: 102668 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1725
Credit: 18,382,444
RAC: 19,446
Message 102669 - Posted: 18 Sep 2021, 6:45:49 UTC - in response to Message 102668.  

Oh well, a few more hours & i'll be out of work again, even though this time there's still millions available.
I don't want to jinx things, but work appears to be flowing again.
Complains about the lack of VirtualBox messages keep occurring, but at least i can get work again.
I did jinx it.
Work has stopped flowing again.
Grant
Darwin NT
ID: 102669 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bryn Mawr

Send message
Joined: 26 Dec 18
Posts: 399
Credit: 12,294,748
RAC: 6,222
Message 102673 - Posted: 18 Sep 2021, 8:35:38 UTC - in response to Message 102668.  

Oh well, a few more hours & i'll be out of work again, even though this time there's still millions available.
I don't want to jinx things, but work appears to be flowing again.
Complains about the lack of VirtualBox messages keep occurring, but at least i can get work again.


I’m with you, I will not be running virtualbox or the python tasks so if that screws up running normal Rosetta so be it.
ID: 102673 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sputnik
Avatar

Send message
Joined: 7 Aug 17
Posts: 1
Credit: 9,868,633
RAC: 3,211
Message 102674 - Posted: 18 Sep 2021, 8:52:21 UTC - in response to Message 80621.  
Last modified: 18 Sep 2021, 9:03:55 UTC

Hej. I still get download errors for the rosetta python projects 1.03 files, because there are some checksum errors / MD5 errors in BOINC Manager 7.16.11 (x64) for the co-loaded 2GB file AIMNet_vm_v2.vdi

18.09.2021 10:42:42 | Rosetta@home | Finished download of AIMNet_vm_v2.vdi
18.09.2021 10:43:23 | Rosetta@home | [error] MD5 check failed for AIMNet_vm_v2.vdi
18.09.2021 10:43:23 | Rosetta@home | [error] expected d41d8cd98f00b204e9800998ecf8427e, got 61fef19456bb58ec941845ef08d8c5ef
18.09.2021 10:43:23 | Rosetta@home | [error] Checksum or signature error for AIMNet_vm_v2.vdi

How to stop the download of these incorrect co-loaded 2GB file/python projects in the rosetta portal (https://boinc.bakerlab.org/rosetta/prefs.php?subset=project)?
Or do i have to stop crunching for the whole rosetta project (e.g. rosetta 4.20)?

The Oracle VirtualBox 6.1.26 is installed - and is running fine - used for LHC Home projects.


THX Sputnik
ID: 102674 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
winny33

Send message
Joined: 15 Sep 09
Posts: 1
Credit: 666,704
RAC: 2
Message 102679 - Posted: 18 Sep 2021, 11:12:59 UTC
Last modified: 18 Sep 2021, 11:15:37 UTC

Hello Me I can't stop having this error in all my spots.


18/0918/09/2021 13:08:21 | Rosetta@home | [error] MD5 check failed for AIMNet_vm_v2.vdi
/2021 13:08:21 | Rosetta@home | [error] expected d41d8cd98f00b204e9800998ecf8427e, got 61fef19456bb58ec941845ef08d8c5ef
18/09/2021 13:08:21 | Rosetta@home | [error] Checksum or signature error for AIMNet_vm_v2.vdi


<core_client_version>7.16.11</core_client_version>
<![CDATA[
<message>
WU download error: couldn't get input files:
<file_xfer_error>
<file_name>AIMNet_vm_v2.vdi</file_name>
<error_code>-119 (md5 checksum failed for file)</error_code>
<error_message>MD5 check failed</error_message>
</file_xfer_error>
</message>
]]>
ID: 102679 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
StarCastle

Send message
Joined: 25 Apr 20
Posts: 7
Credit: 1,025,975
RAC: 183
Message 102680 - Posted: 18 Sep 2021, 11:49:41 UTC

Receiving a checksum error on the python downloads. This started happening mid last week.

Here is the event log of one of those downloads


09/18/2021 07:13:49 | Rosetta@home | Started download of AIMNet_vm_v2.vdi
09/18/2021 07:13:49 | Rosetta@home | Started download of aaaf-mNMABU_pp-mPTAMBA_pp-NHM_pp-NMBEN3_pp_0.gz
09/18/2021 07:13:51 | Rosetta@home | Finished download of aaaf-mNMABU_pp-mPTAMBA_pp-NHM_pp-NMBEN3_pp_0.gz
09/18/2021 07:13:51 | Rosetta@home | Started download of aaaf-mNMALA_pp-PTAMBA_pp-TIC_pp-NMBEN3_pp_0.gz
09/18/2021 07:13:52 | Rosetta@home | Finished download of aaaf-mNMALA_pp-PTAMBA_pp-TIC_pp-NMBEN3_pp_0.gz
09/18/2021 07:13:52 | Rosetta@home | Started download of aaaf-mNMABU_pp-PTAMBA_pp-PIP_pp-NMBEN3_pp_0.gz
09/18/2021 07:13:53 | Rosetta@home | Finished download of aaaf-mNMABU_pp-PTAMBA_pp-PIP_pp-NMBEN3_pp_0.gz
09/18/2021 07:13:53 | Rosetta@home | Started download of aaaf-mNMABU-ACPC-ABU_pp-NMBEN3_pp_0.gz
09/18/2021 07:13:54 | Rosetta@home | Finished download of aaaf-mNMABU-ACPC-ABU_pp-NMBEN3_pp_0.gz
09/18/2021 07:13:54 | Rosetta@home | Started download of aaaf-mNMABU-ACPC-SER_pp-NMBEN3_pp_0.gz
09/18/2021 07:13:56 | Rosetta@home | Finished download of aaaf-mNMABU-ACPC-SER_pp-NMBEN3_pp_0.gz
09/18/2021 07:13:56 | Rosetta@home | Started download of aaaf-mNMABU-ACPC-TBA_pp-NMBEN3_pp_0.gz
09/18/2021 07:13:57 | Rosetta@home | Finished download of aaaf-mNMABU-ACPC-TBA_pp-NMBEN3_pp_0.gz
09/18/2021 07:13:57 | Rosetta@home | Started download of aaaf-mNMASN-ACPC_pp-PIP_pp-NMBEN3_pp_0.gz
09/18/2021 07:13:58 | Rosetta@home | Finished download of aaaf-mNMASN-ACPC_pp-PIP_pp-NMBEN3_pp_0.gz
09/18/2021 07:13:58 | Rosetta@home | Started download of aaaf-mNMASN-ACPC_pp-NMLEU_pp-NMBEN3_pp_0.gz
09/18/2021 07:13:59 | Rosetta@home | Finished download of aaaf-mNMASN-ACPC_pp-NMLEU_pp-NMBEN3_pp_0.gz
09/18/2021 07:21:59 | Rosetta@home | Finished download of AIMNet_vm_v2.vdi
09/18/2021 07:23:12 | Rosetta@home | [error] MD5 check failed for AIMNet_vm_v2.vdi
09/18/2021 07:23:12 | Rosetta@home | [error] expected d41d8cd98f00b204e9800998ecf8427e, got 61fef19456bb58ec941845ef08d8c5ef
09/18/2021 07:23:12 | Rosetta@home | [error] Checksum or signature error for AIMNet_vm_v2.vdi
ID: 102680 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 102699 - Posted: 18 Sep 2021, 21:46:47 UTC

This is all very interesting, but the problems that I see here are all on Windows 10 machines. I will be able to start running them on Ubuntu 20.04.3 machines by Monday. If anyone is seeing problems on that, I would appreciate knowing about it.

My machines have enough memory to run four to six at a time. If there are any cache limitations on running that many, it would be interesting to know about that too. Good luck.
ID: 102699 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1725
Credit: 18,382,444
RAC: 19,446
Message 102701 - Posted: 18 Sep 2021, 21:54:21 UTC

Event log no longer showing complaints about lack of VirtualBox, and system is now able to get work again.
Grant
Darwin NT
ID: 102701 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 102716 - Posted: 19 Sep 2021, 9:45:14 UTC - in response to Message 102699.  

This is all very interesting, but the problems that I see here are all on Windows 10 machines. I will be able to start running them on Ubuntu 20.04.3 machines by Monday. If anyone is seeing problems on that, I would appreciate knowing about it.

My machines have enough memory to run four to six at a time. If there are any cache limitations on running that many, it would be interesting to know about that too. Good luck.



But it is not a issue for Linux machines because your running the native environment of the task.
Us Windows users are emulating Linux via a Virtual Machine and that seems to be where the problem is at.
The Virtual Disk Image file Checksum code that we get does not seem to match the code the server wants and then we get a code -119 MD5 checksum error.

I've been fortunate not to get that.
There is a guy in Italy in another thread with a Windows machine that gets nothing but these errors, no matter what we have tried. And it only related to Python tasks which are VM tasks.
ID: 102716 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 102720 - Posted: 19 Sep 2021, 11:50:34 UTC - in response to Message 102716.  

But it is not a issue for Linux machines because your running the native environment of the task.
Us Windows users are emulating Linux via a Virtual Machine and that seems to be where the problem is at.
The Virtual Disk Image file Checksum code that we get does not seem to match the code the server wants and then we get a code -119 MD5 checksum error.

I have to run VirtualBox on my Ubuntu machines also to do the pythons.

But it does not always work quite the same way as on Windows. It is hopefully a small discrepancy, but we all need someone at Rosetta to look into it, and they never bother to even acknowledge problems. Maybe someone can get the attention of the Admin.
ID: 102720 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Nature Boy

Send message
Joined: 16 Jul 10
Posts: 1
Credit: 2,217,155
RAC: 0
Message 102727 - Posted: 19 Sep 2021, 14:21:11 UTC - in response to Message 102699.  

This is all very interesting, but the problems that I see here are all on Windows 10 machines. I will be able to start running them on Ubuntu 20.04.3 machines by Monday. If anyone is seeing problems on that, I would appreciate knowing about it.


Running Utuntu 20.04 and the file keeps trying to download and failing. Boinc Manager 7.16.6. The file is attempting to download, again, as I type.

The latest failure part of the log:

Sun 19 Sep 2021 02:09:53 AM EDT | Rosetta@home | Finished download of AIMNet_vm_v2.vdi
Sun 19 Sep 2021 02:10:55 AM EDT | Rosetta@home | [error] MD5 check failed for AIMNet_vm_v2.vdi
Sun 19 Sep 2021 02:10:55 AM EDT | Rosetta@home | [error] expected d41d8cd98f00b204e9800998ecf8427e, got 61fef19456bb58ec941845ef08d8c5ef
Sun 19 Sep 2021 02:10:55 AM EDT | Rosetta@home | [error] Checksum or signature error for AIMNet_vm_v2.vdi

First occurrence, in my log is:

Fri 17 Sep 2021 08:26:54 AM EDT | Rosetta@home | Started download of AIMNet_vm_v2.vdi
Fri 17 Sep 2021 08:26:54 AM EDT | Rosetta@home | Started download of AIMNet_minimization_python_project.py
Fri 17 Sep 2021 08:26:56 AM EDT | Rosetta@home | Finished download of AIMNet_minimization_python_project.py
Fri 17 Sep 2021 08:26:56 AM EDT | Rosetta@home | Started download of aaaf-PTAMBA-mTBG_pp-NMABU_pp-NMBEN3_pp_1.gz
Fri 17 Sep 2021 08:26:59 AM EDT | Rosetta@home | Finished download of aaaf-PTAMBA-mTBG_pp-NMABU_pp-NMBEN3_pp_1.gz
Fri 17 Sep 2021 09:24:18 AM EDT | Rosetta@home | Starting task degrader_site_1tfq_plait_-1.5_bcov_25.hbnet_5_SAVE_ALL_OUT_IGNORE_THE_REST_5bw7nj6v_1731293_5_0
Fri 17 Sep 2021 09:24:22 AM EDT | Rosetta@home | Starting task 5nvx_graft_buwei_xad_SAVE_ALL_OUT_IGNORE_THE_REST_9mp6yp0h_1731808_3_0
Fri 17 Sep 2021 10:19:13 AM EDT | Rosetta@home | Finished download of AIMNet_vm_v2.vdi
Fri 17 Sep 2021 10:20:12 AM EDT | Rosetta@home | [error] MD5 check failed for AIMNet_vm_v2.vdi
Fri 17 Sep 2021 10:20:12 AM EDT | Rosetta@home | [error] expected d41d8cd98f00b204e9800998ecf8427e, got 61fef19456bb58ec941845ef08d8c5ef
Fri 17 Sep 2021 10:20:12 AM EDT | Rosetta@home | [error] Checksum or signature error for AIMNet_vm_v2.vdi

How can I make it stop? I have been a long time supporter of RH, but will disconnect from the project if it keeps wasting my bandwidth with 2GB failed downloads.
ID: 102727 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dr Who Fan
Avatar

Send message
Joined: 28 May 06
Posts: 79
Credit: 273,880
RAC: 243
Message 102728 - Posted: 19 Sep 2021, 15:58:59 UTC - in response to Message 102727.  

How can I make it stop? I have been a long time supporter of RH, but will disconnect from the project if it keeps wasting my bandwidth with 2GB failed downloads.

Temporarily STOP REQUESTING NEW WORK FOR THE PROJET by settling NO NEW WORK for Rosetta on your PC(S) in BOINC until the project fixes the problem.

ID: 102728 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 102730 - Posted: 19 Sep 2021, 18:00:05 UTC - in response to Message 102720.  

But it is not a issue for Linux machines because your running the native environment of the task.
Us Windows users are emulating Linux via a Virtual Machine and that seems to be where the problem is at.
The Virtual Disk Image file Checksum code that we get does not seem to match the code the server wants and then we get a code -119 MD5 checksum error.

I have to run VirtualBox on my Ubuntu machines also to do the pythons.

But it does not always work quite the same way as on Windows. It is hopefully a small discrepancy, but we all need someone at Rosetta to look into it, and they never bother to even acknowledge problems. Maybe someone can get the attention of the Admin.



Sid Celery is already sending emails to Baker Lab, but they are a M-F office hours only project.
There is no dedicated IT person that monitors the forums, at least not that we have seen.
The only way they know if there is a problem is if the researcher of this specific project sees a lot of failures in his data. Other than that, your SOL.

You were in a conversation with Mikey who showed you how to isolate python tasks so you could limit the cores. The only thing further (which i don't know the commands for) is to "block" the specific task type. Maybe Mikey can help you with that in that other thread.

Your not alone with this MD5 issue, A windows user in Italy is also having the exact same problem.
So far everything we know to try and make this disappear has failed. So its something back in Seattle on their system that is causing this. Now why some windows machines and not others is a mystery, because I am cranking through these just fine.
ID: 102730 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 102733 - Posted: 19 Sep 2021, 20:06:20 UTC - in response to Message 102730.  

You were in a conversation with Mikey who showed you how to isolate python tasks so you could limit the cores. The only thing further (which i don't know the commands for) is to "block" the specific task type. Maybe Mikey can help you with that in that other thread.

Your not alone with this MD5 issue, A windows user in Italy is also having the exact same problem.
So far everything we know to try and make this disappear has failed. So its something back in Seattle on their system that is causing this. Now why some windows machines and not others is a mystery, because I am cranking through these just fine.

I am not sure what you are referring to on MW. If it is "max concurrent" (or "project max concurrent") in an app_config, that causes a problem with excessive downloads. I have posted on it already here.

But that appears to be irrelevant to the present MD5 issue anyway. And I am seeing the same thing on Ubuntu as the Windows users, just a slightly different message.
https://boinc.bakerlab.org/rosetta/results.php?hostid=6143731&offset=0&show_names=0&state=6&appid=

I am trying a new version of BOINC. I was on 7.16.11, the released version from Ubuntu. Now it is 7.16.17 from Locutus-of-Borg.
That solved a problem on QuChemPedIA, but not necessarily here.

Seattle certainly needs to fix it.
ID: 102733 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 102737 - Posted: 19 Sep 2021, 21:52:58 UTC - in response to Message 102733.  

I am trying a new version of BOINC. I was on 7.16.11, the released version from Ubuntu. Now it is 7.16.17 from Locutus-of-Borg.
That solved a problem on QuChemPedIA, but not necessarily here.

The new BOINC 7.16.17 didn't fix it. It is something they have to do.
ID: 102737 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 102739 - Posted: 19 Sep 2021, 23:35:36 UTC - in response to Message 102733.  

You were in a conversation with Mikey who showed you how to isolate python tasks so you could limit the cores. The only thing further (which i don't know the commands for) is to "block" the specific task type. Maybe Mikey can help you with that in that other thread.

Your not alone with this MD5 issue, A windows user in Italy is also having the exact same problem.
So far everything we know to try and make this disappear has failed. So its something back in Seattle on their system that is causing this. Now why some windows machines and not others is a mystery, because I am cranking through these just fine.

I am not sure what you are referring to on MW. If it is "max concurrent" (or "project max concurrent") in an app_config, that causes a problem with excessive downloads. I have posted on it already here.

But that appears to be irrelevant to the present MD5 issue anyway. And I am seeing the same thing on Ubuntu as the Windows users, just a slightly different message.
https://boinc.bakerlab.org/rosetta/results.php?hostid=6143731&offset=0&show_names=0&state=6&appid=

I am trying a new version of BOINC. I was on 7.16.11, the released version from Ubuntu. Now it is 7.16.17 from Locutus-of-Borg.
That solved a problem on QuChemPedIA, but not necessarily here.

Seattle certainly needs to fix it.


But it is weird, MD5 shows up on your machine and the Italian machine, but not on mine and not that many others are talking about it. But is it all python tasks or just certain specific strings that are affected?
ID: 102739 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tomcat雄猫

Send message
Joined: 20 Dec 14
Posts: 180
Credit: 5,386,173
RAC: 0
Message 102742 - Posted: 19 Sep 2021, 23:51:09 UTC - in response to Message 102739.  
Last modified: 19 Sep 2021, 23:56:11 UTC

I also have the MD5 issue on Rosetta Python tasks. Presently it's been a 100% fail rate, error while downloading.

aaaf-IDC_pp-FPR_pp-mNHM_pp-NMBEN3_pp_0_1737815_2_1
<core_client_version>7.16.11</core_client_version>
<![CDATA[
<message>
WU download error: couldn't get input files:
<file_xfer_error>
  <file_name>AIMNet_vm_v2.vdi</file_name>
  <error_code>-119 (md5 checksum failed for file)</error_code>
  <error_message>MD5 check failed</error_message>
</file_xfer_error>
</message>
]]>


aaaf-IDC_pp-SAR-AIB_pp-NMBEN3_pp_0_1737353_4_0
<core_client_version>7.16.11</core_client_version>
<![CDATA[
<message>
WU download error: couldn't get input files:
<file_xfer_error>
  <file_name>AIMNet_vm_v2.vdi</file_name>
  <error_code>-119 (md5 checksum failed for file)</error_code>
  <error_message>MD5 check failed</error_message>
</file_xfer_error>
</message>
]]>


aaaf-mAZE_pp-NMALA_pp-mNMVAL_pp-NMBEN3_pp_0_1736941_7_0
<core_client_version>7.16.11</core_client_version>
<![CDATA[
<message>
WU download error: couldn't get input files:
<file_xfer_error>
  <file_name>AIMNet_vm_v2.vdi</file_name>
  <error_code>-119 (md5 checksum failed for file)</error_code>
  <error_message>MD5 check failed</error_message>
</file_xfer_error>
</message>
]]>


I see you have had good luck with these Python tasks, what's your vBox setup?
ID: 102742 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tomcat雄猫

Send message
Joined: 20 Dec 14
Posts: 180
Credit: 5,386,173
RAC: 0
Message 102743 - Posted: 20 Sep 2021, 0:00:12 UTC - in response to Message 102742.  
Last modified: 20 Sep 2021, 0:18:45 UTC

Good lord, I just checked my Virtual Box and all the failed tasks apear to still be in there.

I'm using 6.1.12, the version supplied on the BOINC website, with the Oracle VM VirtualBox Extension Pack installed on Windows 10. Visualization extensions have been enabled in the BIOS.
Should I use the latest version (6.1.26) or the one supplied by BOINC? What about extensions, should I install any?
ID: 102743 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Pegasus

Send message
Joined: 22 Oct 06
Posts: 5
Credit: 6,837,701
RAC: 0
Message 102745 - Posted: 20 Sep 2021, 0:52:37 UTC

I received an odd message today:

Rosetta@home: Notice from server
VirtualBox is not installed
9/19/2021 6:01:50 PM

It is correct that I do not have VirtualBox installed, and I have not used VirtualBox to run BOINC programs.
ID: 102745 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 122 · 123 · 124 · 125 · 126 · 127 · 128 . . . 309 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org