Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 167 · 168 · 169 · 170 · 171 · 172 · 173 . . . 309 · Next

AuthorMessage
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 104553 - Posted: 28 Jan 2022, 21:22:08 UTC - in response to Message 104551.  
Last modified: 28 Jan 2022, 21:23:14 UTC

It's the freaking M.2 drive.
Data is now on E: on the HDD and things work.
So either the M.2 is defect or there are not enough PCIe channels available.
Taking the drive back tomorrow since I can't flash the firmware.
This is just nuts!
I'm sure they wouldn't design a board and CPU which can't handle one GPU and one M2 drive. I'm sure it's a broken drive.



24 channels.
2 x 8 for GPU
4 for CPU
4 should be open for the 4 channel M2.

Then I am getting my money's worth out of the gear. (16 cores, 2 GPU's and a dedicated storage drive) Everything is used to it max potential then.

If I could ever find another 1080 at the low price this server guy was offering a few years back it would be great. But the 1050 does fine, so might as well use it. Again, not in this game to compete against the big machines. Just a average machine that can do lots of things. Crunch, browse, play back DVD (Need to fire up Metallica newest release from San Fran orchestra) or CD's or utube). I am now at the point where I have spent enough money on BOINC related stuff, so might as well use it until it becomes outdated or dies.
ID: 104553 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 9,863
Message 104554 - Posted: 28 Jan 2022, 22:16:36 UTC - in response to Message 104552.  

I don't understand this. Boinc puts some basic stuff into C:. But the majority goes where you tell it to. You should have been able to select E:.
Because the main section is a program. It falls under program files in windows.
The other stuff is data, you can put that anywhere.
Correct, so you should have been installing to C and E. Why was it looking on D?
ID: 104554 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1725
Credit: 18,380,064
RAC: 20,136
Message 104555 - Posted: 28 Jan 2022, 22:44:07 UTC - in response to Message 104552.  

Because the main section is a program. It falls under program files in windows.
Only if you choose to install it there.
When you install Bionic instead of going with the defaults, you select the Advanced option.
It allows you to select which drive the programme is installed on, and which drive the data is kept on.

https://boinc.mundayweb.com/wiki/index.php?title=How_to_set_up_BOINC_on_another_drive



Grant
Darwin NT
ID: 104555 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rjzak
Avatar

Send message
Joined: 3 Jan 13
Posts: 2
Credit: 2,290,836
RAC: 0
Message 104556 - Posted: 28 Jan 2022, 22:58:43 UTC

I recently installed VirtualBox, since it came to a point where jobs wouldn't download unless VirtualBox was installed. Now the VBox python jobs get stuck, where the job ends up running for a day or two stuck at 99%. The worst was an instance when a job was at 99.999% for 8 days, then was "aborted by project". The use of VirutalBox seems to just add problems, in my limited experience. I'll save logs the next time this happens.

OS: Ubuntu 20.04.3 LTS
Kernel: 5.4.0
Boinc: 7.16.6 from the Ubuntu repo.
ID: 104556 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 104557 - Posted: 28 Jan 2022, 23:54:31 UTC - in response to Message 104555.  
Last modified: 28 Jan 2022, 23:55:07 UTC

Because the main section is a program. It falls under program files in windows.
Only if you choose to install it there.
When you install Bionic instead of going with the defaults, you select the Advanced option.
It allows you to select which drive the programme is installed on, and which drive the data is kept on.

https://boinc.mundayweb.com/wiki/index.php?title=How_to_set_up_BOINC_on_another_drive





I used to split it up. But then with the 250GB and 4.2 on the SSD and all the other stuff on the HDD, I just left it all on the SSD. But then python blew up that idea. For now I am SSD main and HDD for data and python can use as much of the left over space on the 1TB HDD as it wants until I get a work m2. If the next m2 fails then it just stays with SSD and HDD.
ID: 104557 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 9,863
Message 104563 - Posted: 29 Jan 2022, 18:42:03 UTC - in response to Message 104557.  

I used to split it up. But then with the 250GB and 4.2 on the SSD and all the other stuff on the HDD, I just left it all on the SSD. But then python blew up that idea. For now I am SSD main and HDD for data and python can use as much of the left over space on the 1TB HDD as it wants until I get a work m2. If the next m2 fails then it just stays with SSD and HDD.
You could always get a SATA SSD. You said you had a couple of ports left?
ID: 104563 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 104564 - Posted: 30 Jan 2022, 0:10:40 UTC - in response to Message 104563.  
Last modified: 30 Jan 2022, 0:11:49 UTC

I used to split it up. But then with the 250GB and 4.2 on the SSD and all the other stuff on the HDD, I just left it all on the SSD. But then python blew up that idea. For now I am SSD main and HDD for data and python can use as much of the left over space on the 1TB HDD as it wants until I get a work m2. If the next m2 fails then it just stays with SSD and HDD.
You could always get a SATA SSD. You said you had a couple of ports left?


Just swapped out the M2 for a new one.
Been playing catch up using the HDD.
Since I am shutting down now, I'll put in the new M2 when I get up.
Stay tuned for any new drama.

And yeah, I could go for another SATA SSD if this fails.

Oh BTW Grant, I already knew the advanced thing.
But thanks anyway.
ID: 104564 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Falconet

Send message
Joined: 9 Mar 09
Posts: 354
Credit: 1,276,393
RAC: 2,018
Message 104566 - Posted: 30 Jan 2022, 8:25:46 UTC

Looks like a small batch of 600,000 Rosetta 4.2 tasks.
Got na10v7nt on my computers.
ID: 104566 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 104567 - Posted: 30 Jan 2022, 11:51:30 UTC
Last modified: 30 Jan 2022, 12:12:29 UTC

New drive, same problems as before.
Taking it out and swap it for a SATA I guess.
Waiting on Samsung now.

You guys have not had problems with M2's and BOINC have you?
ID: 104567 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 272
Credit: 507,897
RAC: 334
Message 104568 - Posted: 30 Jan 2022, 13:22:12 UTC
Last modified: 30 Jan 2022, 13:32:22 UTC

I have this in app_config.xml
why do i still get and run rosetta 4.20 workunits?
<app_config>

   <app>
      <name>rosetta</name>
      
    <max_concurrent>0</max_concurrent>
      
    </app>
	
	   <app>
      <name>rosetta_python_projects</name>
       <max_concurrent>2</max_concurrent>
     
      
    </app>

	</app_config>
ID: 104568 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 9,863
Message 104569 - Posted: 30 Jan 2022, 18:03:39 UTC - in response to Message 104567.  

New drive, same problems as before.
Taking it out and swap it for a SATA I guess.
Waiting on Samsung now.

You guys have not had problems with M2's and BOINC have you?
It's possible Boinc doesn't like them, or maybe just your specific hardware combination. I found it very fussy a few years ago when trying to get multiple graphics cards to work, but I think I ended up blaming that on worn out power connectors. They corrode eventually with all that current.

I've never actually tried an M2 drive, since I hadn't heard of them before my last SSD purchase. Feel free to post me yours :-)
ID: 104569 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 104570 - Posted: 30 Jan 2022, 18:04:34 UTC - in response to Message 104568.  

I have this in app_config.xml
why do i still get and run rosetta 4.20 workunits?
<app_config>

   <app>
      <name>rosetta</name>
      
    <max_concurrent>0</max_concurrent>
      
    </app>
	
	   <app>
      <name>rosetta_python_projects</name>
       <max_concurrent>2</max_concurrent>
     
      
    </app>

	</app_config>


because the project is configured to be covered under "rosetta" and does not associate the difference between 4.2 and python. That would require to much work on their end.
ID: 104570 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 9,863
Message 104571 - Posted: 30 Jan 2022, 18:06:44 UTC - in response to Message 104568.  
Last modified: 30 Jan 2022, 18:08:36 UTC

I have this in app_config.xml
why do i still get and run rosetta 4.20 workunits?
<app_config>

   <app>
      <name>rosetta</name>
      
    <max_concurrent>0</max_concurrent>
      
    </app>
	
	   <app>
      <name>rosetta_python_projects</name>
       <max_concurrent>2</max_concurrent>
     
      
    </app>

	</app_config>
Unfortunately that's not seen by the server, those files just tell your own computer what CPU/GPU to direct them to and how many at once.

I can't seem to find an option in the server settings to do that, they must have forgotten.... I notice you can turn off the VB jobs, but not the other way round!
ID: 104571 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 272
Credit: 507,897
RAC: 334
Message 104572 - Posted: 30 Jan 2022, 18:08:20 UTC - in response to Message 104571.  

I have <max_concurrent>0</max_concurrent> for rosetta app, but boinc still computes them.
ID: 104572 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 9,863
Message 104573 - Posted: 30 Jan 2022, 18:10:07 UTC - in response to Message 104572.  
Last modified: 30 Jan 2022, 18:20:49 UTC

I have <max_concurrent>0</max_concurrent> for rosetta app, but boinc still computes them.
It's possible there's no zero option. Zero often means no limit in Boinc.

Also check Messages tab in Boinctasks or Event Log in Boinc Manager in case there's an error in your file. It looks ok to me, although the indenting is all over the place, but I think it ignores that. Mine would look like this:


<app_config>
   <app>
      <name>rosetta</name>
      <max_concurrent>0</max_concurrent>
   </app>
   <app>
      <name>rosetta_python_projects</name>
      <max_concurrent>2</max_concurrent>
   </app>
</app_config>
ID: 104573 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1725
Credit: 18,380,064
RAC: 20,136
Message 104575 - Posted: 30 Jan 2022, 18:20:13 UTC - in response to Message 104567.  

You guys have not had problems with M2's and BOINC have you?
Both my systems have M2 drives and they are no different to other drives.
You physically install them.
Install the driver if Windows doesn't do it,
Use Disk Management to partition it, then Format it.
Once it is then visible in Explorer with it's own drive letter it's ready to use just like any other drive.
Grant
Darwin NT
ID: 104575 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 9,863
Message 104576 - Posted: 30 Jan 2022, 18:30:30 UTC - in response to Message 104570.  
Last modified: 30 Jan 2022, 18:30:59 UTC

I have this in app_config.xml
why do i still get and run rosetta 4.20 workunits?
<app_config>

   <app>
      <name>rosetta</name>
      
    <max_concurrent>0</max_concurrent>
      
    </app>
	
	   <app>
      <name>rosetta_python_projects</name>
       <max_concurrent>2</max_concurrent>
     
      
    </app>

	</app_config>


because the project is configured to be covered under "rosetta" and does not associate the difference between 4.2 and python. That would require to much work on their end.
Actually it does - to find these names what I do is paste in something with the wrong <name> (eg copying an app_config.xml from another project). Then read the config files, then check the messages tab / event log. I got this:

Your app_config.xml file refers to an unknown application 'milkyway'. Known applications: 'rosetta_python_projects', 'rosetta'
ID: 104576 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 104577 - Posted: 30 Jan 2022, 18:30:30 UTC - in response to Message 104575.  

You guys have not had problems with M2's and BOINC have you?
Both my systems have M2 drives and they are no different to other drives.
You physically install them.
Install the driver if Windows doesn't do it,
Use Disk Management to partition it, then Format it.
Once it is then visible in Explorer with it's own drive letter it's ready to use just like any other drive.



I installed it, partioned it and BOINC says to hell with your GPU's.
I put the data on C: (SATA SSD) or E: (HDD) and everything is happy.
Outside of BOINC it works fine in Windows.
ID: 104577 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 9,863
Message 104578 - Posted: 30 Jan 2022, 18:33:18 UTC - in response to Message 104577.  
Last modified: 30 Jan 2022, 18:33:46 UTC

You guys have not had problems with M2's and BOINC have you?
Both my systems have M2 drives and they are no different to other drives.
You physically install them.
Install the driver if Windows doesn't do it,
Use Disk Management to partition it, then Format it.
Once it is then visible in Explorer with it's own drive letter it's ready to use just like any other drive.



I installed it, partioned it and BOINC says to hell with your GPU's.
I put the data on C: (SATA SSD) or E: (HDD) and everything is happy.
Outside of BOINC it works fine in Windows.
My vote is now for: You've found one of the billion weird bugs in Boinc's sloppy programming. I doubt they'll ever sort it, since it appears to only happen once in a blue moon, you must have chosen the set of hardware that confuses it. You're writing to the disk, you're using folding at home on the GPUs, I can't see how your hardware can be blamed.
ID: 104578 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 272
Credit: 507,897
RAC: 334
Message 104579 - Posted: 30 Jan 2022, 18:42:05 UTC - in response to Message 104576.  
Last modified: 30 Jan 2022, 18:44:25 UTC

I have <max_concurrent>0</max_concurrent> for rosetta app, but boinc still computes them.
It's possible there's no zero option. Zero often means no limit in Boinc.

Also check Messages tab in Boinctasks or Event Log in Boinc Manager in case there's an error in your file. It looks ok to me, although the indenting is all over the place, but I think it ignores that. Mine would look like this:


<app_config>
   <app>
      <name>rosetta</name>
      <max_concurrent>0</max_concurrent>
   </app>
   <app>
      <name>rosetta_python_projects</name>
      <max_concurrent>2</max_concurrent>
   </app>
</app_config>

I have created a github issue.
ID: 104579 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 167 · 168 · 169 · 170 · 171 · 172 · 173 . . . 309 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org