Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 167 · 168 · 169 · 170 · 171 · 172 · 173 . . . 305 · Next

AuthorMessage
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 104538 - Posted: 27 Jan 2022, 22:28:21 UTC - in response to Message 104535.  

Well taking the 1050 out did not solve the problem.
So I guess it goes to my original theory that the drive is defective.
Going to finish off the last work and take the drive out and see what happens on reinstall.
Did you have the drive in M2 slot 2 when you tried a single GPU? Otherwise I'm thinking it might be trying to take the GPU's lanes.

Actually I think it won't make any difference removing a GPU. Two GPUs use 2x8 lanes. One GPU uses 1x16 lanes. There is no way your board and CPU are designed so they can't cope with one GPU and one M2 drive. If you have the drive in slot 2 and it fails, it's busted. Add in you not being able to flash it. Something funny going on with that drive. I got a Crucial SSD drive once (just on SATA) and it suddenly failed to be recognised. "Known fault, flash it" they said. I couldn't flash what wasn't detected. Back it went.



I'm done with all this hardware swap nonsense. Going to box it up when things are done and swap it out.



Now I can't install BOINC mgr because D does not exist. I used revo uninstall to take the program out and ran wise365 to clean everything. But BOINC keeps telling me D does not exist and can not go further.
But windows does not see D anymore, so what now?!?!
Thinking the M.2 drive thing was a big mistake.
ID: 104538 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,020,126
RAC: 16,456
Message 104539 - Posted: 27 Jan 2022, 23:17:10 UTC - in response to Message 104538.  

Now I can't install BOINC mgr because D does not exist. I used revo uninstall to take the program out and ran wise365 to clean everything. But BOINC keeps telling me D does not exist and can not go further.
But windows does not see D anymore, so what now?!?!
Thinking the M.2 drive thing was a big mistake.
That's odd, you should be asked where to install during the installation of Boinc. Make sure you're clicking "advanced".

I think you got a faulty drive. Find somewhere local that can sell you a tested one.
ID: 104539 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 104542 - Posted: 28 Jan 2022, 6:53:15 UTC - in response to Message 104539.  

Now I can't install BOINC mgr because D does not exist. I used revo uninstall to take the program out and ran wise365 to clean everything. But BOINC keeps telling me D does not exist and can not go further.
But windows does not see D anymore, so what now?!?!
Thinking the M.2 drive thing was a big mistake.
That's odd, you should be asked where to install during the installation of Boinc. Make sure you're clicking "advanced".

I think you got a faulty drive. Find somewhere local that can sell you a tested one.


The drive is out
Biinc was removed.
Registery was cleaned with wise365 and CCleaner.
But the installer seems to think drive D still exists.

It looks like the only way to get around this is to put the drive back and see what happens and then if it's happy out the data onto E which is the HDD.
ID: 104542 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 104549 - Posted: 28 Jan 2022, 19:02:11 UTC
Last modified: 28 Jan 2022, 20:01:30 UTC

It's the freaking M.2 drive.
Data is now on E: on the HDD and things work.
So either the M.2 is defect or there are not enough PCIe channels available.
Taking the drive back tomorrow since I can't flash the firmware.
This is just nuts!

So here is what I find on PCIe channels

Expansions

In addition to the x4 lanes that are reserved for the chipset, the Ryzen 7 3700X has x16 for a discrete graphics processor and x4 for storage (NVMe or 2 ports SATA Express).

Expansion Options
PCIe Revision: 4.0
Max Lanes: 20
Configuration: 1x16+x4, 2x8+x4, 1x8+2x4+x4

But also I find:
The amount of PCIe lanes on AMD Ryzen CPUs range from 24 on an consumer grade Ryzen 3, 5, 7 and 9 CPUs all the way to 128 on workstation grade Ryzen Threadripper CPUs.

The popular desktop Ryzen series i.e 3000 and 5000, have 24 lanes.

So we get back to there should be 4 available for the M.2
But M.2 and BOINC at least here don't get along (possibly for a defective drive)

And I find this about the drive: SSD-interface PCI Express 3.0 x4

So it must be something wrong with the drive because this should work with my system.

2 x 8 for gpu
4 for M.2
4 for CPU

But someone at work was saying the SATA also affects PCIe channel count?
ID: 104549 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,020,126
RAC: 16,456
Message 104550 - Posted: 28 Jan 2022, 20:00:42 UTC - in response to Message 104542.  

Now I can't install BOINC mgr because D does not exist. I used revo uninstall to take the program out and ran wise365 to clean everything. But BOINC keeps telling me D does not exist and can not go further.
But windows does not see D anymore, so what now?!?!
Thinking the M.2 drive thing was a big mistake.
That's odd, you should be asked where to install during the installation of Boinc. Make sure you're clicking "advanced".

I think you got a faulty drive. Find somewhere local that can sell you a tested one.


The drive is out
Biinc was removed.
Registery was cleaned with wise365 and CCleaner.
But the installer seems to think drive D still exists.

It looks like the only way to get around this is to put the drive back and see what happens and then if it's happy out the data onto E which is the HDD.
I don't understand this. Boinc puts some basic stuff into C:. But the majority goes where you tell it to. You should have been able to select E:.
ID: 104550 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,020,126
RAC: 16,456
Message 104551 - Posted: 28 Jan 2022, 20:01:36 UTC - in response to Message 104549.  

It's the freaking M.2 drive.
Data is now on E: on the HDD and things work.
So either the M.2 is defect or there are not enough PCIe channels available.
Taking the drive back tomorrow since I can't flash the firmware.
This is just nuts!
I'm sure they wouldn't design a board and CPU which can't handle one GPU and one M2 drive. I'm sure it's a broken drive.
ID: 104551 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 104552 - Posted: 28 Jan 2022, 21:17:00 UTC - in response to Message 104550.  

Now I can't install BOINC mgr because D does not exist. I used revo uninstall to take the program out and ran wise365 to clean everything. But BOINC keeps telling me D does not exist and can not go further.
But windows does not see D anymore, so what now?!?!
Thinking the M.2 drive thing was a big mistake.
That's odd, you should be asked where to install during the installation of Boinc. Make sure you're clicking "advanced".

I think you got a faulty drive. Find somewhere local that can sell you a tested one.


The drive is out
Biinc was removed.
Registery was cleaned with wise365 and CCleaner.
But the installer seems to think drive D still exists.

It looks like the only way to get around this is to put the drive back and see what happens and then if it's happy out the data onto E which is the HDD.
I don't understand this. Boinc puts some basic stuff into C:. But the majority goes where you tell it to. You should have been able to select E:.


Because the main section is a program. It falls under program files in windows.
The other stuff is data, you can put that anywhere.
ID: 104552 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 104553 - Posted: 28 Jan 2022, 21:22:08 UTC - in response to Message 104551.  
Last modified: 28 Jan 2022, 21:23:14 UTC

It's the freaking M.2 drive.
Data is now on E: on the HDD and things work.
So either the M.2 is defect or there are not enough PCIe channels available.
Taking the drive back tomorrow since I can't flash the firmware.
This is just nuts!
I'm sure they wouldn't design a board and CPU which can't handle one GPU and one M2 drive. I'm sure it's a broken drive.



24 channels.
2 x 8 for GPU
4 for CPU
4 should be open for the 4 channel M2.

Then I am getting my money's worth out of the gear. (16 cores, 2 GPU's and a dedicated storage drive) Everything is used to it max potential then.

If I could ever find another 1080 at the low price this server guy was offering a few years back it would be great. But the 1050 does fine, so might as well use it. Again, not in this game to compete against the big machines. Just a average machine that can do lots of things. Crunch, browse, play back DVD (Need to fire up Metallica newest release from San Fran orchestra) or CD's or utube). I am now at the point where I have spent enough money on BOINC related stuff, so might as well use it until it becomes outdated or dies.
ID: 104553 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,020,126
RAC: 16,456
Message 104554 - Posted: 28 Jan 2022, 22:16:36 UTC - in response to Message 104552.  

I don't understand this. Boinc puts some basic stuff into C:. But the majority goes where you tell it to. You should have been able to select E:.
Because the main section is a program. It falls under program files in windows.
The other stuff is data, you can put that anywhere.
Correct, so you should have been installing to C and E. Why was it looking on D?
ID: 104554 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1697
Credit: 18,164,734
RAC: 24,334
Message 104555 - Posted: 28 Jan 2022, 22:44:07 UTC - in response to Message 104552.  

Because the main section is a program. It falls under program files in windows.
Only if you choose to install it there.
When you install Bionic instead of going with the defaults, you select the Advanced option.
It allows you to select which drive the programme is installed on, and which drive the data is kept on.

https://boinc.mundayweb.com/wiki/index.php?title=How_to_set_up_BOINC_on_another_drive



Grant
Darwin NT
ID: 104555 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rjzak
Avatar

Send message
Joined: 3 Jan 13
Posts: 2
Credit: 2,290,836
RAC: 0
Message 104556 - Posted: 28 Jan 2022, 22:58:43 UTC

I recently installed VirtualBox, since it came to a point where jobs wouldn't download unless VirtualBox was installed. Now the VBox python jobs get stuck, where the job ends up running for a day or two stuck at 99%. The worst was an instance when a job was at 99.999% for 8 days, then was "aborted by project". The use of VirutalBox seems to just add problems, in my limited experience. I'll save logs the next time this happens.

OS: Ubuntu 20.04.3 LTS
Kernel: 5.4.0
Boinc: 7.16.6 from the Ubuntu repo.
ID: 104556 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 104557 - Posted: 28 Jan 2022, 23:54:31 UTC - in response to Message 104555.  
Last modified: 28 Jan 2022, 23:55:07 UTC

Because the main section is a program. It falls under program files in windows.
Only if you choose to install it there.
When you install Bionic instead of going with the defaults, you select the Advanced option.
It allows you to select which drive the programme is installed on, and which drive the data is kept on.

https://boinc.mundayweb.com/wiki/index.php?title=How_to_set_up_BOINC_on_another_drive





I used to split it up. But then with the 250GB and 4.2 on the SSD and all the other stuff on the HDD, I just left it all on the SSD. But then python blew up that idea. For now I am SSD main and HDD for data and python can use as much of the left over space on the 1TB HDD as it wants until I get a work m2. If the next m2 fails then it just stays with SSD and HDD.
ID: 104557 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,020,126
RAC: 16,456
Message 104563 - Posted: 29 Jan 2022, 18:42:03 UTC - in response to Message 104557.  

I used to split it up. But then with the 250GB and 4.2 on the SSD and all the other stuff on the HDD, I just left it all on the SSD. But then python blew up that idea. For now I am SSD main and HDD for data and python can use as much of the left over space on the 1TB HDD as it wants until I get a work m2. If the next m2 fails then it just stays with SSD and HDD.
You could always get a SATA SSD. You said you had a couple of ports left?
ID: 104563 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 104564 - Posted: 30 Jan 2022, 0:10:40 UTC - in response to Message 104563.  
Last modified: 30 Jan 2022, 0:11:49 UTC

I used to split it up. But then with the 250GB and 4.2 on the SSD and all the other stuff on the HDD, I just left it all on the SSD. But then python blew up that idea. For now I am SSD main and HDD for data and python can use as much of the left over space on the 1TB HDD as it wants until I get a work m2. If the next m2 fails then it just stays with SSD and HDD.
You could always get a SATA SSD. You said you had a couple of ports left?


Just swapped out the M2 for a new one.
Been playing catch up using the HDD.
Since I am shutting down now, I'll put in the new M2 when I get up.
Stay tuned for any new drama.

And yeah, I could go for another SATA SSD if this fails.

Oh BTW Grant, I already knew the advanced thing.
But thanks anyway.
ID: 104564 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Falconet

Send message
Joined: 9 Mar 09
Posts: 354
Credit: 1,248,587
RAC: 2,162
Message 104566 - Posted: 30 Jan 2022, 8:25:46 UTC

Looks like a small batch of 600,000 Rosetta 4.2 tasks.
Got na10v7nt on my computers.
ID: 104566 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 104567 - Posted: 30 Jan 2022, 11:51:30 UTC
Last modified: 30 Jan 2022, 12:12:29 UTC

New drive, same problems as before.
Taking it out and swap it for a SATA I guess.
Waiting on Samsung now.

You guys have not had problems with M2's and BOINC have you?
ID: 104567 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 264
Credit: 507,897
RAC: 1,097
Message 104568 - Posted: 30 Jan 2022, 13:22:12 UTC
Last modified: 30 Jan 2022, 13:32:22 UTC

I have this in app_config.xml
why do i still get and run rosetta 4.20 workunits?
<app_config>

   <app>
      <name>rosetta</name>
      
    <max_concurrent>0</max_concurrent>
      
    </app>
	
	   <app>
      <name>rosetta_python_projects</name>
       <max_concurrent>2</max_concurrent>
     
      
    </app>

	</app_config>
ID: 104568 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,020,126
RAC: 16,456
Message 104569 - Posted: 30 Jan 2022, 18:03:39 UTC - in response to Message 104567.  

New drive, same problems as before.
Taking it out and swap it for a SATA I guess.
Waiting on Samsung now.

You guys have not had problems with M2's and BOINC have you?
It's possible Boinc doesn't like them, or maybe just your specific hardware combination. I found it very fussy a few years ago when trying to get multiple graphics cards to work, but I think I ended up blaming that on worn out power connectors. They corrode eventually with all that current.

I've never actually tried an M2 drive, since I hadn't heard of them before my last SSD purchase. Feel free to post me yours :-)
ID: 104569 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 104570 - Posted: 30 Jan 2022, 18:04:34 UTC - in response to Message 104568.  

I have this in app_config.xml
why do i still get and run rosetta 4.20 workunits?
<app_config>

   <app>
      <name>rosetta</name>
      
    <max_concurrent>0</max_concurrent>
      
    </app>
	
	   <app>
      <name>rosetta_python_projects</name>
       <max_concurrent>2</max_concurrent>
     
      
    </app>

	</app_config>


because the project is configured to be covered under "rosetta" and does not associate the difference between 4.2 and python. That would require to much work on their end.
ID: 104570 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,020,126
RAC: 16,456
Message 104571 - Posted: 30 Jan 2022, 18:06:44 UTC - in response to Message 104568.  
Last modified: 30 Jan 2022, 18:08:36 UTC

I have this in app_config.xml
why do i still get and run rosetta 4.20 workunits?
<app_config>

   <app>
      <name>rosetta</name>
      
    <max_concurrent>0</max_concurrent>
      
    </app>
	
	   <app>
      <name>rosetta_python_projects</name>
       <max_concurrent>2</max_concurrent>
     
      
    </app>

	</app_config>
Unfortunately that's not seen by the server, those files just tell your own computer what CPU/GPU to direct them to and how many at once.

I can't seem to find an option in the server settings to do that, they must have forgotten.... I notice you can turn off the VB jobs, but not the other way round!
ID: 104571 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 167 · 168 · 169 · 170 · 171 · 172 · 173 . . . 305 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org