Posts by BoredEEdude

1) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 105113)
Posted 21 Feb 2022 by BoredEEdude
Post:
@ BoredEEdude
Try this
...
Use no more than - 500 GB . . don't worry about setting this BIG. even if you set this bigger than the disk , it works.
Leave at least ## GB free . . [untick this box not needed]
Use no more than ## % of total . . [untick this box not needed]
try it , see what happens , pythons are a pain.

I changed to using just the "Leave at least _____ GB free" setting about 30 minutes before reading your post about using just the "Use no more than _____ % of total" setting.

Well, by the next day the BOINC client was only running 3-4 python tasks with about 12 more waiting to run, even though the CPU was almost down to an idle and there was plenty of free memory. Those 15 tasks were also the only tasks in the queue, and requesting and update didn't get more tasks. The same server side error about low disk space had also showed up again. The day before the new settings arriving had gotten up to about 218 GB in use, then it fell back down overnight to around 70 GB as tasks were completed.

So I unchecked the "Leave at least 50 GB free" setting,
and used .clair.'s suggestion of just "Use no more than 500 GB" instead.


Since the client had not started up and existing tasks into available free memory and CPUs, I then restarted BOINC manager. After restart, about 20 new tasks were downloaded immediately, and in short order 10 python tasks were up and running at the same time.

Now to wait and see if this new setting is more stable for me over time.
2) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 105092)
Posted 21 Feb 2022 by BoredEEdude
Post:
@ BoredEEdude
Try this
...
Use no more than - 500 GB . . don't worry about setting this BIG. even if you set this bigger than the disk , it works.
Leave at least ## GB free . . [untick this box not needed]
Use no more than ## % of total . . [untick this box not needed]
try it , see what happens , pythons are a pain.

I changed to using just the "Leave at least _____ GB free" setting about 30 minutes before reading your post about using just the "Use no more than _____ % of total" setting.

As I write this, the BOINC server notice about needing more space has gone away, so I'm assuming my change is working for now. If I get a new server error about lacking space, I will give your approach a try.

I also just saw my first 3 valid python tasks accepted by the server a few minutes ago, so my main problem of VirtualBox tasks not working seems to be fixed. For now I just want to see if everything keeps running smoothly for the next few days. If I keep tinkering with the settings it might just confuse the server further in the short term.
3) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 105091)
Posted 21 Feb 2022 by BoredEEdude
Post:
I've been here a long time. This used to be a project that was progressive in their keeping tabs on things here. But as they have grown more dependant on their neural network, they stopped paying attention here.

The old procedure was to test new tasks on the beta program RALPH. But now it seems to be just put the tasks here without checking them and see what happens.

I wouldn't give up this project, because the science we process for them is useful. It's just a shame they don't communicate anymore.

I have also been supporting Rosetta@home for a long time (will be 10 years as of 2022-04-11), but have been thinking lately (without any input except what the program and website statistics indicate) that something about the project's culture seems to have changed.

In the early days, they really seemed to appreciate the huge amount of "free" computational power being made available. Stories of what kinds of results they were getting could be found easily, and it was clear that a lot of the progress being made was because of the computing resources being volunteered. 100% of my many computers were dedicated to processing Rosetta tasks for most of my time here.

When COVID hit, the amount of available resources skyrocketed to "help find a cure" or at least better solutions to the problem. My individual climb up the crunching statistics ladder slowed, stopped, then backslid as all the other resources came online. It was all good, since getting science done was the main thing. Personal credit standing was just a nice-to-see thing, and I knew I would eventually level out at some point below the biggest crunchers. It was also interesting over the years to watch power crunchers with massive server farms rocket up the ladder, then eventually disappear, and finally see their peak ranking backslide down the ladder as new power users appeared with better computers, or how the slow-but-steady crunchers would just continue their slow rise on the rankings ladder.

When GridCoin came along, I also figured what the heck, I may as well try getting some of that crypto for the computer work I'm already doing anyhow. GRC will probably never amount to anything, but one thing it did do was prod me to add on other projects when Rosetta was having an outage and my computers were sitting idle. So it served to open the crack of my previous exclusive support to Rosetta, as other projects got CPU time whenever Rosetta went down.

Now I'm reading about their changing focus (what's this neural network stuff about?), seeing a lot of problematic tasks getting released for days on end with no supervision, and having many of my computers sitting idle from lack of compatible work. Reading that a computational-based project has no dedicated IT support staff (not even 1 part-time individual?), science researchers writing programs (with possibly weak basic computer skills), and expecting the individuals already volunteering their computer resources to be the ones troubleshooting issues back to a project that may not really be listening to that feedback is slightly disturbing.

After getting millions of dollars worth of free computing resources donated, which in turn undoubtedly helped to validate research directions and define individual researcher's careers along the way, I would expect that some small but significant about of support/effort be put into addressing at least the obvious problems encountered by the volunteers to the project. That no one from the project will apparently ever read this comment for themselves would indicate a level of assumed entitlement to those donated resources. Or possibly an expectation of moving away from needing those resources in the future, so why bother supporting them now? Just as SETI@home eventually stopped supplying computational work on March 2020, maybe Rosetta@home is heading in a similar direction at some point?

I will continue to give Rosetta priority over other BOINC projects (for now). But if this project expects to retain this computing capability (from "everyone", not just me), they should start paying more attention to supporting it. Once users start leaving due to no available work, or after a lot of aggravation from bad/useless work (I'm looking at you movingstubb), those users may never return. And I hope we won't see another COVID-type crisis to spike the arrival of new computational donors, so they should be managing the ones they still have more carefully.
4) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 105089)
Posted 20 Feb 2022 by BoredEEdude
Post:
Rosetta@home: Notice from server
rosetta python projects needs 14411.41MB more disk space. You currently have 4662.07 MB available and it needs 19073.49 MB.
2/16/2022 10:25:14 PM


Say what?
500 GB dedicated drive
Boinc set to leave 2GB free use the rest
Windows says 97.1 used and 358 free out of 465 available

running 10 python and 4x 4.2

Weird

Now that I just got VirtualBox working, I got a similar server message. Went digging into it, and made the following notes:

- - -
Windows 11 file manager stats:
1 TB HDD
264 GB used
667 GB free
- - -
Within BOINC:

Computing preferences for the Disk

[x] Use no more than 100 GB
[x] Leave at least 1 GB free
[x] Used no more than 50% of total


Showing in the tab "Notices"

Rosetta@home: Notice from server
rosetta python projects needs 4874.94MB more disk space. You currently have 14198.55 MB available and it needs 19073.49 MB.
2/20/2022 4:36:59 PM


Showing in the tab "Disk"

Project #1: 530.50 KB
Project #2: 2.18 GB
Project #3: 21.49 MB
Rosetta@home: 83.74 GB

So, Rosetta is using about 95% of the available disk space allocated to BOINC, with a BOINC total of about 86 GB used of the 100 GB limit.

- - -

Re-writing the server notice to be more readable:

rosetta python projects needs 4.87 GB more disk space.
You currently have 14.19 GB available and it needs 19.07 GB.

Not clear where these limits and free space values are coming from. Nonetheless I can easily increase the available HDD space for BOINC usage. Currently still have 667 GB of free space, and no plans to use any of it for now. Changing the Disk settings to only have one limit for at lease 50 GB of free space be left available.

[ ] Use no more than ____ GB
[x] Leave at least 50 GB free
[ ] Used no more than ____ of total

With the old settings, the available HDD limit should have been 100 GB.

With the new settings, the available HDD limit should be ~617 GB, assuming nothing else uses any of the 667 GB of free space currently available.

UPDATE: By listing these different settings and notices in one place, I see the 86 GB used of 100 GB available leaves only 14 GB free. So the notice about 14.19 GB free but needs 19.07 GB for python (VM based) projects now makes sense.


With the increased available disk space change, I now need to wait and see if the server recognizes more space is available, and starts sending out more and/or larger tasks.
5) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 105084)
Posted 20 Feb 2022 by BoredEEdude
Post:
One of your Vbox tasks shows this error in the log: Failed to create the VirtualBox object!

2022-02-17 17:27:44 (8792): Detected: VirtualBox VboxManage Interface (Version: 6.1.12)
2022-02-17 17:27:51 (8792): Error in host info for VM: -2147024891
Command:
VBoxManage -q list hostinfo
Output:
VBoxManage.exe: error: Failed to create the VirtualBox object!
VBoxManage.exe: error: The object is not ready
VBoxManage.exe: error: Details: code E_ACCESSDENIED (0x80070005), component VirtualBoxClientWrap, interface IVirtualBoxClient

2022-02-17 17:27:51 (8792): WARNING: Communication with VM Hypervisor failed.
2022-02-17 17:27:51 (8792): ERROR: VBoxManage list hostinfo failed

Maybe someone else can help him with this?

I seem to have gotten VirtualBox working by doing a few things (so unsure exactly what fixed the VBoxManage.exe errors).

First, I let the last batch of "Rosetta v4.20 windows_x86_64" tasks to complete. Had dozens of "movingstub_..." tasks all fail immediately, and would cause the BOINC GUI to hang without updates while those movingstub tasks would start then fail. So stopped getting new tasks, and then allowed the good tasks to complete. (Went to work and left the system alone.)

Second, reenabled the "VirtualBox VM jobs" by clicking the "Allow" button. New tasks for "rosetta python projects v1.03 (vbox64) windows_x86_64" started to download. These tasks also started failing with more of the failed to create VirtualBox object. Shutdown BOINC so the few remaining tasks would not get a chance to start running.

Third, downloaded and installed the latest BOINC version. This would update my existing from
BOINC_7.16.11_with_Virtualbox_6.1.12 to the next version of
BOINC_7.16.20_with_Virtualbox_6.1.12
Note that the VirtualBox version remains the same. But it is still older than the current available version that VirtualBox says is available when its GUI is started up manually.

Fourth, when running the update executable, I made sure to right click on the installer program and use the "Run as administrator" option.
Previously I have the BOINC program files installed on SSD Drive C: and the ProgramData files on HDD D:
For this install, I also put the Program Files onto D: as well.
I also unchecked the option to to run BOINC in a separate service account, as I think I saw other comments saying that using a service account could be a problem.

Fifth, after the update BOINC seems to start, but not sure if the Rosetta tasks were working. After a minute or so, I shut down BOINC, then rebooted the PC since it has not been required during the upgrade. I figured I should give the system a clean restart anyhow.

Sixth, when BOINC started this time, I manually started the Oracle VM VirtualBox Manager GUI (again, ignoring the message that a newer version is available). After awhile I started seeing VMs getting spooled up one at a time. I had not seen these background VMs appearing before, and the errors make sense if the imagers weren't getting started in the first place. Until I saw the individual VM images listed in the VirtualBox GUI, I was not sure what I would be seeing anything in the leftmost column under the Tools icon. These images were starting up pretty slowly, one at a time, with the HDD activity was maxed out at 100% during each start, and only a brief back off of HDD activity between images starting. So VM spool-up speed seems up HDD bandwidth limited on this system. Eventually 8 images were started, and the HDD activity thankfully dropped greatly. To start off BOINC allowed for 100% CPU activity, which I reduced it to 50% CPU usage. All 8 images kept running, which makes sense, as this machine has an 8 core, 16 thread processor, so 1 VM per core. I then backed off the BOINC CPU usage to 45% so only 7 core would be in use to reduce fan noise slightly and allow other computer activities to have one full core to play with while the VMs ran in the background. (50% / 8 cores = ~6.25% per core, so the 45% limits will only support 7 cores at ~43.75% total.)

So the reinstall made these changes:
- Ran the combined BOINC and VirtualBox installer with Administrator Privileges (not sure if this was done previously).
- The BOINC Version was updated.
- The VirtualBox appeared to have some/all updates run on it as well, including changed to system privileges.
- The BOINC Program Files were placed in a new location.
- The removal of BOINC running in a separate service account.

As BIOS VM support was already turned on, and Microsoft VM support in Windows 10 (Hyper-V ?) was previously removed, the reinstall didn't mess with these other system issues.


With the New Rosetta Python now running, and using the Suspend/Resume button to verify switching between Rosetta and non-Rosetta tasks all run as expected, I just need to wait to see these Python VM Tasks complete and report successfully. But I expect that everything is now running OK.

Finally, on my non-Windows machines the "movingstub_..." tasks were not failing immediately. Some seemed to have completed. Didn't look into it further, but it seems the problem with all the movingstub tasks is related to Linux vs Windows platform differences.
6) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 104885)
Posted 17 Feb 2022 by BoredEEdude
Post:
If you aren't receiving Python work, go to your Rosetta account, list of computers, click on details on whatever computer you wish to run the Pythons on, scroll down and click allow where it says "allow VirtualBox jobs". Then try again.


Thanks! That was the missing step I needed to get a VM to download.

Of the three initial "rosetta python" tasks downloaded, all three failed with with "Computational error" after 15 seconds. So it doesn't look like Python will be running smoothly on my system anytime soon. :(
7) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 104881)
Posted 17 Feb 2022 by BoredEEdude
Post:
My movingstub tasks fail also.

Sorted tasks by name, then suspended the non-movingstub ones. That forced all of the movingstubs to fail immediately, report back as failed (not canceled) and then download more good tasks.
8) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 104872)
Posted 17 Feb 2022 by BoredEEdude
Post:
Thanks to all for the responses about VirtualBox.

I installed and run BOINC from a mechanical hard disk that is separate from the SSD from which the Windows OS, specifically to reduce the constant writes I expected BOINC to be doing. So getting VirtualBox running on that same drive should not be a problem for the SSD.

At this point, the still don't understand is how to get the VM up and running in VirtualBox to begin with.

VirtualBox was installed at the same time as BOINC, using the combined BOINC + VB installer, so the two program versions should be the approved/compatible pairing. I can start VirtualBox separately, and it looks to be running OK, it just doesn't have any VMs loaded. From the BOINC Event Log, it knows that VirtualBox is available.

2/17/2022 10:44:30 AM | | Host name: DELL-XPS8930
2/17/2022 10:44:30 AM | | Processor: 16 GenuineIntel Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz [Family 6 Model 158 Stepping 12]
2/17/2022 10:44:30 AM | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 fma cx16 sse4_1 sse4_2 movebe popcnt aes f16c rdrandsyscall nx lm avx avx2 vmx smx tm2 pbe fsgsbase bmi1 hle smep bmi2
2/17/2022 10:44:30 AM | | OS: Microsoft Windows 10: Professional x64 Edition, (10.00.22000.00)
2/17/2022 10:44:30 AM | | Memory: 31.81 GB physical, 40.31 GB virtual
2/17/2022 10:44:30 AM | | Disk: 931.39 GB total, 751.12 GB free
2/17/2022 10:44:30 AM | | Local time is UTC -5 hours
2/17/2022 10:44:30 AM | | No WSL found.
2/17/2022 10:44:30 AM | | VirtualBox version: 6.1.12


Is Rosetta@home supposed to downloaded and start a VM automatically? Or do I have to manually kick off the VM installation somehow? After reading countless forum posts for a few hours, I have seen lots of discussion about trying to get stuff to run correctly, and a few mentions of an image checksum being wrong so the VM apparently won't install, but haven't seen an explanation about how how the initial VM installation is supposed to work.

On my system the original tasks will download and run when they are available, but nothing seems to be happening as it relates to VirtualBox.
9) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 104844)
Posted 16 Feb 2022 by BoredEEdude
Post:
I have been running Rosetta on multiple computers for years, and it has been a mostly hands-off background task requiring minimal supervision.

For the past few months, Rosetta work units have been unavailable for days on end. No errors are shown, just "got 0 new tasks".

2/16/2022 11:45:16 AM | Rosetta@home | update requested by user
2/16/2022 11:45:20 AM | Rosetta@home | Sending scheduler request: Requested by user.
2/16/2022 11:45:20 AM | Rosetta@home | Requesting new tasks for CPU
2/16/2022 11:45:22 AM | Rosetta@home | Scheduler request completed: got 0 new tasks
2/16/2022 11:45:22 AM | Rosetta@home | No tasks sent
2/16/2022 11:45:22 AM | Rosetta@home | Project requested delay of 31 seconds


When this happens, the online website server status shows approximately 5000 tasks are ready to send, with some large number (~100k) of tasks in progress, and little server side processing occurring.

Computing status
Work
Tasks ready to send 4992
Tasks in progress 115529
Workunits waiting for validation 0
Workunits waiting for assimilation 1
Workunits waiting for file deletion 1
Tasks waiting for file deletion 1
Transitioner backlog (hours) 0.00


It seems that whenever the available number of tasks gets down to around 5000, all work units are considered sent, and the server backend is now just waiting for completed work to be returned. I don't recall ever seeing the available tasks go down to zero.

When I do eventually get some tasks, everything runs as expected locally until all tasks are finished. Then I go idle for days waiting for more tasks to become available.

If seems to me that the project is just not generating as much work for all of it's users these days. I don't know if that is because the number of work units are down, or there are many more users available to process the same number of generally available units, or if the type of work has changed and I am unaware of what my system is lacking so it can be sent some of these "new" type of tasks now being made available.

Is there a checklist somewhere that I can use to verify my system is setup correctly? Because my BOINC Manager currently thinks everything is running just fine.

I used to run Rosetta work exclusively. But to keep my computers occupied (non-idle) I have since added other projects so I can pickup other tasks when no Rosette tasks are available. The downside is that when Rosetta tasks are available, these other projects dilute the amount of resources I can devote to Rosetta in the hands-off processing approach I prefer, as all projects now have to share the available CPU time.

If many Rosetta users are running out of work, but there are still 10s or 100s of thousands of tasks still in progress, can Rosetta start limiting the number of tasks sent to individual users (even if they are willing to backlog a large numbers of tasks locally)?

I have seen other projects where tasks were only generated in large bursts, and the users knew to backlog days or weeks worth of tasks since the server would quickly run out of new tasks to send out. The result was that if you didn't stockpile tasks during the initial big release, you would virtually never see any tasks unless BOINC happened to check in during a new big release of tasks days or weeks in the future.

Limiting the size of individual user backlogs would spread the available work out across all the available users. That would help retain more users, since everyone would feel like they are contributing to the project. At this point, I feel like I'm getting sidelined with no work, while others are sitting on a lot of work units they cannot run immediately. And the rate of results back to Rosetta will be delayed unnecessarily as they wait for the return of backlogged tasks for a few users instead of sending them to idle machines instead.

My Rosetta@home Statistics graph clearly shows 3 bursts of activity over a total of 8 days within the past 30 days. That leaves me sitting idle for 22 days (or about 75% of that time). My main PC (which the graph come from) is capable of running 16 concurrent tasks in 32 GB of RAM at ~3.5 GHZ CPU speed, so while I can normally complete many concurrent tasks in about 8 hours, 75% of the month Rosetta gets ZERO results from me for lack of tasks to run.

https://drive.google.com/file/d/1X5aBWy0xj2wgV7DpF9tqjrRg8i8E-XEY/view
10) Message boards : Number crunching : Results not reporting (Stuck at Uploading) and no new jobs. (Message 78640)
Posted 31 Aug 2015 by BoredEEdude
Post:
While I didn't see any other posts about this problem before I made the post above, I found this other post from 18 hours ago in a later search:

https://boinc.bakerlab.org/rosetta/forum_thread.php?id=6701&nowrap=true#78619
11) Message boards : Number crunching : Results not reporting (Stuck at Uploading) and no new jobs. (Message 78638)
Posted 31 Aug 2015 by BoredEEdude
Post:
I have two Windows 7 machines that have completed all work units. Both are stuck at "uploading", and are not retrieving new tasks.

A non-Windows computer has reported completed work units since the Windows computers stopped working.

I updated one Windows client from the last version to the current version 7.6.6 with no change to the stuck uploading status. This was for Computer ID 2314530.

The server status page also seems to be stuck at "29 Aug 2015 23:25:15 UTC" and is not updating every 10 minutes.

https://boinc.bakerlab.org/rosetta/rah_status.php







©2024 University of Washington
https://www.bakerlab.org