Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 140 · 141 · 142 · 143 · 144 · 145 · 146 . . . 274 · Next

AuthorMessage
Falconet

Send message
Joined: 9 Mar 09
Posts: 350
Credit: 1,002,599
RAC: 188
Message 103640 - Posted: 1 Dec 2021, 11:29:44 UTC - in response to Message 103639.  
Last modified: 1 Dec 2021, 11:31:32 UTC

The created VMs still have the same Hard Disk and RAM set up as before. 8Gb HD and 6Gb RAM.
All tasks I had running had to be aborted. They weren't using any cpu cycles.

Is anyone successfully completing one?



Mine are saying 2.79 GB of RAM and using barely over 56 MB each.
I've completed 4 so far on my laptop.


I see you got a few done. OThers were aborted. If they weren't using CPU cycles, it's possible they were some of those that hang for hours and hours.
ID: 103640 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jonathan

Send message
Joined: 4 Oct 17
Posts: 42
Credit: 1,337,472
RAC: 60
Message 103641 - Posted: 1 Dec 2021, 11:47:52 UTC - in response to Message 103640.  

What Boinc reports and what the VM is created with is two different things. I get about the same results as you looking at the Boinc task properties.
All the successful tasks were the previous Python versions. Started with "boinc_cages_IL_"

I haven't got a single, newer one to run yet. they all hang and when I call up the VM's screen the last line is
"Intel MKL FATAL ERROR: Error on loading function mkl_lapack_ps_mc3_dsytrf_l_small."

If you call up Virtual box and look at a VM, is you last line like that on the monitor? You just go to Machine - Detach GUI to close it afterwards.
ID: 103641 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Falconet

Send message
Joined: 9 Mar 09
Posts: 350
Credit: 1,002,599
RAC: 188
Message 103642 - Posted: 1 Dec 2021, 12:30:59 UTC - in response to Message 103641.  

I see VB reports 6144 MB of base memory.
No, I don't have that line when I open the VM on either of the 2 Pythons that seem to be running fine.
ID: 103642 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1829
Credit: 114,446,264
RAC: 56,273
Message 103643 - Posted: 1 Dec 2021, 14:22:48 UTC

I've got one here that's saying 6GB, but also one that has 10200MB! And 20GB of disk space...
ID: 103643 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Falconet

Send message
Joined: 9 Mar 09
Posts: 350
Credit: 1,002,599
RAC: 188
Message 103644 - Posted: 1 Dec 2021, 14:30:10 UTC - in response to Message 103643.  

I'm still running 2 Pythons and Rosetta is using 24 GB of my SSD.
ID: 103644 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
.clair.

Send message
Joined: 2 Jan 07
Posts: 274
Credit: 26,399,595
RAC: 3
Message 103645 - Posted: 1 Dec 2021, 17:51:04 UTC - in response to Message 103635.  

snip...
Now then what can I meddle with next . . . .
How about going to `vbox64_mt` and setting the default cpu count to two and then half the default run time
that would offset the memory use and use more cpu`s , hmm ;)

You'd better check if Oracle provides vbox64_mt and whether the Python tasks are able to use it before doing much with that.
The Python tasks now only reserve 2.79GB of memory each, at least for Windows 10, so the project staff HAS found a way to control the amount of memory reserved.

cosmology@home has been using `vbox64_mt` for a few years on its vbox work

I see there are still a lot of big Python _1 tasks in the pipeline, resends that will take a while to clean up, that still demand big memory, and stop the python mini`s running
ID: 103645 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1223
Credit: 13,824,497
RAC: 2,340
Message 103646 - Posted: 1 Dec 2021, 22:17:52 UTC - in response to Message 103645.  

snip...
Now then what can I meddle with next . . . .
How about going to `vbox64_mt` and setting the default cpu count to two and then half the default run time
that would offset the memory use and use more cpu`s , hmm ;)

You'd better check if Oracle provides vbox64_mt and whether the Python tasks are able to use it before doing much with that.
The Python tasks now only reserve 2.79GB of memory each, at least for Windows 10, so the project staff HAS found a way to control the amount of memory reserved.

cosmology@home has been using `vbox64_mt` for a few years on its vbox work

I see there are still a lot of big Python _1 tasks in the pipeline, resends that will take a while to clean up, that still demand big memory, and stop the python mini`s running

Good. The other problem is whether the Rosetta python tasks are written to know how to use it.
ID: 103646 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5658
Credit: 5,670,291
RAC: 2,328
Message 103647 - Posted: 1 Dec 2021, 23:26:41 UTC

Some needs to open the download hose some more.
2GB files and in 5 minutes it has downloaded on 38.75% of the vdi file.
That's slower than a handicap snail!
3200 KBs? for 2 gigs! gees!
ID: 103647 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Falconet

Send message
Joined: 9 Mar 09
Posts: 350
Credit: 1,002,599
RAC: 188
Message 103648 - Posted: 1 Dec 2021, 23:31:32 UTC - in response to Message 103647.  
Last modified: 1 Dec 2021, 23:32:12 UTC

Mine downloaded at around 2500 kbps last night. Which is fine, I'm patient lol.

I've noticed the amount of ready to send Pythons has been decreasing for a while now.
Currently at 1815, 2 hours or so ago it was at 3730. Wonder if there's an issue.
ID: 103648 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5658
Credit: 5,670,291
RAC: 2,328
Message 103649 - Posted: 1 Dec 2021, 23:36:20 UTC

Now to add insult. First 5 files, then the next 4 files and the next 4 files after that all have failed to download errors!!!

What now?
ID: 103649 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
nairb

Send message
Joined: 8 Dec 05
Posts: 17
Credit: 990,147
RAC: 0
Message 103650 - Posted: 2 Dec 2021, 0:12:56 UTC
Last modified: 2 Dec 2021, 0:13:38 UTC

Just added rosetta to a new machine and all w/u have failed with download errors.
it says:-
Giving up on download AIMNet_minimization_python_project.py:permanent HTTP error.

It downloaded (v.slowly) the 2 gig file ok.......
ID: 103650 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1223
Credit: 13,824,497
RAC: 2,340
Message 103651 - Posted: 2 Dec 2021, 0:14:37 UTC - in response to Message 103649.  

Now to add insult. First 5 files, then the next 4 files and the next 4 files after that all have failed to download errors!!!

What now?

One thing to do is to wait for another try at downloading those files.
ID: 103651 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mmonnin

Send message
Joined: 2 Jun 16
Posts: 54
Credit: 20,058,207
RAC: 31,720
Message 103652 - Posted: 2 Dec 2021, 0:26:59 UTC

Nothing but download errors now with the new vdi file

<core_client_version>7.16.6</core_client_version>
<![CDATA[
<message>
WU download error: couldn't get input files:
<file_xfer_error>
<file_name>AIMNet_minimization_python_project.py</file_name>
<error_code>-224 (permanent HTTP error)</error_code>
<error_message>permanent HTTP error</error_message>
</file_xfer_error>
</message>
]]>

No size on the file when it actually exists.
-rw-r--r-- 1 boinc boinc 0 Dec 1 18:56 AIMNet_minimization_python_project.py
ID: 103652 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
.clair.

Send message
Joined: 2 Jan 07
Posts: 274
Credit: 26,399,595
RAC: 3
Message 103654 - Posted: 2 Dec 2021, 1:25:35 UTC

Something`s broken the server [waznt me . . . ]
As of 2 Dec 2021, 0:00:15 UTC [ Scheduler running ]
Total queued jobs: 2,163,843
ready to send 0
so its nufink 2 do 4 a wile
ID: 103654 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5658
Credit: 5,670,291
RAC: 2,328
Message 103656 - Posted: 2 Dec 2021, 7:31:17 UTC

Don't they even check their stuff before dropping it on the server?
Really doesn't seem that way.
ID: 103656 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103659 - Posted: 2 Dec 2021, 14:50:06 UTC - in response to Message 103656.  

They are sending me stuff (pythons), but it is somewhat futile.
A significant number either fail with "0 CPU", or else "Vm job unmanageable".

That should have been taken care of before they even started with the pythons.
But then they might have had to listen to feedback from the crunchers.
ID: 103659 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
SirMikan

Send message
Joined: 14 May 20
Posts: 1
Credit: 709,623
RAC: 0
Message 103660 - Posted: 2 Dec 2021, 18:14:11 UTC

I haven't received any new tasks since November 28. Are there any issues right now with the task server?
ID: 103660 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Falconet

Send message
Joined: 9 Mar 09
Posts: 350
Credit: 1,002,599
RAC: 188
Message 103661 - Posted: 2 Dec 2021, 18:40:05 UTC - in response to Message 103660.  

There are few of the standard Rosetta app tasks available.
There are plenty of the Python Rosetta tasks, though. But you need to install Virtualbox in order to run those.
ID: 103661 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5658
Credit: 5,670,291
RAC: 2,328
Message 103665 - Posted: 2 Dec 2021, 19:29:54 UTC - in response to Message 103660.  

I haven't received any new tasks since November 28. Are there any issues right now with the task server?



No...just very small limited batches of 4.2 stuff.
The focus seems to be on Python projects now.
ID: 103665 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5658
Credit: 5,670,291
RAC: 2,328
Message 103666 - Posted: 2 Dec 2021, 19:38:08 UTC - in response to Message 103654.  

Something`s broken the server [waznt me . . . ]
As of 2 Dec 2021, 0:00:15 UTC [ Scheduler running ]
Total queued jobs: 2,163,843
ready to send 0
so its nufink 2 do 4 a wile


There may be 2 million queued, but they are not ready to be released yet.
You can look to the right top and you will see: Tasks ready to send 5007
And then below left is the breakdown:
Rosetta 9 69053 6.93 (0.24 - 41.66) 2284
Rosetta Mini 0 0 --- 0
rosetta python projects 5001 21761 4.76 (0.01 - 36.33) 1457

Which does not match as the total here is 5010.
But what is 3 jobs in the big picture anyway.

And at 2300 UTC I was getting lots of download errors, so maybe they pulled the tasks to repair the problem.
I have not looked at my event log to see when I got my first new 4.2 and Python. But I have those now.
ID: 103666 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 140 · 141 · 142 · 143 · 144 · 145 · 146 . . . 274 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org