Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 203 · 204 · 205 · 206 · 207 · 208 · 209 . . . 280 · Next

AuthorMessage
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5664
Credit: 5,711,666
RAC: 233
Message 106002 - Posted: 23 Apr 2022, 23:40:10 UTC - in response to Message 105999.  

Sid, don't you use a disk cleaner? I have never had your issue.
The disk space thing, yeah we know that now after investigating what happens when you do 0.
(logic error between the programmers and the users)
I always suspend-shutdown client-exit at the end of my crunching day.

The ProgramData folder is on my C drive - an SSD. I don't routinely use a disk cleaner, no, and I wouldn't normally mess around within a directory on C
On the disk space thing, the settings we knew, but I haven't heard anyone say their drive has exceeded 500Gb, let alone that it's made up of .tmp files.
That's the news I'm providing so other people can check if they have the same issue and solve it in the simple way I have without repercussions

I get that...your case is really odd.
I have a 450 dedicated data drive of that with all the projects I used I am only using 38.6.
And it's weird you have so many .tmp files. I thought BOINC deleted all files once the data was uploaded and reported?

It is odd - I have no explanation or any idea what circumstances created it.
I only have this observation, for whatever value anyone else can find in it.
People might want to examine their own ProgramData folder and pre-empt the events that affected me if they see the same thing.
We're all in self-help mode


After that 0 value error thing and before the guys at github figured it out, I dropped some cash on a new SDD to put the data on. You might consider that if your budget allows it. Then you split program files to c: and data to d: (or whatever). Then data can have all the space it needs and only BOINC program files reside on c:

On my c: I have only 91 free from 220 (only boinc program files are here) but on d: I have 419 out of 450 freeand the data from 6 projects uses only 15.2. Rosetta is 8.63 of this total. 5,357 files and 424 folders.

You might not need 450GB and maybe something like a 250GB drive would work and just move data to there, then no matter what it wants, there is always lots of room. It's and idea, not sure if your bank account would like that idea or not.
ID: 106002 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Paddles

Send message
Joined: 15 Mar 15
Posts: 11
Credit: 5,026,006
RAC: 2,331
Message 106003 - Posted: 24 Apr 2022, 0:44:39 UTC

I'm encountering a new problem, has anyone else see it?. I have three Python tasks in the state "Postponed: VM Hypervisor failed to enter an online state in a timely fashion."

I'm running BOINC 7.6.20 and was on VirtualBox 6.1.32 - combination had been working generally happy, and I haven't changed any BOINC settings recently (or had any significant changes to available disk space). I've updated VBox to 6.1.34 to see if that resolves it, but it looks like the Python tasks are being postponed for a day and none of the others have started.
ID: 106003 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 237
Credit: 352,859
RAC: 1,472
Message 106004 - Posted: 24 Apr 2022, 0:46:18 UTC - in response to Message 106003.  

Can you unupdate to 5.2.44?
ID: 106004 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2003
Credit: 39,119,432
RAC: 23,845
Message 106005 - Posted: 24 Apr 2022, 2:56:11 UTC - in response to Message 106000.  

And it's weird you have so many .tmp files. I thought BOINC deleted all files once the data was uploaded and reported?

It is odd - I have no explanation or any idea what circumstances created it.
I only have this observation, for whatever value anyone else can find in it.
People might want to examine their own ProgramData folder and pre-empt the events that affected me if they see the same thing.
We're all in self-help mode

It TRIES to delete all files once the data was uploaded and reported. However, improperly shutting it down can prevent this from happening,

I use a disk cleaner on an SSD. It only looks for files it can delete.

https://www.google.com/search?client=firefox-b-1-d&q=disk+cleanup+windows+10

It should delete any temporary files that BOINC left on place when it was improperly shut down, such as having the power turned off without first telling BOINC to suspend and exit.

I only allow it to delete types of files that I agree are no longer useful.

Ok, I've just used Windows Disk cleanup and ensured storage sense is enabled and freed up a few Gb, but that's on a PC that isn't running VBox
I'll give that a go when I get back to my main PC tomorrow evening
ID: 106005 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Paddles

Send message
Joined: 15 Mar 15
Posts: 11
Credit: 5,026,006
RAC: 2,331
Message 106006 - Posted: 24 Apr 2022, 6:11:42 UTC - in response to Message 106004.  

Can you unupdate to 5.2.44?


I'll give that a try if there are continued problems. I just went back and looked, the existing tasks are still postponed (1 day hasn't elapsed yet, and there isn't an obvious way to manually resume them), but one of the other python tasks seems to be running ok so maybe it was just a transient issue.
ID: 106006 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Paddles

Send message
Joined: 15 Mar 15
Posts: 11
Credit: 5,026,006
RAC: 2,331
Message 106007 - Posted: 24 Apr 2022, 12:15:54 UTC - in response to Message 106006.  

Update: The first task to be postponed reached the end of its one day postponement, and now appears to be computing successfully (in VBox 6.1.34). Haven't tried reverting to previous version to see what happens, but whatever the problem was it seems to have resolved.
ID: 106007 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1226
Credit: 13,940,531
RAC: 3,125
Message 106008 - Posted: 24 Apr 2022, 13:23:47 UTC - in response to Message 106006.  

Can you unupdate to 5.2.44?


I'll give that a try if there are continued problems. I just went back and looked, the existing tasks are still postponed (1 day hasn't elapsed yet, and there isn't an obvious way to manually resume them), but one of the other python tasks seems to be running ok so maybe it was just a transient issue.

I've found a way to manually resume such tasks. Suspend all tasks, then exit BOINC. I've forgotten if it's necessary to restart Windows at this point. Then restart BOINC and tell it to resume tasks.
,
ID: 106008 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 9,852,023
RAC: 5,596
Message 106009 - Posted: 24 Apr 2022, 18:50:35 UTC - in response to Message 106005.  

Ok, I've just used Windows Disk cleanup and ensured storage sense is enabled and freed up a few Gb, but that's on a PC that isn't running VBox
I'll give that a go when I get back to my main PC tomorrow evening
I run the Windows disk cleanup (including system files) then run treesize which shows me what folders are using the most, so I can manually remove stuff I don't want anymore. Last time I reduced the stuff on my disk by a third.

I only bother doing this when the line changes from blue to red in windows explorer.
ID: 106009 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jean-David Beyer

Send message
Joined: 2 Nov 05
Posts: 178
Credit: 5,759,439
RAC: 3,860
Message 106010 - Posted: 24 Apr 2022, 19:24:37 UTC - in response to Message 105989.  

I run Linux and have never run out of disk space (because spinning hard drives are now so big and so cheap). I have about 512 GBytes of ssd, and two 4-Terabyte spinning hard drives.
But on Linux, you can find out how your disk space is being used very easily. Here is what is in my /var/lib /boinc directory and everything under it. To keep from boring you, I printed out only the first 24 lines. The numbers are in 1024-byte blocks. Right now, I have only universe and rosetta tasks running on my machine. So I seem to be using about 2.37 GigaBytes of disk space in that partition that is sized at about 500 GigaBytes of size. When I have a lot of ClimatgeaPrediction tasks and WCG tasks, I use a lot more, but even then, I come nowhere close to using it all.
[/var/lib/boinc]$ du . | sort -nr | head -n 24
2373204	.
2282044	./projects
1763172	./projects/boinc.bakerlab.org_rosetta
996448	./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database
996448	./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl
454540	./projects/www.worldcommunitygrid.org
310248	./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/chemical
273928	./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/chemical/pdb_components
243200	./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/sampling
236248	./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/scoring
191412	./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/scoring/score_functions
190452	./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/rotamer
91416	./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/rotamer/ncaa_rotlibs
86112	./slots
84812	./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/rotamer/ncaa_rotlibs/ncaa_rotamer_libraries
58676	./projects/climateprediction.net
53652	./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/scoring/score_functions/rama
51688	./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/scoring/score_functions/mhc_epitope
45804	./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/rotamer/ncaa_rotlibs/ncaa_rotamer_libraries/n_methyl_amino_acid
39948	./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/scoring/score_functions/P_AA_pp
39672	./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/scoring/score_functions/P_AA_pp/shapovalov
37292	./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/scoring/score_functions/P_AA_pp/shapovalov/2.5deg
34520	./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/chemical/residue_type_sets
32532	./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/scoring/motif


As far as .tmp files are concerned, there are very few: here are all of them:
[/var/lib/boinc]$ du -a | grep tmp
484	./slots/0/data0.tmp
0	./slots/0/data1.tmp
0	./slots/0/data2.tmp
12	./slots/0/error.tmp
228	./slots/1/data0.tmp
0	./slots/1/data1.tmp
0	./slots/1/data2.tmp
8	./slots/1/error.tmp
4	./slots/2/rosetta_tmp.txt
512	./slots/3/data0.tmp
0	./slots/3/data1.tmp
0	./slots/3/data2.tmp
12	./slots/3/error.tmp
4	./slots/4/rosetta_tmp.txt
4	./slots/5/rosetta_tmp.txt
4	./slots/6/rosetta_tmp.txt
4	./slots/7/rosetta_tmp.txt

ID: 106010 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 9,852,023
RAC: 5,596
Message 106011 - Posted: 24 Apr 2022, 19:27:16 UTC - in response to Message 106010.  
Last modified: 24 Apr 2022, 19:27:45 UTC

I run Linux and have never run out of disk space (because spinning hard drives are now so big and so cheap).
You forgot "and slow". I've banned spinning disks from anything boinc related in my house. I have 7 PCs running Boinc and it's difficult to control them all when one is sat waiting on a disk! The only things rust spinners are used for is backups, TV/Film storage, and security cameras.
ID: 106011 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
.clair.

Send message
Joined: 2 Jan 07
Posts: 274
Credit: 26,399,595
RAC: 0
Message 106012 - Posted: 24 Apr 2022, 20:30:00 UTC

Is it time to panic ?
There are less than a million tasks left on the front page . . .
Does this mean we may run out of pythons sometime this year :-)
and then what will we do for `entertainment`
ID: 106012 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 106013 - Posted: 24 Apr 2022, 21:02:29 UTC - in response to Message 106012.  

Is it time to panic ?
There are less than a million tasks left on the front page . . .
Does this mean we may run out of pythons sometime this year :-)
and then what will we do for `entertainment`

I am wondering that myself. The pythons are from a single researcher, and I don't know if there will be more.
Maybe it is just a one-shot experiment?

Since they never tell us anything, planning is not possible.
ID: 106013 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 9,852,023
RAC: 5,596
Message 106014 - Posted: 24 Apr 2022, 21:09:20 UTC - in response to Message 106012.  

Is it time to panic ?
There are less than a million tasks left on the front page . . .
Does this mean we may run out of pythons sometime this year :-)
and then what will we do for `entertainment`
Play with WCG. If they ever work out how to move a server from one building to another. Another delay until 9th May....
ID: 106014 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1509
Credit: 15,236,281
RAC: 23,398
Message 106016 - Posted: 24 Apr 2022, 21:25:01 UTC - in response to Message 106012.  

Is it time to panic ?
There are less than a million tasks left on the front page . . .
Does this mean we may run out of pythons sometime this year :-)
Given that the most In progress for them was a bit over 21,000, they tend to average around 15,000 or less, and that there are presently only 10,500 In progress, i think it will be a long, long, long time before they get cleared due to the very minuscule number of systems that are actually processing them.
Grant
Darwin NT
ID: 106016 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 9,852,023
RAC: 5,596
Message 106017 - Posted: 24 Apr 2022, 21:39:54 UTC - in response to Message 106016.  
Last modified: 24 Apr 2022, 21:41:18 UTC

Is it time to panic ?
There are less than a million tasks left on the front page . . .
Does this mean we may run out of pythons sometime this year :-)
Given that the most In progress for them was a bit over 21,000, they tend to average around 15,000 or less, and that there are presently only 10,500 In progress, i think it will be a long, long, long time before they get cleared due to the very minuscule number of systems that are actually processing them.
I make that four months. Depends how soon you want to panic. Anyway why panic when there's about 50 projects to play with? I'm off doing Milkyway (DP cards), Cosmology (CPUs), and Folding (SP cards) just now.
ID: 106017 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5664
Credit: 5,711,666
RAC: 233
Message 106018 - Posted: 24 Apr 2022, 21:41:32 UTC - in response to Message 106017.  

Is it time to panic ?
There are less than a million tasks left on the front page . . .
Does this mean we may run out of pythons sometime this year :-)
Given that the most In progress for them was a bit over 21,000, they tend to average around 15,000 or less, and that there are presently only 10,500 In progress, i think it will be a long, long, long time before they get cleared due to the very minuscule number of systems that are actually processing them.
I make that four months. Depends how soon you want to panic.



And then we get to have fun with the buggy stuff.
ID: 106018 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 9,852,023
RAC: 5,596
Message 106019 - Posted: 24 Apr 2022, 21:50:48 UTC - in response to Message 106018.  

And then we get to have fun with the buggy stuff.
I prefer dune buggies.
ID: 106019 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1226
Credit: 13,940,531
RAC: 3,125
Message 106020 - Posted: 24 Apr 2022, 22:17:00 UTC - in response to Message 106018.  

Is it time to panic ?
There are less than a million tasks left on the front page . . .
Does this mean we may run out of pythons sometime this year :-)
Given that the most In progress for them was a bit over 21,000, they tend to average around 15,000 or less, and that there are presently only 10,500 In progress, i think it will be a long, long, long time before they get cleared due to the very minuscule number of systems that are actually processing them.
I make that four months. Depends how soon you want to panic.



And then we get to have fun with the buggy stuff.

I prefer both to the situation at Predictor@Home. They lost the two members of their project team who knew how the create useful new workunits (probably because they graduated). For several months, they kept the project running by repeatedly raising the number of times a workunit could fail before no more tasks would be sent out for it. Some of the remaining workunits failed over 30 times before the professor in charge decided it was not worthwhile to let the project continue, and it shut down.
ID: 106020 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 106021 - Posted: 24 Apr 2022, 23:03:48 UTC - in response to Message 106018.  

And then we get to have fun with the buggy stuff.

This IS the buggy stuff. That is one reason I am concerned we may not get more.
They did not bother to fix it, so it may be good enough for what they need it for.
ID: 106021 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 9,852,023
RAC: 5,596
Message 106022 - Posted: 24 Apr 2022, 23:20:20 UTC - in response to Message 106020.  

I prefer both to the situation at Predictor@Home. They lost the two members of their project team who knew how the create useful new workunits (probably because they graduated). For several months, they kept the project running by repeatedly raising the number of times a workunit could fail before no more tasks would be sent out for it. Some of the remaining workunits failed over 30 times before the professor in charge decided it was not worthwhile to let the project continue, and it shut down.
ROFL, Wikipedia says "Though it was quite successful, a "disagreement" between the project administration and the user base caused a mass exodus of participating users"
ID: 106022 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 203 · 204 · 205 · 206 · 207 · 208 · 209 . . . 280 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org