Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 203 · 204 · 205 · 206 · 207 · 208 · 209 . . . 309 · Next
Author | Message |
---|---|
Jean-David Beyer Send message Joined: 2 Nov 05 Posts: 196 Credit: 6,613,600 RAC: 5,541 |
I run Linux and have never run out of disk space (because spinning hard drives are now so big and so cheap). I have about 512 GBytes of ssd, and two 4-Terabyte spinning hard drives. But on Linux, you can find out how your disk space is being used very easily. Here is what is in my /var/lib /boinc directory and everything under it. To keep from boring you, I printed out only the first 24 lines. The numbers are in 1024-byte blocks. Right now, I have only universe and rosetta tasks running on my machine. So I seem to be using about 2.37 GigaBytes of disk space in that partition that is sized at about 500 GigaBytes of size. When I have a lot of ClimatgeaPrediction tasks and WCG tasks, I use a lot more, but even then, I come nowhere close to using it all. [/var/lib/boinc]$ du . | sort -nr | head -n 24 2373204 . 2282044 ./projects 1763172 ./projects/boinc.bakerlab.org_rosetta 996448 ./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database 996448 ./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl 454540 ./projects/www.worldcommunitygrid.org 310248 ./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/chemical 273928 ./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/chemical/pdb_components 243200 ./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/sampling 236248 ./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/scoring 191412 ./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/scoring/score_functions 190452 ./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/rotamer 91416 ./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/rotamer/ncaa_rotlibs 86112 ./slots 84812 ./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/rotamer/ncaa_rotlibs/ncaa_rotamer_libraries 58676 ./projects/climateprediction.net 53652 ./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/scoring/score_functions/rama 51688 ./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/scoring/score_functions/mhc_epitope 45804 ./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/rotamer/ncaa_rotlibs/ncaa_rotamer_libraries/n_methyl_amino_acid 39948 ./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/scoring/score_functions/P_AA_pp 39672 ./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/scoring/score_functions/P_AA_pp/shapovalov 37292 ./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/scoring/score_functions/P_AA_pp/shapovalov/2.5deg 34520 ./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/chemical/residue_type_sets 32532 ./projects/boinc.bakerlab.org_rosetta/database_357d5d93529_n_methyl/minirosetta_database/scoring/motif As far as .tmp files are concerned, there are very few: here are all of them: [/var/lib/boinc]$ du -a | grep tmp 484 ./slots/0/data0.tmp 0 ./slots/0/data1.tmp 0 ./slots/0/data2.tmp 12 ./slots/0/error.tmp 228 ./slots/1/data0.tmp 0 ./slots/1/data1.tmp 0 ./slots/1/data2.tmp 8 ./slots/1/error.tmp 4 ./slots/2/rosetta_tmp.txt 512 ./slots/3/data0.tmp 0 ./slots/3/data1.tmp 0 ./slots/3/data2.tmp 12 ./slots/3/error.tmp 4 ./slots/4/rosetta_tmp.txt 4 ./slots/5/rosetta_tmp.txt 4 ./slots/6/rosetta_tmp.txt 4 ./slots/7/rosetta_tmp.txt |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 12,116,986 RAC: 8,091 |
I run Linux and have never run out of disk space (because spinning hard drives are now so big and so cheap).You forgot "and slow". I've banned spinning disks from anything boinc related in my house. I have 7 PCs running Boinc and it's difficult to control them all when one is sat waiting on a disk! The only things rust spinners are used for is backups, TV/Film storage, and security cameras. |
.clair. Send message Joined: 2 Jan 07 Posts: 274 Credit: 26,399,595 RAC: 0 |
Is it time to panic ? There are less than a million tasks left on the front page . . . Does this mean we may run out of pythons sometime this year :-) and then what will we do for `entertainment` |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
Is it time to panic ? I am wondering that myself. The pythons are from a single researcher, and I don't know if there will be more. Maybe it is just a one-shot experiment? Since they never tell us anything, planning is not possible. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 12,116,986 RAC: 8,091 |
Is it time to panic ?Play with WCG. If they ever work out how to move a server from one building to another. Another delay until 9th May.... |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1725 Credit: 18,417,319 RAC: 20,286 |
Is it time to panic ?Given that the most In progress for them was a bit over 21,000, they tend to average around 15,000 or less, and that there are presently only 10,500 In progress, i think it will be a long, long, long time before they get cleared due to the very minuscule number of systems that are actually processing them. Grant Darwin NT |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 12,116,986 RAC: 8,091 |
I make that four months. Depends how soon you want to panic. Anyway why panic when there's about 50 projects to play with? I'm off doing Milkyway (DP cards), Cosmology (CPUs), and Folding (SP cards) just now.Is it time to panic ?Given that the most In progress for them was a bit over 21,000, they tend to average around 15,000 or less, and that there are presently only 10,500 In progress, i think it will be a long, long, long time before they get cleared due to the very minuscule number of systems that are actually processing them. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
I make that four months. Depends how soon you want to panic.Is it time to panic ?Given that the most In progress for them was a bit over 21,000, they tend to average around 15,000 or less, and that there are presently only 10,500 In progress, i think it will be a long, long, long time before they get cleared due to the very minuscule number of systems that are actually processing them. And then we get to have fun with the buggy stuff. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 12,116,986 RAC: 8,091 |
And then we get to have fun with the buggy stuff.I prefer dune buggies. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1234 Credit: 14,338,560 RAC: 1,653 |
I make that four months. Depends how soon you want to panic.Is it time to panic ?Given that the most In progress for them was a bit over 21,000, they tend to average around 15,000 or less, and that there are presently only 10,500 In progress, i think it will be a long, long, long time before they get cleared due to the very minuscule number of systems that are actually processing them. I prefer both to the situation at Predictor@Home. They lost the two members of their project team who knew how the create useful new workunits (probably because they graduated). For several months, they kept the project running by repeatedly raising the number of times a workunit could fail before no more tasks would be sent out for it. Some of the remaining workunits failed over 30 times before the professor in charge decided it was not worthwhile to let the project continue, and it shut down. |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
And then we get to have fun with the buggy stuff. This IS the buggy stuff. That is one reason I am concerned we may not get more. They did not bother to fix it, so it may be good enough for what they need it for. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 12,116,986 RAC: 8,091 |
I prefer both to the situation at Predictor@Home. They lost the two members of their project team who knew how the create useful new workunits (probably because they graduated). For several months, they kept the project running by repeatedly raising the number of times a workunit could fail before no more tasks would be sent out for it. Some of the remaining workunits failed over 30 times before the professor in charge decided it was not worthwhile to let the project continue, and it shut down.ROFL, Wikipedia says "Though it was quite successful, a "disagreement" between the project administration and the user base caused a mass exodus of participating users" |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 12,116,986 RAC: 8,091 |
I'm not concerned. If all future tasks are made with 4.2, things will work properly again. Python is a shit programming language and Oracle is a shit virtual machine.And then we get to have fun with the buggy stuff. |
Jean-David Beyer Send message Joined: 2 Nov 05 Posts: 196 Credit: 6,613,600 RAC: 5,541 |
I do not know how slow my spinning disks are. True, these 7200 rpm SATA hard drives are not as fast as the 10,000 rpm SCSI/320 LVD hard drives on a former machine, but Boinc does not do all that much disk IO as to slow me down much. All my BOINC stuff is on one of those spinning hard drives. I note that Boinc homework assignments are severely compute-limited, so disk IO is just a small part of the work load. I use half my cores on Boinc stuff that runs mainly at nice level 19. Since the machine is doing little else at the moment, not that the machine is running about 50% computing, about 50% idle, and no time waiting for IO. More subjectively, the disk IO light blinks a very very short blink with about a 5-second interval; i.e., hardly any disk IO. The machine is running 5 rosetta and 3 universe jobs at the moment. top - 22:01:25 up 4 days, 13:20, 1 user, load average: 8.00, 8.13, 8.31 Tasks: 454 total, 9 running, 444 sleeping, 0 stopped, 1 zombie %Cpu(s): 0.4 us, 0.1 sy, 49.7 ni, 49.7 id, 0.0 wa, 0.1 hi, 0.0 si, 0.0 st MiB Mem : 63902.1 total, 2784.7 free, 6276.7 used, 54840.8 buff/cache MiB Swap: 15992.0 total, 15987.0 free, 5.0 used. 56835.9 avail Mem |
robertmiles Send message Joined: 16 Jun 08 Posts: 1234 Credit: 14,338,560 RAC: 1,653 |
I prefer both to the situation at Predictor@Home. They lost the two members of their project team who knew how the create useful new workunits (probably because they graduated). For several months, they kept the project running by repeatedly raising the number of times a workunit could fail before no more tasks would be sent out for it. Some of the remaining workunits failed over 30 times before the professor in charge decided it was not worthwhile to let the project continue, and it shut down.ROFL, Wikipedia says "Though it was quite successful, a "disagreement" between the project administration and the user base caused a mass exodus of participating users" I'd expect a user base to disagree a lot and start exiting once every task started failing. What I wrote came from the professor in charge. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1725 Credit: 18,417,319 RAC: 20,286 |
but Boinc does not do all that much disk IO as to slow me down much.It depends on the application. In the case of Rosetta, the Rosetta 4.20 Tasks don't require much disk I/O, however the Python Tasks require massive amounts of disk I/O when starting up & ending. And apparently they also require quite a bit during processing. The more cores & threads a system has & uses, then the higher the disk I/O requirements will be. Grant Darwin NT |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2141 Credit: 41,538,222 RAC: 10,691 |
Ok, I've just used Windows Disk cleanup and ensured storage sense is enabled and freed up a few Gb, but that's on a PC that isn't running VBoxI run the Windows disk cleanup (including system files) then run treesize which shows me what folders are using the most, so I can manually remove stuff I don't want anymore. Last time I reduced the stuff on my disk by a third. Looking at this message was a reminder to do all this. No new .tmp files, freed up a few Gb here too, grabbed Treeview but it's not telling me anything I expect to find useful so removed again. I've got BoincTasks but hadn't set it up to run at startup, which I've now done. Yes, very useful in finding tasks that are very far behind in CPU time compared to Elapsed time. More useful when running VBox tasks compared to running plain Rosetta tasks - I'll keep this going now. All good, ta |
Paddles Send message Joined: 15 Mar 15 Posts: 11 Credit: 5,434,545 RAC: 2,362 |
Update: The first task to be postponed reached the end of its one day postponement, and now appears to be computing successfully (in VBox 6.1.34). Haven't tried reverting to previous version to see what happens, but whatever the problem was it seems to have resolved. I may have spoken too soon. The tasks were running for exceptionally long times (18-26 hours) - although unlike the normal "not doing anything" vbox tasks, they were showing significant CPU time utilised (rather than the tasks that "run" for 18 hours but have only consumed 10-20 seconds of CPU). I shut down BOINC, rolled VBox back to version 6.1.12 (BOINC recommended version, not 6.1.32 which I had been running), restarted, and all the vbox tasks came up with computation errors. Oh well, will see what happens with the next tasks to run. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 12,116,986 RAC: 8,091 |
I do not know how slow my spinning disks are. True, these 7200 rpm SATA hard drives are not as fast as the 10,000 rpm SCSI/320 LVD hard drives on a former machine, but Boinc does not do all that much disk IO as to slow me down much. All my BOINC stuff is on one of those spinning hard drives. I note that Boinc homework assignments are severely compute-limited, so disk IO is just a small part of the work load. I use half my cores on Boinc stuff that runs mainly at nice level 19. Since the machine is doing little else at the moment, not that the machine is running about 50% computing, about 50% idle, and no time waiting for IO. More subjectively, the disk IO light blinks a very very short blink with about a 5-second interval; i.e., hardly any disk IO. The machine is running 5 rosetta and 3 universe jobs at the moment.Try 24 cores running virtualbox. 2GB disk read and 2GB disk write to start each one, followed by many checkpoints. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 12,116,986 RAC: 8,091 |
I'd expect a user base to disagree a lot and start exiting once every task started failing.I'm not that arrogant. I keep trying to help a project in difficulty. |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org