Questions and Answers : Wish list : I would like to see::
Author | Message |
---|---|
Sir Antony Magnus Send message Joined: 28 Nov 05 Posts: 31 Credit: 526,750 RAC: 0 |
The ability to delete an account no matter what the project. Should be a user controlled issue anyway, also could make it a one stop shop via BOINC software. Would make it easier. Antony |
joseps Send message Joined: 25 Jun 06 Posts: 72 Credit: 8,173,820 RAC: 0 |
I would like to see a regular report on a list of successful accomplishment achieved by the Rosetta@home volunteers. Just to show that the group have done something worthwhile in their effort to help. We have been volunteering for more than a year now. Did we have accomplish something during that period? I hope I am not behind keeping with the news. joseps I turned off my 5computers when I went on vacation. When I return today, I can not upload work. Need work units to run computers. joseps |
Natronomonas Send message Joined: 11 Jan 06 Posts: 38 Credit: 536,978 RAC: 0 |
I would like to see a regular report on a list of successful accomplishment achieved by the Rosetta@home volunteers. Just to show that the group have done something worthwhile in their effort to help. We have been volunteering for more than a year now. Did we have accomplish something during that period? I hope I am not behind keeping with the news. joseps On the front page : ) News Oct 17, 2007 An article about Rosetta@home is in Nature. Congrats and thank you to all the volunteers that made this possible! http://www.nature.com/news/2007/071016/full/449765a.html So things are progressing! : ) Crunching Rosetta as a member of the Whirlpool BOINC Teams |
khanson Send message Joined: 8 Jan 08 Posts: 2 Credit: 307,805 RAC: 0 |
I would like to see a news area where you would report project problems like server outages with an estimate as to when you expect to be back on line. The server status page is nice, but a little text to explain what is going on would also be helpful. I also read some threads where users asked to know about accomplishments and scientific breakthroughs. This is an excellent suggestion. Keep up the good work!! |
Keenan Whitmore Send message Joined: 27 Jun 08 Posts: 1 Credit: 17,703 RAC: 0 |
I would still like to see a User Certificate, similar to the ones available on other BOINC projects. People have posted about this since '05 and I'm still wondering where they are. |
WBT112 Send message Joined: 11 Dec 05 Posts: 11 Credit: 1,382,693 RAC: 0 |
I would like to see project badges like WCG has them (Bronze/Silver/Gold) and so on for several contributions a user has made to Rosetta or badges when a user or team predicted the lowest energy structure for a wu and so on. Im not that kind of achivement hunter but I think a lot of people could be motivated by this to join rosetta. |
Paul Send message Joined: 19 Jun 13 Posts: 1 Credit: 997,853 RAC: 233 |
I would like to see the BIONIC Manager give priority to the projects with the next deadline. I often see projects running that are due in several days consuming resources while projects due the same day are waiting to run. I often miss deadlines because of poor resource allocation. My only workaround is to manually pause projects which is a waste of my time. I also see new tasks starting when there are numerous incomplete tasks; this again wastes resources when initially started tasks miss their deadline because resources were given to lower priority projects. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2140 Credit: 41,518,559 RAC: 10,612 |
I would like to see the BOINC Manager give priority to the projects with the next deadline. I often see projects running that are due in several days consuming resources while projects due the same day are waiting to run. I often miss deadlines because of poor resource allocation. My only workaround is to manually pause projects which is a waste of my time. I also see new tasks starting when there are numerous incomplete tasks; this again wastes resources when initially started tasks miss their deadline because resources were given to lower priority projects. This is a fair question, but task scheduling is controlled by the Boinc Manager, not any of the individual projects. What resource share do your each of projects have? Have you allocated sufficient RAM and disk space to Boinc? The settings in BoincOptionsComputing Preferences on the Computing and Disk & Memory tabs. If you can list those we might see why Boinc is failing to start tasks that are about to miss deadlines |
Melvin Send message Joined: 11 Dec 05 Posts: 13 Credit: 11,636,709 RAC: 3,060 |
Even with other projects paused to focus the scheduler on R@H tasks I often notice R@H deadlines are reached and passed, even with work units running 24/7. I realise the total compute time required may be hard to judge until the work unit is well under-way, though that seems a good reason to set deadlines further out. I often see huge elapsed times compared to other projects. Work buffer set to 1 day min. 2 day max. so as not to get too big a work unit queue, yet there are 32 R@H units queued, with deadlines up to 25/10. (and a similar number of non-R@H units) which seems to imply at least 4 units/core/day is expected, but some units take way longer than that. I don't usually monitor tasks closely because I tend to assume/hope the software can manage itself - but I do wonder to what depth it does this. A low progress when approaching a deadline may cause user concern. Firstly, there may be the temptation to abort rather than use more compute time for uncertain/no credits. It would seem wasteful if all users running the same work unit came to a similar conclusion to abort when it could be just short of a useful result. Is 'data-so-far' then just lost, with no part-result communicated back to base to help decide if the unfinished work should be re-allocated or not? (with risk of events repeating unless a longer deadline is set). Does/can the software, at some point in the calculations, determine if it will definitely converge to a useful result, even if that would take it beyond the original deadline? What about a dynamically re-defined deadline to recognise this situation? Or, can the software decide on an early completion to retrieve a part-result for re-allocation to a continuing work-unit? If results are diverging beyond hope of finishing in reasonable time (whatever that is?) does the software have the means to terminate itself at that point? Secondly, perceived completion estimates can seem an order of magnitude longer compared to projects that set longer deadlines (even with shorter compute times). At the moment I see one R@H work-unit due 7/10 showing less than 3% progress in the below table with an elapsed time of 366 hours ( >15 days) that could seem to imply about 500 days left to finish! Yet the estimated remaining time left shows less than 10 hours - though very often this figure clocks up instead of down as time elapses. Other work-units on the same PC: due 13/10 shows 166 elapsed hours as 32% progress, due 16/10 shows 37 elapsed hours as 20% progress, and one due yesterday with 30 elapsed hours as <1%. (activity setting 'Run Always' on Linux Mint 17.3 using AMD FX4100 quad-core CPU @1.4GHz, typical memory use 2/3 of 12GB free, no swap, other projects paused). Although compute time seems inherently unpredictable 'I would like to see' some means to a) assure the user their overdue results may still be useful and credited, and/or b) the software can make an abort/terminate decision when it is sensible to do so and credit user up to that point. |
Melvin Send message Joined: 11 Dec 05 Posts: 13 Credit: 11,636,709 RAC: 3,060 |
Insert of table failed, see http://frintonet.dlinkddns.com/boinc.jpg. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2140 Credit: 41,518,559 RAC: 10,612 |
Insert of table failed, see http://frintonet.dlinkddns.com/boinc.jpg. Blimey! Something is going very wrong with these tasks. I've had a quick glance at your PCs and 3 of the 4 look very well behaved, but this one is a disaster. You should certainly abort those tasks as they've missed deadline and have been allocated to other machines. As you fear, the time spent has effectively been wasted, I'm sorry to say. I'd suggest you look at our computing preferences (1st & 3rd tabs in particular) on this particular machine and compare them to one of your others and see if there's some setting mistakenly applied. You shouldn't be getting this outcome at all. |
Melvin Send message Joined: 11 Dec 05 Posts: 13 Credit: 11,636,709 RAC: 3,060 |
Thanks for looking at this. Other machines have occasionally run high priority for R@H and sometimes also shown very high elapsed compute times. This PC does so a bit more often, but not sure if it is the settings, hardware compatibility or just be the luck of the work-unit draw? All are set to similar preferences, apart from CPU usage on this one is backed down a little as it has greater tendency to overheat, though if it was running exceedingly slow I'd expect the elapsed compute time to be low, and all projects to be adversely affected. What would seem more wasteful is if this behaviour is repeated many times if units are simply re-allocated until finally landing on a much faster machine that can complete within the deadline period (assuming the compute is deterministic). I didn't abort the tasks because when I next looked a couple of days ago they had already all been replaced by new ones. Based on earlier elapsed/remaining proportions those earlier tasks would still all have be crunching now. Screen-shot of he new set of tasks now added to the earlier link, you will see all are already overdue again. I had wondered if this was unlike the initial wild time-estimates that soon converge to a mostly accurate one, such as when downloading a large file, and that maybe the often up-counting remaining compute time does may converge/finish abruptly, as I've not watched close enough to know, or, was there is a terminate-mechanism already built into the software (and if not, can it be very difficult to program in)? Looking at the log I now see these lengthy units were indeed terminated rather than ending with a natural/useful answer. Log extract http://frintonet.dlinkddns.com/boinclog.jpg Not sure however if a result becomes "no longer usable" based on a simple time-beyond-deadline for reallocation, or (preferably) actioned by a stability-assessment algorithm? Software based on general knowledge of result characteristics should be able to make a more informed decision than the average user, some of whom may not even monitor this at all. I'd also like to think that a result shown to be 'awkward to compute' (especially if confirmed so by others) has some value to help decide a longer allowance if it is to be reallocated. Melvin |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2140 Credit: 41,518,559 RAC: 10,612 |
Thanks for updating your earlier link to show your settings, Melvin. That helps a lot. I run an AMD FX8350, but under Windows rather than Linux, so I've got a reasonable idea what you should have set. While several of my settings are different to yours, can I point you to two on your "Disk and Memory Usage" tab in particular. "Use at Most 0.00Gb disk space" doesn't make much sense. I use 10Gb on an 8-core machine, but somewhere between 6 & 10Gb ought to be fine for your 4-core machine. "Leave applications in memory while suspended" needs to be ticked. It's known to produce problems if left unticked. The same should apply on your other PCs even if you aren't showing any problems with them right now, I could quibble with some of your other settings, but give these 2 changes a chance to see what difference it makes, if any. No need to speculate about other things if this sorts things out for you. Good luck. Thanks for looking at this. Other machines have occasionally run high priority for R@H and sometimes also shown very high elapsed compute times. |
Melvin Send message Joined: 11 Dec 05 Posts: 13 Credit: 11,636,709 RAC: 3,060 |
Thanks for those suggestions Sid. I have made those changes on all my machines. I think I'd misunderstood a zero to mean 'no restriction' as it does in some other preferences. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2140 Credit: 41,518,559 RAC: 10,612 |
Thanks for those suggestions Sid. I have made those changes on all my machines. You may even be right that a zero means no restriction, but it can help to be explicit. Anyway, I've just taken a quick look. Problem solved? Looks like a dozen tasks have completed successfully since that change and your RAC has already gone up 50%. |
Melvin Send message Joined: 11 Dec 05 Posts: 13 Credit: 11,636,709 RAC: 3,060 |
Yes, and it was still going good after a further week from changing them settings, looking as though that had solved it, but that may have been too early to be sure because not all units previously stalled. Certainly several units went by quickly to bump up the RAC, though after those earlier stalling units anything completing was an improvement. I'd explicitly allocated 7Gb for disk as you'd suggested, and according to boinc-manager stats, this should be more than adequate. Screenshot at http://frintonet.dlinkddns.com/boinc1.jpg shows BOINC currently using 4.3Gb of this (70% of which is taken by R@H) with 2.7Gb still available to BOINC. The "Leave applications in memory" thing was ticked, as you'd said (I later found reference to it causing some people problems depending on project check-pointing). As before there seems ample available memory (never seen this machine swap since extra memory fitted). Process monitor shows BOINC gets a decent share of CPU. Now Just when it was looking as though I now might not get any more 'awkward' units, a couple of suspects come along! Conspicuous by their high 'elapsed time' (about 120hours) despite low 'remaining time' estimates of only a few hours left with a couple of days left to deadline, they are raised to 'high-priority'. The displayed 'remaining' time estimates again do not seem to tally with the few % of 'progress' (taking elapsed/remaining on pro-rata basis) and are faltering (e.g. slowly increased by a minute or so in the last hour). |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2140 Credit: 41,518,559 RAC: 10,612 |
Yes, and it was still going good after a further week from changing them settings, looking as though that had solved it, but that may have been too early to be sure because not all units previously stalled. You're average credit looks up to par, but those 2 tasks in your latest pic are terrible again. I understand you have heat problems so I won't suggest you change "use at most 85% CPU time", even though this can be problematic if less than 100%, so can you try changing Computing allowed "while processor usage is below 70%" to 0% (unrestricted). also, I'd be concerned long before a task was overrunning by 8 hours, let alone 108 and more. Is there nothing you can do to improve cooling? A cleaning out of the case makes a massive difference to getting rid of excess heat. |
Melvin Send message Joined: 11 Dec 05 Posts: 13 Credit: 11,636,709 RAC: 3,060 |
Had to reboot after crash earlier today and later noticed these two units displayed similar (to each other) 'Elapsed' and 'Remaining' times. I'd not realised this seems to be an elapsed real time since reboot - as opposed to the cumulative time elapsing whilst the units have been crunching ! Have added today's table below the screen-shots I uploaded earlier, to show the same work-units with only a handful of elapsed hours but around 66% 'Progress' - a big jump from earlier few % - confusing? (I tend to think of 'progress' to be an estimate of percentage complete, but had previously concluded I'd misunderstood how various terms are calculated) The most significant change I'd noticed after reboot was that 'remaining' times, that previously had started to move slowly upward, were now lower and now going down slowly. One being an hour or so lower than a few days ago, the other being less than an hour higher than a few days ago, but both now in near unison. Have now changed the 70% to 0% as you suggested. Will give the 85% a try at 100% as you mention this may be problematic. Now summer has ended, we'll see if the extra heat is tolerated due to a few degrees lower ambient. Tower runs on it's 'side' with other 'side' open (as top) so ribs on GPU card are vertical so more heat can convect away (no integral fan). Otherwise in 'normal' orientation, card would be horizontal but with ribs on underside accumulating heat (not much case fan flow in that area). I've found this to be an alternative to removing the GPU (which R@H does not use but other projects do) . |
Melvin Send message Joined: 11 Dec 05 Posts: 13 Credit: 11,636,709 RAC: 3,060 |
Thanks for making those suggestions Sid. The two potentially awkward work-units completed earlier today. Will have to wait and see if any more like those come along. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2140 Credit: 41,518,559 RAC: 10,612 |
Had to reboot after crash earlier today and later noticed these two units displayed similar (to each other) 'Elapsed' and 'Remaining' times. When you <don't> have "Leave non-GPU in memory while suspended" ticked it's possible that a task drops back to the start of processing after a reboot, but now you have it ticked it'll only drop back to the last checkpoint, as I understand things. What you need to do is select the task in Boinc Manager and click Properties on the left. This allows you to compare "CPU time" to "Elapsed time". The problem I suspect you had was that the task wasn't getting any CPU time. It's possible the reboot rectified whatever was causing it rather than any other changes made. But hopefully that's all in the past now - fingers crossed. Have now changed the 70% to 0% as you suggested. Will give the 85% a try at 100% as you mention this may be problematic. Normally I'd recommend changing just one setting at a time to see whether it cures the problem or not before moving on, but you're certainly right that the temperature drop makes things a lot simpler. I overclock and have to knock everything down in the summer months or suffer the consequences - and that's with an water-cooling in use. Glad everything seems to be settling down and you're getting the credits your work deserves |
Questions and Answers :
Wish list :
I would like to see::
©2024 University of Washington
https://www.bakerlab.org