Posts by River~~

21) Message boards : Number crunching : All about Rosetta memory requirements (Message 37592)
Posted 7 Mar 2007 by Profile River~~
Post:
Thanks for this, it is a good start on a useful set of FAQs

In A7, you might like to give the example that with BOINC using 100% or memory, even basic windows features like the Start button can take a minute or two to react. Some users find this disconcerting, others don't mind. If you don't like this behaviour, experiment with dropping the max % while machine is idle 1% at a time 98%, 97% ... till the machine feels responsive.

Noteh also that even with a smaller setting, if you have a big spreadsheet open that can take memeory away from the Start button, etc, as can any large program. This setting only regulates BOINC's use of the memory, not that of programs you have left dormant.

R~~
22) Message boards : Number crunching : Predictor of the day (Message 37528)
Posted 6 Mar 2007 by Profile River~~
Post:
Yes, thanks for getting it going again. I guess I'm not the only geek who checks fairly often to see if I'm the POTD or not...



yes thanks for automating it for us

R~~
23) Message boards : Number crunching : Problems with rosetta 5.48 (Message 37527)
Posted 6 Mar 2007 by Profile River~~
Post:
... I have not got any more HINGE WU's in queue, instead its back to ABRELAX and NMRREF, why is this? Is my system with 512MB RAM not powerful enough to run HINGE or is that just luck of the draw for WU's?


It is the luck of the draw, with four exceptions, if I understand how it is designed to work correctly:

- HINGE will not be issued to a machine with < 477 or whatever

- once a specific HINGE has been held back from a machine, that HINGE remains at the top of the pile until it is placed with a suitable machine

- a small machine that is being refused work will be offered each of the priority HINGEs in turn, then if there were less than 50 of them, gets (50-N) goes at picking jobs at random. If it gets jobs that fit, fine, if it does not get any in the rest of its 50 prize draws, then it gets the 'There was work but' message and is told to try again in about 5 mins.

- when there are 50 HINGE jobs that have all been prioritised, then the scheduler stops issuing smaller jobs (which is when everyone with small boxes see the 'There was work but' message for several tires in a row, until some bigger boxes come along and remove the priority pile.

'50' is a project settable limit, so may be some other number.

As you will see, none of those exceptions prevent a small job going to a big machine, even when there is big work waiting.

John Keck has already passed back to BOINC the suggestion that it would work better for this project if the work was released from different queues. In fairness this is BOINC's first attempt to address this issue and the best thing about all first attempts is that they show you how to do the second attempt better...

R~~
24) Message boards : Number crunching : System Requirements (Message 37525)
Posted 6 Mar 2007 by Profile River~~
Post:
In 'Recommended System Requirements' memory is showing 256mb for all OS. Wondering if this should be amended so that newcomers are not disappointed.


Most jobs will run in 256, it is just a few that need more, so whatever goes on the page needs to reflect that. Most of the time 256 will continue to be fine.

You are right that it is a good idea to put something on that page to explain that the memory requirements vary. Then newcomers with smaller / older machines don't immediately panic and leave if they are unlucky enough to see the '476Mb needed' message in their first scheduler contact.

What might be useful is a warning on that page that there are a few times a year when the project is running mainly larger work, and that although the project will 'try to' keep some work available for 256Mb, there will be times when the 'normal sized' work runs out, perhaps even for a few days at a time, and especially over weekends etc if the admins are not at work to add more small tasks to the queue.

R~~
25) Message boards : Number crunching : Problems with rosetta 5.48 (Message 37521)
Posted 6 Mar 2007 by Profile River~~
Post:
Don't know if this problem is specific to this Rosetta version.

http://boinc.bakerlab.org/rosetta/result.php?resultid=65982275
http://boinc.bakerlab.org/rosetta/result.php?resultid=65982274
http://boinc.bakerlab.org/rosetta/result.php?resultid=65921927
http://boinc.bakerlab.org/rosetta/result.php?resultid=65759711

These four tasks all failed to get started during a 14min period when the CPU was in use intensively on mathematical calculations. According to top, Rosetta was getting about 5% cpu time but was also being squashed out of memory.

OK so this was not a normal occurrence, but even so Rosetta should wait politely for the higher priority cpu usage to go away and then pick up again nicely.

On the other hand, probably not a critical issue to put a lot of work into resolving.

R~~
26) Message boards : Number crunching : Problems with rosetta 5.48 (Message 37508)
Posted 6 Mar 2007 by Profile River~~
Post:
WAH! I have 2 WUs running at once! Of course, only one of them is a HINGE; the other is a new one:

03/05/2007 8:06:30 PM|rosetta@home|Starting 1xpv_1_NMRREF_1_1xpv_1_idid_model_01IGNORE_THE_REST_idl_1597_2544_0

So maybe there's hope.


Of course, if you don't want to run HINGE you can always reduce your box's memory ;-)
27) Message boards : Number crunching : Problems with rosetta 5.48 (Message 37493)
Posted 5 Mar 2007 by Profile River~~
Post:
My user average has lost serious amount of points since HINGE started comeing through. I keep getting credit and my user total keeps going up, but my user average keeps diving to the bottom of the graph every time a WU completes and credit is granted.

Anyone got any ideas?


Hi Greg,

Are you on BOINC v 5.8.x, or still on v 5.4.x?

How much memory do you have on the relevant box?

The credits given are an average for other people who have run similar work - ie HINGE credits should be adjusted in line with the average crunchers progress on those tasks.

If (and I am taking a wild guess here, so forgive me if I am wrong on either/both counts) if you have less than 400Mb AND you are on the older BOINC client, then you may be running work that is too large for your system to run effectively - the result would be heavy swap file usage, and your system running slower than the typical systems that were used to set the level of credit.

This would not happen on the newer BOINC client, as if you had too little memory you would not have been given the work in the first place.

EDIT - add: On the other hand, with the cpu you have, I bet you have plenty of memory, so I am probably way off target in your case. But I will leave my comments on the board in any case in case they apply to anyone else.

Don't know if that helps.

R~~
28) Message boards : Number crunching : Problems with rosetta 5.48 (Message 37487)
Posted 5 Mar 2007 by Profile River~~
Post:
I appear to be in the minority in thinking that a message in my BOINC Manager telling me Rosetta is running units with more RAM than I have counts as communication -_-


hi Matt,

of course that counts as communication.

Two things are going on in response to this new communication.

Firstly, some people are reacting to the newness of the message, as they recognise that they have never seen it before. They wonder, did I do something or did the project team do something or is it a new feature. In fact, having asked, they discover it is two of those things combined. All of that counts as communication too, clarifiaction in response to the original communication.

And, I think it is totally fair comment that communication would have been even better had there been a short note on the forum, or even on the front page, to the effect that there are some big jobs on the way and that the project team are going to take advantage of the new facilities in BOINC to screen the big jobs away from smaller hosts.

Communication in advance is *even* *more* *effective* than communcation that starts at the point of change.

Secondly, some people (including myself) think we are spotting flaws in this new automated screening process. We are pointing out those (alleged) flaws. Whether we turn out to be right, or whether the project or BOINC folk come back and explain why we are mistaken, all of that is communication too.

But I would also add, that while I think more communication up front would have helped here, this is in the context of a project that consistently has the best communication of any BOINC project Ive been part of, and far better than the non-BOINC DC projects I have been part of.

So when I say that, on this occasion, communication in advance would have helped, I do not mean it as a complaint. I am offering a pointer to how (IMO) things could have been even more betterer than other projects.

R~~
29) Message boards : Number crunching : there was work but your computer doesn't have enought memory (Message 37484)
Posted 5 Mar 2007 by Profile River~~
Post:
The BOINC server code is not as specific as to distinguish between a Linux system with reduced RAM operating system, and no graphics, from a Windows system with 50MB of extra drivers loaded and full graphics enabled. So the memory requirement pertains to the created task, not to the operating system that happens to download it.


Exactly my point.

But when the task arrives on a box that is asking to display graphics (explicitly or the screensaver) it loads lots of gfx code from the shared libraries (.so on Linux, .dll on Windows). When it arrives on a machine that does not want gfx, it does not load that code. If it tried to on my machine it would fall over (as the Leiden WU do).

So the same task runs in two very different sizes depending where it lands. This is nothing to do with the code used by the OS, it is code selectively loaded by the Rosetta app as required.

Which means the mem requirement is either wrong for a user with graphics, or is wrong for a user without.

This problem must exist on every project that reponds properly to the presence/absence of graphics. It is a BOINC issue not a ROsetta issue, but at present it is affecting all of us who don't want graphics. In effect we are being told to provide enough RAM for the graphics we don't want...

R~~
30) Message boards : Number crunching : Problems with Rosetta version 5.46 (Message 37483)
Posted 5 Mar 2007 by Profile River~~
Post:
I got this messages:

01.03.2007 16:38:01|rosetta@home|Restarting task vp10__BOINC_ABRELAX_cterm_hom002__1581_12037_0 using rosetta version 546
01.03.2007 16:38:09|rosetta@home|Task vp10__BOINC_ABRELAX_cterm_hom002__1581_12037_0 exited with zero status but no 'finished' file
01.03.2007 16:38:09|rosetta@home|If this happens repeatedly you may need to reset the project.

What to do???? I run Rosetta von an AMD Duron with 750MHz on WIN 2000.



If it happens rarely, do nothing and don't worry. If that task finshes OK then don't worry.

If it happens repeatedly, especially if it happens repeatedly without finishing any task, then reset the project - to do this from the manager you need to be in the advanced view if you have the new version, then go to the projects tab, highlight Rosetta, and click the button marked 'Reset Project'

This error occurs when timing problems arise between different processes of BOINC & Rosetta. If it happens rarely it probably means you were doing something unusual at the time that slowed the communication down from BOINC to Rosetta. It happens to all of us who actually use our machines for real work as well as letting BOINC run!

Hope that reassures.
River~~

R~~
31) Message boards : Number crunching : The Cost of Power? (Message 37466)
Posted 5 Mar 2007 by Profile River~~
Post:
There's a really good PSU guide here:
http://www.silentpcreview.com/article699-page1.html


...

What it does not do is tell me how efficient any of the supplies is. What I want to see is the AC power draw for each PSU at a standard 80W and 200W DC delivery, or (what is equivalent) the %efficiency at each of these as a number, not as 'pretty good' etc. (%efficiency = 100 x DC Watts drawn by motherboard / AC Watts drawn from wall)

If I am going to use the PSU on a machine that is boot-on-LAN, I also want to know what the PSU's AC power draw is when only powering the LAN components. ...


Hi Danny,

I owe you an apology.

The site you pointed to *does* have all that info, just not on the overview page. By clicking on the review link for each PSU it is all there, the boot-on-LAN figs being shown as +5SB (ie standby).

Thanks for the link and sorry for my initial obtuseness.

River~~
32) Message boards : Number crunching : BOINC thread: Questions re scheduler & v5.8.x (Message 37457)
Posted 5 Mar 2007 by Profile River~~
Post:
Thanks for replying John.

If my memory serves me correctly it is closer to d) none of the above.

1) A clean scheduler will issue results randomly from the available pool.

2) If a result is deemed unsuitable for a host it is prioritized so that work that has failed to be sent to a host will be attempted before a clean result is attempted.

The server will generally attempt about 50 results per RPC. If none are suitable then you get the no work message.

This may be outdated information. It was accurate prior to the implementation of locallity scheduling. Locallity and HR schedulers work differently.

edit: The design goal is to get bad workunits out of the system. The prioritization is by incrementing a variable called infeasable_count associated with the workunit. Once that reaches a project settable threshold the workunit is automatically cancelled.


This is a counter-productive goal, plausible as it sounds at first sight.

The effect is that when 50 prioritised WU are in the system, work will only be issued to large boxes.

It does have the advantage that large machines are more likley to get large work.

Keeping to the spirit of this approach, a better approach would be that the prioritisation of previously-rejected work is not applied where, earlier in the same RPC, a workunit has already been rejected. That way the scheduler is not setting itself up to fail. The first offering would be prioritised, but if that were too large then the other 49 random offerings would be from the general pool not from the already-too-big sub-pool.

I still feel that my (c) would be even better - keeping subpools of work of different sizes and issuing the largest suitable work each time. However that may be more work than the change is worth, and may have knock on effects on other scheduling decisions that I don't know about.

If you think there is any merit in either of these suggestions, please feel free to pass them on to the BOINC lists.

R~~
33) Message boards : Number crunching : there was work but your computer doesn't have enought memory (Message 37456)
Posted 5 Mar 2007 by Profile River~~
Post:
...None of my machines are graphics loaded.. too much much work for too little return...


Exactly my strategy.

btw, for Linux users, if you want a more detailed breakdown of memory usage for the machine as a whole, including swapfile, disk cache, etc, try

cat /proc/meminfo

which provides a listing with headings of all kinds of memeory related stuff. The Swapfile free is a useful indicator - if it is almost as big as the total swapfile size then your box is not suffering too much.

For the one process, you could try (where [pid] is the proces id number from top) cat /proc/[pid]/status

R~~
34) Message boards : Cafe Rosetta : ALTERNATIVES FOR THOSE WHO CAN'T RUN "HINGE" (Message 37454)
Posted 5 Mar 2007 by Profile River~~
Post:
Hi, everyone. I have been running Rosetta on 2 machines for several months. Suddenly there is no work for my older machine. It seems that this venerable legacy machine (a 4 year old Compaq 2500) is no longer capable of running the new “HINGE” WUs. Not enough RAM. It runs other WUs just fine, if somewhat slowly.

If anyone out there is in the same situation you might consider running malariacontrol.net until the return of more standard WUs. If you tried to join before, but, ran into a notice saying that they were not creating any new accounts be advised that it has reopened to new accounts. Its graphics may be somewhat strange, but, it is fighting a disease that is every bit as dangerous as AIDS to the people of Africa.


hi

Thanks PUDDIN, that is a good suggestion.

For those wanting to keep Rosetta as your main priority, when suitable work for your machine is on offer, this is what to do.

Join Malaria, and set its Resource share to 1, leaving Rosetta at 100.

How it works

In normal circumstances, when suitable work is available on both projects, these setting mean you will run some Malaria work every so often, but run Rosetta most of the time. The timing of downloading Malaria will be adjected by the client so that you get the 1:100 ratio over a period of months.

However, when Rosetta refuses to send you work at a time when your box is already empty, in order to keep your machine busy the client will download a single Malaria WU. This will only count towards the 1:100 ratio from the time that Rosetta again offers you work (which could be part way through the Malaria work).

It works for me

I use this trick the other way round. My major project is LHC, which only has work rarely. Rosetta is the fill-in project for me, but as you can see from my stats, the 'fill-in' actually runs more than LHC, purely because of availability of work on LHC. I find that when work is available on LHC my boxes are running it again within 4hrs, and I can leave the settings the same whether work is available on LHC or not.

River~~
35) Message boards : Rosetta@home Science : Rosetta@Home mentioned at CES 2007 by INTEL (Message 37423)
Posted 4 Mar 2007 by Profile River~~
Post:
Well that's blown his handle.

Maybe he will change it now to Oh-Thats-Who!

;-)

Um, yeah, he is...

it's one of Who?'s machines - theres a link to the discussion about it here in the number crunching thread somewhere...


According to INTEL, it was a machine built *by them* specifically to show off an 8 core machine using 'off the shelf' parts just FOR CES 2007.

Unless Who? is actually Francois Piednoel of INTEL's Benchmarking team, then I doubt it's one of his.


36) Message boards : Number crunching : there was work but your computer doesn't have enought memory (Message 37422)
Posted 4 Mar 2007 by Profile River~~
Post:
--

BTW... Only the HINGE_* WU's are set to 478MB required... They are actually using 390MB... so this is not a BOINC issue, but, an issue of a parameter set for this particular class of WU....




I think this supports my view, or at least might do.

I notice that all your boxes are Linuces of various flavours. My furst guess is not that the Rosetta folk got the numbers wrong, but that they have fed in the numbers for the majority platform, Win-XP, and that BOINC offers no way to make the Linux offering 88MB smaller.

Also, as a matter of interest, do you have graphics enabled on the box that shows a 390MB usage (eg Gnome/KDE plus X, etc) ? Did you look at the size only before displaying graphics, or both during and after displaying graphics?

Finally, it is however reassuring to know that other work units are not so big.

River~~

R~~
This is relevant because my second guess is that the size reflects a user who views the Rosetta graphics, and clearly memory will be saved when this is not done, because the relevant .so or .dll libraries will not be loaded.
37) Message boards : Number crunching : Problems with rosetta 5.48 (Message 37414)
Posted 4 Mar 2007 by Profile River~~
Post:
Have 512mb on a 1.2ghz cpu with w2k. I can no longer get work units after finishing the last one.

3/4/2007 12:16:54 AM|rosetta@home|Message from server: Your preferences limit memory usage to 460.34MB, and a job requires 476.84MB
3/4/2007 12:16:54 AM|rosetta@home|Message from server: No work sent
3/4/2007 12:16:54 AM|rosetta@home|Message from server: (there was work but your computer doesn't have enough memory)

Change preferences to 95%, change pagefile to 2gb, still can not get any work.


Your prefs at 95% would be enough for 477Mb, 0.95 x 512 = 486 (now there's a nostalgic number!) so I am guessing that you have shared memory with your video card, yes? If not, I am totally confuzdled.

If so, you could go into the BIOS and reduce graphics memory to the smallest setting and see if that helped. Or possibly (used to be the case, don't know if it is still true with modern cards) there might be a program that came with the card to adjust the amount of video memory.

If BOINC's 477Mb is 477 x 1024 x 1024 then you need to get the video down to around 35Mb. If BOINC's 477 is 477 x 1000 x 1000, then that corresponds to about 454 x 1024 x 1024, and you need a video allocation < 58 Mb, or maybe 56Mb depending on rounding, etc. So either way a setting of 32Mb would show whether that is the way forward.

Remember, if you change any settings, to note what they are now, so that you can get back to where you are now if the effect on your system is too horrible...

R~~
38) Message boards : Number crunching : 咋没人用中文发帖子~ (Message 37404)
Posted 4 Mar 2007 by Profile River~~
Post:
!!!!!!!!!!!

R~~
39) Message boards : Number crunching : BOINC thread: Questions re scheduler & v5.8.x (Message 37403)
Posted 4 Mar 2007 by Profile River~~
Post:
Q1

I am not clear just what the scheduler does when it refuses to give me work because of not having enough memory in the case where there is a variety of sizes of work on wating to be issued.

Does it

a) issue jobs strictly from a single queue, (so that I am refused work if the next job happens to be too big for my machine, and so that a big machine might get given a small-memory job)

b) try to issue the next job in the queue, but look for a smaller one if need be (so that I get work if there is any that fits, but so that a big machine might still be given a small job)

c) issue the largest job that will fit in my machine (so that big machines do not take all the small jobs leaving the small machines without work)

Clearly from a user perspective (c) is what we want, and if not (c) then at least (b). My impression (on insufficient evidence to be certain as yet) is that the scheduler is operating (a).

Q2.

Is it possible for users & projects to set things up so that a job asks for less memory whern there is no graphics needed (could be that the user is running a Linux command-line only box, like me, or that the user is running a windows box in service mode, no shared graphics).

My suspiscion is that I am being denied work that my box could run perfectly well, simply because I do not have enough RAM to run graphics, but that is not an issue because I don't want to run graphics anyway.

Does anyone know how the new code works and/or what the design intentions were?
River~~
40) Message boards : Number crunching : there was work but your computer doesn't have enought memory (Message 37401)
Posted 4 Mar 2007 by Profile River~~
Post:
Rosetta is one of the project which has the biggest memory requirement.


Very true, but how much of this is needed by the science and how much by the graphics?

I have never seen much swapfile usage on Rosetta, even on my boxes that have only 256Mb mem, but no graphics. If I am right (and, for example, if the programmers have not recently added 100% to the size of the code) then the scheduler is preventing me from downloading work which wouyld actually run without problems.

For example right now I have a CPDN and a Rosetta both sharing just 256M RAM, and the swapfile usage is only up to 2M. (Linux, no GUI, no graphics, not even X)

R~~


Previous 20 · Next 20



©2024 University of Washington
https://www.bakerlab.org