Posts by Darren

1) Message boards : Number crunching : Report Problems with Rosetta Version 5.12 (Message 16141)
Posted 13 May 2006 by Darren
Post:
I've also got an error with 5.12 for "exceeded disk limit". This is on a Gentoo Linux system, so the windows debug code shouldn't be the problem.

Sat May 13 00:41:08 2006|rosetta@home|Aborting result JUMP_CLOSE_CHAINBREAK_ALLBARCODE_1q7sA_SAVE_ALL_OUT_493_16783_0: exceeded disk limit: 103092700.000000 > 100000000.000000
Sat May 13 00:41:08 2006|rosetta@home|Unrecoverable error for result JUMP_CLOSE_CHAINBREAK_ALLBARCODE_1q7sA_SAVE_ALL_OUT_493_16783_0 (Maximum disk usage exceeded)
Sat May 13 00:41:09 2006||request_reschedule_cpus: process exited
Sat May 13 00:41:09 2006|rosetta@home|Computation for result JUMP_CLOSE_CHAINBREAK_ALLBARCODE_1q7sA_SAVE_ALL_OUT_493_16783_0 finished


It's this work unit, which is my only 5.12 work unit - the next is 5.13.


2) Message boards : Number crunching : Problem with upload/download speed settings (Message 13005)
Posted 3 Apr 2006 by Darren
Post:
Does Rosetta for some reason not honor the speed limits set in boinc?

My internet connection is by cellular wireless service, which has a max speed capability of about 130 kilobits sustained and bursts of about 200 kilobits. However, a problem arises on prolonged file transfers at near-maximum sustained throughput - that problem being that the provider breaks the connection because it's overtaxing their system.

I always have to fight with my connection (suspend network during transfers and reconnect my internet several times) during rosetta downloads, as it doesn't seem to want to honor my speed settings. I have scaled this speed all the way down to 5 kilobytes (which should be ~40 kilobits, and well under what my provider will allow me to sustain - the provider says it will drop if I sustain more than ~120 kilobits).

My settings in global_prefs.xml read:
<max_bytes_sec_down>5000</max_bytes_sec_down>
<max_bytes_sec_up>5000</max_bytes_sec_up>


However, my averages from client_state.xml show:

<net_stats>
    <bwup>4863.07</bwup>
    <bwdown>18008.8</bwdown>


And from my most recent download today, I have log entries like:

2006-04-03 14:38:36 [rosetta@home] Started download of cc1a68_03_05.200_v1_3.gz
2006-04-03 14:39:24 [rosetta@home] Finished download of cc256bA09_05.200_v1_3.gz
2006-04-03 14:39:24 [rosetta@home] Throughput 9644 bytes/sec

and

2006-04-03 14:46:51 [rosetta@home] Started download of cc1a68_09_05.200_v1_3.gz
2006-04-03 14:48:48 [rosetta@home] Finished download of cc1a68_09_05.200_v1_3.gz
2006-04-03 14:48:48 [rosetta@home] Throughput 15357 bytes/sec


Obviously, these are being sent faster than my setting should allow, and are causing my provider to drop my connection (repeatedly). The small files are never a problem as they don't sustain long enough to cause a drop, but every time I get to one of the 1MB+ files, they usually sustain long enough to get me dropped a few times on each one. Occasionally, one like above will be just under the point of getting me dropped, but that's very rare.

On the contrary, Einstein's 8MB files don't get me kicked off my connection, as they transfer very slowly. Is it that no projects honor the speed limit and it just happens that Einstein's servers are just that much slower with the data anyway, or is it that for some reason only Rosetta isn't honoring the speed limit?


3) Message boards : Number crunching : Any objections to reducing the maximum run time to 12-16 hours? (Message 12484)
Posted 22 Mar 2006 by Darren
Post:
So the question is: are there any objections to reducing the maximum time to 12 hours. Unfortunately, this is a work unit level parameter, and cannot be changed by the user.


I'm one of those (probably few) who use long runtimes. I would just ask that you keep us updated as to the time frame - specifically, let us know when the option for long runtimes is back. My preference for it is a little different from that of the dial-up users in that my internet connection is by cellular modem. The amount of data transfered isn't a problem, but rosetta for some reason gives me much more grief than other projects when there are gaps in the data flow. If a particular file times out on other projects it starts right back up 60 seconds later and the work unit runs fine after it all finally gets here. When any individual file in the download times out with rosetta, the download always starts back up but once it all gets here the work unit immediately reports a download error.

This requires me to "babysit" the connection and manually suspend network access during any data flow gaps while rosetta is downloading work. Oddly, if I manually suspend it, it picks right back up without causing an error - but if I let it timeout and resume it doesn't work. With that, I'll probably just suspend rosetta unless I know I'll be home long enough to babysit the connection through a few downloads - thus my request that you be sure to let us know when the option is available again.

Thanks.


4) Message boards : Number crunching : unnecessary earliest-deadline-first scheduling (Message 12047)
Posted 15 Mar 2006 by Darren
Post:
I *am* agreeing to what you've said -- but did you note my "minor correction"?


Yes, I did, that's why I said "Since it occurred before that manual connection, it's something a bit more complex than that, though."

But from the info that's available - nothing really more than the host info showing your work units download and report times - it will probably never be possible to figure out what triggered it. If you can go back far enough in your local logs to see it, there may be more of an indication there, but still not likely.

It could have been any of a few different things. It could have been something so simple as the benchmarks running and getting a slightly lower benchmark than before, at which point the client will assume - based on those new benchmarks - that the work units will take longer. Remember that the boinc client doesn't know how rosetta works with the defined time, so if the benchmarks change the client will assume the run times will also change. After a couple more work units finish, the DCF will have adjusted and even with a new benchmark your estimated run times (as far as the boinc client is concerned) will be back to 10 hours, though as far as rosetta is concerned they will never have left 10 hours.

If it's not happening every time you get new work, it's nothing to worry about and is probably going to be completely untraceable as to the cause. If it happens every time you get new work, then watching it closely to see what is changing as it goes into and out of EDF will help narrow it down.

5) Message boards : Number crunching : unnecessary earliest-deadline-first scheduling (Message 12038)
Posted 15 Mar 2006 by Darren
Post:
On another thread, it was said: 'if a project deadline is less than double the cache size, the box will run in EDF mode'.


That is correct, and that is what I though had happened in your case as the manual connect you made 2 days after the prior connection would have "reset" the time frame for determining when the 6 day connection setting would apply - basically making boinc think it had to fit 6 + 6 (for 2 normal connections) days worth of work into 2 + 6 (the manual connection plus the next normal connection) days on the calendar. Since it occurred before that manual connection, it's something a bit more complex than that, though.

Working like it should, if the due date isn't at least 2 normal connection times away, it will always think it has to do all the work before the next connection - thus the "deadline must be at least double the connection time (or cache size, if you prefer)" guidelines.


And yes the crashed WUs would affect a change in the expected run time but it should make it shorter


In the case of a 1 minute crashed work unit, the change would be very small - but enough of them followed by a quick reconnect before it adjusts back up could cause an extra work unit to download.

Just FYI for anyone who's interested, any work unit that runs less than 10% of the estimated run time only adjusts the estimated run time of the remaining work units by 1% of the run time difference. If the normal run time is 10 hours (600 minutes), a work unit that crashes in 1 minute would only reduce the estimated run time by 6 minutes (5.99 minutes to be exact) as the difference in what it expected and what happened was 599 minutes and the total run time was less than 10% of the expected run time.

If the run time is between 10% and 100% of the expected run time, the dcf adjusts the run time by 50% of the difference. So if a 600 minute work unit for some reason only ran 300 minutes, it would adjust the estimated run time down 150 minutes (50% of the 300 minute difference between what it expected and what happened).

In short, all that just means that a work unit that crashes quickly will have only a absolute minimal effect. One that runs for a reasonable time then crashes would have a much more dramatic effect. This 10% factor was put in specifically so corrupt work units wouldn't cause a drastic change in estimated run times for the remaining work units.


6) Message boards : Number crunching : unnecessary earliest-deadline-first scheduling (Message 12019)
Posted 14 Mar 2006 by Darren
Post:
Rosetta is the only BOINC project on this system, which runs off-line (except for occasional connects). My preferences specify the CPU time per WU as 10 hours, and the interval between connects as 6 days.

On Mar 12 I did a download to "refresh" the queue of ready WUs at my system. When the currently-running WU completed, the next WU failed in less than one minute with code 1 (its problem is known, and has been solved by now). Two WUs then completed normally; the next WU after that again got code 1 (same problem). HOWEVER, after the 'finished' message for this failed WU, there now was the message "Resuming round-robin CPU scheduling". There had been *no* previous messages about scheduling methods.

The next WU completed normally. HOWEVER, __eight hours__ into the processing of the WU after that (the WU eventually completed normally), there was the message "Using earliest-deadline-first scheduling because computer is overcommitted".

Note that there were about 100 hours of work in the queue, with the earliest expiration-deadline about 10 days away. Plus the system had been processing 10 hour CPU time WUs for weeks now. There was NO REASON for the client to say the computer was overcommitted.


<Today I was again able to do a download to "refresh" the queue of ready WUs at the system, *despite* the last (spurious?) scheduling method message having told me the computer was 'overcommitted'.>
.


This is just one of those weird little oddities that boinc has. Keep in mind that when you give it a "connect every" setting, it assumes that it will actually be restricted exactly to that time frame.

In your case, with a 6 day connect setting, once boinc has gone a couple days into the queue, if it is allowed to actually reconnect again it will determine that it must finish everything that is due in less than about 10 days before the next connection that will occur in 6 more days days - because it assumes that after the next connection it won't be allowed to connect again until 6 days later (and after the expiration of some of the work units).

I see from your results that 2 of the work units downloaded on the 12th were reported on the 14th. At that point, boinc assumes 6 days till the next connection (20th)- then 6 days after that for another (26th). Since that connection was made after the time of day for the workunits due on the 26th, boinc would assume that all work units due on the 26th would now have to be finished before the 20th - otherwise the connection (based on a strict 6 day connect interval) on the 26th would have occured about an hour too late to report those work units due on the 26th.

The message will have no real-world effect with the host only attached to one project as everything will keep processing just like it normally would have anyway.

7) Message boards : Number crunching : WUs that die in 30 seconds.. (Message 11964)
Posted 13 Mar 2006 by Darren
Post:
Did any of the HOMSdt_homDB WUs get processed successfully?


I have this HOMSdt_homDB005_1dtj__352_78 that processed successfully.

8) Message boards : Number crunching : "Concerned my P.C. isnot well" (Message 11960)
Posted 12 Mar 2006 by Darren
Post:
As it stands I only can crunch for 2 experiments now.Somehow I meesed-up the O.V. and Boinc crashe several times.Now to start Boinc,I must go to home.
"Conquerer" and hold down CNTL click the run Client go to tools and execute
the shell.Then I have to repeat the same to get the Boinc Manager operational.


As well as I can follow what you're having to do to start it, I have to say I'm not sure what would cause that. How were you starting it before? We're you clicking on run_client or were you opening a shell first and typing it in? What does it do when you try to start it the way you used to - does it give you any messages or does it just not start?


All my Malaria w/u failed,but, Einstein and Rosetta still work and am getting better Credits than ever.I may have to do a Complete install of this O.S. Unless you can Delete all Boinc files and reaaply them????


Your benchmarks look better now, so whatever you did there helped a lot. Your floating point went from 378 to 585, and your integer went from 1001 to 1637. That looks much more inline with what other systems like yours seem to report. And that will be why your credits have went up. They seem to have gone from less than 6 credits for 2 hours to just over 9 credits - that looks like a decent improvement for your cpu.

As for the malaria work units failing, that's one of the few projects I've never joined so I can't offer any suggestions on that one short of letting it download a new malaria executable. The best way to do that is to do a project reset on malaria - don't detach, just do the reset. Or if you prefer you can delete the files in the project folder for malaria in your boinc directory - then it will have to download new ones. If you do the project reset, it will keep your client_state file clean as you go whereas deleting the files will trigger an error in boinc which it will then correct by downloading new ones.


9) Message boards : Number crunching : resource allotment (Message 11949)
Posted 12 Mar 2006 by Darren
Post:
if i am running more than one boinc project, how do i allot the amount of resources i want each project to receive? ie, rosetta gets 66 percent and seti 33 percent, for example. thanks, paul.


You can adjust it in your account settings. If you go into your account then click on "rosetta@home preferences" you'll see a resource share setting. You'll have to define the share for each project at that project's site to get the balance you want. Also, be aware that what you set will be interpreted as a proportion rather than a percentage, meaning if you set 1 project at 50 and another at 450, it will result in 10%/90%.

10) Message boards : Number crunching : Result was reported too late to validate ???????????? (Message 11947)
Posted 12 Mar 2006 by Darren
Post:
My apologies if I misunderstood your original statement, but I took it the same way "Scribe" took it.


And I apologize for not being clear enough. I thought I had made it very clear that I don't like the way boinc addresses the issue (by implementing redundancy) either, but that it does serve a purpose, so an alternative should be used by any project that chooses not to use redundancy. I even said that I thought projects should not use redundancy if not scientifically necessary, so I am kind of baffled how anyone could interpret what I said to mean projects should do exactly as boinc/seti does and are somehow "violating" some concept if they don't.


But the idea that they will somehow create some sort of fair "leveling" process and remove credit from people on a wide scale would be more repugnant to many than the present situation. Imagine the outcry from people who have credits reduced, through such a process.


I think any outcry would be suprisingly small. Any rosetta participant who also participates in any other project is already fully accustomed to the fact that the amount of credit they ask for and the amount of credit they get can be different. People have complained about the benchmark variations long and loud all the way back to when boinc was still in beta. If rosetta implemented a "leveling" process that corrected totally out of line benchmarks rather than the normal throw-out-the-high-and-low-claim leveling process other projects now use, I think rosetta would be considered a hero, not a villain.


Moreover this is not as trivial a process as you imply. There would have to be testing, benchmarking, code preparation, and the increase load on the servers would impact the operation of the project.


My original suggestion of averaging is not so complex. It would require running against the database, which is why I suggested it be run once per day or so. If, for example, a script extracted the benchmarks of all the p4 2.8 gh hosts then averaged those results, you now have a starting point. The project then determines how much a benchmark can reasonably and legitimately vary from the average of all similar systems - then applies that as the maximum and minimum for that cpu. This would have to be done for each type and speed of cpu, but it would only have to be done once. And one script (ok, it has to be written, but it's not a complex script) running one time could get all the averages and determine the high and low variation tolerance numbers. Those numbers are then used to write another quite simple script to run at the project defined intervals and look for benchmarks that are out of range. If it finds any - too high or too low - it recalculates the credit since its last run using the project-defined maximum or minimum benchmark for that cpu. For that matter, granting of credit could even be held until after the script has ok'd the benchmarks - it's not like people aren't already used to waiting days for credit to be issued anyway on projects that use redundancy. Of course, any process that defines an acceptable range doesn't totally eliminate the ability to manipulate, but it does cap just how far any manipulation can take you.


There are over 40,000 users attached to the project, less than 100 individual people have raised this issue, and some of those raise it often and loudly. When it is raised the project looks at the problem and takes action if it is appropriate, but the forums give a false impression of the actual size of the problem. If people are cheating it stands to reason that they would appear among the top individual systems. But even that is not easy to say when a high RAC number can be created by infrequent results uploads, and high credit claims can be made by identical systems that for a number of reasons have different but legitimate benchmarks.


I learned long ago not to form opinions based on forum rants. The only reason I joined in this discussion here is because I can look at the list of top computers myself and see for myself what I'm talking about. At the time that I'm writing this, the top computer has the following details:

CPU type GenuineIntel Intel(R) Pentium(R) 4 CPU 2.80GHz
Number of CPUs 2
Operating System Microsoft Windows XP Professional Edition, Service Pack 2, (05.01.2600.00)
Memory 502.98 MB
Cache 976.56 KB
Measured floating point speed 6346.35 million ops/sec
Measured integer speed 13034.85 million ops/sec


Now, as you know I'm no computer expert, but even I know that there is no legitimate way the benchmarking method boinc uses could produce those benchmarks on a p4 2.8gh system. Not that I'm saying this person did anything to intentionally cheat (boinc itself with no user intervention has done weirder things all by itself before), but the benchmarks are still clearly wrong. If a script ran every day or so and adjusted those to an acceptable maximum, they may still be wrong - but not as wrong.

Who could complain about that? The legitimate user with totally out of whack benchmarks will not care that they were corrected. I see no way anyone could complain about a fair crediting system unless they're not playing fair themselves.


The answer to this problem is the proposed flops counting system being incorporated into BOINC right now, not some arbitrary credit value assigned to each work unit and imposed by a single project. That would be no different that simply saying one work unit equals one credit. The only way to get the flops system in place, is to work with the BOINC team to do it, and the only way to get that done is to demonstrate a real world need on one of the projects to have it. That is precisely what Rosetta has done, and is doing.


I give rosetta full credit on that count. However, I also know from being a seti beta tester that there are some current flaws with that concept that are just as big as the existing flaws. Granted, it's still not ready for general release so there is some time to work some of that out. As has been pointed out over there though, it doesn't seem to really be fixing the problem - it just buries it a little deeper and makes it even harder to find.

Anyway, I think all of my views here are pretty well known. Not than anyone listens or cares, but at least I feel better having screamed about it for a while. So, I'll just slip back into obscurity now.


11) Message boards : Number crunching : Result was reported too late to validate ???????????? (Message 11930)
Posted 12 Mar 2006 by Darren
Post:
Darren said -

Your position that Rosetta has somehow violated the intent of BOINC by requesting improvements to the Client, is just not correct.


I don't think I have in any way implied such a position.



but Darren said earlier -

...Again, only half credit here. Boinc is designed to operate a certain way, it is Rosetta that does not conform to that standard........


Which says to me that Darren did imply that Rosetta had violated the intent of Boinc!


And what does the rest of that very paragraph say:

Again, only half credit here. Boinc is designed to operate a certain way, it is Rosetta that does not conform to that standard. Now, I fully hope that boinc is successfully modified to support everything Rosetta wants (and any other project that wants to use boinc), but this adamant "it all boinc and no rosetta" to blame is really astounding. As it is rosetta that is deviating from the boinc norm, rosetta does have some obligation in addressing the issue.


I never said they can't or shouldn't deviate from the boinc norm - just that they are deviating from the norm. But they shouldn't simply discard functionality because they don't like the process that generates that functionality. If they don't like the process, they should implement a local alternative, not simply ignore it.

What I said and what you're implying I said are two very different things.



12) Message boards : Number crunching : Result was reported too late to validate ???????????? (Message 11928)
Posted 12 Mar 2006 by Darren
Post:
Most people who look at this objectively would agree that redundancy, while a nice patch for a design flaw, is not the best solution for the validation issue if the only purpose for using it is credit granting. If I read you correctly, you see it that way as well. In certain cases redundancy is required to validate the quality of the science, but Rosetta is not one of those cases. While a number of projects do use redundancy, it is a significant waste of resources unless it serves the project.


That is my exact take on redundancy. On some projects it serves a scientific need and should exist. On most it doesn't and should not exist. That is why I made it perfectly clear that I do not think rosetta should implement redundancy.


Your position that Rosetta has somehow violated the intent of BOINC by requesting improvements to the Client, is just not correct.


I don't think I have in any way implied such a position.


You have correctly pointed out that BOINC redundancy was instituted to solve known weaknesses in the BOINC credit system. So in truth BOINC is forcing the projects to give up between 1/2 and 3/4 of the available project computing power to fix a problem in the BOINC infrastructure and design concept.


But as everyone knows, you're not forced to use the redundancy concept that boinc has integrated. However - and this is a big however - when you decide NOT to use it, you should have a viable alternative in its place. Right or wrong, good or bad - it does serve a purpose. Considering that, it shouldn't simply be discarded unless an alternative is put in its place. Your database is under your control, not Berkeley's. When rosetta decided not to implement the boinc solution at the boinc level, an alternative solution should have been put in place at the rosetta level. Not to say that suggestions (very forceful ones if necessary) shouldn't be presented to Berkeley to improve the boinc program overall, until that happens it's still rosetta that chose no redundancy therefore it's still rosetta's responsibility to have a (hopefully temporary) alternative in place.


If we are to expect no changes in BOINC that are driven by the needs of the projects, then all of the projects should conform to the original SETI standard when BOINC was first released. Instead, most people recognize that BOINC itself is actually in a state of development. ... Can anyone honestly say that BOINC should NOT be improved? Is it so perfect in concept and current implementation that change is unnecessary?


I'm not a boinc cheerleader - I think boinc (the program, not the concept) still needs a lot of improvement. In my mind, though, that still doesn't equate to a project just ignoring one of the "less-than-desirable" aspects of boinc without somehow compensating for the lost function that is introduced by ignoring that bad aspect (in this case, fair credit). Again, it goes back to right or wrong, good or bad, that less-than-desirable aspect does still serve a purpose.


Frankly, the new time feature in Rosetta is a quite elegant and simple solution to a range of problems beyond simple bandwidth reduction that face all BOINC projects. For example how about allowing slower machines to run the projects and still meet the reporting deadlines.


While my machine is not overly slow, this one thing was the sole trigger for me deciding to allocate much more of my machine to rosetta. I currently run them for 48 hours, and plan to increase to 96 very soon. I guess I'm a little backwards from most crunchers in that aspect - I prefer long work units over short ones.


What Rosetta is asking of BOINC is precisely what the BOINC environment is all about. Flexibility, creativity, science. innovation, and reaching for more. I am sorry you feel that all of this is melodramatic.


Well, that was not the melodramatic part of your post. The melodramatic part was the implication that the human race may become extinct if the project were to devote just a bit of attention to fixing a problem its own design caused instead of devoting every minute towards the science, which "is life and death for everyone".

Not that the science is not extremely important - I think it is. But I guess having been a paramedic for 23 years, my definition of "life and death for everyone" is a bit different from that of a rosetta scientist.


13) Message boards : Number crunching : Result was reported too late to validate ???????????? (Message 11915)
Posted 12 Mar 2006 by Darren
Post:
...So the problem you have pointed out is a BOINC problem.


Well, I only partly agree with that. Having been with boinc since it was in beta testing, boinc addressed this from the very beginning. You have not seen me reccommend redundancy here - and you won't see it - but that is the built in method that boinc uses to prevent people from claiming falsely high credit. It is ROSETTA that chooses not to run their project in a way that allows the boinc software to prevent this as it was intended. The problem that I mention does not occur on other projects - it is unique to rosetta. Boinc released the software as open-source knowing it could be manipulated, then implemented redundancy to negate any benefit anyone would gain from making those manipulations. Again, I don't advocate redundancy here (or on most projects for that matter) - just some simply sanity check to detect falsely high credit claims.


...melodramatic stuff edited out...


While clearly a few people are concerned with credit awards to the exclusion of all else, the awarding of credits is not the primary purpose of ANY of the BOINC projects.


And it should not be, but neither should it be totally ignored. I really don't care about credits myself. I've participated in projects knowing credit would be lost, such as boinc beta. Yeah, I post my credits in my signature, but they're just numbers - they have no value and I know that. However, they are the only thing that crunchers get in exchange for their participation. And as that, the project should demonstrate some degree of respect for them rather than a "we don't care if people cheat" attitude. The primary purpose of the Olympics is not awarding medals, but the Olympic committees ensure that the ones actually awarded were really earned.


...But frankly, Rosetta is not currently planning to institute redundancy, and I have heard of no plans to develop a new credit system unique to Rosetta.


Again, I've never suggested that. The most I've suggested is a script to trigger an automated process of "correcting" credit claims that are clearly abnormal.


That position does not constitute hiding from the issue, or ignoring the users, it constitutes fixing the problem at the source. The benchmark system is controlled by BOINC, the credit claims are calculated by BOINC, the ability to hack certain important BOINC files is a BOINC security issue, and the lack of support for non redundant projects is a BOINC problem. Therefore the fix lies with some future version of the BOINC client package, and that is precisly what the project is trying to get.


Again, only half credit here. Boinc is designed to operate a certain way, it is Rosetta that does not conform to that standard. Now, I fully hope that boinc is successfully modified to support everything Rosetta wants (and any other project that wants to use boinc), but this adamant "it all boinc and no rosetta" to blame is really astounding. As it is rosetta that is deviating from the boinc norm, rosetta does have some obligation in addressing the issue.


14) Message boards : Number crunching : Result was reported too late to validate ???????????? (Message 11911)
Posted 12 Mar 2006 by Darren
Post:
All of the complaints in this area are based on the use of so call "optimized" clients by some participants, which are in fact available to everyone equally.


While I don't want to get in the middle of the personal dispute going on here, I kind of have some lingering doubt as to whether all the complaints are really pointing at legitimately optimized clients.

I use an optimized client myself (the one made by Harold Naparst and available to anyone), but some of the benchmarks I've seen don't appear to be coming from any legitimate optimized clients.

Now, I'm as far as could possibly be from being a computer programmer - and I couldn't interpret a single line of code if my life depended on it. That said, it took me well under 2 minutes to find a way to alter the code to get bogus benchmarks. This host is the one I created, with no difficulty whatsoever, with bogus benchmarks. Since I altered the code, I did let it run a single work unit for one hour (the shortest time possible) just to make sure the client really did still work properly and return results that would validate. The benchmarks on that host came from my first guess at making changes, and I didn't do too good as my whetstone is lower than a legitimate optimized client. But my dhrystone is much higher than legitimate. The end result was that my system that normally claims 10.125 cs/hour with a legitimate optimized client claimed 33 cs/hour after my changes - changes that are now built into the code so it will rebenchmark this way every time. THESE CHANGES ARE CLEARLY CHEATING, and I made these changes by doing nothing more than altering 1 single item in 2 files, but without some kind of a sanity check from rosetta ANYONE can cheat like this.

And to show how completely absurd it can get, I then decided to go even further and see what it would give. I won't attach it unless there is some compelling need to prove the benchmarks it gives me now - as I would have to trash the work unit that would be initially assigned. However, here's the output from executing the absurd version (and there appears to be no limit on how far it can go if I just change it even more than I did):

platinum client # ./boinc
2006-03-11 18:46:55 [---] Starting BOINC client version 5.2.14 for i686-pc-linux-gnu
2006-03-11 18:46:55 [---] libcurl/7.15.2 OpenSSL/0.9.7i zlib/1.2.3
2006-03-11 18:46:55 [---] Data directory: /home/darren/extract/boinc/client
2006-03-11 18:46:55 [---] Processor: 2 GenuineIntel Intel(R) Pentium(R) 4 CPU 3.06GHz
2006-03-11 18:46:55 [---] Memory: 883.68 MB physical, 494.18 MB virtual
2006-03-11 18:46:55 [---] Disk: 13.80 GB total, 7.96 GB free
2006-03-11 18:46:55 [---] No general preferences found - using BOINC defaults
2006-03-11 18:46:55 [---] Remote control not allowed; using loopback address
2006-03-11 18:46:55 [---] Primary listening port was already in use; using alternate listening port
2006-03-11 18:46:55 [---] This computer is not attached to any projects.
2006-03-11 18:46:55 [---] There are several ways to attach to a project:
2006-03-11 18:46:55 [---] 1) Run the BOINC Manager and click Projects.
2006-03-11 18:46:55 [---] 2) (Unix/Mac) Use boinc_cmd --project_attach
2006-03-11 18:46:55 [---] 3) (Unix/Mac) Run this program with the -attach_project command-line option.
2006-03-11 18:46:55 [---] Visit http://boinc.berkeley.edu for more information
2006-03-11 18:46:57 [---] Running CPU benchmarks
2006-03-11 18:47:56 [---] Benchmark results:
2006-03-11 18:47:56 [---] Number of CPUs: 2
2006-03-11 18:47:56 [---] 39714 double precision MIPS (Whetstone) per CPU
2006-03-11 18:47:56 [---] 2310738 integer MIPS (Dhrystone) per CPU
2006-03-11 18:47:56 [---] Finished CPU benchmarks

2006-03-11 18:47:57 [---] Resuming computation and network activity
2006-03-11 18:47:57 [---] request_reschedule_cpus: Resuming activities


Notice the benchmarks, and just imagine what kind of credit that would claim.

But, my whole point here is that there needs to be some kind of a sanity check put in place by rosetta. If nothing else, a script that runs every day or so that just looks for totally out of whack benchmarks when compared to the cpu they were run on - then adjusts and recalculates the granted credit to those hosts that are too far from reasonable.

OK, stepping off of my soapbox now...


15) Message boards : Number crunching : "Concerned my P.C. isnot well" (Message 11878)
Posted 11 Mar 2006 by Darren
Post:
Have looked at those "Optimised Clients",and,It is a bit above my knowledge.


Ah, don't let it scare you. Most of the optimized boinc clients are just a single file that you download to replace your existing file with. Now, to get an optimized seti app working is a little more involved, but still not hard.

If you decide to try one, just download the file and extract it into your boinc directory. In the case of Harold Naparst, you'll get a file named "boinc_naparst_p4", which is the replacement for your file named "boinc" that you have in your boinc directory now. Rather than try to make everything that may be linked find the new filename, it's much easier if you just stop boinc from running then change the name of the file now named "boinc" to something else (boinc_old, for instance). After that, change the name of the file "boinc_naparst_p4" to be just "boinc". You'll probably have to make it executable - which is the only kind of weird part of the whole thing. You can do this by right clicking on the file and choosing "properties", then click the "permissions" tab and put a mark in the box for "execute" on the "owner" line (or if you use the command prompt, from within the boinc directory just type "chmod+x boinc" without the quotes). After that, start boinc just like you normally do and you'll be using the optimized client - it should tell you right at startup that a new version is detected and it will then run new benchmarks automatically.

In the event there is any kind of a problem, you still have the old one, so you can just delete the new "boinc" and rename the other one back to boinc and you're right back where you were.

It sounds like a lot when you read it, but it's not really difficult at all. If you've gotten as far in linux as you obviously already have, I don't think this will give you much trouble.


Run Boinc is User instead of Root due to problems I could cause as a newbie.


That's not just a newbie thing. Everyone running boinc should be running it as a user, and not as root. The only thing you should normally do as root is make general system changes, like upgrading core software, setting up your network, and stuff such as that. Running everything else as a user instead of root not only keeps things from getting messed up, but it's one of the key things that makes linux systems so safe from hackers and virus/trojan software and such.

But again, don't be intimidated by it. Have your friend look over the web sites with the optimized clients and have him look over the instructions I gave you for installing one and he'll likely agree that you'll do ok with it.

16) Message boards : Number crunching : "Concerned my P.C. isnot well" (Message 11875)
Posted 11 Mar 2006 by Darren
Post:
Doug, your benchmarks seem a bit low for your cpu from others I've seen, but I'm not all that up on what different cpu's should be giving so I could be wrong.

I see you're using boinc 5.2.13, and I assume you're using the standard download rather than an optimized version. If you're not aware, the standard versions don't benchmark properly on linux - which would be my guess as to why yours look on the low side. And your credit is directly determined by your benchmarks.

If you go back to the download page (from the homepage of any of the projects) and follow the link at the bottom for "download executables from a third-party site" (which will take you here), you'll find a few optimized versions of boinc that may work better for you. I personally use the version from Harold Naparst's site (here), as his does better for me than anything I could come up with for myself. If you go there, just be sure you get the one labeled "moral", as it is fully legitimate and doesn't try to implement any kind of cheats.

Short of trying an optimized boinc client, you can try rerunning your benchmarks (in the boinc manager, choose Commands | Run Benchmarks) and see if you can get better benchmarks. If you try that, make sure the system isn't doing anything else really cpu intensive at the same time (no busy programs running, stop any music, etc.).

Other than that, there's probably not really a lot you can easily do. I assume from your kernel name you're running Mandrake (or Mandriva). There's not going to be a lot you can easily change there - if anything at all - to get any big improvements.

Good luck.

17) Message boards : Number crunching : Points Per Day Averages (Message 11864)
Posted 10 Mar 2006 by Darren
Post:
that happens to be my machine. It is a dual socket Opteron 280 which makes it a 4 way box. Now when a 2.8 p4 reports a higher RAC than a 64 bit AMD with hypertransport memory access then I would through stones at the P4. Now do you think that 8 hours of CPU time on this machine is not worth 220 points then you just have the admins say the word and I will be gone! This machine is clean.


Not to speak for him, but I think maybe the original poster just overlooked the changes in run time when he was looking at your system.

As for me, I don't know much about AMD processors (or any processor for that matter) so I admit I go out on a limb when I decide to throw stones at them. But since I've jumped in and been pretty vocal in this thread, I will take a moment to elaborate before I wind up getting attacked. When your system first got mentioned, I looked up the processor and didn't think anything about it claiming that much. It's a $1000 quad processor - no doubt it can do a lot of work in a very short time. But that still brings me back to one of the systems mentioned in the original post. That one is a Sempron 2800+. That's a $75 processor and it's claiming higher benchmarks than your $1000 processor.

That's the kind of stuff that causes me to throw stones.

18) Message boards : Number crunching : Points Per Day Averages (Message 11859)
Posted 10 Mar 2006 by Darren
Post:
However following the threads on their forum has led me to seriuosly wonder if someone has found a way to cheat. They have some very interesting threads in their public forum that were moved to their private forum just because of this topic.


I read a lot of their public forums yesterday and thought some of that was very strange, too. They seem to have a lot of threads that get made private so people won't see them and think they're cheating. Well, logic sort of says that's backwards - hiding them makes it look like they are cheating, whereas if they really weren't cheating and left them public, people would see it all and know there was no cheating.

I also thought it was interesting that they don't seem to advocate participating in projects that require multiple crunchers to form a quorum (not that they don't, but I couldn't find any mention of it if they do), thus doing away with any ability to compare their crunchers side by side with other crunchers.

Since rosetta doesn't require multiple crunchers to form a quorum, they really need to come up with some kind of a way to introduce a sanity check on the participating computers. Perhaps set a maximum deviation from the average benchmark for each cpu. Anyone who understands the code and compiles their own boinc software can screw with the way the benchmarks are determined as well as the amount of credit the computer asks for, but on other projects the quorum method of granting credit kills off any real benefit of doing that. Since that's not logical on rosetta, maybe they need to devise a script that runs every so often looking for out-of-whack benchmarks or credit claims based on the machine's cpu and then resets those benchmarks to some determined maximum allowable deviation for that particular cpu and recalculates the credit granted to those hosts with some real-world benchmarks applied.

19) Message boards : Number crunching : Points Per Day Averages (Message 11834)
Posted 9 Mar 2006 by Darren
Post:
So don't be puzzled... there isn't a trick, he is just uploading a lot's of wu's at one time and never connecting inbetween.


There's a bit more to it than just that. Just as an example, one of the systems mentioned (this one) is reporting its work every day, and is getting 60+ credits for 2 hours of work. The data being returned does not show any of the common optimized clients is being used (but he certainly could have optimized his own client). What grabs me as out of place, though, is the benchmarks:

Measured floating point speed 3705.26 million ops/sec
Measured integer speed 10957.09 million ops/sec

and it is identified as:

AuthenticAMD
AMD Sempron(tm) Processor 2800+

Now, admittedly, I don't know a lot about what the normal benchmarks for the various AMD processors should be, but come on - integer speed of almost 11000.


20) Message boards : Number crunching : Report Maximum CPU Time Exceeded WU HERE (Message 11065)
Posted 21 Feb 2006 by Darren
Post:
Whoa now, what is this???

I set my cpu time for 24 hours and I get a max cpu time exceeded after 10 hours.

Here is the WU, and here is the pertinent info:

CPU time 36185.368987

stderr out

<core_client_version>5.2.14</core_client_version>
<message>Maximum CPU time exceeded
</message>
<stderr_txt>
# random seed: 910501
# cpu_run_time_pref: 86400

</stderr_txt>

Validate state Invalid
Claimed credit 101.245443882576
Granted credit 0
application version 4.81



Next 20



©2024 University of Washington
https://www.bakerlab.org