Posts by verdy_p

1) Questions and Answers : Windows : Rosetta bogous: only errors, never credited any jobs (Message 11393)
Posted 25 Feb 2006 by verdy_p
Post:
Many thousands of people are running Rosetta with complete success. While the application switching problem may be frustrating for you, it is a known condition of running rosetta. The latest release of the application (as of 18/2/06) contains a fix for this issue and others. But if you decide to stay with Rosetta@home, please take a few moments to read the FAQs that are posted in the number crunching forum, and the project requirements posted as a link on the home page. This will help you understand the limitations and requirements for running Rosetta.


Completely stupid. Such requierment is not only unacceptable, goes against all recommandatations for BOINC projects, and does not even solve the problem that Rosetta has a severe bug that just forbids someone of just suspending the project temporarily or shutting down his PC for whatever reason he likes (including to finish antivirus updates).

A BOINC project that can't be stopped at any time is not a BOINC project, just an unnecessary pollution (in all senses of this word). If you're not able to fix it in Rosetta (like it was done for al other projects), get out of BOINC, because your "recommanded" setting affects in fact ALL projects, not only Rosetta.
2) Questions and Answers : Windows : Computation Error (Message 10996)
Posted 20 Feb 2006 by verdy_p
Post:
3) Questions and Answers : Windows : Aborted work unit and memory usage (Message 10995)
Posted 20 Feb 2006 by verdy_p
Post:
I have also seen the memory issue on my system, I have a P4 - 2.8Ghz with HT enabled. Each WU is currently using 180Mb of Virtual Memory. Lucikly I have 2Gb Ram so it is not effecting the performance much but still that is a lot of memory to be using.


You're right. It's completely ridiculous to dedicate permanently so much virtual space on disk when Rosetta is suspended, because we are running another application, or because time issharedwith other BOINC projects.

The main reason is a serious bug in Rosetta's interrupt handler whose programming is definitely not multithread safe and corrupts the main computing thread.

I don't like leaving Rosetta in memory. It is against the philosophy of BOINC projects that should use ONLY idle computing resources.Rosetta should retire from BOINC projects as long as this bug is not corrected (this bug may even generate false scientific results due to the possible data corruption that it may generate even if a work unit apparently does not terminate abruptly with an unrecoverable error).

Also: please save computing snapshots more often. When there's a failure, the work unit state should be recovered without too much CPU time lost, and will progress enough until the next snapshot to bypass a single failure caused by an external event. Note that when BOINC is running as a screensaver, itmaybe interrrupted very fast before any significant progress has been done. And suchevent mayoccur several times rapidly. This is not a failure, but a commonissue of screen savers that are sometimes triggered when the user is just reading a documentor has paused for a smalltime butdoes not want the screensaverto come into interrupt his job.

To solve this problem: Rosetta should enter sleepmode for a smalltime ifit gets paused, and unless it has not been resumed after 2 minutes, it should shutdown and save its computing state to disk and exit. If the computing thread is locking a critical data section, the interrupt handler may not be able to terminate the job immediately and should start a timer for a delayed retry after 2 seconds up to 1 minute, in a loop. It MUST make all efforts to exit the Rosetta process and free memory as fast as possible.

For now, Rosettta just stinks, and abuses users computing resources.
4) Questions and Answers : Windows : Only client errors (Message 10994)
Posted 20 Feb 2006 by verdy_p
Post:
5) Questions and Answers : Windows : Rosetta bogous: only errors, never credited any jobs (Message 10894)
Posted 18 Feb 2006 by verdy_p
Post:
Rosetta is the ONLY project for BOINC, with which I NEVER get credited of any work.

There is unambiguously a severe bug in Rosetta, that causes to fail when it is suspenped (either because of CPU scheduling by BOINC, or bysuspending the workmanually or when logging out).

Here is an example log taken from BOINC:

18/02/2006 17:24:57|rosetta@home|Restarting result PRODUCTION_ABINITIO_QUADRUPLELONGRANGEANTIPARALLEL_1vls__311_31_1 using rosetta version 481
18/02/2006 17:29:57|rosetta@home|Pausing result PRODUCTION_ABINITIO_QUADRUPLELONGRANGEANTIPARALLEL_1vls__311_31_1 (removed from memory)
18/02/2006 17:29:58|rosetta@home|Unrecoverable error for result PRODUCTION_ABINITIO_QUADRUPLELONGRANGEANTIPARALLEL_1vls__311_31_1 ( - exit code -1073741819 (0xc0000005))
18/02/2006 17:29:58||request_reschedule_cpus: process exited
18/02/2006 17:29:58|rosetta@home|Computation for result PRODUCTION_ABINITIO_QUADRUPLELONGRANGEANTIPARALLEL_1vls__311_31_1 finished
18/02/2006 17:31:01|rosetta@home|Sending scheduler request to http://boinc.bakerlab.org/rosetta_cgi/cgi
18/02/2006 17:31:01|rosetta@home|Reason: To report results
18/02/2006 17:31:01|rosetta@home|Reporting 1 results
18/02/2006 17:31:06|rosetta@home|Scheduler request to http://boinc.bakerlab.org/rosetta_cgi/cgi succeeded

My PC is a notebook with AMD Turion 64ML-30, definitely not overclocked, and running perfectly for other projects.

The only reason why this happens is definitely a bug in rosetta's code that CAN'T handle properly job suspensions, a REQUIRED feature for any BOINC client.

Already more than 15 jobs performed, all with this "unrecoverable" error, and NO credits, despite many hours have been spent on the project. All I can do now is to retire from this project, and choose other more useful BOINC projects, because the way Rosetta works just wastes my CPU time.

Such bug implies lackof quality in the Rosetta software, and Ican't exposemy computer to such constant bugs because of a poor quality software I can't trust on my machine.

I have tried to run Rosetta on another PC, and it also fails (as a coincidence, it is also an AMD processor, an Athlon XP 1800+). I really think that the code that implements job suspension was optimized only for Intel processors, and never tested for AMD, which it incorrectly identifies in some Rosetta's internal assembly programming code.

Rosetta stinks... I doubt that it can even return usable and reliable results.

Note: the AMD Turion is the mobile version of the Athlon XP. At 1.6GHz, the Turion runs with the same performance as an Athlon XP at 3.2GHz, because it integrates two parallel execution pipelines with much lower energy used and less heat. This processor is then more expensive than the equivalent Athlon model, but is great for notebooks. (This is quite similar to the differences between Intel Pentium M and Pentium 4)

Are there people getting any success with Rosetta? Doother peopleget so many "client" errors on Intel processors too?






©2024 University of Washington
https://www.bakerlab.org