Posts by magiceye04

1) Message boards : Number crunching : Work done - no "pay" for it (Message 95326)
Posted 24 Apr 2020 by magiceye04
Post:
Thank you for the explanation!

Today i also had to abort some WUs. They consumed about 1,8GB per WU und freezed the PC. I only allowed about 12 WUs, but it was still to much for 16GB RAM.
Maybe these WUs can be sent to PCs with minimum 4GB per CPU-Core...
2) Message boards : Number crunching : Work done - no "pay" for it (Message 95218)
Posted 23 Apr 2020 by magiceye04
Post:
Today i got defective WUs without checkpointing.
But the Computer needed to restart by external reason.
AGAIN many hours of wasted computing - all start from zero.
Maybe i try an not beta-project the next days...
3) Message boards : Number crunching : Work done - no "pay" for it (Message 95183)
Posted 23 Apr 2020 by magiceye04
Post:
I also had about 70 project aborted WUs last night.
Many of them were partly computed, some also fully computed.

I would really recommend to test these beta-WUs on the Ralph-project.
4) Message boards : Number crunching : many erors: rb_04_20_22201_ (rosetta 4.15, Ubuntu) (Message 95150)
Posted 22 Apr 2020 by magiceye04
Post:
I have detected many WUs with comuting error on both PCs.
All are rb_04_20_22201_21746_ab_t000__robetta_cstwt_5.0_FT_IGNORE_THE_REST_*
They ran about 50.000s instead of the normal 14.000s which i have set.
Only a few of them got credits - about 500-600 for 50.000s.
But the most are just an error.
http://boinc.bakerlab.org/rosetta/results.php?userid=419076&offset=0&show_names=1&state=6&appid=

I hope, they will not return.
The question i have: Why is the watch dog not working in this case?
Earlier error WUs were aborted after +4h
5) Message boards : Number crunching : "Rosetta v4.12 i686-pc-linux-gnu" : fixed 20 h CPU time, fixed 20 credits (Message 94391)
Posted 13 Apr 2020 by magiceye04
Post:
OK - then i686=32bit and x64 = 64bit?

But the needed solution is still: run only the 64bit WUs on AMD CPU.

Why is the 20 years old i686 code still in use if the problem is known?
6) Message boards : Number crunching : "Rosetta v4.12 i686-pc-linux-gnu" : fixed 20 h CPU time, fixed 20 credits (Message 94373)
Posted 13 Apr 2020 by magiceye04
Post:


If you would like to try creating a simple cc_config.xml file, you can get BOINC to use the x86_64 version instead of the i686 version that is having trouble. There is an example here.

What exactly from this example is needed?

<no_alt_platform>1</no_alt_platform> ?

I have an existing config file and only want to add the relevant line.
7) Message boards : Number crunching : "Rosetta v4.12 i686-pc-linux-gnu" : fixed 20 h CPU time, fixed 20 credits (Message 94372)
Posted 13 Apr 2020 by magiceye04
Post:
To any that have found this thread because they are having Linux i686 issues, please join Ralph (project url is: http://ralph.bakerlab.org/) with your machine. This will help with testing when changes are made to address this, and confirm they are working.

No promises on when a new version will be available there. I may take some time.

If you would like to try creating a simple cc_config.xml file, you can get BOINC to use the x86_64 version instead of the i686 version that is having trouble. There is an example here.


I have some broken WUs on my PC, new version 4.15
*i686*
Why are INTEL686 WUs sent to AMD-PCs?
The x86 run perfect, please keep away these i686 WUs from non-intel-PCs.

http://boinc.bakerlab.org/rosetta/result.php?resultid=1148091774
8) Message boards : Number crunching : What do all of these little credit scores mean? (Message 94041)
Posted 10 Apr 2020 by magiceye04
Post:
The last 2 days i got no errors with long running/watchdog-aborted WUs.
I reduced the working time to 4 hours. maybe this helped or the project guys repaired something. :)
9) Message boards : Number crunching : What do all of these little credit scores mean? (Message 93732)
Posted 7 Apr 2020 by magiceye04
Post:

These people aren't running jobs for the benefit of the project. They're running for the stats.
They think they're more important than the project they're running. Essentially, they are [*censored word*]

It's hard to argue with that. In fact, impossible to argue with it, credibly.

The problem is not that these WUs are getting very less credits - the problem is that these WUs are not produce any effort for the science. The computation always ends in an error. Hours of computing and spending energy for an error message!

@Project: please try to find the root cause for the errors.
Even the new version 4.12 are not free of the bug.


http://boinc.bakerlab.org/rosetta/result.php?resultid=1140411318

BOINC:: CPU time: 72062.5s, 14400s + 57600s[2020- 4- 6 19:32:47:] :: BOINC
WARNING! cannot get file size for default.out.gz: could not open file.
Output exists: default.out.gz Size: -1
InternalDecoyCount: 0 (GZ)
-----
0
-----
Stream information inconsistent.
Writing W_0000001
======================================================
DONE :: 1 starting structures 72062.5 cpu seconds
This process generated 1 decoys from 1 attempts
======================================================
19:32:47 (6643): called boinc_finish(0)
10) Message boards : Number crunching : What do all of these little credit scores mean? (Message 93464)
Posted 5 Apr 2020 by magiceye04
Post:
Hi!

I have seen that all my tasks that i got on 30.03.2020 had this bug of low credits and were ended by watchdog.
So i kicked the remaining bugged tasks back to the server before they start.
But it seems, they where send out now to other users. e.g.
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=1024166306
Maybe the bug will still be alive some more days...
Wouldn't it be better to cancel the tasks by the server if they have not started yet as they all seems to have a problem?

Best Regards
MagicEye






©2025 University of Washington
https://www.bakerlab.org