Posts by Klimax

1) Message boards : Cafe Rosetta : DreamLab now available for Android and iOS (Message 93954)
Posted 9 Apr 2020 by Klimax
Post:
Hm, mobile only and it is not clear if it is even available outside of UK.
2) Message boards : Number crunching : Rosetta@home using AVX / AVX2 ? (Message 93952)
Posted 9 Apr 2020 by Klimax
Post:
It's time for SSEx/Avx support??

Are there even any x86-64 CPUs without at least SSE2 support?


SSE2 from Wiki:
Introduced by Intel with the initial version of the Pentium 4 in 2000... AMD added support for SSE2 in 2003

It should be noted, that x86-64 mandates SSE2 support and as such any 64-bit CPU supports it.
3) Message boards : Number crunching : How might memory effect R@h processing? (Message 93463)
Posted 5 Apr 2020 by Klimax
Post:
[snip]

Yes, smt/hyper threading is off. The point is though, on my Intel laptops, the speed has remained the same between 4.07 to 4.12. On my tr it's dropped over 60% regardless of what I try, I can't put it much simpler than that.

Note that all of your Intel laptops have a much lower number of cores, and will therefore have much less of a problem with too many cores trying to share the limited speed path to main memory. I can't put it much simpler than that, either.



4.07 is the reference yes. 4.12 is 60% slower than that, I don't know what is so difficult to understand, it's has nothing to do with memory or anything else. You dont have a 64/128 chip to only run it on 5 cores to keep the same productivity because of a software change. Is there a mod/developer that can possibly comment on this issue?

Resource contention, nothing less, nothing more. Nothing exotic or unknowable. ZEN architecture is strong on memory bandwidth but bad on memory latency. Even if you have relatively larger cache there will be still lots of memory access that has to be fulfilled from main memory. Because of that too many core doing access will overload memory bus and their performance will drop of sharply. (Especially when latency of memory access is already high)

I suggest you experiment with number of concurrent task for RAH to find equilibrium. Alternately, you could try to see if you can get memory to higher frequency without worsening its timing. Also ensure that there is no swapping. (IIRC each 4.12 task needs about 2GB of RAM which amounts to about 128GB of RAM in use beside all other processes already running)
4) Message boards : Number crunching : Stalled downloads (Message 93194)
Posted 3 Apr 2020 by Klimax
Post:
So maybe allow repeated attempts to download for a maximum 24hrs, else abort? But could be a loss of connectivity at the user end, or at the project end?


Yes, some means for BOINC to give up is required. Perhaps it should wait even longer if the file is "particularly large" (intentionally left for others to define). The other issue is that in the case here at R@h the problems seemed to be with small files, but the small file was required to run a given WU. That WU might have had many GBs of other required files that came down ok. So the actual size of the particular file having problems is not really the only consideration. If you abort the WU, all of the WU files are being lost (unless other WUs your machine has on-board use them as well).

I should think the BOINC Manager can see the difference between a lack of connectivity being the cause, and a dropped frame. So that info. could be used. If there was a loss of connectivity, then perhaps you double the maximum timeout.

...all good discussion for BOINC message boards.


Connectivity detection is also partially broken in BOINC clients:
If BOINC client can not download file or contact project server it contacts "reference site" which is google.com for last few years.
If it get "HTTP 200 - OK" response from google - all work as expected: BOINC understands there are problem with particular project but Internet connection is OK and proceeds with other projects.

But if google return any other response(not "200 OK") - BOINC stupidly interprets it as connectivity issues (no internet connection) and temporarily ceases all internet activity and increase "back-off" timer each time.
And Google often returns other codes - because a lot of such similar automated requests from thousands of computers around the world running BOINC often trigger the anti-bot filters of Google (it blocks such requests and offers to solve CAPTCHA to continue work). BOINC of course can not understand it and react to this by declaring "connectivity issues".

Although the fact that it was able to get an error message from Google (no matter which and for what reason) means the exactly opposite - that the Internet connection is working OK.

I think it is work around for various captive portals and security appliances/gateways that might block connection with custom error page.
5) Message boards : Number crunching : Rosetta 4.1+ and 4.2+ (Message 92859)
Posted 1 Apr 2020 by Klimax
Post:
Would it be possible to support Windows on ARM (WOA) too?
6) Message boards : News : Rosetta's role in fighting coronavirus (Message 92718)
Posted 31 Mar 2020 by Klimax
Post:
I was surprised when I saw a comment from someone who had queue for a couple of days. I joined the project about 10 days ago and I usually got 2-3 extra tasks besides the 12 running tasks. My tasks take 7-8 hours to complete so I never got a large queue. Wonder why there's such a big difference.

Also since yesterday I didn't receive any new tasks and I'm now running the last 5 tasks.

BOINC settings. For example I have 1 day of work + extra 1 day as a buffer with Rosetta runtime set at 12 hours.
7) Message boards : Number crunching : Problems with Minirosetta v1.54 (Message 60540)
Posted 7 Apr 2009 by Klimax
Post:
Klimax, why don't you go ahead and take a dump and EMail it to me, along with details on what you observered with it as it ran. I will forward it to the Project Team.

Ups,didn't know :-(
Last time I reported it,I was told to let it finish and upload.(IIRC)
Mail is being prepared.
8) Message boards : Number crunching : Problems with Minirosetta v1.54 (Message 60526)
Posted 7 Apr 2009 by Klimax
Post:
Again another task is now not crunching due to "Accepted Energy:1.#QNAN" and "Accpeted RMSD:1.#QQ".
It is 39.50% Complete ; Model:11 Step 7788. I have now suspended task.

I can create dump file.Should I?

Or is it already fixed in next version?
9) Message boards : Cafe Rosetta : S'tel klat drawkcab! (Message 59708)
Posted 21 Feb 2009 by Klimax
Post:
Siht si ssendab...
10) Message boards : Number crunching : Problems with Minirosetta v1.54 (Message 59465)
Posted 8 Feb 2009 by Klimax
Post:
Hello.
Following task (http://boinc.bakerlab.org/rosetta/result.php?resultid=225859224) is suspended as it has produced "accepted energy": QNAN(Not a Number?) and RMSD: QO.Model number 25 step 9518. Running time: 20h 2min 21sec.
Set runtime 24h.
For now suspended.No crash before.
OS:Windows 7 beta.I can create dump file using task manager.

Should I let it try to finish?

Thanks


I'd suggest allowing it to run normally. Was it still using CPU time? If you want to kind of cut it off, but get it to report in, let it run, then exit (not close) BOINC and restart it, let it run about 2 minutes, then exit again and restart, until you've done that 5 times and the task should be ended and report in with "too many restarts".


OK,set runtime at 8hours,so watchdog would cut it at 24hours.It has now uploaded and reported it.I have dump files as well,if somebody in team is interested.(Captured at reported time and step)
And I see I was not alone... :-(
11) Message boards : Number crunching : Problems with Minirosetta v1.54 (Message 59418)
Posted 7 Feb 2009 by Klimax
Post:
Hello.
Following task (http://boinc.bakerlab.org/rosetta/result.php?resultid=225859224) is suspended as it has produced "accepted energy": QNAN(Not a Number?) and RMSD: QO.Model number 25 step 9518. Running time: 20h 2min 21sec.
Set runtime 24h.
For now suspended.No crash before.
OS:Windows 7 beta.I can create dump file using task manager.

Should I let it try to finish?

Thanks
12) Message boards : Number crunching : Optimized versions (Message 58414)
Posted 3 Jan 2009 by Klimax
Post:
Hi,
I have been crunching for Seti@Home for one year now and have used an optimized version of the crunching program. This results in much higher performance.
I was wondering if Rosetta@Home also has an optimized version? At the moment I see in my Rosetta Projects folder the following files:
minirosetta_1.47_x86_64-pc-linux-gnu
minirosetta_graphics_1.40_x86_64-pc-linux-gnu
rosetta_beta_5.98_x86_64-pc-linux-gnu
Are these files the "normal" programs to be used or is there an optimized version among them already? Btw, which program does what?


There are no optimized apps as code base is changed very often.Although one can request source code for rosseta and make optim. app it would be in next months replaced by newer version with core calculation bit different and entire optimization could be done again...

This is as far as my memory goes.So this project uses "raw" power of processor and nothing more.
13) Message boards : Number crunching : Team Points STOLEN!!!!! Over 6 Million Points! (Message 54683)
Posted 27 Jul 2008 by Klimax
Post:
After reading entire thread,I have some comments.
-Hmmm,once again Czechs show how to tunnel in new area(DC).How sad that they are czechs.(But not suprising)
-Automatic foundership transfer was possibly added,because of LHC and others with very busy admins,where people waited for two years(not sure) to become new founder,after the old one went "missing".
14) Message boards : Number crunching : minirosetta v1.24 bug thread (Message 53324)
Posted 25 May 2008 by Klimax
Post:
This one was going well,until I stopped it for defrag.Then I restarted and WU just crashed.

WU:

<core_client_version>5.10.18</core_client_version>
<![CDATA[
<message>
- exit code -1073741819 (0xc0000005)
</message>
<stderr_txt>
# cpu_run_time_pref: 86400


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x004484E8 read attempt to address 0x015B6000

Engaging BOINC Windows Runtime Debugger...



********************


BOINC Windows Runtime Debugger Version 6.3.2


Dump Timestamp : 05/25/08 07:12:02
Loaded Library : C:Program FilesBOINCdbghelp.dll
Loaded Library : C:Program FilesBOINCsymsrv.dll
Loaded Library : C:Program FilesBOINCsrcsrv.dll
LoadLibraryA( C:Program FilesBOINCversion.dll ): GetLastError = 126
Loaded Library : version.dll
Debugger Engine : 4.0.5.0
Symbol Search Path: C:Program FilesBOINCslots;C:Program FilesBOINCprojectsboinc.bakerlab.org_rosetta;srv*C:DOCUME~1KlimaxLOCALS~1Tempsymbols*http://msdl.microsoft.com/download/symbols;srv*C:DOCUME~1KlimaxLOCALS~1Tempsymbols*http://boinc.bakerlab.org/rosetta/symstore;srv*C:DOCUME~1KlimaxLOCALS~1Tempsymbols*http://boinc.berkeley.edu/symstore

<snip>
*** Dump of the Process Statistics: ***

- I/O Operations Counters -
Read: 7, Write: 0, Other 209

- I/O Transfers Counters -
Read: 0, Write: 90, Other 0

- Paged Pool Usage -
QuotaPagedPoolUsage: 56076, QuotaPeakPagedPoolUsage: 56076
QuotaNonPagedPoolUsage: 2248, QuotaPeakNonPagedPoolUsage: 2304

- Virtual Memory Usage -
VirtualSize: 35221504, PeakVirtualSize: 35221504

- Pagefile Usage -
PagefileUsage: 10870784, PeakPagefileUsage: 10870784

- Working Set Size -
WorkingSetSize: 10285056, PeakWorkingSetSize: 10285056, PageFaultCount: 2542

*** Dump of thread ID 8052 (state: Ready): ***

- Information -
Status: Base Priority: Above Normal, Priority: Above Normal, , Kernel Time: 156250.000000, User Time: 7812500.000000, Wait Time: 11237628.000000

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x004484E8 read attempt to address 0x015B6000

- Registers -
eax=01958b4a ebx=0196a008 ecx=00000000 edx=00000001 esi=015b5ffc edi=015b6000
eip=004484e8 esp=0013ed18 ebp=0013ed38
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010202

- Callstack -
ChildEBP RetAddr Args to Child
0013ed38 00448669 0196a008 00000001 015b5ffc 00000004 minirosetta_1.24_windows_intelx!+0x0
0013ed50 00445a06 00000004 015b5ff0 83874561 01950e98 minirosetta_1.24_windows_intelx!+0x0
0013ed8c 0044656d 00000000 00000000 015a56f0 01950e98 minirosetta_1.24_windows_intelx!+0x0
00000000 00000000 00000000 00000000 00000000 00000000 minirosetta_1.24_windows_intelx!+0x0

*** Dump of thread ID 6492 (state: Waiting): ***

- Information -
Status: Wait Reason: ExecutionDelay, , Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 11237624.000000

- Registers -
eax=0000b471 ebx=00000000 ecx=018ff0d8 edx=00951634 esi=00000000 edi=018fff70
eip=7c90eb94 esp=018fff40 ebp=018fff98
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202

- Callstack -
ChildEBP RetAddr Args to Child
018fff98 7c802451 00000064 00000000 018fffec 0042945b ntdll!KiFastSystemCallRet+0x0
018fffa8 0042945b 00000064 00000000 7c80b683 00000000 kernel32!Sleep+0x0
018fffec 00000000 00429450 00000000 00000000 48920000 minirosetta_1.24_windows_intelx!+0x0


*** Debug Message Dump ****


*** Foreground Window Data ***
Window Name :
Window Class :
Window Process ID: 0
Window Thread ID : 0

Exiting...

</stderr_txt>
]]>

And I am underway back to RALPH...
15) Message boards : Cafe Rosetta : Beating a dead horse.. (Message 53309)
Posted 24 May 2008 by Klimax
Post:
Agreed.

At least there has been very recent communication from the Project that they will again be in contact with MS re a possible xBox application.

Not certain why the xBox appears to be preferred over the PS3, but if they eventually do develop an xBox app, there is hope for (imho, the superior) PS3.

I am cautiously optimistic that this is not just talk, but real action on the part of the Project.

In the interim, both of my PS3's are crunching for F@H, and when I get around to installing linux, I'll likely do some crunching for PS3Grid and yoyo@Home's OGR wrapper.


Well,have you ever notice one logo at the bottom of front page?It is logo of Microsoft research...
16) Message boards : Number crunching : NOD32 3 says Virus in file! (Message 52557)
Posted 17 Apr 2008 by Klimax
Post:
And never ever there was false positive or false negative!


I'll join in for the heck of it.

One day I woke up to my computers and half of my customers' computers having AVG reporting files within QuickBooks as a virus. More specifically, they were Help Files made in Macromedia/Adobe Flash, to which I can only assume other products were affected as well. As far as false negatives, I've seen a decent chunk of spyware that has just flown right over AVG at the time of infection.

What version of AVG was that?


No one is perfect; you're bound to have some problem with just about anything. AVG works relatively well; now if only it scanned a little faster, had more malware cleaning tools built-in rather than sometimes directing the user to an external tool, and that incident mentioned above didn't happen...

I chose NOD32 because, when I used the trial (simply for comparative purposes), it scanned a heck of a lot faster than any other virus scanner I've seen, and because I've seen it catch a few things, some of which were quite nasty, that others let by. However, I've also seen it let a few things slip by as well, so it's not perfect either. If only this incident with Minirosetta didn't happen...

Although the problem is gone, for future reference, you can set exclusions in many Anti-Virus programs to temporarilly ignore false positives. When NOD32 on my computer started griping about minirosetta, I just told it to exclude *minirosetta*, and problem solved. Of course, for those a bit more paranoid, you could be more specific, only whitelisting it from the bakerlab.org site and in the RALPH/Rosetta Folders in BOINC.

Also, cleaning a computer is usually a different story than preventing the problem. Heck, most of the computers I've had to clean involved me finding a third-party freeware tool designed specifically to cure a common virus/malware/et cetera. Otherwise, AVG Anti-Malware with all it's bells and whistles can scan not only files but also the registry, making it more ideal for cleaning a computer, while NOD32 has it's own set of bells and whistles (remembering these are two different products), most of which focused on preventing anything from getting in in the first place.

Don't get me wrong... I'll usually reccomend AVG, among one or two other free ones, simply because most people I know don't want to pay for an Anti-Virus (although some, after trying the free AVG, will in fact pay for one of the full-blown versions), and most of the computers that I put AVG on usually die of old age before they get another virus... most.

Just my five cents.


Interesting...that is all I can say. :-)
17) Message boards : Number crunching : small question (Message 52071)
Posted 22 Mar 2008 by Klimax
Post:
i just put it on 2 hours. :)

Small warning:Some decoys wil take more than two hours so do not be suprised,that it will not honour exactly setting in such case.(So far however this long tasks are rare...for now :-) )

what are decoys??


Hopefully I say it simply:Each decoy is one successfull attempt at folding given protein.
But not so sure about this description,but somewhere here should be full explanation.Search function is your friend.I recommend trying science subforum.(Unfortuantely I have not enough time to do search myself. :-( )
18) Message boards : Number crunching : small question (Message 51999)
Posted 18 Mar 2008 by Klimax
Post:
i just put it on 2 hours. :)

Small warning:Some decoys wil take more than two hours so do not be suprised,that it will not honour exactly setting in such case.(So far however this long tasks are rare...for now :-) )
19) Message boards : Number crunching : Problems with Minirosetta version 1.09 (Message 51998)
Posted 18 Mar 2008 by Klimax
Post:
I'm not sure we have an ENTIRELY AV problem here: of the two problems I reported with mini-Rosetta, one WU was rerun successfully; the other, WU 135123822, says it's still rerunning - which is odd because all the tasks showing in BOINC manager are Rosetta 5.95. Also, I'm not running ESET (whatever that is) - I have ZoneAlarm Internet Security Suite with the Kaspersky anti-virus, and it hasn't reported any problems with any of the Rosetta programs.


Small clarification:Eset is maker of NOD32 AV.
20) Message boards : Number crunching : small question (Message 51975)
Posted 16 Mar 2008 by Klimax
Post:
smaller work units are always better.

when they crash, you lose less crunching time.
is there an advantage to have one WU work for longer time?
or better yet.. whats better... longer WU or shorter WU?


But then more computers is needed to get same amount of "decoys".

with shorter task run-times a computer will produce fewer decoys per task but will run more tasks so you'll end up with the same work produced either way.

As Angus says, shorter tasks reduce the impact of failures, but longer tasks reduce the upload/download overhead to the user and the project.


More tasks,but there will be very interesting overhead and remeber that we sometimes receive task with huge time to finish per edcoy and then client is not only confused.

And so far the only problem I had was huge memory requirment with BOINC cycling through WUs killing any free memory...
And I participate at RALPH with no error...(24h setting!)

Just my POV. :-)


Next 20



©2021 University of Washington
https://www.bakerlab.org