Posts by [AF>Le_Pommier] Jerome_C2005

21) Message boards : Number crunching : Rosetta 4.0+ (Message 95287)
Posted 24 Apr 2020 by Profile [AF>Le_Pommier] Jerome_C2005
Post:
@Grand + Sid : I don't understand very well what is the problem of having tasks (whatever the number) cancelled by the server because the deadline is reached ? are they not sent back to other crunchers ? the calculation will be done at then, and no resource will actually be "wasted", correct ? or is it just about the "error count" ? it should only affect me finally, not the project... ?

Regarding rosetta deadline I had not noticed is was so short indeed. But my cache is not "rosetta only", I've always been a multi-projects boincer, but it's true it's an old habit when internet was not so stable, and when projects would often come short of tasks, having a cache was always a pleasant idea.

But again : this was absolutely not the problem I faced with the mini tasks (see all the history of my explanations above). And again, I "solved" it by blocking the mini on that machine, it was enough for me and was not doing any harm to the project research.

Thanks.
22) Message boards : Number crunching : Rosetta 4.0+ (Message 95217)
Posted 23 Apr 2020 by Profile [AF>Le_Pommier] Jerome_C2005
Post:
The problems I had with mini tasks had *nothing* to do with missed deadline.

The canceled by they server" mini tasks had not started to run = sent back to other users, no problem.

The mini tasks that started to run on that machine never terminated / succeeded due to the problems I documented before.

Besides I had solved the problem on that host by blocking mini tasks execution (documented also, app_info = anonymous platform = no problem).

Rosetta tasks were running fine.

For the moment I have turned this host on another project, so less problem even :)
23) Message boards : Number crunching : Rosetta 4.0+ (Message 95198)
Posted 23 Apr 2020 by Profile [AF>Le_Pommier] Jerome_C2005
Post:
The platform was not anonymous before and my cache was 2 days. I don't know why boinc got so many tasks at the beginning, it's not a problem for me if the project cancels them as long as I have rosetta tasks running and it is the case.
I'm not concerned about you as you obviously don't care about the project.
But i am concerned about the project as trashing Tasks doesn't help. Having a reasonably sized cache would.


???

Instead of judging completely off base maybe you can read the problems that I have reported here, from the beginning, and that have not been solved.

I don't want to have mini tasks running forever with errors in the log, not using CPU anymore and not terminating ever, so blocking other tasks to run. All the mini were having the same behavior on that host. None of the rosetta did.

This what I call "caring about the project" and not yelping like a twitter troll.

Since when 2 days is not a "reasonable cache" for boinc ?
24) Message boards : Number crunching : Rosetta 4.0+ (Message 95036)
Posted 21 Apr 2020 by Profile [AF>Le_Pommier] Jerome_C2005
Post:
The platform was not anonymous before and my cache was 2 days. I don't know why boinc got so many tasks at the beginning, it's not a problem for me if the project cancels them as long as I have rosetta tasks running and it is the case.

Mini tasks were failing after starting to run, not before.

You only see canceled tasks because rosetta is purging the list all the time and I have no real history.

Platform is now anonymous because I created the app_info (this is a direct consequence) in order to avoid mini tasks to be sent by the project and now things are OK for me.
25) Message boards : Number crunching : Rosetta 4.0+ (Message 94988)
Posted 20 Apr 2020 by Profile [AF>Le_Pommier] Jerome_C2005
Post:
A method using app_info.xml file, re-describing only the rosetta app and not the mini app, allowed me to get rid definitely of the horrible mini tasks, the rosetta are now crunching like a charm.
26) Message boards : Number crunching : Rosetta 4.0+ (Message 94667)
Posted 17 Apr 2020 by Profile [AF>Le_Pommier] Jerome_C2005
Post:
But this is on linux debian... ?

I see there is a 7.16.6 development version for linux... but the "stable" version is supposed to be 7.4.22 (old !), I got my 7.14.2 from the default depots (I used the apt-get install command).
27) Message boards : Number crunching : Rosetta 4.0+ (Message 94640)
Posted 16 Apr 2020 by Profile [AF>Le_Pommier] Jerome_C2005
Post:
I realize the last mini task I had has been stuck for 3 days without using no CPU (at least no advancement is done on the task), all the files in the slot have not been updated since 3 days.

I don't know how to extract the err file out of the linux hosted machine so i made screenshots because I'm going to abort this task and as we say rosetta doesn't keep the task log on the server after one or two days.





I was given a solution to exclude all mini on that machine by using an app_info config file (re-describe all rosetta apps, and no mini app).

The rosetta tasks continue to run normally on that machine...
28) Message boards : Number crunching : Rosetta 4.0+ (Message 94462)
Posted 14 Apr 2020 by Profile [AF>Le_Pommier] Jerome_C2005
Post:
I had posted these details in my message above (april 12) but it seems the tasks were now purged from the website, my first link is not showing the example task I had given anymore (how long do they remain visible ? this was only 2 days ago).

But I posted examples of the error messages I could find in the slot directory by that time in that same message above.
29) Message boards : Number crunching : Rosetta 4.0+ (Message 94449)
Posted 14 Apr 2020 by Profile [AF>Le_Pommier] Jerome_C2005
Post:
It is a dedicated server hosted by a foreign provider, cheap and not recent : I cannot upgrade memory or anything.

All rosetta tasks are running fine (biggest use of RAM) and finishing in success, even with 6 concurrent tasks running, and all mini tasks are ending in error (except one), even limited to 1 at a time, so it cannot be a lack of RAM (rosetta uses more than mini) (and I doubled checked I still have a fair amount of free RAM at any given time).

I realize I cannot select applications to exclude mini in rosetta preferences ! (unlike all other boinc projects)

And in app_config I cannot set max number to 0 because it is ignored, I have to set to 1 to see the max limit considered by boinc... do I have any other way to completely exclude mini and waste processing time ?
30) Message boards : Number crunching : Rosetta 4.0+ (Message 94360)
Posted 13 Apr 2020 by Profile [AF>Le_Pommier] Jerome_C2005
Post:
I managed to access that boinc using boinctasks from another machine now, I aborted all pending mini tasks (BT is great to manager many tasks at once).

I'll let it run for some days to see how it goes, for the moment there are enough rosetta (normal) tasks for some time I think, I'll see how it goes.
31) Message boards : Number crunching : Rosetta 4.0+ (Message 94352)
Posted 13 Apr 2020 by Profile [AF>Le_Pommier] Jerome_C2005
Post:
I have limited to 4 rosetta + 4 mini using an app_config. The rest is running TN-Grid.

The system is currently only using 4 GB out of 8, so plenty or RAM left.

The mini tasks keep having the same issue, I have some running over 30 hours without CPU used nor completion.

I have limited mini to 1 tasks and rosetta to 6 now, I have problem accessing the machine now except from a linux ssh command line and I have loads of mini tasks waiting and I don't know how to bulk cancel them all...
32) Message boards : Number crunching : Rosetta 4.0+ (Message 94267)
Posted 12 Apr 2020 by Profile [AF>Le_Pommier] Jerome_C2005
Post:
Hi

I have minirosetta tasks running on a linux machine like this one where I realized I had been running a long with almost no CPU used.

In the slot file I found some errors so I decided to cancel them

No heartbeat from core client for 30 sec - exiting
FILE_LOCK::unlock(): close failed.: Bad file descriptor
*** glibc detected *** ../../projects/boinc.bakerlab.org_rosetta/minirosetta_3.78_x86_64-pc-linux-gnu: double free or corruption (!prev): 0x0fdefa10 ***

FILE_LOCK::unlock(): close failed.: Bad file descriptor
SIGSEGV: segmentation violation
*** glibc detected *** ../../projects/boinc.bakerlab.org_rosetta/minirosetta_3.78_x86_64-pc-linux-gnu: free(): corrupted unsorted chunks: 0x101dabd0 ***
*** glibc detected *** ../../projects/boinc.bakerlab.org_rosetta/minirosetta_3.78_x86_64-pc-linux-gnu: corrupted double-linked list: 0x101dadd8 ***

It seems that I have others taking the same way, the CPU time is almost null with a consistent run-time...

I had only 2 rosetta mini that succeeded.

My rosetta tasks seem to be all OK.

What should I do ? completely stop rosetta mini to be sent for that machine ?

Thanks
33) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 89949)
Posted 2 Dec 2018 by Profile [AF>Le_Pommier] Jerome_C2005
Post:
Hi

I have tasks erroring after 10 hours of calculation

<core_client_version>7.14.2</core_client_version>
<![CDATA[
<message>
finish file present too long</message>
<stderr_txt>
command: rosetta_4.09_x86_64-apple-darwin -run:protocol jd2_scripting @flags_rb_12_01_955_1018__t000__0_C1_robetta -silent_gz -mute all -out:file:silent default.out -in:file:boinc_wu_zip input_rb_12_01_955_1018__t000__0_C1_robetta.zip -nstruct 10000 -cpu_run_time 28800 -watchdog -boinc:max_nstruct 600 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 3814287
Starting watchdog...
Watchdog active.
======================================================
DONE :: 43 starting structures 28348 cpu seconds
This process generated 43 decoys from 43 attempts
======================================================
BOINC :: WS_max 5.21523e+08

BOINC :: Watchdog shutting down...
12:42:37 (98417): called boinc_finish(0)

</stderr_txt>
]]>


A few did succeed from the same lot after the same amount of calculation time

<core_client_version>7.14.2</core_client_version>
<![CDATA[
<stderr_txt>
command: rosetta_4.09_x86_64-apple-darwin -run:protocol jd2_scripting @flags_rb_12_01_948_1013__t000__1_C1_robetta -silent_gz -mute all -out:file:silent default.out -in:file:boinc_wu_zip input_rb_12_01_948_1013__t000__1_C1_robetta.zip -nstruct 10000 -cpu_run_time 28800 -watchdog -boinc:max_nstruct 600 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 3810154
Starting watchdog...
Watchdog active.
======================================================
DONE :: 7 starting structures 28105.7 cpu seconds
This process generated 7 decoys from 7 attempts
======================================================
BOINC :: WS_max 9.90781e+08

BOINC :: Watchdog shutting down...
12:37:56 (98460): called boinc_finish(0)

</stderr_txt>
]]>
34) Message boards : Number crunching : Rosetta 4.0+ (Message 88155)
Posted 24 Jan 2018 by Profile [AF>Le_Pommier] Jerome_C2005
Post:
PLUS it would be nice that the admin who created this topic would come and read it, sometimes !
35) Message boards : Number crunching : Rosetta 4.0+ (Message 88081)
Posted 16 Jan 2018 by Profile [AF>Le_Pommier] Jerome_C2005
Post:
Hi

All Rosetta 4.06 = "process got signal 4" on macOS 10.13.2 + boinc 7.8.4

https://boinc.bakerlab.org/rosetta/results.php?hostid=1665975&offset=0&show_names=0&state=0&appid=2
36) Message boards : Number crunching : Client Errors (Message 72734)
Posted 11 Apr 2012 by Profile [AF>Le_Pommier] Jerome_C2005
Post:
Hi people,

I don't have time to read all what's been written since my previous post but i just wanted to let you know that after installing the new stable boinc 7.0.25 on my iMac I resumed rosetta to give it a new try and WU are running fine now.

They still credit some very small amounts, but it works fine ;)

(and I don't do it for credits of course :) )

Well I realize that the other information is that I upgraded from Snow Leopard to Lion just after I posted (on the 24/03), so there may be a link also...
37) Message boards : Number crunching : Client Errors (Message 72586)
Posted 24 Mar 2012 by Profile [AF>Le_Pommier] Jerome_C2005
Post:
Ok thanks for the info, yes I know about potential instability, I'm trying version 7 since we are trying to see about GPU support implementation under Mac OS X, GPU started to be recognized for us by boinc with v7 but no GPU project is working with my iMac so far, it seems I'd have to upgrade to Lion, but I don't want to for the moment. Some of my Alliance Francophone fellows have better results (with Lion) but afaik it's not working 100% yet.

For information, I have quite a high number of projects running on my machine and it's only happening with Rosetta for the moment.

I'll see what I do regarding my version, thanks.
38) Message boards : Number crunching : Client Errors (Message 72582)
Posted 23 Mar 2012 by Profile [AF>Le_Pommier] Jerome_C2005
Post:
Hi

for information I have the "no finished file error" since at least 15/03 (my results are all erroring, I hadn't noticed) on an iMac 27 (2010 / i7 / 16 GB) with Mac OS X 10.6.8


I have reseted the project yesterday, no success.

////
Ven 23 mar 09:41:17 2012 | rosetta@home | Task rb_03_22_29991_60688__t000__SAVE_ALL_OUT_IGNORE_THE_REST_45429_2203_0 exited with zero status but no 'finished' file
Ven 23 mar 09:41:17 2012 | rosetta@home | If this happens repeatedly you may need to reset the project.
Ven 23 mar 09:41:17 2012 | rosetta@home | Restarting task rb_03_22_29991_60688__t000__SAVE_ALL_OUT_IGNORE_THE_REST_45429_2203_0 using minirosetta version 324 in slot 1
Ven 23 mar 09:41:18 2012 | rosetta@home | Task rb_03_22_29991_60688__t000__SAVE_ALL_OUT_IGNORE_THE_REST_45429_2203_0 exited with zero status but no 'finished' file
Ven 23 mar 09:41:18 2012 | rosetta@home | If this happens repeatedly you may need to reset the project.
Ven 23 mar 09:41:18 2012 | rosetta@home | Restarting task rb_03_22_29991_60688__t000__SAVE_ALL_OUT_IGNORE_THE_REST_45429_2203_0 using minirosetta version 324 in slot 1
Ven 23 mar 09:41:19 2012 | rosetta@home | Computation for task rb_03_22_29991_60688__t000__SAVE_ALL_OUT_IGNORE_THE_REST_45429_2203_0 finished
Ven 23 mar 09:41:19 2012 | rosetta@home | Output file rb_03_22_29991_60688__t000__SAVE_ALL_OUT_IGNORE_THE_REST_45429_2203_0_0 for task rb_03_22_29991_60688__t000__SAVE_ALL_OUT_IGNORE_THE_REST_45429_2203_0 absent
Ven 23 mar 09:41:21 2012 | rosetta@home | Scheduler request completed: got 0 new tasks
Ven 23 mar 09:41:21 2012 | rosetta@home | No work sent
Ven 23 mar 09:41:21 2012 | rosetta@home | (reached daily quota of 8 results)


Boinc :
--------
Mer 21 mar 17:27:44 2012 | | Starting BOINC client version 7.0.20 for x86_64-apple-darwin
Mer 21 mar 17:27:44 2012 | | log flags: file_xfer, sched_ops, task
Mer 21 mar 17:27:44 2012 | | Libraries: libcurl/7.21.7 OpenSSL/0.9.7l zlib/1.2.3 c-ares/1.7.4
Mer 21 mar 17:27:44 2012 | | Running as a daemon
Mer 21 mar 17:27:44 2012 | | Data directory: /Library/Application Support/BOINC Data
Mer 21 mar 17:27:44 2012 | | Processor: 8 GenuineIntel Intel(R) Core(TM) i7 CPU 860 @ 2.80GHz [x86 Family 6 Model 30 Stepping 5]
Mer 21 mar 17:27:44 2012 | | Processor features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 DTES64 MON DSCPL VMX SMX EST TM2 SSSE3 CX16 TPR PDCM SSE4.1 SSE4.2 POPCNT
Mer 21 mar 17:27:44 2012 | | OS: Mac OS X 10.6.8 (Darwin 10.8.0)
Mer 21 mar 17:27:44 2012 | | Memory: 16.00 GB physical, 829.07 GB virtual
Mer 21 mar 17:27:44 2012 | | Disk: 1.82 TB total, 828.83 GB free
Mer 21 mar 17:27:44 2012 | | Local time is UTC +1 hours
Mer 21 mar 17:27:44 2012 | | VirtualBox version: 4.1.10
Mer 21 mar 17:27:44 2012 | | WARNING: get_ati_mem_size_from_opengl failed to create PixelFormat
Mer 21 mar 17:27:44 2012 | | OpenCL: ATI GPU 0: Radeon HD 4850 (driver version 1.0, device version OpenCL 1.0, 512MB, 512MB available)
Mer 21 mar 17:27:44 2012 | | Config: report completed tasks immediately
Mer 21 mar 17:27:44 2012 | | Config: GUI RPC allowed from any host


Previous 20



©2024 University of Washington
https://www.bakerlab.org