Posts by Idan Shifres

1) Message boards : Number crunching : Report long-running models here (Message 58672)
Posted 8 Jan 2009 by Idan Shifres
Post:
Thanks for the reply, I was talking about the scientific value of letting the WU run for a long time till it abort itself, not about the points.

I would like to know if it is important for you guys to see that the watch dog stopped the application later than 3 times the time it was suppose to run or it doesn't matter for you what so ever?
2) Message boards : Number crunching : Report long-running models here (Message 58665)
Posted 8 Jan 2009 by Idan Shifres
Post:
I would really like to know if it would be worth it to let the WU finish by itself or should we just abort it after 25-30 hours?

Do you look at the problems reported by the long-running models? is it contributing to let them run or it doesn't? in case it does contribute, I'll happily running to the bitter end, but if not, I'd like to do some realy science helping work... :)
3) Message boards : Number crunching : Report long-running models here (Message 58612)
Posted 7 Jan 2009 by Idan Shifres
Post:
Idan, what is the normal runtime preference for the host that is running that task? (note to self, why hasn't watchdog ended it? If pref. is <24hrs)


The computer's working on BOINC 24/7... better watch that watchdog... :)
I have another WU with 40+ hours running right now, I'll let it finish to see if it gets the same watchdog message...

Woops, looks like it just finished, as you can see: HERE.
<core_client_version>6.4.5</core_client_version>
<![CDATA[
<stderr_txt>
**********************************************************************
Rosetta is going too long. Watchdog is ending the run!
CPU time: 183132 seconds. Greater than 3X preferred time: 10800 seconds
**********************************************************************
called boinc_finish

</stderr_txt>
]]>

Same watchdog being late again, now only after 50 hours... gave me 80 credits... which is 1.6 credits per hour... I think you should change your credit system a bit... :P
4) Message boards : Number crunching : Report long-running models here (Message 58558)
Posted 6 Jan 2009 by Idan Shifres
Post:
The WU 1nkuA_BOINC_MPZN_vanilla_abrelax_5901_51691_0 finally finished after more than 66 hours only to get 80 credits... quite disappointing... :(
5) Message boards : Number crunching : Report long-running models here (Message 58557)
Posted 6 Jan 2009 by Idan Shifres
Post:
The WU 1nkuA_BOINC_MPZN_vanilla_abrelax_5901_51691_0 finally finished after more than 66 hours only to get 80 credits... quite disappointing... :(
6) Message boards : Number crunching : Report long-running models here (Message 58523)
Posted 5 Jan 2009 by Idan Shifres
Post:
I know this WU doesn't look promising, but I want it to finish by itself or quit for passing the deadline...

I see no one else returned a result for this WU, I hope it will get taken care of, so it won't repeat again...

Good day! :D


You might as well. Rosetta@home has finally got its workunit generators working again, but they haven't caught up with the demand for more workunits yet. I believe that Rosetta@home in one of the BOINC projects that will even let you return a workunit after the deadline and get credit for it, as long as not enough other people have already returned it to meet the quorum.


I will let it run, as you said in another post, it came to the last 10 mins of "estimated" time and just got stuck there, for 63 hours and 30 mins so far... hopefully this WU will finally finish and even better it would be if I'll get credit for it... :)

I'll keep updating... :)
7) Message boards : Number crunching : Report long-running models here (Message 58519)
Posted 5 Jan 2009 by Idan Shifres
Post:
I know this WU doesn't look promising, but I want it to finish by itself or quit for passing the deadline...

I see no one else returned a result for this WU, I hope it will get taken care of, so it won't repeat again...

Good day! :D
8) Message boards : Number crunching : Report long-running models here (Message 58514)
Posted 5 Jan 2009 by Idan Shifres
Post:
Have the wu: 1nkuA_BOINC_MPZN_vanilla_abrelax_5901_51691_0 running over 62 hours now, most of that time it just crawling towards the 100% mark from 99.730 mark....

Going to hit deadline soon :(
9) Message boards : Number crunching : Why did I get Compute Error? (Message 58296)
Posted 31 Dec 2008 by Idan Shifres
Post:
One failed work unit is no reason for alarm.

Switching between projects usually does not cause a failure of the work unit. I have been 100% R@H for a while so I only have a little experience.

One of the complaints about R@H has been failed work units. The WU will go to another computer for processing. If the same unit fails on multiple computers, the project team will figure out what is wrong.

Keep crunching.


Thanks for the feedback, I'm not alarmed or anything, it's just unusual and I thought maybe there might be something wrong with my computer... :)

I'm having another WU running for 25 hours, hopefully this one will finish well :) it's just that I don't want my computer to work so hard for a result that couldn't be used...
10) Message boards : Number crunching : Why did I get Compute Error? (Message 58293)
Posted 31 Dec 2008 by Idan Shifres
Post:
Did you reboot your computer several times while running the project? The statement "too many restarts" usually indicates to this problem. Rosetta will only allow I think 3 restarts per work unit before declaring a compute error. If you have to reboot the computer several times for any reason suspend the project using the activity tab at the top of the page and then suspend.


I didn't restart the computer that crunched this WU.
Can a restart mean only restarting of a computer or restarting work on rosetta? coz, if the WU is big - which means take about 20 hours, and my BOINC prefences switch between projects every 1 hour or so, it means that rosetta will be "restarted" quite a few times with this big WU...

Could that be the problem?
11) Message boards : Number crunching : Why did I get Compute Error? (Message 58289)
Posted 31 Dec 2008 by Idan Shifres
Post:
Hi, I have no idea why I got Compute/Client error, maybe someone can shed some light over this: http://boinc.bakerlab.org/rosetta/result.php?resultid=217026695

Thanks in advance.






©2024 University of Washington
https://www.bakerlab.org