Posts by Warped

1) Message boards : Technical News : Starting at around 8:20 (PST) this morning the University of Washington network began to experience widespread connectivity problems (Message 108021)
Posted 2 Feb 2023 by Warped
Post:
Downloads stuck for two days already.
Is this being sorted out?

2023/02/02 08:45:21 | Rosetta@home | Started download of rb_01_30_484113_479201_ab_t000__robetta_FLAGS
2023/02/02 08:45:21 | Rosetta@home | Started download of rb_01_30_484113_479201_ab_t000__robetta.zip
2023/02/02 08:45:23 | | Project communication failed: attempting access to reference site
2023/02/02 08:45:23 | Rosetta@home | Temporarily failed download of rb_01_30_484113_479201_ab_t000__robetta_FLAGS: transient HTTP error
2023/02/02 08:45:23 | Rosetta@home | Backing off 03:24:02 on download of rb_01_30_484113_479201_ab_t000__robetta_FLAGS
2023/02/02 08:45:23 | Rosetta@home | Temporarily failed download of rb_01_30_484113_479201_ab_t000__robetta.zip: transient HTTP error
2023/02/02 08:45:23 | Rosetta@home | Backing off 05:12:47 on download of rb_01_30_484113_479201_ab_t000__robetta.zip
2023/02/02 08:45:23 | Rosetta@home | Started download of rb_01_30_484113_479201_ab_t000__robetta.200.3mers.index.gz
2023/02/02 08:45:23 | Rosetta@home | Started download of rb_01_30_484113_479201_ab_t000__robetta.200.8mers.index.gz
2023/02/02 08:45:25 | | Internet access OK - project servers may be temporarily down.
2023/02/02 08:45:25 | Rosetta@home | Temporarily failed download of rb_01_30_484113_479201_ab_t000__robetta.200.3mers.index.gz: transient HTTP error
2023/02/02 08:45:25 | Rosetta@home | Backing off 04:27:16 on download of rb_01_30_484113_479201_ab_t000__robetta.200.3mers.index.gz
2023/02/02 08:45:25 | Rosetta@home | Temporarily failed download of rb_01_30_484113_479201_ab_t000__robetta.200.8mers.index.gz: transient HTTP error
2023/02/02 08:45:25 | Rosetta@home | Backing off 04:29:20 on download of rb_01_30_484113_479201_ab_t000__robetta.200.8mers.index.gz
2) Message boards : Number crunching : No work (Message 88571)
Posted 27 Mar 2018 by Warped
Post:
statement from rosetta staff would be nice


+1

+2
3) Message boards : Number crunching : No Tasks Sent [Resolved] (Message 87786)
Posted 2 Dec 2017 by Warped
Post:
Is anyone else having the same or is it an issue on my side?

2017/12/02 07:41:22 AM | Rosetta@home | update requested by user
2017/12/02 07:41:26 AM | Rosetta@home | Sending scheduler request: Requested by user.
2017/12/02 07:41:26 AM | Rosetta@home | Requesting new tasks for CPU
2017/12/02 07:41:28 AM | Rosetta@home | Scheduler request completed: got 0 new tasks
2017/12/02 07:41:28 AM | Rosetta@home | No tasks sent
2017/12/02 07:41:38 AM | Rosetta@home | Sending scheduler request: To fetch work.
2017/12/02 07:41:38 AM | Rosetta@home | Requesting new tasks for CPU
2017/12/02 07:41:41 AM | Rosetta@home | Scheduler request completed: got 0 new tasks
2017/12/02 07:41:41 AM | Rosetta@home | No tasks sent

Edit:
This was the case about 8 hours ago and repeated again a few minutes ago.
However, I just received a number of tasks.
4) Message boards : Number crunching : Upload errors. (Message 87207)
Posted 5 Sep 2017 by Warped
Post:
Same for me.

Tue 05 Sep 2017 17:22:55 SAST | Rosetta@home | Started upload of rb_09_03_77151_119988_ab_stage0_t000___robetta_IGNORE_THE_REST_05_09_514908_71_0_r1550815990_0
Tue 05 Sep 2017 17:22:59 SAST | Rosetta@home | [error] Error reported by file upload server: [rb_09_03_77151_119988_ab_stage0_t000___robetta_IGNORE_THE_REST_05_09_514908_71_0_r1550815990_0] locked by file_upload_handler PID=255
Tue 05 Sep 2017 17:22:59 SAST | Rosetta@home | Temporarily failed upload of rb_09_03_77151_119988_ab_stage0_t000___robetta_IGNORE_THE_REST_05_09_514908_71_0_r1550815990_0: transient upload error
Tue 05 Sep 2017 17:22:59 SAST | Rosetta@home | Backing off 04:48:33 on upload of rb_09_03_77151_119988_ab_stage0_t000___robetta_IGNORE_THE_REST_05_09_514908_71_0_r1550815990_0
5) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 81283)
Posted 10 Mar 2017 by Warped
Post:
Welcome Back!

See Kel's post about the issue:
http://boinc.berkeley.edu/dev/forum_thread.php?id=10279&postid=76320#76320
6) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 80596)
Posted 5 Sep 2016 by Warped
Post:
I have to ask what's going on atm.

Servers all seemed to be reset the other day, lots of tasks showing on the homepage and server status page now but little or nothing coming down, though at the same time lots of tasks seem to be in progress.

Meanwhile validation is anything up to a day and a half behind with some of my team.


I share your concern.
The server status page bears little resemblance to what we experience, even reporting tasks has become a lottery, validation is way behind and external statistics reporting is sporadic.

Has Rosetta@home become a victim of it's own success?
7) Questions and Answers : Web site : Website status report incorrect (Message 77215)
Posted 2 Aug 2014 by Warped
Post:
Strange as it may seem, the Server Status Page is actually correct. If your computer was inside the firewall at the University of Washington, you would not be aware of any issue. The problem is the internet connection from the campus being throttled to the point where uploads and downloads from us are timing out. This is not monitored on the Server Status Page.

Based on what I see in the Number Crunching section of the Message Boards, I do not expect any resolution until Monday since the Rosetta staff have been away and I expect the UW IT staff will only be back on Monday. In addition, that's Pacific Time so it will likely only be about 15h00 UTC before resolution can be expected. On top of this, when resolved, the routers and switches will get hammered with data.
8) Message boards : Number crunching : Can't get tasks! (Message 76642)
Posted 21 Apr 2014 by Warped
Post:
Somewhat annoying, I must admit.

Do the "Make_work", "Feeder" and "Scheduler" servers need to be manually controlled?
9) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 76054)
Posted 19 Sep 2013 by Warped
Post:
endo_ae__ results cause (and suffer from) BOINC heartbeat problems and they do not checkpoint properly on one of my boxes, my guess is that they have very high RAM requirements (my internet PC with only 2GB RAM, having Firefox nearly always running, one Rosetta task plus 3 projects with very low RAM requirements). They should probably be limited to boxes with more than 3GB physical RAM.

Unfortunately I could not catch/spy on one just before it crashed, so the RAM thing is only a guess. After the crash the RAM history is lost with the PID so I cannot check the maximum usage. Other result types seem not to be affected.


Indeed. The endo_ae tasks are terrible:
1. The first checkpoint takes a number of hours.
2. I have at least three which have crashed after a few minutes.
3. The credit from them is poor. In one example over 8 hours for only 20 points.
10) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 75132)
Posted 18 Feb 2013 by Warped
Post:
Rosetta folks take their weekends seriously, THEY DO NOT WORK ON WEEKENDS. That means until they show up for work today, Monday, they have NO CLUE we are having any problems! As long as the fix is not broken or missing hardware it should be back up and running pretty quickly. Obviously broken or missing hardware could take a bit longer.

They know; I sent a PM yesterday (roughly 36 hours ago) morning to the mail address provided by Ethan (to use when there are issues).


That's wishful thinking:
1. Monday is a public holiday in the USA.
2. The e-mail address is likely only looked-at when Ethan is at work.

I'm expecting at least a further 24 hours without any life from the project.

Edit: I am pleasantly surprised. Looks like the router's power supply was switched back on!
11) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 75089)
Posted 17 Feb 2013 by Warped
Post:
Oh well. It's Sunday so I expect we will have to wait another day for the comms to be sorted out.
12) Message boards : Number crunching : Mini Rosetta 3.45 (Message 74714)
Posted 11 Dec 2012 by Warped
Post:
This thing ran for over ten hours(10), on my 6hr limit then failed.

Others had had this fail as well, Thanks for nothing!

hyb_af_bench_4aimA_SAVE_ALL_OUT_IGNORE_THE_REST_57052_456_2

http://boinc.bakerlab.org/rosetta/workunit.php?wuid=496983312

Setting database description ...
Setting up checkpointing ...
Setting up graphics native ...
BOINC:: Worker startup.
Starting watchdog...
Watchdog active.
# cpu_run_time_pref: 21600
BOINC:: CPU time: 36354.1s, 14400s + 21600s[2012-12-11 17:50: 2:] :: BOINC
WARNING! cannot get file size for default.out.gz: could not open file.
Output exists: default.out.gz Size: -1
InternalDecoyCount: 0 (GZ)
-----
0
-----
Stream information inconsistent.
Writing W_0000001
======================================================
DONE :: 1 starting structures 36354.1 cpu seconds
This process generated 1 decoys from 1 attempts
======================================================
called boinc_finish

</stderr_txt>
]]>

Validate state Workunit error - check skipped
Claimed credit 277.111340820468
Granted credit 0
application version 3.45


I try to remember to abort these "hyb" tasks as soon as they arrive.
However, I just discovered one and it's already way over limit without a single checkpoint :-(
13) Message boards : Number crunching : Rosetta@Home version 3.26 (Message 72757)
Posted 14 Apr 2012 by Warped
Post:
Thank you for improving the interval between checkpoints. For those of us who reboot from time to time, the work lost is now minimal.
14) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 71963)
Posted 7 Jan 2012 by Warped
Post:
Well done on getting the project back up and running!

Contrary to nearly every other IT-related promise/prediction I can recall, you managed expectations rather well, even working on a weekend ;-)
15) Message boards : Number crunching : 32-bit Windows XP vs 64-bit Linux (Message 71225)
Posted 8 Sep 2011 by Warped
Post:

The easiest way to check is to drop a page of results into a spreadsheet and work out the average of granted credit divided by cpu time for each OS.


Hi Danny

That's exactly what I did. They both use a 4-hour runtime preference and have both been averaging within a few percent of 14400 seconds per work unit.

I am aware that the the application is only 32-bit, but was thinking that perhaps the ability of a 64-bit system to address the memory was part of the issue. Ideally, someone who also has a dual-boot but with both running 64-bit or both 32-bit could comment.

Regarding the credit claimed issue, you're saying that the credit granted is an accurate reflection of the relative work done. This means that the Ubuntu O/S is "doing more work", which gets back to my curiosity about how it achieves this.
16) Message boards : Number crunching : 32-bit Windows XP vs 64-bit Linux (Message 71223)
Posted 8 Sep 2011 by Warped
Post:
I know this subject has been discussed before but cannot find that any conclusions were reached.

I run a machine with dual boot Windows XP 32-bit and Ubuntu 64-bit. I am finding that the Linux setup is giving about one-third more credit for the same run time.

I have also noticed that the Linux system claims about double the credit claimed under Windows and is consistently awarded less credit whereas the Windows set-up claims less than it is awarded.

What I am interested in is whether it's the 64-bit or the linux (or maybe both?) which is giving the performance boost.

Perhaps there's no difference in the actual benefit to the project but the anomoly is the credit claiming and granting system.

Any comments?
17) Message boards : Number crunching : UNFINISHED WU RUNNING PAST DEADLINE (Message 71167)
Posted 31 Aug 2011 by Warped
Post:
Reduce the size of your cache. If you have a permanent (always on) internet connection, then make the cache about 3 or 4 days. As long as the project has work, you should be fine.
18) Message boards : Number crunching : Problems with web site (Message 71084)
Posted 18 Aug 2011 by Warped
Post:
As far as I know, they are different queues. Total queued jobs feeds the "ready to send"-queue.


That is my understanding as well.

If you look at the glossary at the bottom of the server status page, the Feeder sends tasks to the Scheduler. The queued jobs (on the home page) is the number available to the Feeder. The ready to send number on the server status page are those in the Scheduler which are immediately available for clients (our computers).
19) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 70821)
Posted 30 Jul 2011 by Warped
Post:
Hi Warped

All BAM! gets is "Creating account at World Community Grid -> No response from project".

superlinkattechnion has no tasks.

Makes one feel surplus to requirements :-)

David


The server at BAM! is having problems. I suggest you attach using the standard BOINC Manager, Advanced View. In my Ubuntu version of BOINC Manager (6.10.17) the attach to project menu item is in Tools.
20) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 70815)
Posted 30 Jul 2011 by Warped
Post:
Hi Snagletooth

I have never installed a firewall in Ubuntu - I go through a router which protects me.

As I say, everything was fine (therefore port access in the router was setup ok) until suddenly.

And since 6am this morning on this machine, 2 tasks completed ok - the next 9 exited with errors after a few minutes (1hr 09, 30m, 19m, 07m, 17m, 15m, 01m, 0.01m, 0.02m respectively).

It seems odd that 2 computers suddenly became error prone at the same time and and in the same approximate quantity.

I feel very despondent about this - research into cancer is very personal to me and I thought I was contributing, albeit in a small way.

Please help as I am not technical in sorting out why this is being wasted. I'm afraid your paste of the error is double-dutch to me.

David


Hi David

Seeing that Rosetta is out of work, you may wish to try Help Conquer Cancer at World Community Grid. If the tasks run successfully there, I would suggest that the problem you had is merely coincidental. I run 64-bit Ubuntu and have no problems running BOINC.


Next 20



©2024 University of Washington
https://www.bakerlab.org