Sometimes with Boincmanager time changes backwards.

Message boards : Number crunching : Sometimes with Boincmanager time changes backwards.

To post messages, you must log in.

AuthorMessage
Grutte Pier [Wa Oars]~MAB The Frisian
Avatar

Send message
Joined: 6 Nov 05
Posts: 87
Credit: 497,588
RAC: 0
Message 7651 - Posted: 26 Dec 2005, 8:42:46 UTC

Sometimes when I check Boincmanager again after a few minutes, I see the passed time going backwards.
Just now it went back from 2:53:31 to 2:23:17 while I saw the 2:53:31 less than 10 minutes ago.
Anybody any idea?
ID: 7651 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Rytis

Send message
Joined: 17 Sep 05
Posts: 9
Credit: 183,185
RAC: 0
Message 7652 - Posted: 26 Dec 2005, 9:08:39 UTC - in response to Message 7651.  

Sometimes when I check Boincmanager again after a few minutes, I see the passed time going backwards.
Just now it went back from 2:53:31 to 2:23:17 while I saw the 2:53:31 less than 10 minutes ago.
Anybody any idea?

Most probably, you have set not to leave apps in memory, and once applications switch you lose some work done, unless the switch occurs just after the checkpoint. You can turn "leave apps in memory" setting on, that will hopefully solve this (unless you have very little amounts of RAM, that would slow your computer a bit. But you can always turn the setting back).
PrimeGrid
Administrator
ID: 7652 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Webmaster Yoda
Avatar

Send message
Joined: 17 Sep 05
Posts: 161
Credit: 162,253
RAC: 0
Message 7653 - Posted: 26 Dec 2005, 9:16:58 UTC - in response to Message 7651.  
Last modified: 26 Dec 2005, 9:20:10 UTC

Sometimes when I check Boincmanager again after a few minutes, I see the passed time going backwards.


It sounds like you may have the following setting in your preferences:
Leave applications in memory while preempted? no
Switch between applications every 30 minutes (or maybe less)

Rosetta has a known bug (documented in several threads on these boards) which is usually fixed by changing that first setting to yes.

EDIT: Rytis, you beat me to it :-)
*** Join BOINC@Australia today ***
ID: 7653 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Grutte Pier [Wa Oars]~MAB The Frisian
Avatar

Send message
Joined: 6 Nov 05
Posts: 87
Credit: 497,588
RAC: 0
Message 7655 - Posted: 26 Dec 2005, 9:44:47 UTC

I only run R@H and some of the machines have only got 256MB RAM, so don't want to slow them down too much.
Would setting "leave apps in memory" to yes change something when I run only one application?
Will I lose work done for instance?

"Rosetta has a known bug (documented in several threads on these boards) which is usually fixed by changing that first setting to yes."
Link please.

ID: 7655 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Webmaster Yoda
Avatar

Send message
Joined: 17 Sep 05
Posts: 161
Credit: 162,253
RAC: 0
Message 7657 - Posted: 26 Dec 2005, 10:49:01 UTC - in response to Message 7655.  

I only run R@H and some of the machines have only got 256MB RAM, so don't want to slow them down too much.


256MB is below recommended specs but it should not be a problem in itself, particularly if the machine only runs Rosetta.

Would setting "leave apps in memory" to yes change something when I run only one application? Will I lose work done for instance?


I'd say the opposite is the case. With a setting of "no", any time Rosetta or the work unit is suspended you lose CPU time spent since the last save (which can be quite a while). The work unit can be suspended for a number of reasons, including "no" on the the first two settings in preferences, running benchmarks, or a manual suspend.

Another reason it may have gone backwards would be if BOINC was restarted or the computer rebooted during the 10 minutes you mentioned (in which case, obviously, none of the settings would make a difference)

If none of the above happened, I don't know what caused the clock to go backwards - maybe someone else has ideas.
*** Join BOINC@Australia today ***
ID: 7657 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile River~~
Avatar

Send message
Joined: 15 Dec 05
Posts: 761
Credit: 285,578
RAC: 0
Message 7666 - Posted: 26 Dec 2005, 18:26:29 UTC - in response to Message 7657.  
Last modified: 26 Dec 2005, 18:28:13 UTC

If none of the above happened, I don't know what caused the clock to go backwards - maybe someone else has ideas.


The other side of the issue is that with a single project, setting yes can't hurt either - if the app is always running it is going to be always taking part of your 256 Mb! So I would still be inclined to set yes.

By the way, that setting does not affect the operating system swapping Rosetta into virtual memory, when you run a big program that needs most of your 256Mb. The setting only affects what happens when BOINC itself loads a different task.

While I think of it, the most useful setting if you have limited RAM is to set the prefs so BOINC does not run while the machine is in use. This means that it waits 3min before bringing Rosetta back from virtual memory - otherwise both Rosetta and your 'live' program can be slowed down by the constant swapping in and out to & from VM. (Apols if you already knew that)

R~~
ID: 7666 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Grutte Pier [Wa Oars]~MAB The Frisian
Avatar

Send message
Joined: 6 Nov 05
Posts: 87
Credit: 497,588
RAC: 0
Message 8632 - Posted: 9 Jan 2006, 9:52:19 UTC
Last modified: 9 Jan 2006, 9:54:10 UTC

Just now it happened again.
Time jumped back from 3.57.xx to 3.46.xx while I expected it to get over 4.00.00.

Leave applications in memory while preempted?
(suspended applications will consume swap space if 'yes') yes

Still would like a explanation for this.
This way you lose time while the machines crunch full time ?

ID: 8632 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile River~~
Avatar

Send message
Joined: 15 Dec 05
Posts: 761
Credit: 285,578
RAC: 0
Message 8634 - Posted: 9 Jan 2006, 11:31:35 UTC - in response to Message 8632.  

Just now it happened again.
Time jumped back from 3.57.xx to 3.46.xx while I expected it to get over 4.00.00.

Leave applications in memory while preempted?
(suspended applications will consume swap space if 'yes') yes

Still would like a explanation for this.
This way you lose time while the machines crunch full time ?


Sometimes an application stops for no apparent reason - you may see a mesage saying that the client exited without an error code. The most common reason is that a safety mechanism cuts in inappropriately (it's designed to prevent an app running on when BOINC has been shut down)

BOINC responds by restarting the app, if it can. The restart picks up from the previous checkpoint. Soon after the restart the time jumps backward to the time at that checkpoint

If this is what has happened, the syptom is that the time it jumps back to will be the lowest time that was previously displayed with the same %complete, plus or minus a few seconds.

On some projects checkpoints happen every few min, sadly on BOINC there are only nine checkpoints in total for the whole wu (ten if you count completion as a checkpoint, eleven if you count the start!).

Yes it is a way projects lose time. The proper fix is to make more checkpoints (so less time is lost) and not to interfere with a safety trap that is there for good reason. Playing with the safety trap is like putting a nail into a fusebox to stop the fuse blowing.

Users have asked the project team here to look at providing more checkpoints, it seems it is particularly difficult to do on this project. So far the programmers have not had time to figure out how to do it, but they are well aware it is a high priority for us.

R~~
ID: 8634 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Grutte Pier [Wa Oars]~MAB The Frisian
Avatar

Send message
Joined: 6 Nov 05
Posts: 87
Credit: 497,588
RAC: 0
Message 8644 - Posted: 9 Jan 2006, 14:14:45 UTC
Last modified: 9 Jan 2006, 14:32:58 UTC

As long it's not just my problem I can accept it.
However it stays a waste of time until they find a cure.

ID: 8644 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Snake Doctor
Avatar

Send message
Joined: 17 Sep 05
Posts: 182
Credit: 6,401,938
RAC: 0
Message 8651 - Posted: 9 Jan 2006, 15:14:50 UTC - in response to Message 8644.  

As long it's not just my problem I can accept it.
However it stays a waste of time until they find a cure.


I assume you are looking at the correct column of the display. I regularly see the "to completion" time back up and move forward. Each time it moves forward it goes a little further toward completion. Usually this only happens in the first 50% of a WU. After that it just runs down to zero as expected. But the "CPU time" always moves forward as one would expect. The cause of this fluctuation in the completion time is the system adjusting to the actual run time of the WU as compared to the estimated run time.

If you are actually seeing this in the CPU time column, then I have no idea what is causing it.

Someone mentioned in this thread that setting the app to stay in memory only affects the project where you make the setting = YES. This is not true. If you set any of your projects to keep in memory = YES, that change will spread to all of your projects as they contact the respective servers.

As for lost time. ALL of the projects will lose some processing time if the application is removed from memory. The distance between checkpoints and the time of the swap in relation to the checkpoint will determine just how much time you lose. On a project like R@H the checkpoints are VERY far apart, so you can lose as much as 2 hours per swap on a slow machine. This can cause a WU to actually run forever under certain circumstances. Projects like Climate are a little better, usually a climate model will checkpoint every 15 min of CPU time, so that is the most you would ever lose. But even projects like S@H, E@H, and P@H will lose a few min of time if the swap occurs just before a checkpoint is reached.

I have found that keeping the apps in memory has no significant impact on other systems operations. But I have more than 256 megs of memory. At the price of memory these days, that might be something you should consider even if you don't crunch for these projects. 256 megs is not a lot of memory on any system these days.

One other finer point would be to set your system to swap application less often. You should probably have the swap time set to at least 1 hour 30 Min, 2 hours would be better. This assures that the system has the opportunity to reach a checkpoint before swapping, thus reducing the amount of lost CPU time.

Regards
Phil


We Must look for intelligent life on other planets as,
it is becoming increasingly apparent we will not find any on our own.
ID: 8651 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Grutte Pier [Wa Oars]~MAB The Frisian
Avatar

Send message
Joined: 6 Nov 05
Posts: 87
Credit: 497,588
RAC: 0
Message 8654 - Posted: 9 Jan 2006, 16:05:33 UTC

It happened on the one with 1Gig of memory and R@H is the only program running, if you don't count Kaspersky and WAS and some surfing.

Processor usage
Do work while computer is running on batteries?
(matters only for portable computers) no
Do work while computer is in use? yes
Do work only between the hours of (no restriction)
Leave applications in memory while preempted?
(suspended applications will consume swap space if 'yes') yes
Switch between applications every
(recommended: 60 minutes) 60000 minutes
On multiprocessors, use at most 2 processors
Disk and memory usage
Use no more than 2 GB disk space
Leave at least 0.2 GB disk space free
Use no more than 75% of total disk space
Write to disk at most every 60 seconds
Use no more than 75% of total virtual memory
Network usage
Connect to network about every
(determines size of work cache; maximum 10 days) 10 days
Confirm before connecting to Internet?
(matters only if you use a modem) no
Disconnect when done?
(matters only if you use a modem) no
Maximum download rate: no limit
Maximum upload rate: no limit
Use network only between the hours of
Enforced by versions 4.46 and greater (no restriction)
Skip image file verification?
Check this ONLY if your Internet provider modifies image files (UMTS does this, for example).
Skipping verification reduces the security of BOINC. no

ID: 8654 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Snake Doctor
Avatar

Send message
Joined: 17 Sep 05
Posts: 182
Credit: 6,401,938
RAC: 0
Message 8800 - Posted: 11 Jan 2006, 20:45:03 UTC - in response to Message 8654.  

It happened on the one with 1Gig of memory and R@H is the only program running, if you don't count Kaspersky and WAS and some surfing.....


Rosetta is a little different than other projects in a lot of subtile ways. One of those is the percent complete counter. While most projects increment this counter in 00.01% increments, Rosetta jumps in 10% increments. Between these jumps the "Time to Completion" will increase as the CPU time increases. This is because time to completion is calculated using the CPU time and percent complete. If the percent complete does not change then the calculated time to completion must go up. It's a math thing.

When the WU hits the next 10% increase the time to completion will suddenly decrease in one big jump by a large amount. The amount it jumps represents the amount of time it takes to process 10% of the WU on your particular machine. This time will of course vary depending on the size of the WU as well.

If you are processing WUs successfully I would not worry too much about the Manager displays. As long as the system does not run for very long periods showing no progress on the CPU time and percent counters, and you are getting about the same amount of credit for the same class of WU, I would not worry about it.

In the information you provided I am assuming the "600000" minutes" between application switches is a typo. So long as this value is set somewhere between 90 and 120 it should be sufficient. This is particularly true if you are only running the one project. You have "Leave in memory" set to "YES" which is correct. As long as you do not shut BOINC down, or do so very soon after a 10% jump in the percent complete of a WU, this is the best setup you can have for the project.

Regards
Phil


We Must look for intelligent life on other planets as,
it is becoming increasingly apparent we will not find any on our own.
ID: 8800 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Grutte Pier [Wa Oars]~MAB The Frisian
Avatar

Send message
Joined: 6 Nov 05
Posts: 87
Credit: 497,588
RAC: 0
Message 8803 - Posted: 11 Jan 2006, 21:43:18 UTC

It says 60000 minutes.
Can also put it to 60 minutes cause R@H is the only project running.
And it's the progress that jumps back now and again.

ID: 8803 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Snake Doctor
Avatar

Send message
Joined: 17 Sep 05
Posts: 182
Credit: 6,401,938
RAC: 0
Message 8807 - Posted: 11 Jan 2006, 22:09:55 UTC - in response to Message 8803.  

...
And it's the progress that jumps back now and again.


I can't help you with that one, I have never seen that happen. The progress on mine always jumps to 1% when the WU starts. After a while it jumps to 10%, and continues to jump in 10% increments until it completes.

Regards
Phil

We Must look for intelligent life on other planets as,
it is becoming increasingly apparent we will not find any on our own.
ID: 8807 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Sometimes with Boincmanager time changes backwards.



©2024 University of Washington
https://www.bakerlab.org