7 days penality when project is down for a few hours

Message boards : BOINC client : 7 days penality when project is down for a few hours
Message board moderation

To post messages, you must log in.

AuthorMessage
Wurgl

Send message
Joined: 15 Apr 06
Posts: 6
Germany
Message 3902 - Posted: 15 Apr 2006, 8:57:12 UTC

Hello,

I already mentioned this on the Einstein@Home Forum. The project was down for about 5 hours and this downtime caused boinc not to connect for 604800, which is seven long days.

It seems, that after several retries boinc tries to refetch the master scheduler list and when this failes, it immediately sets this penalty. Please note, that I was sleeping during that time, I did not click on any button for fun or other reason, I did not touch the puter.

And I really think, that a penality is much too high. I am okay with half a day, or with no connection until all WUs are crunched or something like that. But seven days might be beyond the report deadline.

It might not be a problem with cobblestone-junkies sitting in front of their computers and watching the CPU cycles the whole day. But this penalaty is a problem for anyone having a large number of machines, sitting on different locations. A few (hopefully not all) will run out of work, and the poor guy has to manually check every machine and has to update e few of them -- if the guy recognizes the problem.

Please, set this penalty to a lower limit, at least for the first refetch of the sheduler list.
ID: 3902 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15482
Netherlands
Message 3906 - Posted: 15 Apr 2006, 14:43:22 UTC

I have just gotten an answer on this from David Anderson yesterday, as I saw it back off to 163 hours when I forgot to allow test version 5.4.2 through my firewall for a day!

From email:
The exponential backoff mechanism ramps up to 2 weeks
if network communication is not working
(because of firewall or lack of connection).
In principle this shouldn't matter because everything
gets retried when communication is enabled.

However: exponential backoff is to protect servers.
we shouldn't be using exponential backoff when
we know that the network problems are client-side.
We should just retry every few minutes.
I don't want to make this change now,
but we should eventually do it.

-- David

ID: 3906 · Report as offensive
Wurgl

Send message
Joined: 15 Apr 06
Posts: 6
Germany
Message 3908 - Posted: 15 Apr 2006, 14:58:57 UTC - in response to Message 3906.  

I have just gotten an answer on this from David Anderson yesterday, as I saw it back off to 163 hours when I forgot to allow test version 5.4.2 through my firewall for a day!

From email:
The exponential backoff mechanism ramps up to 2 weeks
...


Thanks for the answer, as you can see here there is actually no exponential, but rather an explosional backoff :-) Similar to the Big Bang.

It was trying every few minutes and suddenly 7 days.
ID: 3908 · Report as offensive

Message boards : BOINC client : 7 days penality when project is down for a few hours

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.