Invalid pointer issue

Message boards : Number crunching : Invalid pointer issue

To post messages, you must log in.

AuthorMessage
manalog

Send message
Joined: 8 Apr 15
Posts: 24
Credit: 233,155
RAC: 0
Message 98449 - Posted: 9 Aug 2020, 12:18:36 UTC

Hi all,
I observed a couple of weeks ago that some tasks processed by my Intel T7250 laptop had in the stderr (visible from the task page on the website) error messages such as these:
*** Error in `../../projects/boinc.bakerlab.org_rosetta/rosetta_4.20_x86_64-pc-linux-gnu': free(): invalid pointer: 0x00000000067bd783 ***

repeated hundreds of time. I thought it was a laptop's problem, so I turned it on other projects. Some days ago I tried again to crunch Rosetta on that laptop. It ran two tasks over a night, 8 hours, no suspensions nor restart of boinc-client: the (valid) tasks shown a correct stderr with a good number of decoy. The day after I needed to turn the laptop off but there were two tasks at 4 hours. Thus, I changed Rosetta runtime preference to 2 hours, issued an update on the laptop and restarted the boinc client via /etc/init.d/boinc-client restart.
When the tasks were sent to the server, I checked the stderr on the webpage and the "invalid pointer issue" was again present. Nonetheless, the tasks were considered valid and had a correct number of decoys (around 30 for a JHR that is a sound number for 4 hours of work on a T7250 core).
I have another host running Rosetta, and old iMac running Debian with a T7200 processor. It ran, before with Windows 7 and since a week with Debian, Rosetta H24 for a month without any issue of this kind, always set to 8 hours. While it was running Debian, I do not remember if I restarted boinc-client until today, but I am pretty much sure I suspended the tasks a couple of time.
Today I restarted the boinc-client on this machine and then checked the stderr of the running rosetta processes via
cat /proc/(PID)/fd/2
and... Invalid pointer again! Same issue as the laptop!
So now I know that this is not an issue isolated to my laptop but it can be replicated on other hosts, so perhaps it could be dangerous for the science of the project and I think we as volunteers should do a bit of debug.

1) This invalid pointer error in stderr does not affect the validation of the tasks, nor the credit, nor the number of decoys. It could be that it is just a bug giving a warning message that can be ignored, but it could also be that it generates results that are believed to be valid but that then are not actually valid. Only someone from the staff could answer this, and I think it is important, because if this problem causes issues on the results it is dangerous.
2) It seems to affect only the 64 bit version of rosetta for linux. I tried to restart the boinc-client on a Pentium 4 32 bit and cat /proc/(PID)/fd/2 shown a correct stderr with only the command issued by the boinc-manager
3) I am not sure how to replicate it but my guess is that it could be caused:
a) by a suspension of the task (unlikely)
b) by a change in the runtime preferences during the execution of the task, followed by a restart of the boinc-client
c) by a restart of the boinc-client while tasks are running
d) randomly

Please I ask some volunteer running Linux (problem appeared both in Mint (laptop) and in Debian (iMac)) to try to do the a, b and c step separately and then in combination, always checking as soon as the task starts again if the stderr (cat /proc/pid/fd/2) shows some invalid pointer error message.
Thank you :)
ID: 98449 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Ray Murray
Avatar

Send message
Joined: 22 Apr 20
Posts: 17
Credit: 270,864
RAC: 0
Message 98450 - Posted: 9 Aug 2020, 15:39:20 UTC - in response to Message 98449.  

Hi Manalog,
I'm Windows only, so not much help for your Linux except for your question of changing the runtime. This change would only affect future tasks. It will have no influence on any tasks already in progress or waiting to run. That's my only input, sorry, but it rules out one cause.
ID: 98450 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
manalog

Send message
Joined: 8 Apr 15
Posts: 24
Credit: 233,155
RAC: 0
Message 98460 - Posted: 10 Aug 2020, 9:01:40 UTC - in response to Message 98450.  

That's weird, on Linux I can also change the preferred runtime of a task during its execution: just change the preference, issue an 'update' and then restart the client. If a task is let's say at 4 hours and was set to 8hrs, then if you change it to 2 and restart the client it will finish the last decoys and send it to the server
ID: 98460 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Ray Murray
Avatar

Send message
Joined: 22 Apr 20
Posts: 17
Credit: 270,864
RAC: 0
Message 98462 - Posted: 10 Aug 2020, 12:23:04 UTC - in response to Message 98460.  
Last modified: 10 Aug 2020, 12:29:18 UTC

My understanding was that the Target Runtime would be set when the task was allocated but I am happy to be proved wrong in that assumption.

If the results are returned as Valid then it could just be that that particular part of the Task didn't return what was being looked for but otherwise the Task returned good work.
ID: 98462 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Invalid pointer issue



©2024 University of Washington
https://www.bakerlab.org