mirror of
http://galexander.org/git/simplesshd.git
synced 2025-01-14 17:10:55 +00:00
musings about the dump, which must be caused by atexit()
This commit is contained in:
parent
2d8d649cdd
commit
e204c1ea74
69
NOTES
69
NOTES
@ -898,6 +898,75 @@ the manifest...
|
|||||||
allowBackup="false" took immediate effect and had no surprises...
|
allowBackup="false" took immediate effect and had no surprises...
|
||||||
|
|
||||||
|
|
||||||
|
August 4, 2019.
|
||||||
|
|
||||||
|
I finally got a dump from a user (Hammad), and it's quite distressing.
|
||||||
|
The stack trace is roughly:
|
||||||
|
backtrace()
|
||||||
|
sigsegv_handler()
|
||||||
|
/system/bin/app_process64+0x2a90
|
||||||
|
__kernel_rt_sigreturn()
|
||||||
|
A5xContext::HwAddNop(unsigned int *, unsigned int)
|
||||||
|
EsxCmdMgr::IssuePendingIB1s(EsxFlushReason, int, int)
|
||||||
|
EsxCmdMgr::Flush(EsxFlushReason)
|
||||||
|
EsxContext::Destroy()
|
||||||
|
EglContext::DestroyEsxContext()
|
||||||
|
EglDisplay::MarkContextListForDestroy()
|
||||||
|
EglDisplay::Terminate(int)
|
||||||
|
EglDisplayList::Destroy()
|
||||||
|
EglDisplay::DestroyStaticListsMutexesAndTlsKeys()
|
||||||
|
EsxEntryDestruct()
|
||||||
|
/system/vendor/lib64/egl/libGLESv2_adreno.so+0x12780
|
||||||
|
[... cut off at 16 ...]
|
||||||
|
|
||||||
|
So many questions! I think app_process64 must be the actual C main() of
|
||||||
|
a process, responsible for branching into all the android system
|
||||||
|
libraries? I imagine it's involved because it's somehow intercepted the
|
||||||
|
SIGSEGV and re-dispatched it to my handler? I don't see any way we could
|
||||||
|
have branched into libGLESv2_adreno from userland, so the SIGSEGV must
|
||||||
|
come from the UI thread, I guess? Maybe this SIGSEGV is actually the
|
||||||
|
sort of thing we'd get if we tried to call UI code from the non-UI
|
||||||
|
thread??
|
||||||
|
|
||||||
|
It looks like GLES is busy cleaning itself up, and it crashes. Why's it
|
||||||
|
crash? Why's it trying to clean itself up?
|
||||||
|
|
||||||
|
Hammad says there is no problem using sshd...I thought he meant that the
|
||||||
|
re-start logic is working for him but his dropbear.err has multiple dumps
|
||||||
|
in it! The SIGSEGVs are apparently not killing the daemon.
|
||||||
|
|
||||||
|
There are no timestamps on the dumps, but it looks like they're
|
||||||
|
associated with activity anyways. Each dump happens between "Disconnect
|
||||||
|
received" and "sigchld". Some of them have "server select out"
|
||||||
|
interleaved into the dump, which I think is the result of Hammad running:
|
||||||
|
while true; do ssh phone 'exit'; done
|
||||||
|
That is, it appears he starts a new connection the very instant the old
|
||||||
|
connection ends. So the new connection comes into the server process
|
||||||
|
while the child process is in the act of dying.
|
||||||
|
|
||||||
|
The thing is, I don't see how it could possibly be getting signals from
|
||||||
|
the Java side of things, because it fork()s before setting up the signal
|
||||||
|
handling. It's not just running in a different thread, it should be a
|
||||||
|
totally separate process. I can test this but I don't think I'm wrong
|
||||||
|
about that.
|
||||||
|
|
||||||
|
So I guess just about the only thing that's really possible is that
|
||||||
|
there's an atexit() which survives the fork() because it isn't followed
|
||||||
|
up with an execve(). It's not caused by ARM, or even necessarily by
|
||||||
|
Android 9...the reason it doesn't show up in the emulator is that the
|
||||||
|
libGLES that registers the atexit() is vendor-supplied for specific
|
||||||
|
hardware ("Adreno").
|
||||||
|
|
||||||
|
So I need to figure out how to bypass the atexit() somehow, perhaps by
|
||||||
|
calling _exit() directly?
|
||||||
|
|
||||||
|
|
||||||
|
XXX - merge back into main branch, because I'll want to keep the dump facility
|
||||||
|
XXX - make the dump go deeper in the stack
|
||||||
|
XXX - put a crash in an atexit() to be sure it presents about this way
|
||||||
|
XXX - test re-start mechanism, which doesn't seem to work on the first try if it crashes
|
||||||
|
XXX - test bypassing that crash
|
||||||
|
XXX - remove the crash, remove the debug fprintfs (select in/out, sigchld)
|
||||||
|
|
||||||
--- new release
|
--- new release
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user