Discussion:
debugging Windows application crashes
(too old to reply)
muta...@gmail.com
2021-06-25 03:41:23 UTC
Permalink
Hi.

I come from an IBM MVS environment, and when
you have a memory violation, there is a dump
produced, including stack traceback, and you send
that to the vendor and they can normally solve the
problem from that one single instance, without
needing to annoy the customer any further.

As opposed to "sorry, I can't reproduce the problem
on my system, and until I can, there's nothing I can do".

Now I'm in a Windows environment (Windows
Server 2019) and I have no idea what to do.

Event Viewer shows the below. Maybe it's too late
for this time, but what do I need to change so that
from now on, no application crash ever occurs
without full analysis at the assembler level?

At a minimum I wish to see the stack traceback.
The crash occurred in Microsoft's runtime, but
if I at least knew which function was being called,
from where in the application (and who called
that), I'd be in a much stronger position to look
for a fault in the code.

The application is C++ built with Visual Studio 6
(ie decades old). But the executables were
recently built.

The executables are being built as "release", not
"debug", and it is the "release" versions from
production that I wish to debug whenever they
fail.

Also, it's a 32-bit application and I'm a bit surprised
that it isn't the MSVCRT.DLL from SysWow64 that
isn't mentioned as the failure point. Both the
System32 and SysWow64 versions are large, so
I assume one doesn't just call the other. I also
assume that SysWow64 is something that Windows
does internally.

Also, it is almost technically impossible to move
this application to a later version of Visual Studio
due to third-party dependencies. I already tried.
Nevermind 64-bit.

Thanks. Paul.



Faulting application name: yyy.EXE, version: a.b.c.d, time stamp: 0x60d2becf
Faulting module name: MSVCRT.dll, version: 7.0.17763.475, time stamp: 0xba51b082
Exception code: 0x40000015
Fault offset: 0x0003b83b
Faulting process id: 0xf50
Faulting application start time: 0x01d768ceb5d88885
Faulting application path: d:\xxx\yyy.EXE
Faulting module path: C:\Windows\System32\MSVCRT.dll
Report Id: cd0871e9-c3be-4555-9d90-200d8ad6b636
Faulting package full name:
Faulting package-relative application ID:


yyy.EXE
a.b.c.d
60d2becf
MSVCRT.dll
7.0.17763.475
ba51b082
40000015
0003b83b
f50
01d768ceb5d88885
d:\xxx\yyy.EXE
C:\Windows\System32\MSVCRT.dll
cd0871e9-c3be-4555-9d90-200d8ad6b636
muta...@gmail.com
2021-06-25 11:31:19 UTC
Permalink
I found this:

https://docs.microsoft.com/en-gb/windows/win32/wer/collecting-user-mode-dumps

but it looks like that is for products that Microsoft are
meant to debug.

This product does not belong to Microsoft, although in
this particular case it make use of Microsoft's compiler
and runtime library.

But the problem is most likely to be in the non-Microsoft
component, so I need a stack trace to get started.

I also found something that you can intrusively insert into
an application to catch and print an error, but that seems
a quite odd thing to do compared to letting the OS capture
it with a lot more integrity.

BFN. Paul.
muta...@gmail.com
2021-06-27 00:34:06 UTC
Permalink
I tried putting a deliberate error (writing to NULL) into my
own C program compiled with my own C library, so there
is nothing surprising, but it didn't produce any dump.

I found this:

https://www.meziantou.net/tip-automatically-create-a-crash-dump-file-on-error.htm

found that the "LocalDumps" didn't exist in my registry,
ran this:

enabdump.ps1:
New-Item -Path "HKLM:\SOFTWARE\Microsoft\Windows\Windows Error Reporting" -Name "LocalDumps"
New-ItemProperty -Path "HKLM:\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps" -Name "DumpFolder" -Value "%LOCALAPPDATA%\CrashDumps" -PropertyType "ExpandString"
New-ItemProperty -Path "HKLM:\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps" -Name "DumpCount" -Value 10 -PropertyType DWord
New-ItemProperty -Path "HKLM:\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps" -Name "DumpType" -Value 2 -PropertyType DWord

to get full dumps, and I got a 6 MB dump file:

Directory of C:\Users\kerra\AppData\Local\CrashDumps

2021-06-27 10:23 <DIR> .
2021-06-27 10:23 <DIR> ..
2021-06-27 10:23 5,995,853 pdptest.exe.3648.dmp

for my 32k executable:

Directory of C:\devel\pdos\pdpclib

2021-06-26 11:31 31,744 pdptest.exe


I tried just typing in the dmp filename and it opened
Visual C++ 2008 Express Edition which I have installed
on my computer, but it didn't do anything obvious like
show me a stack trace.

BFN. Paul.
muta...@gmail.com
2021-06-27 04:36:31 UTC
Permalink
Ok, I went here:

https://developer.microsoft.com/en-us/windows/downloads/windows-10-sdk/

Clicked on the first "download the installer"

Deselected everything except "Debugging Tools for Windows"

That got me windbg.exe (and there are also some others
like cdb.exe that might be useful).

In windbg I chose "Open Crash Dump" and selected
my dump file.

Then I did "!analyze -v" as suggested by windbg, and
lo and behold, I got a stack trace:

STACK_TEXT:
WARNING: Stack unwind information not available. Following frames may be wrong.
0061ff20 00401b68 00000001 004098d0 0061ff50 pdptest!main+0x38
0061ff50 0040100d 00000000 00000000 00000000 pdptest!_start+0x4f8
0061ff70 772cfa29 0030c000 772cfa10 0061ffdc pdptest!mainCRTStartup+0xd
0061ff80 77717a9e 0030c000 44e9b9e2 00000000 kernel32!BaseThreadInitThunk+0x19
0061ffdc 77717a6e ffffffff 77738a33 00000000 ntdll!__RtlUserThreadStart+0x2f
0061ffec 00000000 00401000 0030c000 00000000 ntdll!_RtlUserThreadStart+0x1b

Not sure what it's complaint about "stack unwind information"
not being available is about. It looked correct to me.

Anyway, here is my generated assembler:

movl $LC0, -36(%ebp)
movl $LC1, -40(%ebp)
movl $0, -44(%ebp)
call _puts
movb $0, 0


corresponding to:

printf("welcome to pdptest\n");
*(char *)0 = 0;

Compiled with:

gccwin -S -O2 -D__WIN32__ -D__STATIC__ -D__NOBIVA__ -I . -I../src -o pdptest.s pdptest.c

And when assembled with:

aswin -a -o pdptest.o pdptest.s >temp.txt

gives me:

72 01f9 C745D815 movl $LC1, -40(%ebp)
72 000000
73 0200 C745D400 movl $0, -44(%ebp)
73 000000
74 0207 E8000000 call _puts
74 00
75 020c C6050000 movb $0, 0
75 000000
76 0213 59 popl %ecx

Not the offset I need, because of constants.

So here is the important line:

57 01d1 8D7600 .align 4
58 .globl _main
59 _main:
60 01d4 55 pushl %ebp


So 1d1 plus the reported offset 38 gives me 209.

Not what I expected, but getting close.

Oh. That "1d4" is the first instruction. That's what I need.
Let's try again.

1d4 + 38 = 20C.

And Houston, we have liftoff.

It has reported the line of code that faulted. I was actually
expecting it to point to the next line (213) after the error,
but that's fine, so long as I know "the rules".

One other thing I ideally need is the module load point
so that I'm not dependent on symbols. In fact, how does
windbg even know my function is called "_main"? I thought
symbols were being stripped?

ldwin -s -o pdptest.exe w32start.o pdptest.o pdpwin32.a ../src/kernel32.a

Looks like stripping to me.

I have this:

0071E0 61627300 6C646976 006D6169 6E006D61 abs.ldiv.main.ma
0071F0 696E4352 54537461 72747570 006D616C inCRTStartup.mal

near the end of the file.

And this:

006FD0 746F7574 005F5F69 73627566 005F5F6D tout.__isbuf.__m
006FE0 61696E00 5F5F705F 5F656E76 69726F6E ain.__p__environ

So one "main" with 2 underscores, one with none. But
mine is just 1. Not sure what that is about.

Anyway, from the map I have:

ldwin -M -s -o pdptest.exe w32start.o pdptest.o pdpwin32.a ../src/kernel32.a >temp.txt

.text 0x00401000 0x5e00
*(.init)
*(.text)
.text 0x00401000 0x20 w32start.o
0x00401000 mainCRTStartup
0x00401010 __main
.text 0x00401020 0x520 pdptest.o
0x004011f4 main
.text 0x00401540 0x870 pdpwin32.a(start.o)
0x00401ca0 _cexit


So the "main" I'm interested in is at offset 4011f4 - 401000 = 1f4.

I have this address:

0061ff20 00401b68 00000001 004098d0 0061ff50 pdptest!main+0x38

not sure if that (61ff20) is of "main" or the failing location.

If I know the load point I'll be in business. Does windbg give me that?

It gives me registers, which is great:

eax=00000000 ebx=00000000 ecx=fe388eab edx=0000000a esi=00000003 edi=00000003
eip=77722f8c esp=0061ef58 ebp=0061f0e8


Gives me the instruction too:

0040122c c6050000000000 mov byte ptr ds:[0],0 ds:002b:00000000=??

Repeats the address:

ExceptionAddress: 0040122c (pdptest!main+0x00000038)

Ok, the other command I know is "lmv"

That gives me:

start end module name
00400000 00412000 pdptest (export symbols) pdptest.exe

That's strange. I thought there was address space
randomization. Why is it loaded at 400000? There
should be relocation information for this executable.

I think this (.reloc) is what proves that:

objdump -p pdptest.exe

The Data Directory
Entry 0 0000f000 000009ab Export Directory [.edata (or where ever we found it)]
Entry 1 00010000 0000027c Import Directory [parts of .idata]
Entry 2 00000000 00000000 Resource Directory [.rsrc]
Entry 3 00000000 00000000 Exception Directory [.pdata]
Entry 4 00000000 00000000 Security Directory
Entry 5 00011000 000002ac Base Relocation Directory [.reloc]

Anyway, with exception address of 0040122c it means I'm
looking for offset 122c in my module.

I have this:
0x004011f4 main

But I thought it would be just 1f4.

Regardless, 11f4 + 38 gives me the 122C I am looking for.

I just need to find out what the missing 1000 is about.

BFN. Paul.
muta...@gmail.com
2021-06-27 05:14:53 UTC
Permalink
Post by ***@gmail.com
I just need to find out what the missing 1000 is about.
Ok, the map actually has:

0x00400000 __image_base__ = 0x400000

.text 0x00401000

So the executable code doesn't start exactly where
the module is loaded, it is at offset 1000. So all
seems to be in order except the lack of address
space randomization which I thought existed in
Windows 10.

Regardless, this now means I can debug problems
on Windows in a professional manner, the same as
I can on MVS.

Although I still need to know how to look at the
memory.

I did a "view", "memory" and got some strange
default. Then I put in 400000 and what do I see?

00400000 4d 5a 90 00 03 00 00 00 04 00 00 00 ff ff 00 00 b8 00 MZ................
00400012 00 00 00 00 00 00 40 00 00 00 00 00 00 00 00 00 00 00 ***@...........
00400024 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ..................
00400036 00 00 00 00 00 00 80 00 00 00 0e 1f b4 09 ba 10 00 cd ..................
00400048 21 b0 01 b4 4c cd 21 00 49 6e 73 74 61 6c 6c 20 48 58 !...L.!.Install HX
0040005a 20 6f 72 20 75 70 67 72 61 64 65 20 74 6f 20 50 44 4f or upgrade to PDO
0040006c 53 2f 33 38 36 20 6f 72 20 57 69 6e 65 20 65 74 63 0d S/386 or Wine etc.
0040007e 0a 24 50 45 00 00 4c 01 06 00 75 83 d6 60 00 00 00 00 .$PE..L...u..`....
00400090 00 00 00 00 e0 00 0e 02 0b 01 02 38 00 5e 00 00 00 1a ...........8.^....
004000a2 00 00 00 6a 00 00 00 10 00 00 00 10 00 00 00 70 00 00 ...j...........p..
004000b4 00 00 40 00 00 10 00 00 00 02 00 00 04 00 00 00 01 00 ***@.....


The MSDOS stub!!!

So that's what is in the first x'1000'.

In the executable itself my 32-bit code seems to start at x'400' though:

0003C0 00000000 00000000 00000000 00000000 ................
0003D0 00000000 00000000 00000000 00000000 ................
0003E0 00000000 00000000 00000000 00000000 ................
0003F0 00000000 00000000 00000000 00000000 ................
000400 5589E583 EC146A00 E8630600 00C9C390 U.....j..c......
000410 5589E5C9 C3909090 90909090 90909090 U...............
000420 77656C63 6F6D6520 746F2070 64707465 welcome to pdpte
000430 73740072 0077006D 61696E20 66756E63 st.r.w.main func
000440 74696F6E 20697320 61742025 700A0061 tion is at %p..a

Yep, definitely:

.text 0x00401000 0x20 w32start.o
0x00401000 mainCRTStartup
0x00401010 __main
.text 0x00401020 0x520 pdptest.o
0x004011f4 main

4 0000 77656C63 .ascii "welcome to pdptest\0"


So what's needed now for the Windows environment to
be professional is for users to know they need to switch
on crash dumps, and for programmers to know that
crash dumps are a thing, and how to use them, regardless
of what language/compiler you are using. The information
is all there.

BFN. Paul.

Loading...