onhacks.org – where I am

It’s been great writing here and carving deep my knowledge to myself and hopefully, to you who came. And thanks a lot for those who have written comments and also sending email to me on ideas and improvements. =)

As some of you might know, this page is not readily accessible in China. Hence, I am now moving to here :

http://www.onhacks.org

Where I will write with in English, Traditional Chinese and Simplified Chinese covering various aspects in Security and also the security view in China.

Some Good Stuffs to Read

Well, these articles really enlighten you on matters, even if you know about them. I recommend them to my fellow knowledge-thirsty visitors to take a look at them.

Lookout.net – Chris Weber, specializes in Internationalized Software Security

Unicode attacks and test cases – Visual Spoofing, IDN homograph attacks, and the Confusables

Unicode attacks and test cases – Visual Spoofing, IDN homograph attacks, and the Single Script Confusables

Alex’s Corner – Kuza55, specializes in webappsec.

Racing to downgrade users to cookie-less authentication

Understanding Cookie Security

First look on Cookies

cookie_monster

I wrote a simple script to set some cookies, and found some cute numbers on the maximum cookies to be set per domain name per path. The cookies are in the form of <key>=<val>, e.g. 1=1, 2=1, 3=1, 4=1. The length of the cookie name matters, as I found out.

Internet Explorer 7 – 20 cookies, maximum of 244 Set-Cookies per page.

Firefox 3 – 50 cookies.

Safari 3 – 1161 cookies, no limit of Set-cookies per page. See analysis below.

Opera 9 – 30 cookies.

Chrome 0.4 – 59~70 cookies, I have no idea why it is varying.

Tencent Traveller 2 – 20 cookies, follows the behaviour of Internet Explorer 7.

Except Safari 3, all browsers have a limit on the number of cookies to be set. I guess Safari is using a link list for that. For most browsers, although the HTTP Response code is 200, they will report the page as cannot be displayed. However, for Safari, since it has no limit, when the cookie headers are too long ( > 7619 ), Apache replies with a 400 Bad Request.

Haven’t think of any interesting tests yet, but feel free to discuss if there is anything we can do about them. By the way, I remember hotmail sets a whole lot of cookies, like BrowserSense and BS are just duplicates obviously (legacy code, yeehh!), I wonder are they hitting the limits soon? =)

The Tencent Traveller 2, as I will bet none of you outside of China will know about, is actually a browser in China that is built on top of IE7. Consider a GUI on top of IE7, and it even uses cookies of IE7, too. I have no idea of its adoption in China. Only after testing I realized I am using a very old version of it. I’ll see if there’s anything interesting in its newest version, 4.4.

So much for debugging last time. Let’s get back to the web. =P

Tencent Traveller – http://www.skycn.com/soft/14500.html

RFC2109 – http://www.faqs.org/rfcs/rfc2109.html

RFC2965 – http://www.faqs.org/rfcs/rfc2965.html

How to debug a Stack Overflow for beginners?

How do you debug a stack overflow?

If you rarely touch debuggers, the above question will be difficult to answer, and if now you are faced with some cryptic failure and error codes. Awww.

Today I am going to share with you my experience in a powerful debugger called WinDBG. This is going to be a very long journey. On we go!

===

The Beginning

The first step in dealing with whatever bugs, is find a solid way to reproduce the bug. If it cannot be reproduced, how can you prove it is gone when you fixed it? Absence of evidence does not imply evidence of absence! Since it is different for all bugs, find the steps to reproduce the bug now and come back.

Have you got it? Make sure you do. You really need it.

Let’s begin.

Start the faulting process and attach WinDBG to it. Supply the path to the right symbols, and source files if you feel need them. The symbol files are called PDB. Without symbols, you will have a very hard time doing debugging in general. With the right source, you can free yourself from looking into assembly. ( Note that source can be incorrect! Assembly does not lie. )

An example for the symbols path could be the Microsoft Symbol Server, and my own symbols :

srv*DownstreamStore*D:local_SymbolDownstreamStore*http://msdl.microsoft.com/download/symbols;
D:\Symbols;

Now let’s download the symbols, /f for reload immediately. The DLL has embedded information and knows where to look for the PDB in the symbol server.

.reload /f

This will force all the modules to find their corresponding PDB symbols. It will take some time. The symbols will be cached at D:\local_SymbolDownstreamStore as specified above. Next time you do not need to wait that long.

List loaded modules, no arguments for showing all of them. This shows all the DLLs that is loaded into memory so far. Before we start doing any debugging, we have to make sure if the modules we want to debug have the right symbols.

lm

If you are successful, you will see something like below :

01000000 012ac000 CrashingProgram (private pdb symbols) D:\symbols\CrashingProgram.pdb

This means the symbols are not right :

10000000 100c8000 ws03res (no symbols)

Let’s set a breakpoint on all first chance exceptions, “*” for all of them. The breakpoint freezes the program so we can examine it.

sxe *

Let’s return the control flow back to the application.

g

It should show a little *BUSY* status. Now it is your job to reproduce the bug.

Reproduce the bug now.

The Debug

(9df4.4f7c): Stack overflow – code c00000fd (first/second chance not available)
eax=0000c94a ebx=80000000 ecx=00d6389c edx=7ffb001c esi=00000104 edi=77f670e9
eip=77d06628 esp=00d62fc0 ebp=00d63858 iopl=0         nv up ei pl nz na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010206
oleaut32!LoadTypeLibEx+0×13:
77d06628 53              push    ebx

Got it? Good. WinDBG should have halted by now. And the *BUSY* is gone. The next step is optional : create a memory dump in case you need to bring the debugging elsewhere or do it later.

.dump /ma C:\memory.dmp

The flag /m is to create a minidump, and with “a” it is equivalent to “fFhut” as well, which effectively means dump everything out. Funny thing is that a mini dump is bigger than a full dump – legacy stuffs.

The first thing you do on a crash, is to do the !analyze, with -v as verbose. It will do all the grunt work to analyze the information for you and save you a lot of time.

!analyze -v


0:001> !analyze -v
*******************************************************************************
* *
* Exception Analysis *
* *
*******************************************************************************

FAULTING_IP:
ole32!ModalLoop+5b [d:\nt\com\ole32\com\dcomrem\chancont.cxx @ 200]
776c1d74 57 push edi

EXCEPTION_RECORD: ffffffff — (.exr 0xffffffffffffffff)
ExceptionAddress: 77d06628 (oleaut32!LoadTypeLibEx+0×00000013)
ExceptionCode: c00000fd (Stack overflow)
ExceptionFlags: 00000000
NumberParameters: 2
Parameter[0]: 00000001
Parameter[1]: 00d62fbc

BUGCHECK_STR: c00000fd
DEFAULT_BUCKET_ID: STATUS_STACKOVERFLOW
PROCESS_NAME: CrashingProgram.exe
ERROR_CODE: (NTSTATUS) 0xc00000fd – A new guard page for the stack cannot be created.
RECURRING_STACK: From frames 0×70 to 0×98
NTGLOBALFLAG: 0
APPLICATION_VERIFIER_FLAGS: 0
LAST_CONTROL_TRANSFER: from 77d06c53 to 77d06628

STACK_COMMAND: ~1s; .ecxr ; kb

FOLLOWUP_IP:
CrashingProgram!_com_ptr_t<_com_IIID<CrashingServiceLib::IShared,&_GUID_e55d5bc5_0eff_4ca9_ae3f_63f6203afe18> >::CreateInstance+3a [d:\l\src\sdk\inc\comip.h @ 516]
010ab6ca 8945fc mov dword ptr [ebp-4],eax

FAULTING_SOURCE_CODE:
512:
513: if (dwClsContext & (CLSCTX_LOCAL_SERVER | CLSCTX_REMOTE_SERVER)) {
514: IUnknown* pIUnknown;
515:
> 516: hr = CoCreateInstance(rclsid, pOuter, dwClsContext, __uuidof(IUnknown), reinterpret_cast<void**>(&pIUnknown));
517:
518: if (FAILED(hr)) {
519: return hr;
520: }
521:

SYMBOL_STACK_INDEX: 82
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: CrashingProgram
IMAGE_NAME: CrashingProgram.exe
DEBUG_FLR_IMAGE_TIMESTAMP: 4937c905
FAULTING_THREAD: 00004f7c
SYMBOL_NAME: CrashingProgram!_com_ptr_t<_com_IIID<CrashingServiceLib::IShared,&_GUID_e55d5bc5_0eff_4ca9_ae3f_63f6203afe18> >::CreateInstance+3a
FAILURE_BUCKET_ID: c00000fd_CrashingProgram!_com_ptr_t__com_IIID_CrashingServiceLib::IShared,__GUID_e55d5bc5_0eff_4ca9_ae3f_63f6203afe18___::CreateInstance+3a
BUCKET_ID: c00000fd_CrashingProgram!_com_ptr_t__com_IIID_CrashingServiceLib::IShared,__GUID_e55d5bc5_0eff_4ca9_ae3f_63f6203afe18___::CreateInstance+3a
Followup: MachineOwner
———

We are concerned only with these :

FAULTING_IP – The CPU instruction to execute when the crash happens.

STACK_COMMAND – This gives us a short hand to get more information on the stack by executing it in WinDBG command prompt.

MODULE_NAME – The crashing module name in the executable.

IMAGE_NAME – The crashing module file name in the file system.

FAULTING_THREAD – The thread ID of the thread that is active at the moment of crash.

FAULTING_SOURCE_CODE – If you have the right source code and symbols, this can pinpoint the source code where the crash happens.

STACK_TEXT – If you have not the right source code, this gives you the idea what happened.

ERROR_CODE – The error code of the exception that caused this crash.

DEFAULT_BUCKET_ID – The category of the problem we experience.

We got a stack overflow – 0xc00000fd ( You can find that in ntstatus.h ). We also know that the faulting thread is 1. In the above, sometimes you might not get the source code but the STACK_TEXT instead. The STACK_TEXT is the stacktrace of the faulting thread. It is present when you do not have the source code. Manually, you can type ~<thread number>s where thread number is the thread you want to see. In this case, it is 1. You can use the STACK_COMMAND as supplied above by the analysis, ~1s; .ecxr ; kb ;.

Let’s work on thread one.

~1s

Display the exception context for more information just in case. Registers do not lie too. =)

.ecxr

0:001> .ecxr
eax=0000c94a ebx=80000000 ecx=00d6389c edx=7ffb001c esi=00000104 edi=77f670e9
eip=77d06628 esp=00d62fc0 ebp=00d63858 iopl=0 nv up ei pl nz na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010206
oleaut32!LoadTypeLibEx+0×13:
77d06628 53 push ebx

Now dump the thread stack , “kb” with arguments. Stack traces can be corrupted, do not trust them all.

kb

As we are doing a stack overflow debugging, we are probably using the whole of the stack reserves. Let’s see how much memory are we allowed by dumping the Thread Environment Block (TEB) :

!teb

0:001> !teb
TEB at 7ffdc000
ExceptionList: 00d63ca0
StackBase: 00da0000
StackLimit: 00d61000

SubSystemTib: 00000000
FiberData: 00001e00
ArbitraryUserPointer: 00000000
Self: 7ffdc000
EnvironmentPointer: 00000000
ClientId: 00009df4 . 00004f7c
RpcHandle: 00000000
Tls Storage: 00000000
PEB Address: 7ffdf000
LastErrorValue: 14007
LastStatusValue: 0
Count Owned Locks: 0
HardErrorMode: 0

Ahhh, do you see the StackBase and StackLimit? Their difference is the stack reserve limit for this thread, note that this can be different for every other thread.  ( DA0000 – D61000 = 3F000 ) and you got 252kb roughly.

Now let’s dump all of those stack frames and see what we’ve got.

~*kb 0xffff


0:001> ~1kb 0xffff
ChildEBP RetAddr  Args to Child
00d63858 77d06c53 00d638d4 00000000 00d6389c oleaut32!LoadTypeLibEx+0×13 [
(truncated)]
00d6386c 77d0e9f8 00d638d4 00d6389c 07df2d4c oleaut32!LoadTypeLib+0×12 [(truncated)]
00d63c4c 77d0ed1b 07df2d4c 00d63c68 0012f070 oleaut32!GetTypeInfoOfIID+0×371 [
(truncated)]
00d63c6c 7778d01b 07df2d38 07defd58 022117dc oleaut32!CUnivStubWrapper::Invoke+0x7c [(truncated)]
… ( truncated for clarity )

00d9ffa4 77f65e91 00000001 000a142c 00000000 CrashingModule!CServiceModule::_ServiceMain+0×57 [
d:\l\src\CrashingProgram\CrashingServiceMain.cpp @ 514]
00d9ffb8 77e64829 000a1420 00000000 00000000 advapi32!ScSvcctrlThreadA+0×21 [
(truncated)]
00d9ffec 00000000 77f65e70 000a1420 00000000 kernel32!BaseThreadStart+0×34 [
(truncated)]

The first column is the address in the stack. The second column is the return address. The third, forth and fifth column are the arguments to the function call. We are concerned with the first column and the top and bottom of the stack trace. Their difference gives the amount of memory used on the stack.

To calculate the amount of memory used, do a subtraction ( D9FFEC – D63858 = 3C794 ) and … 242kb. There we go! This thread is topping the limit of 252kb, and it seems it just hit the top. However, we still have to find out what caused this memory usage in the first place.

Phew!

The Cause

Now, we have to find out what is happening. Since the problem lies in using the whole stack memory. Let’s analyze the stack frames for any recursion. According to WinDBG online help, it could be :

  • A thread uses the entire stack reserved for it. This is often caused by infinite recursion.
  • A thread cannot extend the stack because the page file is maxed out, and therefore no additional pages can be committed to extend the stack.
  • A thread cannot extend the stack because the system is within the brief period used to extend the page file.

For cause 1, infinite recursion, the crash can be exacerbated if you are allocating huge strings on the stack.

So. Let’s look into the full stack trace and see if there are any apparent recursions :

0:001> ~1kb 0xffff
00d652f8 010af822 01012b68 00000000 00000017 CrashingProgram!_com_ptr_t<_com_IIID<CrashingServiceLib::IShared3,&_GUID_e55d5bc5_0eff_4ca9_ae3f_63f6203afe18> >::CreateInstance+0x3a [d:\l\src\sdk\inc\comip.h @ 516]
00d65344 010adf2e 00bb93c8 6c3ba3d9 00d6540c CrashingProgram!CShared2::CreateShared3Service+0×72 [d:\l\src\CrashingProgram\Shared2.cpp @ 958]
00d65408 77c80193 00bb93cc 00d65618 02020202 CrashingProgram!CShared2::GetSettings+0x15e [d:\l\src\CrashingProgram\Shared2.cpp @ 458]
… (truncated for clarity)
00d67178 77c80193 00bc2f80 00000018 00000001 CrashingProgram!CrashingProgram::SetHealthStatus+0×84 [d:\l\src\CrashingProgram\CrashingService.cpp @ 13073]
… (truncated for clarity)
00d68c80 010af822 01012b68 00000000 00000017 CrashingProgram!_com_ptr_t<_com_IIID<CrashingServiceLib::IShared3,&_GUID_e55d5bc5_0eff_4ca9_ae3f_63f6203afe18> >::CreateInstance+0x3a [d:\l\src\sdk\inc\comip.h @ 516]
00d68ccc 010adf2e 00bc2658 6c3b7a41 00d68d94 CrashingProgram!CShared2::CreateShared3Service+0×72 [d:\l\src\CrashingProgram\Shared2.cpp @ 958]
00d68d90 77c80193 00bc265c 00d68fa0 02020202 CrashingProgram!CShared2::GetSettings+0x15e [d:\l\src\CrashingProgram\Shared2.cpp @ 458]
… (truncated for clarity)

I have truncated the above stack trace, which is about thousands of lines. There is a recurring CreateInstance in general. I counted there are 37 instances of it, and each of them uses about 6.8kb. ( 6.8kb * 37 = 251.6kb ) Boom!!!

Remember what we see above in the “!analyze -v” results?

CrashingProgram!_com_ptr_t<_com_IIID<CrashingServiceLib::IShared,&_GUID_e55d5bc5_0eff_4ca9_ae3f_63f6203afe18> >::CreateInstance+3a [d:\l\src\sdk\inc\comip.h @ 516]

Ignore this paragraph : After some studying, it turns out that the cause is that a COM Single-Threaded Apartment allows pre-emption if the main thread is performing an Out-of-proc call, which is a by-product of the Windows Message Loop. The official workaround is to use the IFilter and implement the whole thing yourself. YUCKS! Whatever, that is the reason to a program I’m working on. It might be different for you.

MACHINE BENT! ( This is a my native language slang for almost anything, in this case “Gotcha!” )

By the way, you can also check if this thread is specially allocated only 252kb of stack reserve, or it is a executable limit. Let’s dump the executable headers information. The generic command is “dh <module start addr>” :

!dh 01000000|CrashingProgram.exe -f

252 kb for stack reserves. I got the 01000000 from the “lm” command above. The two hex numbers are starting address and ending address of the loaded code :

01000000 012ac000 CrashingProgram (private pdb symbols) D:\symbols\CrashingProgram.pdb

Then scroll down to the headers, in this case it is :

00040000 size of stack reserve
00002000 size of stack commit
00100000 size of heap reserve
00001000 size of heap commit

That is pathetically small, MSDN says that the operating system default is rounded up to the nearest multiple of 1MB per stack. However, it seems the compiler of this executable imposed their own limits in this case.

The Words

Wow! That’s all for such a boring tutorial. I removed some information from the stacktraces above, though I believe the information above is adequate for your understanding. I hope you find this article helpful for bootstrapping your debugging experience, as it can be very fun.

If you know where I am doing badly, remember to tell me as I am … a beginner! ( Hey! I am a web application security dude! )

===

Resources :

WinDBG help online – http://msdn.microsoft.com/en-us/library/cc267445.aspx

Thread Stack Size – http://msdn.microsoft.com/en-us/library/ms686774(VS.85).aspx

Crash Dump Analysis – http://www.dumpanalysis.org/blog/

Advanced Windows Debugging – http://www.amazon.com/Advanced-Debugging-Addison-Wesley-Microsoft-Technology/dp/0321374460

Windows Internals – http://www.amazon.com/Microsoft-Windows-Internals-4th-Server/dp/0735619174/

I am NOT dead.

I know I look like I’m dead and abandoning this place

Nah.

I am just too busy last month on life matters and so, and I nearly got killed by the food in China, and now I am going to come back on track!
And I might be moving my blog once again, because my China friends aren’t making to this webpage. So sad. Anyway, I am having my next post ready, with images!

Upcoming next : How to debug a Stack Overflow for beginners.

(updated the link above)

Basics of The Integer in the Binary World

Talking about overflows, Tom in my previous post mentioned -(-x) != x problem. Precisely, -(2^31) * -1 != 2^31 . What happened?

To understand this, we must understand how integers work in the reality and the binary world.

In reality, integers form a countably infinite set [1]. They have no upper or lower limits. So, in our mind, we can visualize it this way :

The line expands forever to the left and forever to the right.

In the binary world, this is another story. For an integer, we have only 4 bytes ( 32 bits ). By nature, an integer can only represent as many as 2^32 = 4294967296 values. Which means, integers in computers cannot represent the countably infinite nature of integers as in reality. Once the limits are exceeded, it wraps around. Like a wheel :

As you can see, if you subtract 1 from -2147483648 (-2^31), the integer in binary world no longer behaves in what we believe.

(-2^31) – 1
= -214783648 – 1
= +2147483647

Notice that overflowing by subtracting 1 from -(2^31) does not yield (2^31) but (2^31-1). Why? Because 0 is also a value in integer, and thus requires one representation as well. Now there are only (2^32-1) choices left, and so the positive value of -(2^31) is now missing.

That is what Tom is talking about. Since the positive of -(2^31) cannot be represented, -(-(2^31)) = -(2^31).

This goes the same for :

-(2^31) * -1 = -(2^31)
-(2^31) / -1 = -(2^31)

The following code compiling in VC++ demonstrates :

#include <cstdio>
#include <climits>

int main()
{
	printf("%12s\t%12s\t%12s\t%12s\t%12s\n","x","-x","x*-1","x/-1","x-1");
	for ( int i=0; i<10; ++i )
	{
		int x = INT_MIN+i;
		printf("%12d\t%12d\t%12d\t%12d\t%12d\n",x,-x,x*-1,x/-1,x-1);
	}
	return 0;
}

The output of the program is :

           x              -x            x*-1            x/-1             x-1
 -2147483648     -2147483648     -2147483648     -2147483648      2147483647
 -2147483647      2147483647      2147483647      2147483647     -2147483648
 -2147483646      2147483646      2147483646      2147483646     -2147483647
 -2147483645      2147483645      2147483645      2147483645     -2147483646
 -2147483644      2147483644      2147483644      2147483644     -2147483645
 -2147483643      2147483643      2147483643      2147483643     -2147483644
 -2147483642      2147483642      2147483642      2147483642     -2147483643
 -2147483641      2147483641      2147483641      2147483641     -2147483642
 -2147483640      2147483640      2147483640      2147483640     -2147483641
 -2147483639      2147483639      2147483639      2147483639     -2147483640

Look carefully at the 1st line. -2147483648 is -(2^31), our number of interest.

This is the basics of integer overflow problems, and I hope you have learned more about how integers work.

References :

[1] http://en.wikipedia.org/wiki/Integers

Do Not Detect Overflow With Overflow

Credits to a gweilo for the sharing below.

Integer overflow and underflow manifest themselves as vulnerabilities. Here is an overflow bug fired by Sir BugFinder. I assigned our fictional developer Sir FastFix ownership of the bug, and he jumped into the code straight.

First, look at this problematic pseudo-code snippet below :

SWORD param = 0;

while ( flag )
{
	param ++ ;
	//
	//manipulate the flag value...
	//
}

buffer = malloc(sizeof(BYTE) * param);
...

The param can increase definitely. No good. Sir FastFix quickly identifies the problem and sends me this code review below.

// sirfastfix: now uses unsigned.
UWORD param = 0;

while ( flag )
{
	// sirfastfix: code fix for overflow bug.
	if ( param > param + 1 )
	{
		TRACE_ERROR("Overflow occurred at param\n");
		return E_UNEXPECTED;
	}
	param ++ ;
	//
	//manipulate the flag value...
	//
}

buffer = malloc(sizeof(BYTE) * param);
...

Now, I have to review it. Let’s look at the changes.

  1. param is now checked with the condition (param > param + 1). Since it must be false, an overflow must have occurred if it is true. Intuitive.
  2. param is now unsigned using UWORD, and not signed SWORD. I find no reasons for negative buffers. A good move.

But, something smells stinky. Let’s think again.

  1. Why not use well-defined constants like MAX_INT, MAX_SHORT or MAX_LONG constants to check before incrementing param? Like MAX_INT – a < b ?
  2. Why the code to detect overflow is using yet another overflow to check?

Sir FastFix, I am not approving this code check-in. This fix is not going in anywhere into the source tree. Who knows what this overflow to check overflow can result in? Let’s write more solid and not college quality code, and not rushing to resolve the bug.



Follow

Get every new post delivered to your Inbox.