Tuesday, October 2, 2012

Hashtables are not so innocent afterall !

I have always thought, most of the problems which have seemingly a bad complexity are often solved very easily by hashing. Although, it is true for most the cases one should note that the time complexities offered by a hash depends on the quality of your hash. While the solution definitely reduces to an average case O(1), the worst case could still be O(n). How O(n) you ask ?
Lets talk about linear probing. This is one of techniques to resolve the conflicts. So, if hash of item A(while you are inserting it) comes out to be say 34, while item B already hashes to 34, then you put the item A at 35 if its free or continue in the same fashion o/w. This part (the linear probing) is also called as secondary search. Similarly, if you are searching and not inserting for item A then you probe linearly until you find an empty slot. An empty slot reached while probing linearly ensures that A is not in the table because of the technique with which we inserted item A. Sometimes, the value for the corresponding will actually be a list of items so that after hashing to a key one would iterate over the linked list of the items to finish the search. One could also employ some ideas from caching mechanism in maintaining those linked list. For example using a "most recently used first" approach would both be easy and effective. In both cases, a hashtable can a poor worst case time complexity if the hash function is not good.
Now, designing a good hash function has its own costs associated with it. A robust hash which ensures good level of distribution of data may be costly in itself. I had seen some MIT paper with a very good hash function. It involved some bit twiddling hacks to ensure it ran super fast.
EDIT: Here is the code

unsigned int
bitwisehash(char *word, int tsize, unsigned int seed)
{
    char c;
    unsigned int h;

    h = seed;
    for( ; ( c=*word )!='\0' ; word++ )
    {
 h ^= ( (h << 5) + c + (h >> 2) );
    }
    return((unsigned int)((h&0x7fffffff) % tsize));
}


One more problem arises if the table gets full. If it is implemented as an array we would have to allocate another array of double the size then rehash all the items. Too bad, you will agree. Although, for the chained type of implementation this issue wont arise.

EDIT: Just a random note for those working in Java. Know your DS. Leaarn the difference between Hashtables ane Hashmaps. Hashmaps may give very high performance gain over hashtables if concurrency is not an issue.

Monday, October 1, 2012

Some notes on virtual memory(focussing mostly on windows)

Just thought I would revise my virtual memory understanding for some work I got to do.
Each address has its own address space (private to it). Page table keeps the mapping of the physical address to the virtual address(TLBs are cache to the page table). The extra amount of the virtual space is maintained by treating some area of the hard disk as RAM. This is called "swap space" in linux and "paging file" in windows. One can go to "Advanced System Settings->Advanced->Settings->Advanced" to view the size of paging file in windows(Did you notice the hierarchy.. windows is a funny os ;) ). This page summarises the physical memory limit for different windows platforms.

A default 32 bit windows has 4GB virtual space. The lower 2GB (0 to 0x7FFFFFFF) is used by the process and the higher 2GB (0x80000000 to 0xFFFFFFFF) is used by the system. One can also increase the process memory limit to 3GB by 4 gigabyte tuning(called 4GT). Actually this limit is configurable within a 2GB-3GB window and can be done by the command BCDEdit /set increaseuserva Megabyte.  The virtual to physical mapping is managed in terms of pages of size 4KB. So, the pages which are not there in the physical RAM are in the pagefile. A valid page fault occurs if a page is not in RAM and in the swap space(or page file) and the page is brough back. An invalid page fault occurs if the page is not even in the page file. This is one of the reasons you see BSOD ;).

So, a RAM consists of a) Non-paged area which is basically non paged i.e. kernel modules have been loaded in that space and the pages are never swapped out, and b) page pool which basically contains the code, the data, the file cache. One thing to note is there always will be very little free RAM. Free RAM is wasted RAM. So, do not think that if you 8GB RAM the free RAM will increase. Your OS will always find ways to fill it. Manually emptying the RAM is thus an overkill and the programs which do this are actually causing performance degrade.

This page file is pagefile.sys. Obviously, it is kept in C: drive by default. Seek time for C: will be generally less than other drives as this drive handles most of the action.

One interesting aspect I have come across is "you can not access all the memory of your process, even when some parts seem to be free". But, first, lets discuss "memory fragmentation". So, if you have lots of physical RAM free,low committed memory but still getting a "no memory available" then it is most probably a case of memory fragmentation. So, if multiple chunks are malloced and then freed, then this creates obvious fragments in the virtual memory. and after some time it may happen that the virtual memory is so fragmented that there is no contiguous memory available for n bytes although in total a lot of memory is free. So, huge softwares generally precreate memory blocks and cache it for runtime usage.

But, the first line of the previous paragraph mentions something even more interesting, right ! So, even if you have enough memory with enough big contiguous memory blocks, you can still get a "cant alloc" problem. ntdll!ZwAllocateVirtualMemory is the preliminary function which is used to allocate memory. It is never used directly even via its user stub called VirtualAlloc. Because, we mostly use the heap manager provided by the OS. Now, VirtualAlloc has a granularity of 64KB with 4KB minimum allocation(default page size). (Note that the first 64KB and the last 64KB cant be used by a process). So, if I keep VirtualAlloc'ing at fresh addresses I while exhaust the memory space with <2G/64K iterations (as some amount of memory is already used by the process, for example the default heap, some dlls, stack of the thread etc). Now, if you start allocating space from heap such that you overflow the heap space then the heap manager will try to increase the heap as expected using VirtualAlloc and will fail as all 64KB boundaries have been exhausted. If you use a memory footprint s/w you will see lots of 60K empty blocks (as 4KB was allocated due to VirtualAlloc). Thus even though you are allocating much less than 60K and have so much of free memory (60K * no of iterations of VirtualAlloc'ing) you will still get "out of memory".  So, if such a case ever occurs it means ntdll!ZwAllocateVirtualMemory is being directly used in the program and is the source of problem.

Besides, this is a useful tool to monitor the address space of a process.


I plan to cover exception handling in windows very soon.