Why Write MASM In The Modern Era ? |
There are many reasons why experienced programmers choose to write assembler code, performance issues where speed matters, the architectural freedom to lay out code in any way you like, the capacity to do things that cannot be done in many compilers but the main reason is simply because you can. Many conjure up the image of cobbling together a few DOS interrupts in unintelligible notation to prop up the shortcomings of compilers yet a modern assembler like MASM has the range of a high level language and can be written that way for high level code while retaining all of its power at the lowest level.
With the introduction of the 32 bit Windows API functions, MASM had access at the same functions that compilers had from the operating system but without the clutter and assumption of many of the compilers available. When you write Windows API code in MASM you get perfectly clear minimal precision code that leverages the full power of the Windows operating system and you get it at the code size you write, not with a pile of unwanted extras dumped into your executable by a compiler. MASM has never been for the faint of heart, it is an uncompromised tool that has never been softened into a user friendly toy and it required the development of expertise to use correctly but for the programmer who already has experience in low level C and similar code, MASM offers power and flexibility that the best of compilers cannot deliver and contrary to popular opinion it can be developed and written at about the same development time as C code. MASM handle C style structures with ease as it does with a number of other familiar high level constructions. ; --------------------------------------------------- ; set window class attributes in WNDCLASSEX structure ; --------------------------------------------------- mov wc.cbSize, sizeof WNDCLASSEX mov wc.style, CS_BYTEALIGNCLIENT or CS_BYTEALIGNWINDOW m2m wc.lpfnWndProc, OFFSET WndProc mov wc.cbClsExtra, NULL mov wc.cbWndExtra, NULL m2m wc.hInstance, hInstance m2m wc.hbrBackground, COLOR_BTNFACE+1 mov wc.lpszMenuName, NULL mov wc.lpszClassName, OFFSET szClassName m2m wc.hIcon, hIcon m2m wc.hCursor, hCursor m2m wc.hIconSm, hIcon Instead of the nightmares of old with unreadable and unmaintainable code, the high level aspects of MASM look and read very much like traditional C code yet at any time it can work directly in mnemonics where performance matters.
Window API calls are clean clear code. ; ----------------------------------------------------------------- ; create the main window with the size and attributes defined above ; ----------------------------------------------------------------- invoke CreateWindowEx,WS_EX_LEFT or WS_EX_ACCEPTFILES, ADDR szClassName, ADDR szDisplayName, WS_OVERLAPPEDWINDOW, Wtx,Wty,Wwd,Wht, NULL,NULL, hInstance,NULL mov hWnd,eax For non critical high level code it has a number of built in loop techniques that make for clear and maintainable code. .while eax > 0 sub eax, 1 .endw And the matching, .repeat sub eax, 1 .until eax < 1 Where genuine performance matters loop code is written directly in mnemonics using any of the entire instruction set that the processor will support and you are by no means restricted to high level style loop code or the theory behind it. You can write multi-entry and exit loops, nested loops, interdependent loops and manually unrolled loops of many different designs, most of which will give even the best compilers nightmares.
Block structure conditional testing is routine in MASM with an efficient and clear notation that is useful in all but the most demanding algorithms where direct mnemonic code can bypass unrequired structure for maximum performance. .if eax == 1 ; do something .elseif eax == 2 ; do something else .else ; otherwise do this .endif This of course can be nested in the normal manner for the construction of WndClass() style message processing and with any of the high level flow control notations you can use C style runtime comparisons for complex condition evaluation.
Operator Meaning == Equal != Not equal > Greater than >= Greater than or equal to < Less than <= Less than or equal to & Bit test (format: expression & bitnumber) ! Logical NOT && Logical AND || Logical OR CARRY? Carry bit set OVERFLOW? Overflow bit set PARITY? Parity bit set SIGN? Sign bit set ZERO? Zero bit set This combined capacity alone makes MASM a formidable tool yet its true designation is that of a MACRO assembler, a capacity that few would be familiar with and it has a very powerful pre-processor that substantially extends this already formidable capacity. MASM macros are known to be quirky and at a more advanced level they can be reasonably difficult to develop end debug but there are a large number of well proven reliable macros available that increase the throughput of reliable code.
For normal integer evaluation. switch variable case value1 ; do something here case value2 ; do something else default ; do any default processing endsw A variation combined with a hand written string evaluation procedure. switch$ string_address case$ "quoted_text1" ; perform action here ; if word maches case$ "quoted_text2" ; perform action here ; if word maches else$ ; perform default action endsw$ The range of macros to automate tasks in MASM is almost unlimited yet while it may look like high level code it is in fact inlined hand written assembler code to improve your code throughput and maintainance without compromising performance.
While the pseudo high level capacity in MASM is very useful and has a proven track record over time, its real power is in its ability to write the complete Intel instruction set from conventional integer instruction to floating point instructions and register, MMX and the series of late XMM (SSE) instructions with near complete freedom of architecture. With manually written mnemonic code you have as much or as little structure as you require and this is one of its great advantages over even the best compilers, the near total freedom to write anything you need without artificially imposed limitations. Below is a simple algorithm using SSE instructions to XOR a random pad against data to be encrypted. It is written using a normal stack frame as the stack overhead is not a factor in its timing and the source and pad must be 16 byte aligned for the SSE instructions used. It is unrolled by a factor of 4 and uses non-temporal writes to avoid cache pollution with incoming data. Built as an object module it can be used in either a MASM program or linked directly into a Microsoft VC application, one of the major targets of MASM as it is supplied with Visual C. ; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤ SSExor proc src:DWORD,padd:DWORD,ln:DWORD push ebx push esi push edi mov esi, src mov edi, padd mov edx, -64 mov ebx, ln shr ebx, 6 ; int divide ln by 64 align 16 lbl0: add edx, 64 movdqa xmm0, [esi+edx] ; read the source movdqa xmm1, [esi+edx+16] ; read the source movdqa xmm2, [esi+edx+32] ; read the source movdqa xmm3, [esi+edx+48] ; read the source pxor xmm0, [edi+edx] ; xor pad to source pxor xmm1, [edi+edx+16] ; xor pad to source pxor xmm2, [edi+edx+32] ; xor pad to source pxor xmm3, [edi+edx+48] ; xor pad to source movntdq [esi+edx], xmm0 ; write result back to source movntdq [esi+edx+16], xmm1 ; write result back to source movntdq [esi+edx+32], xmm2 ; write result back to source movntdq [esi+edx+48], xmm3 ; write result back to source sub ebx, 1 jnz lbl0 mov eax, edx sub eax, ln cmp eax, 0 jle lbl2 add edx, 64 align 4 lbl1: movzx ecx, BYTE PTR [edi+edx] xor [esi+edx], cl add edx, 1 sub eax, 1 jnz lbl1 lbl2: pop edi pop esi pop ebx ret SSExor endp ; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤ Algorithms written in the normal integer instructions are commonplace and still do the lion share of work in most applications. Below is an algorithm to perform the mundane task of replacing ASCII zeros with spaces in large text files that occasionally have embedded zeros that prevent them from being read in a normal editor.
It is written without a stack frame and is unrolled by a factor of 4 to improve its throughput by reducing loop code overhead. ; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤ OPTION PROLOGUE:NONE OPTION EPILOGUE:NONE zrep proc ptxt:DWORD,ltxt:DWORD mov edx, [esp+4] ; ptxt mov ecx, [esp+8] ; ltxt add edx, ecx neg ecx jmp label0 quit: ret 8 align 4 pre: mov BYTE PTR [edx+ecx], 32 add ecx, 1 jz quit align 4 label0: cmp BYTE PTR [edx+ecx], 0 je pre add ecx, 1 jz quit cmp BYTE PTR [edx+ecx], 0 je pre add ecx, 1 jz quit cmp BYTE PTR [edx+ecx], 0 je pre add ecx, 1 jz quit cmp BYTE PTR [edx+ecx], 0 je pre add ecx, 1 jnz label0 ret 8 zrep endp OPTION PROLOGUE:PrologueDef OPTION EPILOGUE:EpilogueDef ; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤ With MASM the sky is the limit and while there are many tasks that are not worth making the effort to write, where you do need to target the speed of a particular process in an application, MASM is the tool that can produce any algorithm you know enough about to write.
MASM is not for the faint of heart, it is a technically demanding tool that requires a good working knowledge of both the operating system and the available mnemonics that the processor supports but it puts the final control of an algorithm in the hands of the programmer who chooses to master a tool of this type. |