Hello World in Assembly
As is customary with almost every language the first thing people learn to do is how to print out a string of text (Usually being "Hello World!"). In higher level languages this is usually fairly easy to accomplish in relatively few lines of code. For instance in C++ one could write:
Code:
#include <iostream>
using namespace std;
int main()
{
cout << "Hello World!" << endl;
return 0;
}
When it comes to assembly though, depending on the environment, a little more work is involved. If you were to disassemble the exe file generated by the above code you would see TONS of assembly code just to do this one simply task.
Back in the 'old days' when MS-DOS reigned supreme and any assembly programming was done is 16 bit you had direct access to the 'MS-DOS API' which allowed you to easily deal with any of the hardware in a fairly direct matter.
On the flip side you had to worry about the structure of your program in memory, which memory model to use etc etc.
For instance a 16 Bit version of Hello World in Assembly would look like this:
Code:
.MODEL Small
.STACK 100h
.DATA
msg db 'Hello World!',13,10,'$'
.CODE
start:
mov ax, seg msg
mov ds,ax
mov ah, 09h
lea dx, msg
int 21h
mov ax,4C00h
int 21h
end start
It's definitely a little more cryptic. First the memory model gets set, a stack size gets defined. Then the data block sets the variables we will be using. Strings used this way need to be ended with the '$' symbol. The 13 and 10 values are \R and \N respectively which is a newline basically the same as the ENDL command in the C++ sample.
The
'start:' label defines the start of the program. The next two lines make sure that the DataSegment actually points to the correct spot in memory.
The core of the above is the
Int 21h call. It basically takes a parameter in the
AH register and does something with it. You can see it gets called twice. Once with 09h in AH and another with AX having the of 4C00h (thus AH = 4C). Basically when we use INT 21h with AH=09H we are telling it to print a string to the command line whose address pointer sits in the DX Register. As you can see this is basically the 'COUT' statement in the C++ Source
The Second Int21 call basically says 'End of program and return what's in AL' which in this case is 0 so this is similar to the 'Return 0' statement in the C++ program.
Ok this is all great, but in a sense this is all irrelevant for writing assembly now-adays against the Windows API. Everything Windows does is using the 'flat' memory model which means theoretically you have access to a full 4gb of ram, while under DOS it would only give you 16kb, hence the need for segments. Also Windows provides functionality for a lot of the functionality you used to have to talk directly for the hardware for. SO in a sense programming Assembly for windows makes it a little more 'high-level' than old-school MS-DOS Programming.
So what if we now want to write a 32-bit Command-Line Version of 'Hello World' for Windows?
The trick here is to not reinvent the wheel or go to 'low level' as your first instict may be, but to work against the provided Windows API with methods in the Windows DLLs. This code is for Masm32 (different assemblers may need slight tweaks etc)
Code:
.386
.model flat, stdcall
option casemap :none
include \masm32\include\windows.inc
include \masm32\include\kernel32.inc
include \masm32\include\masm32.inc
includelib \masm32\lib\kernel32.lib
includelib \masm32\lib\masm32.lib
.data
HelloWorld db "Hello World!", 0
.code
start:
invoke StdOut, addr HelloWorld
invoke ExitProcess, 0
end start
Looking at this code from the top you'll see again some initialization (target platform is a 386, using the 'flat' memory model , standard calling convention and we're case-sensitive in our names). Then you see a bunch of 'include' and 'includelib' statements. If you're familiar with C++ this is very similar to including a Header File and telling the linker that when when building it should also include a specific LIB file. So in this instance we're including the Windows,Kernel32 and masm32 files. This will allow us to make windows sytem calls.
Then we have the Data Segment again where we define our string (notice how this one is '0' terminated) after which we find the Code Segment. Instead of calling Interrupts as we did in the 16-bit sample we are now simply invoking existing Windows API Methods, namely: StdOut and Exit Process.
As you can probably guess StdOut takes as a parameter the pointer to the 'Hello World' String and Exit Process will return '0' since that's what we're passing int.
Notice that this is meant to be ran from within the command line otherwise it will just flash a window without altering the code to wait for input
Ok, Great! But now what if we want to show an actual Dialog inside of windows? Well the only change we'd have to do is basically change where we call
StdOut to something that shows a window. Windows provides a function called
MessageBox which is just what we're looking for. This function though is provided in the user32.dll so we'll need to make sure to include this in our code.
Looking at MSDN.microsof*****m we can find that the MessageBox functions takes the following parameters
Code:
int MessageBox(
HWND hWnd,
LPCTSTR lpText,
LPCTSTR lpCaption,
UINT uType
);
Where hWnd is the window Owner - Not needed so set to NULL
lpText is the pointer to the string to display: addr HelloWorld
lpCaption is the Caption for the window, which we just set to the same text for convenience
and uType is a enum which defines what buttons to show on the dialog, In this case just the 'OK' button.
The finished code for this would then look like this:
Code:
.386
.model flat, stdcall
option casemap :none
include \masm32\include\windows.inc
include \masm32\include\kernel32.inc
include \masm32\include\user32.inc
includelib \masm32\lib\kernel32.lib
includelib \masm32\lib\user32.lib
.data
HelloWorld db "Hello World!", 0
.code
start:
invoke MessageBox, NULL, addr HelloWorld, addr HelloWorld, MB_OK
invoke ExitProcess, 0
end start
Of course once this gets compiled when looking at it through a disassembler you will not see the code as nice as this, but that's for another time
(ok hint : you'd see something more like this but without the nice readable variable names)
Code:
...
push MB_OK
push offset HelloWorld
push offset HelloWorld
push NULL
call MessageBoxA
...