Reverse a string (16-bit Assembly)
Hey there, welcome back to my blog! Today, I’m diving into the world of 8086 assembly to explore something pretty cool — how to reverse a string. But don’t worry, I’m keeping it chill and beginner-friendly.
Ever wondered about flipping a string in Assembly Language? It’s a handy skill, and Emu8086 is the perfect place to learn it. No worries if you’re just starting out this blog is all about making the process easy to grasp.
So, if you’re curious about turning a string backward in Assembly Language, you’re in for a treat! We’ll go through the whole process step by step, and I promise it’ll be a breeze, even if you’re new to this Assembly stuff. Ready to jump into the world of string reversal in 8086 assembly? Let’s roll!
Before We Start:
- Basic understanding of Emu8086 registers (ax, bx, cx, dx, etc.)
- Familiarity with addressing modes and memory access
- Knowledge of basic assembly instructions (mov, inc, dec, cmp, etc.)
These are required to fully understand the logic behind the algorithm. If you are ready, let’s get started.
The Algorithm
In the process of reversing a string, we employ a stack-based approach. The concept is to systematically push each character of the original string onto a stack. Subsequently, we retrieve these characters from the stack, storing them back into the original string starting from the initial position. This methodology leverages the Last In, First Out (LIFO) property of the stack data structure. Due to this property, the characters are retrieved in the reverse order from how they were initially pushed onto the stack. Consequently, the resulting string mirrors the original string but in a reversed sequence. This approach is a clear illustration of how certain data structures, like stacks, can be harnessed to efficiently solve problems such as string reversal.
Code
.model small
: This directive sets the memory model for the program. In this case, it specifies a small memory model, which is one of the memory models used in x86 assembly programming. The memory model determines how the program's code and data are organized in memory..stack 100h
: This directive sets the size of the stack to 100 hexadecimal bytes. The stack is a region of memory used for storing temporary data and managing function calls..data
: This directive marks the beginning of the data segment. The data segment is used for declaring and initializing data that the program uses during its execution.string db "hello"
: This line declares a string variable namedstring
and initializes it with the string "hello". Here's the breakdown:
db
: Declares a byte (8 bits) of data."hello"
: The actual string data. Each character is represented by one byte.
mov ax, @data
: This instruction loads the address of the data segment (@data
) into the accumulator register (ax
). The@data
directive represents the start of the data segment in the code.mov ds, ax
: This instruction moves the value in the accumulator register (ax
), which now holds the address of the data segment, into the data segment register (ds
). This sets up the data segment for accessing data in the program.mov si, offset string
: This instruction moves the offset of thestring
variable into the source index register (si
). Theoffset
keyword retrieves the offset of a variable or label, and in this case, it pointssi
to the beginning of the string.mov cx, 5
: This instruction sets the loop counter register (cx
) to 5. This is likely intended to control the number of iterations in a loop.
We have already defined a string named ‘hello,’ and our objective is to push each letter onto the stack to reverse the string. The process is quite simple. To reverse the string, we will systematically push the letters ‘h,’ ‘e,’ ‘l,’ ‘l,’ and ‘o’ into the stack, followed by popping them out in reverse order.
I moved “offset string” to the source index so the offset is being 00000h which refers to letter “h”
mov bx, [si]
: This instruction moves the 16-bit value (a word) stored at the memory location pointed to by the source index register (si
) into the base register (bx
). In this case,si
is pointing to the current character or word in the string.push bx
: This instruction pushes the value in the base register (bx
) onto the stack. The x86 stack is a Last-In-First-Out (LIFO) data structure, so the value will be pushed onto the top of the stack.inc si
: This instruction increments the value in the source index register (si
). This is done to move to the next character or word in the string.loop stackpush
: This instruction decrements the loop counter (cx
) and jumps to thestackpush
label ifcx
is not zero. It essentially implements a loop that repeats thestackpush
block untilcx
becomes zero.
This code block is likely part of a larger program that initializes the data segment, sets up a string, and then pushes each word of the string onto the stack in a loop. The loop is controlled by the cx
register, which is initially set to 5, so the loop will iterate five times (pushing five words onto the stack).
In order to reverse the string, we should pop each letter from the stack. It is really simple to do that:
mov cx, 5
: This instruction initializes the loop counter (cx
) to 5, indicating that the loop will iterate five times.stackpop:
: This label marks the beginning of the loop.pop dx
: This instruction pops a 16-bit value from the stack into the dx register. The assumption here is that the values on the stack represent characters or bytes to be printed.mov ah, 02h
: This instruction sets the high byte of the ax register to 02h, indicating that the DOS function 02h (character output) will be used.int 21h
: This instruction calls the DOS interrupt 21h, which is a software interrupt used for various DOS functions. In this case, function 02h is invoked, and it prints the character in the dl register (which was loaded with the popped value).loop stackpop
: This instruction decrements the loop counter (cx
) and jumps to thestackpop
label ifcx
is not zero. It effectively creates a loop that repeats thestackpop
block untilcx
becomes zero.
As you can see, ‘68’ is being pushed into the stack, which corresponds to ‘h’ (AL) in the first iteration. In the second iteration, (AH) corresponds to ‘65,’ representing ‘e,’ and so on..
If we examine the stackpop
operation, we can observe that (AL) '6F' corresponds to 'o' in the first iteration. In the second iteration, '6C' corresponds to 'l,' and so on."
I added the word “operation” after stackpop
for clarification and adjusted the phrasing for smoother flow.
Here is the output:
Here is the code:
Summary:
Today, I took a shot at breaking down the workings of Assembly 8086, specifically honing in on the cool task of flipping a string. Think of this as just the beginning of a journey that delves into all sorts of interesting stuff from exploring and tinkering to putting things together and taking them apart.
Now, if the idea of “reversing a string in 8086” seems a bit tangled in tech jargon, no worries! I get that diving into assembly language can be a bit like deciphering a secret code. But hey, I’m here to help. If any part didn’t quite click, just shoot me a question.
You can follow me on:
Twitter: https://twitter.com/lockpin010_
LinkedIn: https://www.linkedin.com/in/ahmetgoker/
Github: https://github.com/0xCD4
Ahmet | Security Researcher | Sociologist