Logo 
Search:

Assembly Language Answers

Ask Question   UnAnswered
Home » Forum » Assembly Language       RSS Feeds
  Question Asked By: Sahar Chandler   on May 22 In Assembly Language Category.

  
Question Answered By: Abhishek Singh   on May 22

2 down vote accepted


For 8-bit characters it's broadly like this, there are many ways to implement it:

Set si to point to the first character of the string.

mov al,[si]

repnz scasb to find the first match of the first character.

Store the address somewhere.

Set di to point to the first character of the replacement string ('dog' in this case).

Set cx/ecx/rcx to string length.

repz cmpsb

Check that cx/ecx/rcx is zero and last characters match.

If yes, it's a match, so copy 'dog' to the address stored with rep movsb (set pointers si and di first). Do note that this approach only works if the replace string is no longer than the original string. If it's longer, you may need to reserve a new block of memory to avoid a buffer overflow. If it's not a match, backtrack si to the stored address, increment si by 1 (by 2 for 16-bit characters), and jump to 2. (mov al,[si]). You need to also check here when you have reached the end of the string.

Ready. Or, if you want to replace all, as in sed s/cat/dog/g, loop from 1, set pointer (si) first (depending on how you want your regex engine to work).

For UTF-8 (16-bit characters) replace the following: scasb -> scasw, cmpsb -> cmpsw, movsb -> movsw, al -> ax.

For 32-bit code, replace all references to si with esi and all references to di with edi.

For 64-bit code, replace all references to si with rsi and all references to di with rdi.

Share: 

 
 
Didn't find what you were looking for? Find more on please help - why this is not working ? (sub str in str) Or get search suggestion and latest updates.


Tagged: