SQL Server, inside the parser: the Get_Gen_Lex procedure
Hello friends,
Well, we have arrived on Mars! The Perseverance probe designed by NASA reached the Red Planet after 203 days!
"The human being by his nature always wants to overcome his limits and go beyond. If curiosity is the great engine that makes humanity progress, science is the engine.
Today is a big day for science!"
To this and to those who do not want to have limits I dedicate this post.
Yes, again about the Parser of SQL Server
Today we talk about another procedure of the SqlLang.dll the Get_Gen_Lex procedure.
This is the second part of a series of posts about the parser and you can find the first part here
Enjoy the reading!
The CParser::Get_GenLex procedure
This procedure, part of the CParser class, is usually called after the GetChar procedure and it is used to classify the character just read.
In many parts of the LGetToken routine for example we have:
Procedure and parameters
Looking at the source below
we can say that the declaration is:
int CParser::Get_Gen_Lex(CCompatLevel param_1,ulong param_2)
The procedure has two parameters:
CCompatLevel (CL) and param_2 (EDX) and return an int32 value in the EAX register.
This procedure return the type of character read.
So, let's read togheter the main part of the Get_Gen_Lex routine
1) As for every asm routine written in C (as the entire SQL Server is)
we can find at the start of the function the instruction used to save
the rbx register: This istruction is the "push rbx"
Then you will find the instruction "sub rsp, 20h" that allocates 0x20 bytes of stack space.
At the end of the function we will find the "inverse" opcodes: "add rsp,20h" , "pop rbx" and "ret".
I hightlight these areas in yellow.
00007fff`63b52ff0 53 push rbx
00007fff`63b52ff1 4883ec20 sub rsp,20h
2) The EDX register is the second parameter of the procedure and contains the value the read from the input string.
In the previous post we read the character 's' (53h) with the getchar procedure and now this value is passed to the param2
If the value read if less than 80h
00007fff`63b52ff5 8bda mov ebx,edx
00007fff`63b52ff7 81fa80000000 cmp edx,80h
00007fff`63b52ffd
731a jae sqllang!CParser::Get_Gen_Lex+0x2f
(00007fff`63b53019)
If the value read is not equal to 5Ch ("|")
00007fff`63b52fff 83fb5c cmp ebx,5Ch
00007fff`63b53002 0f840a42af00 je sqllang!CParser::Get_Gen_Lex+0x15 (00007fff`64647212)
Then the procedure read from sqllang!charTab zone of memory the type of character and it store this value in the EAX register. The register rbx contain the value read.
00007fff`63b53008 488d0d31954402 lea rcx,[sqllang!CharTab (00007fff`65f9c540)]
00007fff`63b5300f 0fb60499 movzx eax,byte ptr [rcx+rbx*4]
Finally the original rsp and rbx values are restored (since we are exiting from the procedure) and the value in eax is returned.
00007fff`63b53013 4883c420 add rsp,20h
00007fff`63b53017 5b pop rbx
00007fff`63b53018 c3 ret
This is the normal behaviour of the procedure!
But what does the sqllang!charTab zone of memory contain?
Sqllang!charTab zone of memory
This memory zone is 80h x 4 = 140h bytes wide and for each character contain a property called "type of character" here highlighted in orange.
Values from 0 to 20h are non printable charaters and returns a type equal to 3 (eax=3).
Values from 21h to 2fh are special characters returns types 05, 06, 04 and 01
Values from 30h to 39h are numerical characters. Return type 02
Values from 3ah to 40h are special characters. Return types 05 and 01
Values from 61h to 7ah are alfabetical characters. Return type 01
Values from 5bh to 60h are special characters. Return types 05, 04 and 01.
Values from 61h to 7ah are alfabetical characters. Return type 01
Values from 7bh to 7fh are special characters. Return type 05.
Briefly:
01 = Alfabethical characters from a to z plus _ @ #
02 = numbers from 0 to 9
03 = non printable characters.
04 = this special characters: [ " '
05 = this special characters: { | } ~ \ ] ^ : ; < = >? ! % & ( ) * + , - . /
06 $
=> Ooh yes.. remember this values for the next posts...
Exception:
00007fff`63b53019 83fbff cmp ebx,0FFFFFFFFh
00007fff`63b5301c 0f85f941af00 jne sqllang!CParser::Get_Gen_Lex+0x3f (00007fff`6464721b)
00007fff`63b53022 b809000000 mov eax,9
00007fff`63b53027 4883c420 add rsp,20h
00007fff`63b5302b 5b pop rbx
00007fff`63b5302c c3 ret
00007fff`63b5302d 90 nop
00007fff`63b5302e 90 nop
00007fff`63b5302f 90 nop
If the value read is equal to 5Ch then eax = 7h
00007ffd`d6627212 8d43ab lea eax,[rbx-55h] ; 5C - 55 -> 7h
00007ffd`d6627215 4883c420 add rsp,20h
00007ffd`d6627219 5b pop rbx
00007ffd`d662721a c3 ret
And now?
Well, for now no more details about this procedure (i know we did not talk about what happen when the value is greater than 80h and not equal to 0ffffffffh...)
Next time we will talk about another procedure part of the Parser called LGetToken.
Another step inside the parser..
That's all folks but remember:
If you liked this post leave a comment, subscribe to the blog and wait for the next post!
Luca
Thank you so much for providing information about SQL server and many complexities that could have easily helped people look and find out some brutal and useful ways of simplifying operations.
ReplyDeleteSQL Server Load Rest API
Thank you James for you kind words. Stay tuned for the next posts! Luca
Delete