paulcoding810 · July 14, 2024 04:42
diff --git a/smali-cheatsheet.txt b/smali-cheatsheet.txt
 A little help in Smali

 (To be supplemented)

 # 
 general information
 # 
 Smali
 Types
 Dalvik bytecode has two main type classes, primitive types and reference types. Reference types are objects and arrays, everything else is primitive.
 Primitives are represented by a single letter.

 V - Void - can only be used for return types

 Z - Boolean (logical)

 B - Byte (byte)

 S - Short

 C - Char

 I - Integer

 J - Long (64 bits)

 F - Float (floating)

 D - Double (64 bits)

 Objects take the form Lpackage / name / ObjectName; - where the leading "L" indicates that this is the type of object, package / name / is the package that contains the object, ObjectName is the name of the object, and ";" marks the end of the object name. This will be equivalent to package.name.ObjectName in java. Or for a more specific example, Ljava / lang / String; equivalent to java.lang.String
 Arrays take the form [I - this will be an array of integers with one dimension. those. int [] in Java. For multidimensional arrays, you simply add more "[" characters. [[I = int [] [], [[[I = int [] [] [] etc.) (Note: the maximum number of measurements you can have is 255).
 You can also have arrays of objects, [Ljava / lang / String; there will be an array of strings.

 Methods
 Methods are always specified in very detailed form, which includes the type that contains the method, the name of the method, the types of the parameters, and the return type. All this information is needed so that the virtual machine can find the correct method and be able to perform static analysis on bytecode.
 They take the form Lpackage / name / ObjectName; -> MethodName (III) Z
 In this example, you must recognize Lpackage / name / ObjectName; as a type. MethodName is the name of the method. (III) Z is the signature of the method. III are the parameters (in this case, 3 integers), and Z is the return type (bool).
 Method parameters are listed one after the other, with no separators between them.
 Here's a more complex example:
 Lpackage / name / ObjectName; -> MethodName (I [[IILjava / lang / String; [Ljava / lang / Object;) Ljava / lang / String;
 In Java, this would be
 String MethodName (int, int [] [], int, String, Object [])

 Fields
 Fields are also always specified in a verbose form that includes the type containing the field, the field name, and the field type. Again, this allows the VM to find the correct field and also perform static analysis on the bytecode.
 They take the form Lpackage / name / ObjectName; -> FieldName: Ljava / lang / String;
 It should be pretty obvious - it's the package and object name, field name and field type respectively
 # 
 Registers
 Introduction
 In the dalvik bytecode, registers are always 32 bits and can contain any type of value. 2 registers are used to store 64-bit types (long - Long and double - Double).

 Specifying the number of registers in a method
 There are two ways to specify how many registers are available in a method. the .registers directive specifies the total number of registers in the method, while the alternative .locals directive specifies the number of registers without parameters in the method. The total number of registers will also include, however many registers are needed to store method parameters.

 How method parameters are passed to a method
 When the method is called, the parameters of the method are placed in the last n registers. If the method has 2 arguments and 5 registers (v0-v4), the arguments will be placed in the last 2 registers - v3 and v4.
 The first parameter for non-static methods is always the object on which the method is called (this object)
 For example, let's say you are writing a non-static method LMyObject; -> callMe (II) V. This method has 2 integer parameters, but also has an implicit LMyObject; parameter before both integer parameters, so there are only 3 arguments for the method.
 Suppose you specified that there are 5 registers in method (v0-v4), either with the .registers directive 5 or with the .locals directive 2 (i.e. 2 local registers + 3 parameter registers). When the method is called, the object to which the method is executed (i.e. this reference) will be in v2, the first integer parameter will be in v3, and the second integer parameter will be in v4.
 For static methods, they are the same, except that this argument is implicit.

 Register names
 There are two naming schemes for registers - the usual naming scheme v # and the p # naming scheme for parameter registers. The first register in the p # naming scheme is the first register of parameters in the method. So, let's go back to the previous example of a method with 3 arguments and 5 full registers. The following table shows the common name v # for each register followed by the name p # for parameter registers

 v0 First local register

 v1 Second local register

 v2 p0 First parameter register

 v3 p1 Second parameter register

 v4 p2 Third parameter register
 You can refer to parameter registers by name - it doesn't matter.

 Parameter Registers Insertions
 p # naming scheme was introduced as a practice question
 Let's say you have an existing method with multiple parameters and you add some code to that method and you find that you need extra case. You think: "It's okay, I'll just increase the number of registers specified in the .registers directive!"
 Unfortunately, it is not that easy. Be aware that method parameters are stored in the last registers in the method. If you increase the number of registers, you change which registers enter the method arguments. Therefore, you will have to change the .registers directive and renumber each parameter register.
 But if the p # naming scheme was used to refer to parameter registers throughout the method, you can easily change the number of registers in the method without worrying about re-numbering any existing registers.

 Long / Double values
 As mentioned earlier, long and double primitives (J and D respectively) have 64-bit values and require 2 registers. This is important to keep in mind when referring to method arguments. For example, suppose you have a (non-static) method LMyObject; -> MyMethod (IJZ) V. The method parameters are LMyObject;, int, long, bool. Thus, all of its parameters will require 5 registers.

 p0 this

 p1 I

 p2, p3 J

 p4 Z

 Also, when you call the method later, you need to specify both registers for any double-expanded arguments in the register list for an invoke statement.
 # 
 array (arrays)
 array-length vA, vB
 A: Destination register (4 bits)
 B: Array of reference-bearing register (4 bits)
 Stores the length (number of entries) of the specified vB array to vA

 fill-array-data vA +,: target
 A: Registering a pair containing an array reference
 B: Target label defining the array data table
 Populates the specified array vA + with the specified data in the target. The link must be in an array of primitives and the data table must match it in type and size. The array width is defined in the table.
 The register pairs are occupied by vX and vX + 1. for example v1, v2.
 Example data table:
 : target
 .array-data 0x2
 0x01 0x02
 0x03 0x04
 .end array-data

 new-array vA +, vB, Lclass; -> type
 A: Destination register (8 bits)
 B: Size register
 C: Type reference
 Creates a new array of the specified type and size. The type must be an array type.

 filled-new-array {vA [vB, v .., vX]}, Lclass; -> type
 vA-vX: Argument Registers (4 bits each)
 B: Type reference
 Creates a new array of the specified type and size. The type must be an array type. A reference to the newly generated array can be obtained with the move-result-object command, immediately after the fill-new-array command.

 filled-new-array / range {vA .. vX}, Lclass; -> type
 vA .. vX: Range of registers containing array parameters (4 bits each)
 B: Type reference (16 bit)
 Creates a new array of the specified type. The type must be an array type. A reference to the newly created array can be obtained with the move-result-object command, immediately after the fill-new-array / range command.
 # 
 array accessors
 Legend:
 A (aget): Destination register
 A (aput): Source register
 B: Array reference
 C: Index in the array
 aget vA, vB, vC

 Retrieves the integer value at index vC from the array referenced by vB and stores it in vA

 aput vA, vB, vC

 Stores the integer value from vA in the array referenced by vB at the index of vC
 There are also other aget / aput, adding an ending changes the value type. For example: aget-objec (Gets an object).
 -boolean

 -byte

 -char

 -object

 -short

 -wide
 # 
 comparison
 Legend:
 A: Destination register
 B: First source register
 C: Second source register
 B +: First pair of source registers (pair)
 C +: Second pair of source registers (pair)
 cmp-long vA, vB +, vC +
 Compares long values in original registers, keeping 0;
 If vB + == vC + then preserves 1;
 If vB + <vC + or vB +> vC + then retains -1.

 cmpg-double vA, vB +, vC +
 Compares double values in original registers, keeping 0;
 If vB + == vC + then preserves 1;
 If vB + <vC + or vB +> vC + then retains -1.
 If vB + or vC + is not a number, 1 is returned.

 cmpg-float vA, vB, vC
 Compares float values in source registers, keeping 0;
 If vB == vC then preserves 1;
 If vB <vC or vB> vC then retains -1.
 If vB or vC is not a number, 1 is returned.

 cmpl-double vA, vB +, vC +
 Compares double values in original registers, keeping 0;
 If vB + == vC + then preserves 1;
 If vB + <vC + or vB +> vC + then retains -1.
 If either vB + or vC + is not a number, -1 is returned.

 cmpl-float vA, vB, vC
 Performs the specified float comparison, storing 0;
 If vB == vC then preserves 1;
 If vB <vC or vB> vC then retains -1.
 If vB or vC is not a number, -1 is returned.
 # 
 const
 const vAA, # + BBBBBBBB
 A: Destination register (8 bits)
 B: 32-bit signed constant integer
 Move the specified constant integer value to the specified vAA register.

 const / 16 vAA, # + BBBB
 A: Destination register (8 bits)
 B: Integer (16 bit)
 Pushes # + BBBB into vAA register 

 const / 4 vA, # + B
 A: Destination register (4 bits)
 B: Integer (4 bits)
 Places the specified 4-bit integer constant in the destination register vA.

 const / high16 vAA, # + BBBB
 A: Destination register (8 bits)
 B: Integer (16 bits)
 Places a 16-bit constant in the uppermost bits of the vAA register. Used to initialize float values.

 const-class vAA, Lclass
 A: Destination register (8 bits)
 class: Class reference
 Will move the reference to the class specified in the vAA destination register. In the case where the specified type is primitive, this will store a reference to a special class of the primitive type.

 const-string vAA, "BBBB"
 A: Destination register (8 bits)
 B: String value
 Move the reference to the string specified in the vAA destination register

 const-string / jumbo vAA, "BBBBBBBB"
 A: Destination register (8 bits)
 B: String value
 Move the reference to the string specified in the vAA destination register
 jumbo - indicates that the value will be "large"

 const-wide / 16 vA +, # + BBBB
 # While empty

 const-wide / high16 vA +, # + BBBB
 # While empty

 const-wide vA +, # + BBBBBBBBBBBBBBBB
 # While empty
 # 
 goto
 goto - Unconditional jump to: target.
 goto: target

 goto / 16: target # 16bit

 goto / 32: target # 32bit
 Note: goto literally uses +/- offsets from the current command. APKTool converts them to labels for readability. If within the code, a 16-bit value is required for an offset, goto / 16 should be used, or for a 32-bit value, goto / 32 should be used. It's almost impossible to tell if goto / 16 or goto / 32 is required when adding a new command (unless you know for sure). If you don't know exactly which bit, goto / 16 can replace any goto, and goto / 32 can replace any goto / 16 or goto.
 Only the replacement cannot be made for a turn: goto cannot replace goto / 16, and it, in turn, cannot replace goto / 32.
 # 
 if
 Legend:
 A: First register to check (integer)
 B: Second register to check (integer)
 target: Target label
 Note:! = Not equal
 if-eq vA, vB,: target
 Execution jumps to: target if vA == vB

 if-eqz vA,: target
 : target if vA == 0

 if-ge vA, vB,: target
 : target if vA> = vB

 if-gez vA,: target
 : target if vA> = 0

 if-gt vA, vB,: target
 : target if vA> vB

 if-gtz vA,: target
 : target if vA> 0

 if-le vA, vB,: target
 : target if vA <= vB

 if-lez vA,: target
 : target if vA <= 0

 if-lt vA, vB,: target
 : target if vA <vB

 if-ltz vA,: target
 : target if vA <0

 if-ne vA, vB,: target
 : target if vA! = vB

 if-nez vA,: target
 : target if vA! = 0
 # 
 invoke
 Legend:
 vA-vX: Arguments passed to the method
 class: The name of the class containing the method
 method: The name of the method to call
 R: Return type.
 invoke-direct {vA, v .., vX}, Lclass; -> method () R
 Calls a non-static direct method (that is, an instance method that by its nature is not overridden, namely either a private instance method or a constructor).

 invoke-interface {vA, v .., vX}, Lclass; -> method () R
 Calls an interface method (that is, an object whose specific class is unknown using a method that refers to an interface).

 invoke-static {vA, v .., vX}, Lclass; -> method () R
 Calls a static method (which is always considered a direct method).

 invoke-super {vA, v .., vX}, Lclass; -> method () R
 Calls the virtual method of the immediate parent class.

 invoke-virtual {vA, v .., vX}, Lclass; -> method () R
 Calls a virtual method (a method that is not static or final, and is not a constructor).
 Note:
 If the method returns (R is not "V" for Void), it must be committed to the next line by one of the move-result statements, or it will be lost.

 You can also not list all the vA-vX arguments, but make the Range of arguments by adding the / range ending. For example: invoke-direct / range {vA .. vX}, Lclass; -> method () R And this can be done with any of the above invoke.
 invoke-direct {v1, v2, v3} is the same as invoke-direct / range {v1 .. v3}
 invoke-direct {v0} is the same as invoke-direct / range {v0 .. v0}

 It often leads to errors using invoke-virtual {vX} instead of invoke-virtual / range {vX .. vX} in methods with a large number of local registers (v1, v2, v22)
 # 
 misc / misc
 check-cast vAA, Lclass
 A: Reference register (8 bits)
 B: Type reference (16 bits)
 Checks if an object reference in vAA can be passed to an instance of the type referenced by class.
 Throws a ClassCastException; if this is not possible, execution continues otherwise.

 instance-of vA, vB, Lclass
 A: Destination register (4 bits)
 B: Reference register (4 bits)
 C: Class reference (16 bits)
 # No description yet

 new-instance vAA, Lclass
 A: Destination register (8 bits)
 B: Type reference
 Creates a class object of type and places a reference to the newly created instance in vAA.
 The type must be of the non-array class.

 nop
 Empty command / No operation

 throw vAA
 A: Exception-bearing register (8 bits)
 Throws the specified exception. The exception object reference is in vAA.
 # 
 move
 Legend:
 A: Destination register (4, 8, 16 bits)
 B: Original register (4, 16 bits)
 #A: x bits. B: x bits is not part of the code. Added only to denote bits in registers
 move vA, vB #A: 4 bits. B: 4 bits
 Moves the contents of one non-object register to another.

 move / 16 vAAAA, vBBBB #A: 16 bits. B: 16 bits
 Does the same as move. Source register and destination register only 16 bits

 move / from16 vAA, vBBBB #A: 8 bits. B: 16 bits
 Does the same as move / 16. Destination register only 8 bits

 move-exception vAA #A: 8 bits
 Saves the just caught exception to vAA. This must be the first statement of any exception handler whose exception should not be ignored, and this statement can only ever occur as the first statement of an exception handler. PS: nowhere without tautology)

 move-object vA, vB #A: 4 bits. B: 4 bits
 Moves the contents of one register object to another.

 move-object / 16 vAAAA, vBBBB #A: 16 bits. B: 16 bits
 Does the same as move-object. Source register and destination register only 16 bits

 move-object / from16 vAA, vBBBB #A: 8 bits. B: 16 bits
 Does the same as move-object / from16. Destination register only 8 bits

 move-result vAA #A: 8 bits.
 Wraps the result of a single word non-object from the most recent invoke type to vAA. This should be done as a statement immediately after the invoke type, the result of which (one-word, not an object) should not be ignored.

 move-result-object vAA #A: 8 bits.
 Transfers the object result from the last invoke to vAA. This should be executed as a statement immediately following an invoke type or fill-new-array, whose (object) result should not be ignored.

 move-result-wide vA + #A: 8 bits.
 # While empty

 move-wide vA +, vB + #A: 4 bits. B: 16 bits
 # While empty

 move-wide / 16 vA +, vB + #A: 16 bits. B: 16 bits
 # While empty

 move-wide / from16 vA +, vBBBB #A: 8 bits. B: 16 bits
 # While empty
 # 
 operations
 ADD operator - adds values on either side of the operator
 # 
 add-double vA +, vB +, vC +
 A: Pair of destination registers (8 bits)
 B: Source register pair 1 (8 bits)
 C: Source register pair 2 (8 bits)
 Calculates vB + + vC + and stores the result in vA +

 add-double / 2addr vA +, vB +
 A: Source register 1 / destination register pair (8 bits)
 B: Source register pair 2 (8 bits)
 Calculates vA + vB and store the result in vA +

 add-float vA, vB, vC
 A: Destination register (4 bits)
 B: Source register 1 (4 bits)
 C: Source register 2 (4 bits)
 Calculates vB + vC and stores the result in vA

 add-float / 2addr vA, vB
 A: source register 1 / destination register (4 bits)
 B: source register 2 (4 bits)
 Calculates vA + vB and stores the result in vA

 add-int vA, vB, vC
 A: destination register (4 bits)
 B: source register 1 (4 bits)
 C: source register 2 (4 bits)
 Calculates vB + vC and stores the result in vA

 add-int / lit8 vA, vB, 0xC
 A: destination register (8 bits)
 B: source register (8 bits)
 C: signed constant value constant (8 bits)
 Calculates vB + 0xC and stores the result in vA

 add-int / lit16 vA, vB, 0xC
 A: destination register (4 bits)
 B: source register (4 bits)
 C: signed constant value constant (16 bit)
 Calculates vB + 0xC and stores the result in vA

 add-int / 2addr vA, vB
 A: source register 1 / destination register (4 bits)
 B: source register 2 (4 bits)
 Calculates vA + vB and stores the result in vA


 AND Operator - A binary operator copies a bit into the result if it exists in both operands.

 # 
 # While empty


 DIV Operator - Divides the left operand by the right operand

 # 
 # While empty


 MUL operator - multiplies values on either side of the operator

 # 
 # While empty


 OR Operator - Copies a bit if it exists in any of the operands.

 # 
 # While empty


 REM operator - divides the left operand by the right operand and returns the remainder

 # 
 # While empty


 SHL Operator - The value of the left operands is moved left by the number of bits specified by the right operand.

 # 
 # While empty


 SHR operator - the value of the right operands is moved to the right by the number of bits specified by the left operand.

 # 
 # While empty


 SUB - operator subtracts the left operand from the right operand

 # 
 # While empty


 USHR operator - # no description

 # 
 # While empty


 XOR Operator - Copies a bit if it is set in one operand, but not in both.

 # 
 # While empty

 # 
 return
 The return statement is used to make an explicit return from a method. That is, it again transfers control to the object that called this method. The return statement instructs the interpreter to stop executing the current method. If the method returns a value, the return statement is followed by some expression. The value of this expression becomes the return value of the method.
 return vAA
 A: Return value register (8 bits)
 Returns from the return method of a non-object with the value vAA.

 return-object vAA
 A: Return value register (8 bits)
 Returning from the object-returning method using the object-reference in vAA.

 return-void
 Returning from a void method with no value.

 return-wide vA +
 A: Pair of return value registers (8 bits)
 Returns a double / long (64-bit) value in vA +.
 # 
 switch
 Legend:
 A: The register that is being checked
 target: Target label of packed-switch table (switches)
 packed-switch vAA,: target
 Implements a switch statement where case constants are sequential. The instruction (code execution script) uses the index table. vAA pointers to this table to find the instruction offset for a specific case. If vAA drops out of the index table, execution continues with the next command (default case). pack-switch is used when the possible vAA values are consistent regardless of the lowest value.
 Example of a table with radio buttons:
 : target
 .packed-switch 0x1 # 0x1 = Lowest / Lowest vAA
 : pswitch_0 # Jump to pswitch_0 if vAA == 0x1
 : pswitch_1 # Jump to pswitch_1 if vAA == 0x2
 .end packed-switch

 sparse-switch vAA,: target
 Implements a switch statement where case constants are not sequential. The statement uses a lookup table with case constants and offsets for each case constant. If there is no match in the table, execution continues with the next command (default case).
 : target
 .sparse-switch
 0x3 ->: sswitch_1 # Will go to sswitch_1 if vAA == 0x3
 0x65 ->: sswitch_2 # Will go to sswitch_2 if vAA == 0x65
 .end sparse-switch