Instruction Set
-
Features
-
1 Parameters take a target-to-source approach.
-
2 Depending on the size and type of the bytecode, some bytecodes have name suffixes added to disambiguate
● 32位常规类型的字节码未添加任何后缀 ● 64常规类型的字节码添加 -wide 后缀 ● 特殊类型的字节码根据具体类型添加后缀。它们可以是 -boolean、-byte、-char、-short、 -int、-long、-float、-double、-object、-string、-void之一。
-
3 Depending on the bytecode layout and options, some bytecodes have a bytecode suffix added to disambiguate. These suffixes are separated by adding a slash "/" to the suffix of the bytecode main name.
-
4 In the description of the instruction set, each subtitle in the width value represents a width of 4 bits.
For example: move-wide/from16 vAA, vBBBB
This instruction means: move is the basic bytecode, indicating that this is the basic operation. wide is the name suffix, which identifies the data width (64 bits) of the instruction operation. from16 is the bytecode suffix (opcode suffix), which identifies the source as a 16-bit register reference variable. vAA is the destination register, it is always in front of the source, and the value range is v0~v255. vBBBB is the source register, the value range is v0~v65535. Most instructions in the instruction set use registers as destination operands or source operands, where A/B/C/D/E/F/G/H represents a 4-bit value that can be used to represent v0~v15 registers. AA/BB/.../HH represents an 8-bit value. AAAA/BBBB/.../HHHH represents a 16-bit value
-
-
data manipulation instructions
The data manipulation instruction is move. The prototype of the move instruction is "move destination, source". The move instruction will be followed by different suffixes according to the size and type of the bytecode. eg:
- “move vA, vB”:将vB寄存器的值赋给vA寄存器,源寄存器与目的寄存器都为4位。 - "move /from 16 VAA,VBBBB":将VBBBB寄存器的值赋给VAA寄存器,源寄存器为16位,目标寄存器为8位 - “move /from 16 VAAAA,VBBBB”:将VBBBB寄存器的值赋给VAAAA,源寄存器和目标寄存器都为16位 - “move-wide vA, vB”:为4位的寄存器对赋值。源寄存器与目的寄存器都为4位 - "move-object vA,vB":将vB寄存器中的对象引用赋值给vA寄存器,vA寄存器和vB寄存器都是4位 - "move-result vAA":将上一个“invoke”(方法调用)指令,操作的单字(32位) - “move-result-wide vAA” :将上一个invoke指令操作的双字(64位)非对象结果赋值给vAA寄存器 - “mvoe-result-object vAA”:将上一个invoke指令操作的对象结果赋值给vAA寄存器 - “move-exception vAA”:保存上一个运行时发生的异常到vAA寄存器
-
data definition directive
Data definition instructions are used to define constants, strings, classes and other data used in the program. Its underlying bytecode is const .
- const/4 vA,#+B 将数值符号扩展为32位后赋给寄存器 vA - const/16 vAA,#+BBBB 将数值符号扩展为32位后赋给寄存器 vAA - const vAA,#+BBBBBBBB 将数值赋给寄存器vAA - const/high16 vAA,#+BBBB0000 将数值右边 0 扩展为32位后赋给寄存器vAA - const-wide/16 vAA,#+BBBB 将数值符号扩展64位后赋给寄存器对vAA - const-wide vAA,#+BBBBBBBBBBBBBBBB 将数值赋给寄存器对vAA - const-wide/high16 vAA,#+BBBB000000000000 将数值右边 0 扩展为64位后付赋值给寄存器 vAA - const-string vAA,string[@BBBB](https://my.oschina.net/u/205605) 通过字符串索引构造一个字符串并赋给寄存器对 vAA - const-string/jumbo vAA,string[@BBBBBBBB](https://my.oschina.net/u/2326784) 通过字符串索引(较大) 构造一个字符串并赋值给寄存器对vAA - const-class vAA,type[@BBBB](https://my.oschina.net/u/205605) 通过类型索引获取一个类引用并赋值给寄存器 vAA - const-class/jumbo vAAAA,type[@BBBBBBBB](https://my.oschina.net/u/2326784) 通过给定的类型那个索引获取一个类索引并赋值给寄存器vAAAA(这条指令占用两个字节,值为0x00ff,是Android4.0中新增的指令)
-
data return command
The return instruction refers to the last instruction that was run at the end of the function. Its basic bytecode is return, and there are the following four return instructions:
- "return-void":表示什么也不返回 - “return vAA”:表示函数返回一个32位非对象类型的值 - “return-wide vAA”:表示函数返回一个64位非对象类型的值 - “return-object vAA”:表示函数返回一个对象类型的值
-
array manipulation instructions
Array operations include operations such as reading the length of an array, creating an array, assigning an array, and obtaining and assigning values to elements of an array.
- array-length vA,vB 获取给定vB寄存器中数组的长度并将值赋给vA寄存器,数组长度指的是数组的条目个数。 - new-array vA,vB,type[@CCCC](https://my.oschina.net/u/157616) 构造指定类型(type@CCCC)与大小(vB)的数组,并将值赋给vA寄存器。 - new-array/jumbo vAAAA,vBBBB,type@CCCCCCCC 指令功能与上一条指令相同,只是寄存器与指令的索引取值范围更大(Android4.0中新增的指令) - filled-new-array {vC,vD,vE,vF,vG},type@BBBB 构造指定类型(type@BBBB)与大小(vA)的数组并填充数组内容。vA寄存器是隐含使用的,除了指定数组的大小外还制订了参数的个数,vC~vG是使用到的参数寄存器序列 - filled-new-array/range {vCCCC, ... ,vNNNN},type@BBBB 指定功能与上一条指令相同,只是参数寄存器使用range字节码后缀指定了取值范围,vC是第一个参数寄存器, N=A+C-1。 - filled-new-array/jumbo {vCCCC, ... ,vNNNN},type@BBBBBBBB 指令功能与上一条指令相同,只是寄存器与指令的索引取值范围更大(Android4.0中新增的指令) - fill-array-data vAA, +BBBBBBBB 用指定的数据来填充数组,vAA寄存器为数组引用,引用必须为基础类型的数组,在指令后面会紧跟一个数据表 - arrayop vAA,vBB,vCC 对vBB寄存器指定的数组元素进入取值与赋值。vCC寄存器指定数组元素索引,vAA寄存器用来寄放读取的或需要设置的数组元素的值。读取元素使用aget类指令,元素赋值使用aput指令,元素赋值使用aput类指令,根据数组中存储的类型指令后面会紧跟不同的指令后缀,指令列表有aget、aget-wide、aget-object、aget-boolean、aget-byte、aget-char、aget-short、aput、aput-wide、aput-boolean、aput-byte、aput-char、aput-short。
-
Data conversion instructions
Data conversion instructions are used to convert a value of one type to another, and its format is unop vA,vB . The vB register or vB register pair stores the data to be converted, and the converted result is stored in the vA register or vA register pair.
neg-int 对整型数求补 not-int 对整型数求反 neg-long 对长整型求补 not-long 对长整型求反 neg-float 对单精度浮点型数求补 neg-double 对双精度浮点型数求补 int-to-long 将整型数转换为长整型 int-to-float 将整型数转换为单精度浮点型 int-to-double 将整型数转换为双精度浮点型 long-to-int 将长整型数转换为整型 long-to-float 将长整型数转换为单精度浮点型 long-to-double 将长整型数转换为双精度浮点型 float-to-int 将单精度浮点型数转换为整型 float-to-long 将单精度浮点型数转换为长整型 float-to-double 将单精度浮点型数转换为双精度浮点型 double-to-int 将双精度浮点型数转换为整型 double-to-long 将双精度浮点型数转换为长整型 double-to-float 将双精度浮点型数转换为单精度浮点型 int-to-byte 将整型转换为字节型 int-to-char 将整型转换为字符串 int-to-short 将整型转换为短整型
-
Data operation instructions
Data operation instructions include arithmetic operation instructions and logical operation instructions. Arithmetic operation instructions mainly perform operations such as addition, subtraction, multiplication, division, modulo, and shift among numerical values, and logical operations mainly perform operations such as AND, OR, NOT, and XOR between numerical values. There are four types of data operation instructions (data operation may be performed between registers or register pairs, and the following instructions use registers to describe the function):
binop vAA,vBB,vCC 将vBB寄存器与vCC寄存器进行运算,结果保存到vAA寄存器 binop/2addr vA,vB 将vA寄存器与vB寄存器进行运算,结果保存到vA寄存器 binop/lit16 vA,vB,#+CCCC 将vB寄存器与常量CCCC进行运算,结果保存到vA寄存器 binop/lit8 vAA,vBB,#+CC 将vBB寄存器与常量CC进行运算,结果保存到vAA寄存器
The latter three types of instructions have more instruction suffixes such as addr, lit16, and lit8 than the first type of instructions. In the four types of instructions, the basic bytecode is followed by a data type suffix. For example, -int or -long indicate that the data type of the operation is an integer and a long integer, respectively. Category 1 directives can be classified as follows:
add-type vBB寄存器与vCC寄存器值进行加法运算(vBB + vCC) sub-type vBB寄存器与vCC寄存器值进行减法运算(vBB - vCC) mul-type vBB寄存器与vCC寄存器值进行乘法运算(vBB * vCC) div-type vBB寄存器与vCC寄存器值进除法运算(vBB / vCC) rem-type vBB寄存器与vCC寄存器值进行模运算(vBB % vCC) and-type vBB寄存器与vCC寄存器值进行与运算(vBB & vCC) or-type vBB寄存器与vCC寄存器值进行或运算(vBB | vCC) xor-type vBB寄存器与vCC寄存器值进行异或运算(vBB ^ vCC) shl-type vBB寄存器(有符号数)左移vCC位(vBB << vCC) shr-type vBB寄存器(有符号数)右移vCC位(vBB >> vCC) ushr-type vBB寄存器(无符号数)右移vCC位(vBB >> vCC) 其中基础字节码后面的-type可以是-int、-long、-float、-double。后面3类指令与之类似。
-
object manipulation instructions
Operations related to object instances, such as object creation, object inspection, etc.
- new-instance vAA,type@BBBB 构造一个指定类型对象的新实例,并将对象引用赋值给vAA寄存器,类型符号type指定的类型不能是数组类。 - instance-of vA,vB,type@CCCC 判断vB寄存器中的对象引用是否可以转换成指定的类型,如果可以vA寄存赋值为1,否则vA寄存器为0 - check-cast vAA,type@BBBB 将vAA寄存器中对象的引用转成指定类型,成功则将结果赋值给vAA,否则抛出ClassCastException异常.
-
jump instruction
Jump instructions are used to jump from the current address to the specified offset. There are three kinds of jump instructions in the Dalvik instruction set: unconditional jump (goto), branch jump (switch) and conditional jump (if).
goto +AA 无条件跳转到指定偏移处,偏移量AA不能为0 goto/16 +AAAA 无条件跳转到指定偏移处,偏移量AAAA不能为0。 goto/32 +AAAAAAAA 无条件跳转到指定偏移处。 packed-switch vAA,+BBBBBBBB 分支跳转指令。vAA寄存器为switch分支中需要判断的值,BBBBBBBB指向一个packed-switch-payload格式的偏移表,表中的值是有规律递增的。 sparse-switch vAA,+BBBBBBBB 分支跳转指令。vAA寄存器为switch分支中需要判断的值,BBBBBBBB指向一个sparse-switch-payload格式的偏移表,表中的值是无规律的偏移表,表中的值是无规律的偏移量。 if-test vA,vB,+CCCC 条件跳转指令。比较vA寄存器与vB寄存器的值,如果比较结果满足就跳转到CCCC指定的偏移处。偏移量CCCC不能为0。if-test类型的指令有以下几条: ● if-eq 如果vA不等于vB则跳转。Java语法表示为 if(vA == vB) ● if-ne 如果vA不等于vB则跳转。Java语法表示为 if(vA != vB) ● if-lt 如果vA小于vB则跳转。Java语法表示为 if(vA < vB) ● if-le 如果vA小于等于vB则跳转。Java语法表示为 if(vA <= vB) ● if-gt 如果vA大于vB则跳转。Java语法表示为 if(vA > vB) ● if-ge 如果vA大于等于vB则跳转。Java语法表示为 if(vA >= vB) if-testz vAA,+BBBB 条件跳转指令。拿vAA寄存器与 0 比较,如果比较结果满足或值为0时就跳转到BBBB指定的偏移处。偏移量BBBB不能为0。 if-testz类型的指令有一下几条: ● if-nez 如果vAA为 0 则跳转。Java语法表示为 if(vAA == 0) ● if-eqz 如果vAA不为 0 则跳转。Java语法表示为 if(vAA != 0) ● if-ltz 如果vAA小于 0 则跳转。Java语法表示为 if(vAA < 0) ● if-lez 如果vAA小于等于 0 则跳转。Java语法表示为 if(vAA <= 0) ● if-gtz 如果vAA大于 0 则跳转。Java语法表示为 if(vAA > 0) ● if-gez 如果vAA大于等于 0 则跳转。Java语法表示为 if(vAA >= 0)
-
compare instruction
** The comparison instruction is used to compare the size of the values in the two registers. The basic format is cmp+kind-type vAA, vBB, vCC, type indicates the type of the compared data, such as -long, -float, etc.; kind indicates the operation type, so there are three comparison instructions cmpl, cmpg, cmp. coml is the abbreviation of compare less, cmpg is the abbreviation of compare greater, so cmpl indicates whether the condition of vBB is less than the value in vCC is true, if it is, it returns 1, otherwise it returns - 1, return 0 for equality; cmpg indicates whether the condition of vBB is greater than the value in vCC is true, if it is, it returns 1, otherwise it returns -1, and returns 0 for equality. The semantics of cmp and cmpg are the same, that is, whether vBB is greater than the value in the vCC register If established, return 1 if established, otherwise return -1, if equal, return 0 **
eg:
cmpl-float vAA,vBB,vCC 比较两个单精度的浮点数.如果vBB寄存器中的值大于vCC寄存器的值,则返回-1到vAA中,相等则返回0,小于返回1 cmpg-float vAA,vBB,vCC 比较两个单精度的浮点数,如果vBB寄存器中的值大于vCC的值,则返回1,相等返回0,小于返回-1 cmpl-double vAA,vBB,vCC 比较两个双精度浮点数,如果vBB寄存器中的值大于vCC的值,则返回-1,相等返回0,小于则返回1 cmpg-double vAA,vBB,vCC 比较双精度浮点数,和cmpl-float的语意一致 cmp-double vAA,vBB,vCC 等价与cmpg-double vAA,vBB,vCC指令
-
Field operation instructions
Field manipulation instructions represent setting and fetching of object fields, just like the set and get methods you have in your code. The basic instructions are iput-type, iget-type, sput-type, sget-type.type Represents the data type.
*前缀是i的iput-type和iget-type指令用于普通字段的读写操作.* iget-byte vA,vB,filed_id 读取vB寄存器中的对象中的filed_id字段值赋值给vA寄存器 iput-byte vA,vB,filed_id 设置vB寄存器中的对象中filed_id字段的值为vA寄存器的值 iget-boolean vA,vB,filed_id iput-boolean vA,vB,filed_id iget-long vA,vB,filed_id iput-long vA,vB,filed_id 前缀是s的sput-type和sget-type指令用于静态字段的读写操作 sget-byte vA,vB,filed_id sput-byte vA,vB,filed_id sget-boolean vA,vB,filed_id sput-boolean vA,vB,filed_id sget-long vA,vB,filed_id sput-long vA,vB,filed_id
-
method call instruction
Most of the method instructions in Davilk are very similar to the middle instructions of the JVM. There are currently five instruction sets:
invoke-direct{parameters},methodtocall 调用实例的直接方法,即private修饰的方法.此时需要注意{}中的第一个元素代表的是当前实例对象,即this,后面接下来的才是真正的参数.比如指令invoke-virtual {v3,v1,v4},Test2.method5:(II)V中,v3表示Test2当前实例对象,而v1,v4才是方法参数 invoke-static{parameters},methodtocall 调用实例的静态方法,此时{}中的都是方法参数 invoke-super{parameters},methodtocall 调用父类方法 invoke-virtual{parameters},methodtocall 调用实例的虚方法,即public和protected修饰修饰的方法 invoke-interface{parameters},methodtocall 调用接口方法 这五种指令是基本指令,除此之外,你也会遇到invoke-direct/range,invoke-static/range,invoke-super/range,invoke-virtual/range,invoke-interface/range指令,该类型指令和以上指令唯一的区别就是后者可以设置方法参数可以使用的寄存器的范围,在参数多于四个时候使用. 再此强调一遍对于非静态方法而言{}的结构是{当前实例对象,参数1,参数2,…参数n},而对于静态方法而言则是{参数1,参数2,…参数n}
If you want to get the return value of the method execution, you need to get the execution result through the move-result instruction mentioned above.
-
Synchronization instruction
Synchronizing a sequence of instructions is usually represented by the synchronized statement block in java. The JVM supports the semantics of the synchronized keyword through the monitorenter and monitorexit instructions, and Davilk also provides two similar instructions to support the synchronized semantics :
monitor-enter vAA 为指定对象获取锁操作 monitor-exit vAA 为指定对象释放锁操作
-
exception instruction
throw vAA 抛出vAA寄存器中指定类型的异常
Samil file details
-
Each .smali decompiled by a decompilation tool corresponds to a class in java. Each smali file is composed of Davilk instructions and follows a certain structure. There are many instructions in smali to describe the corresponding java file. , all commands start with ".", the commonly used commands are as follows:
.filed 定义字段 .method…end method 定义方法 .annotation…end annotation 定义注解 .implements 定义接口指令 .local 指定了方法内局部变量的个数 .registers 指定方法内使用寄存器的总数 .prologue 表示方法中代码的开始处 .line 表示java源文件中指定行 .paramter 指定了方法的参数 .param 和.paramter含义一致,但是表达格式不同
-
Let's write a simple Hello World to explain
The JAVA source code is as follows:
public class MainActivity extends AppCompatActivity implements View.OnClickListener { private static final String TAG = "MainActivity"; private TextView tvShowText; private static final String HELLO = "HELLO"; private static final String WORLD = "WORLD"; @Override protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.activity_main); initView(); setListener(); } private void setListener() { tvShowText.setOnClickListener(this); } private void initView() { tvShowText = (TextView) findViewById(R.id.tv_show_text); } @Override public void onClick(View v) { Log.d(TAG, "onClick: TextView"); tvShowText.setText(getText()); } private String getText() { return HELLO + WORLD; } } 反编译后smali文件如下 #文件头描述 .class public Lorg/professor/helloworld/MainActivity; #指定基类 .super Landroid/support/v7/app/AppCompatActivity; #源文件名称 .source "MainActivity.java" #表明实现了View.OnClickListener接口 # interfaces .implements Landroid/view/View$OnClickListener; #定义String静态字段 # static fields .field private static final HELLO:Ljava/lang/String; = "HELLO" .field private static final TAG:Ljava/lang/String; = "MainActivity" .field private static final WORLD:Ljava/lang/String; = "WORLD" #定义TextView静态字段 # instance fields .field private tvShowText:Landroid/widget/TextView; #构造方法 # direct methods .method public constructor <init>()V .locals 0 #表示函数中无局部变量 .prologue #表示方法中代码正式开始 .line 9 #表示对应与java源文件的第8行 #调用AppCompatActivity中的init()方法 invoke-direct {p0}, Landroid/support/v7/app/AppCompatActivity;-><init>()V
#调用返回指令,此处没有返回任何值
return-void
.end method #方法结束
.method private getText()Ljava/lang/String;
.locals 1
.prologue
.line 40
#v0寄存器中赋值为HELLOWORLD
const-string v0, "HELLOWORLD"
#调用返回指令,返回v0中的值
return-object v0
.end method
.method private initView()V
.locals 1
.prologue
.line 30
#v0寄存器赋值为0x7f0b005e
const v0, 0x7f0b005e
#调用方法findViewById
invoke-virtual {p0, v0}, Lorg/professor/helloworld/MainActivity;->findViewById(I)Landroid/view/View;
move-result-object v0
#寄存器中对象的引用转成指定类型
check-cast v0, Landroid/widget/TextView;
#设置p0寄存器中的对象中tvShowText字段的值为v0寄存器的值
iput-object v0, p0, Lorg/professor/helloworld/MainActivity;->tvShowText:Landroid/widget/TextView;
.line 31
return-void
.end method
.method private setListener()V
.locals 1
.prologue
.line 26
#设置v0寄存器中的对象为p0中tvShowText字段的值
iget-object v0, p0, Lorg/professor/helloworld/MainActivity;->tvShowText:Landroid/widget/TextView;
#调用 v0的setOnClickListener
invoke-virtual {v0, p0}, Landroid/widget/TextView;->setOnClickListener(Landroid/view/View$OnClickListener;)V
.line 27
return-void
.end method
# virtual methods
.method public onClick(Landroid/view/View;)V
.locals 2 #表示函数中2局部变量
.param p1, "v" # Landroid/view/View;
.prologue
.line 35
const-string v0, "MainActivity"
const-string v1, "onClick: TextView"
invoke-static {v0, v1}, Landroid/util/Log;->d(Ljava/lang/String;Ljava/lang/String;)I
.line 36
iget-object v0, p0, Lorg/professor/helloworld/MainActivity;->tvShowText:Landroid/widget/TextView;
invoke-direct {p0}, Lorg/professor/helloworld/MainActivity;->getText()Ljava/lang/String;
move-result-object v1
invoke-virtual {v0, v1}, Landroid/widget/TextView;->setText(Ljava/lang/CharSequence;)V
.line 37
return-void
.end method
.method protected onCreate(Landroid/os/Bundle;)V
.locals 1
.param p1, "savedInstanceState" # Landroid/os/Bundle; #参数savedInstancestate
.prologue
.line 19
#调用父类方法onCreate()
invoke-super {p0, p1}, Landroid/support/v7/app/AppCompatActivity;->onCreate(Landroid/os/Bundle;)V
.line 20
#v0寄存器赋值为0x7f04001b
const v0, 0x7f04001b
#调用方法setContentView()
invoke-virtual {p0, v0}, Lorg/professor/helloworld/MainActivity;->setContentView(I)V
.line 21
invoke-direct {p0}, Lorg/professor/helloworld/MainActivity;->initView()V
.line 22
invoke-direct {p0}, Lorg/professor/helloworld/MainActivity;->setListener()V
.line 23
return-void
.end method