The author focuses on the field of Android security. Welcome to pay attention to my personal WeChat public account " Android Security Engineering " (click to scan the code to follow). The personal WeChat public account mainly focuses on the security protection and reverse analysis of Android applications, sharing various security attack and defense methods, Hook technology, ARM compilation and other Android-related knowledge. Follow my personal WeChat and witness the rise of Android security giants~
foreword
When injecting code into an apk file, we are often faced with the decompiled smali code instead of the direct Java source code file, so it is necessary to understand the basics of smali syntax. Here first introduce the Dalvik virtual machine: Dalvik is a virtual machine specially designed by Google for the Android platform. Although Android programs can be developed using the Java language, Dalvik VM and Java VM are two different virtual machines. The Dalvik VM is register-based, while the Java VM is stack-based. Dalvik VM has a special file execution format dex (Dalvik Executable), while Java VM executes Java bytecode. DVM is faster and takes less space than JVM.
smali file structure
The following smali code is taken from a test demo (obtained by decompiling the .apk file with apktool, here is an introduction to the smali syntax format), the purpose is to have a general understanding of the content structure of the smali file, which is beneficial to the later There is an overall grasp when explaining the grammar details.
.class public abstract Lcom/happy/learnsmali/BaseActivity;
.super Landroidx/appcompat/app/AppCompatActivity;
.source "BaseActivity.kt"
# interfaces
.implements Lcom/happy/learnsmali/action/ActivityAction;
.implements Lcom/happy/learnsmali/action/ClickAction;
.implements Lcom/happy/learnsmali/action/HandlerAction;
.implements Lcom/happy/learnsmali/action/BundleAction;
.implements Lcom/happy/learnsmali/action/KeyboardAction;
# annotations
.annotation system Ldalvik/annotation/MemberClasses;
value = {
Lcom/happy/learnsmali/BaseActivity$Companion;,
Lcom/happy/learnsmali/BaseActivity$OnActivityCallback;
}
.end annotation
.annotation system Ldalvik/annotation/SourceDebugExtension;
value = "SMAP\nBaseActivity.kt\nKotlin\n*S Kotlin\n*F\n+ 1 BaseActivity.kt\ncom/happy/learnsmali/BaseActivity\n+ 2 fake.kt\nkotlin/jvm/internal/FakeKt\n*L\n1#1,179:1\n1#2:180\n*E\n"
.end annotation
# static fields
.field public static final Companion:Lcom/happy/learnsmali/BaseActivity$Companion;
.field public static final RESULT_ERROR:I = -0x2
# instance fields
.field private final activityCallbacks$delegate:Lkotlin/Lazy;
# direct methods
.method public static synthetic $r8$lambda$mAxgPA6JBXhjuhBfNvUeqmKUmlk(Lcom/happy/learnsmali/BaseActivity;Landroid/view/View;)V
.locals 0
invoke-static {p0, p1}, Lcom/happy/learnsmali/BaseActivity;->initSoftKeyboard$lambda-0(Lcom/happy/learnsmali/BaseActivity;Landroid/view/View;)V
return-void
.end method
.method static constructor <clinit>()V
.locals 2
new-instance v0, Lcom/happy/learnsmali/BaseActivity$Companion;
const/4 v1, 0x0
invoke-direct {v0, v1}, Lcom/happy/learnsmali/BaseActivity$Companion;-><init>(Lkotlin/jvm/internal/DefaultConstructorMarker;)V
sput-object v0, Lcom/happy/learnsmali/BaseActivity;->Companion:Lcom/happy/learnsmali/BaseActivity$Companion;
return-void
.end method
.method public constructor <init>()V
// ...
.end method
In the above code, if you are new to the smali code, it is normal if you are confused. I will analyze it below. Understanding the meaning of these symbols will help us inject code when we decompile the apk Time to achieve twice the result with half the effort.
Inheritance, interface, package information in smali
First, let's look at the first few lines:
.class public abstract Lcom/happy/learnsmali/BaseActivity; // .class 表示类路径 包名+类名
.super Landroidx/appcompat/app/AppCompatActivity; // .super 表示父类的路径
.source "BaseActivity.kt" // 表示源码文件名
# interfaces
.implements Lcom/happy/learnsmali/action/ActivityAction;
.implements Lcom/happy/learnsmali/action/ClickAction;
.implements Lcom/happy/learnsmali/action/HandlerAction;
.implements Lcom/happy/learnsmali/action/BundleAction;
.implements Lcom/happy/learnsmali/action/KeyboardAction;
# annotations
.annotation system Ldalvik/annotation/MemberClasses;
value = {
Lcom/happy/learnsmali/BaseActivity$Companion;,
Lcom/happy/learnsmali/BaseActivity$OnActivityCallback;
}
.end annotation
Lines 1-3 define basic information : indicates the smali file (third line) obtained by decompiling the source file BaseActivity.kt, the file path is located at com/happy/learnsmali/ (second line), inherited from androidx/appcompat/app/ AppCompatActivity (third line).
Lines 5-9 define interface information : Indicates that the interface classes implemented by the BaseActivity class are:
- com/happy/learnsmali/action/ActivityAction
- com/happy/learnsmali/action/ClickAction
- com/happy/learnsmali/action/HandlerAction
- com/happy/learnsmali/action/BundleAction
- com/happy/learnsmali/action/KeyboardAction
Lines 11-16 define inner classes : Indicates that the BaseActivity class has two inner classes – Companion and OnActivityCallback.
After analyzing the file information at the beginning of smali, we can construct java code based on this:
class BaseActivity extends AppCompatActivity
implements ActivityAction, ClickAction, HandlerAction, BundleAction, KeyboardAction {
class Companion {
// ...
}
class OnActivityCallback {
// ...
}
}
Other methods
# virtual methods //Representation is a virtual method
.method protected onCreate(Landroid/os/Bundle;)V
.locals 1
.param p1, "savedInstanceState" # Landroid/os/Bundle;
.line 10
invoke-super {p0, p1}, Landroid/app/Activity;->onCreate(Landroid/os/Bundle;)V
.line 11
const/high16 v0, 0x7f050000
invoke-virtual {p0, v0}, Lcom/justart/samlidemo/MainActivity;->setContentView(I)V
.line 12
return-void
.end method
- The method
.method
starts with and.end method
ends with ; - The last V in the first line indicates that the return type is void;
- The method parameter Landroid/os/Bundle; indicates that the parameter of the method onCreate() is Bundle type;
- .param indicates that the parameter name of the method is savedInstanceState;
- Finally return-void indicates that the returned value type is void;
type of data
- byte:B
- char:C
- double:D
- float:F
- int:I
- long:J
- short:S
- void:V
- boolean:Z
- array:[XXX
- Object:Lxxx/yyy
I believe that with a JNI foundation, the above data types will be easy to understand. Here are the last two items above:
array:[XXX
Add before the basic type [
to indicate the array type, for example, int array and byte array are [I
, [B
.
Object:Lxxx/yyy
Types starting L
with are represented as objects, such as String objects are represented as Ljava/lang/String;
(the object type needs to be followed by a semicolon), where java/lang represents the java.lang package, and String represents an object under the package path.
There may be doubts about children's shoes here. If the class is represented by Ljava/lang/String;
, how should the inner class be defined in smali? The symbol flashed through the mind of children's shoes who may have used Java reflection $
. Yes, is also used in smali syntax Ljava/lang/String$xxx;
to indicate that xxx is an internal class of the String class.
register
One of the biggest differences between the Dalvik VM and the JVM is that the Dalvik VM is register-based. What does register-based mean? Personal understanding is a bit similar to assembly language, which stores and transfers data through registers. In smali, local registers are represented by letters starting with v + numbers, such as v0, v1, v2, ..., while parameter registers are represented by starting p + numbers, such as p1, p2, p3, .... In particular, the p0 parameter register does not necessarily represent the first parameter. In non-static functions, p0 represents this
, p1 represents the first parameter, and p2 represents the second parameter in the function. In the static function, p0 corresponds to the first parameter ( because Java's static method has no concept of object ). There is no limit to local registers, and theoretically they can be used arbitrarily.
Member variables
Let's continue to introduce the content about member variables:
# static field
.field private static final PREFS_INSTALLATION_ID:Ljava/lang/String; = "installationId"
//...
# instance field
.field private _activityPackageName:Ljava/lang/String;
Both the static field and the instance field defined above are member variables, and the format is:
.field pubilc/private [static] [final] varName:<类型>
Although both static field and instance field are member variables, they are still different. Of course, the most obvious difference is whether it is related to objects. Static field is a class-level concept, while instance field is an object-level concept.
The appearance of member variables means that there are variable assignments and values. In smali syntax, the value instruction includes: iget, sget, iget-boolean, sget-boolean, iget-object, sget-object, etc., and the assignment instruction includes: iput, sput, iput-boolean, sput-boolean, iput- object, sput-object, etc.
iget / iput represent the value and assignment of instance field member variables respectively;
sget / sput represent the value and assignment of static field member variables respectively;
Whether it is an instance field or a static field member's fetching and assignment instruction can be judged according to the prefix of the instruction . With
-object
the suffix, it means that the member variable is an object type, and without the suffix, it means that the basic data type is operated. In particular, the boolean primitive data type uses the-boolean
suffix.
Here is an example:
const/4 v0, 0x0
iput-boolean v0, p0, Lcom/disney/xx/XxActivity;->isRunning:Z
In the above example, the v0 local register is used, and 0x0 is passed to the v0 local register, and then the second sentence uses the iput-boolean
instruction to transfer the value in the v0 register to com.disney.xx.XxActivity
the member variable of isRunning
. That is to say, it is equivalent to: this.isRunning = false;
(As mentioned above, p0 is represented as an object instance in a non-static function this
, but here it is represented as com.disney.xx.XxActivity
an object instance of ).
static field member variable
sget-object v0, Lcom/disney/xx/XxActivity;->PREFS_INSTALLATION_ID:Ljava/lang/String;
Operation instructions sget-object
are used to obtain static member variables and save them in the immediate local parameter list. Here, the value of com.disney.xx.XxActivity
the static member located in the class PREFS_INSTALLATION_ID
is passed to the local register v0
.
instance field member variable
iget-object v0, p0, Lcom/disney/xx/XxActivity;->_view:Lcom/disney/common/WMWView;
Operation instructions iget-object
are also used to obtain class member variables and store them in the immediate local parameter list. Here, com.disney.xx.XxActivity
the object members in the class _view
are assigned to the local registers v0
.
By observing the above static field static member variables and instance field class member variables , the following format can be summarized:
** <local register>, [<parameter register>], <class variable to which the variable belongs> ->varName:<variable type> **
The format of the put command is similar to that of the get command mentioned above, here you can directly look at the following example:
const/4 v3, 0x0
sput-object v3, p0, Lcom/disney/xx/XxActivity;->globalIapHandler:Lcom/disney/config/GlobalPurchaseHandler;
Java code representation: this.globalIapHandler = null; (null = 0x0)
.local v0, wait:Landroid/os/Message;
const/4 v1, 0x2
iput v1, v0, Landroid/os/Message;->what:I
Java code representation: wait.what = 0x2; (wait is an instance of Message)
function call
The format of the function definition:
function (type1type2type3…)RetValue
It should be noted that the parameter type of the function needs to be defined as the type in the smali syntax, and there must be no other separators between the parameters. Examples are as follows:
helloSmali ()V
meansvoid helloSmali()
helloSmall ([BI)Z
displayboolean helloSmali(byte[], int)
helloSmali (ZLjava/lang/String;[I[I)V
displayvoid helloSmali(boolean, String, int[], int[])
In smali, functions and member variables are also divided into two types, but different from static field static member variables and instance field class member variables in member variables, functions are direct method and virtual method . So what is the difference between the direct method and the virtual method of the function? In simple terms, direct method is private function, and virtual method is public and protect function.
So when calling a function, there are several different instructions such as invoke-direct
, , invoke-virtual
and . At the same time, there is also an instruction, which is an instruction called when the number of parameters passed is greater than 4.invoke-static
invoke-super
invoke-interface
invoke-XXX/range
invoke-static
invoke-static {}, Lcom/disney/xx/UnlockHelper;->unlockCrankypack()Z
invoke-static means calling a class static function. The Java code is expressed as: UnlockHelper.unlockCrankypack()
, notice here that invoke-static {}
is immediately followed by the instance + parameter list that calls the method . Since this method does not require parameters and is also a class static method, it is {}
empty. Let’s look at another example :
const-string v0, "fmodex"
invoke-static {v0}, Ljava/lang/System;->loadLibrary(Ljava/lang/String;)V
What is called here is static void System.loadLibrary(String)
to load the so library, and v0 means to pass parameters fmodex
.
invoke-super
Indicates the instruction used to call the parent class method, which can be seen in the overloaded method.
invoke-direct
Indicates the method of calling a private function, such as:
invoke-direct {p0}, Lcom/disney/xx/XxActivity;->getGlobalIapHandler()Lcom/disney/config/GlobalPurchaseHandler;
The GlobalPurchaseHandler getGlobalIapHandler() here means that getGlobalIapHandler() is a method defined in the XxActivity class with private permission.
invoke-virtual
Indicates that a protected or public function is called.
sget-object v0, Lcom/disney/xx/XxActivity;->shareHandler:Landroid/os/Handler;
invoke-virtual {v0, v3}, Landroid/os/Handler;->removeCallbacksAndMessages(Ljava/lang/Object;)V
Here v0 can be expressed as shareHandler:Landroid/os/Handler, and v3 is expressed as the Ljava/lang/Object; type parameter of the removeCallbacksAndMessages method.
invoke-xxxxx/range
Indicates that when the method parameter >= 5, it needs to be added later /range
.
Some children's shoes may notice that the above examples are all in 调用函数
this operation, it seems that there is no operation to get the return value of the function? In the smali code, if the called function returns non-void, you also need to use move-result
(return basic data type) and move-result-object
(return object):
const/4 v2, 0x0
invoke-virtual {p0, v2}, Lcom/disney/xx/XxActivity;->getPreferences(I)Landroid/content/SharedPreferences;
move-result-object v1
v1 represents the object of type SharedPreferences returned by calling this.getPreferences(0) method.
invoke-virtual {v2}, Ljava/lang/String;->length()I
move-result v2
v2 represents the int primitive type returned by String.length().
example analysis
The above preliminarily analyzes the function variables, method definitions, and calls. The following uses examples to further analyze the smali syntax:
.method protected onDestroy()V
.locals 0
.line 79
invoke-super {p0}, Landroidx/appcompat/app/AppCompatActivity;->onDestroy()V
.line 80
invoke-virtual {p0}, Lcom/happy/learnsmali/BaseActivity;->removeCallbacks()V
.line 81
return-void
.end method
This is the onDestroy() function we are familiar with. First of all, we see the first sentence in the function: .locals 0
, indicating the number of local registers used in this function . Here, the number of local registers is 0 because the called method does not use local local registers. If I add: this.isExited = true in that method, then the above method should be modified to:
.method protected onDestroy()V
.locals 1
.line 79
invoke-super {p0}, Landroidx/appcompat/app/AppCompatActivity;->onDestroy()V
.line 80
invoke-virtual {p0}, Lcom/happy/learnsmali/BaseActivity;->removeCallbacks()V
.line 81
const/4 v0, 0x1
iput-boolean v0, p0, Lcom/happy/learnsmali/BaseActivity;->exited:Z
.line 82
return-void
.end method
Because the modified onDestroy() function uses a local register v0, it is .locals 0
changed to .locals 1
. In addition, you may also notice the identifier .line, which indicates the line number of the line of code corresponding to smali in Java. Usually when we debug the program on Android Studio and crash, the line number of the code where the crash occurs in logcat is also the value. Of course, this identifier is not required, but it is recommended to keep it for the convenience of debugging.
data sharing
At the end of the article, the author shares the materials written and organized in the process of learning smali grammar with friends in need:
Contains some more basic detailed operators of smali (can be used as a manual query), how to reverse the steps of an APP, etc.
Obtaining method: Search on WeChat, follow the public account Android Security Engineering , and then reply to the smali keyword to obtain.