[Paper Summary] Demystifying Illegal Mobile Gambling Apps

Demystifying Illegal Mobile Gambling Apps

introduce

This is a paper "Demystifying Illegal Mobile Gambling Apps" from WWW 2021. The authors are: Yuhao Gao, Haoyu Wang, Li Li, Xiapu Luo, Guoai Xu, Xuanzhe Liu.

This paper is about the measurement of illegal gambling apps in China

Pre-knowledge

SAN is a network-centric storage structure. Different from ordinary Ethernet, SAN is located at the back end of the server and is a high-performance dedicated network established for connecting storage devices such as servers, disk arrays, and tape libraries. In SAN, it includes various elements, such as adapters, disk arrays, switches, etc., so it is a system rather than an independent device.

Autonomous system: autonomous system. In the Internet, an autonomous system (AS) is a small unit that has the right to autonomously decide which routing protocol should be used in the system. This network unit can be a simple network or a network group controlled by one or more ordinary network administrators. It is a single manageable network unit (such as a university, an enterprise or a company) individual). An autonomous system is sometimes called a routing domain. An autonomous system will assign a globally unique number, sometimes we call this number the autonomous system number (ASN).

CNAME record, that is: alias record. This type of record allows you to map multiple names to the same computer. Usually used for computers that provide both WWW and MAIL services. For example, there is a computer named "host.mydomain.com" (A record). It provides WWW and MAIL services at the same time, in order to facilitate users to access services. Two aliases (CNAME) can be set for this computer: WWW and MAIL.

Dataset source

Identify illegal online gambling sites first and let collect illegal gambling apps according to the sites.

Online gambling website: Cooperate with a major Chinese ISP to obtain the DNS request data of all users in a major city from August 2019 to January 2020, and retrieve it from the data set by keyword.

Illegal Gambling Apps: Use semi-automated methods to obtain gambling apps from gambling sites. Some sites offer direct download methods, while others redirect to hidden sites.
insert image description here

measuring angle

1. Measurement of real-world prevalence of illegal gambling sites

1) Relationship measurement by domain name and application:

Gambling application (blue), gambling website (green), category 1 download service means that the download service has the same domain name as the gambling website (red) and category 2 download service has a different gambling website domain name (orange). The edge between gambling sites and category 2 download services indicates that gambling sites use download services to distribute gambling apps. An edge between a download service (type 1 or type 2) and a gambling application indicates that the gambling application was downloaded from that service.
insert image description here2) Top 10 gambling domain names
insert image description here
3) Abuse of third-party application service channels

Using a headless browser to grab the download website and perform manual verification, it is found that some download services are common services, and these services are abused by some gambling applications to provide download channels.

2. Characteristics of Gambling Apps

1) Website structure

Angular 1: Connected server address

Method: First use DroidBot to dynamically train and identify the UI interface of the gambling application, then run the gambling software on the real mobile phone to find the server address connected to the mobile phone, and finally use TCPdump to distinguish the network traffic, filter the public server, and then analyze the connected server. When filtering public servers, the method adopted is to obtain public domain names through the dynamic exploration and collection of large-scale Android applications and use Alexa's top 10,000 domain names to filter domain names collected from gambling software.

Gambling apps were found to usually connect to many different server addresses (8 on average). Gambling apps have a more complex communication process

The gambling app com.hmobile.core will first connect to hjcxapix.com during app initialization and get a list of app settings. When a user wants to log in, it requests the real login function URL w1.vip66888.com from www.hjcvip.net:844, and sends it the account information. After successful login, a list of gambling games will be returned from w1.vip66888.com. The user can then select a game and the app will request the main game URL gci.hjcvg.com and its resource loading URL gc.vpcdn.com from www.hjcvip.net:844.
insert image description here
Angle 2: Domain Analysis

Distribution of Top Level Domains in Gambling Applications
insert image description here
ASNs of Top 10 Gambling Application Servers

Method: Use Qihoo 360 and VirusTotal to collect relevant IP addresses, and then use the IP-To-ASN mapping table to obtain the email addresses of the top
insert image description here
10 application server registrars
insert image description here
of the ASN Top 10 application server registrants
insert image description here2) Malicious behavior

Method: Use VirusTotal to identify malicious applications, and then use AVClass to identify malware families, 56% of gambling software has malicious behavior

Top 10 Malicious Gambling App Ranking
insert image description here
3) Abuse of third-party services

Top 10 abuse of third-party domain names/third-party libraries/CNAMEs.

insert image description here
4) Payment service

The process of third-party payment and fourth-party payment
insert image description here
Manual analysis of 10 gambling applications
insert image description here
Payment transaction platform
insert image description here
3. Infer the underground activities behind illegal gambling applications, so as to discover more such applications

1) Application cluster

The author finds that many gambling software have the same UI structure, and proposes a cluster analysis of gambling applications based on code-level similarity and developer signatures to analyze the relationship between gambling applications.
insert image description here
Method: Using FSquaDRA2, an application clone detection tool based on code similarity and resource similarity, to make a pairwise comparison between all applications

Cluster of Top 10 Gambling Apps. #APP indicates the number of gambling applications in this cluster, #Cert indicates the number of developer certificates used in this cluster, %Top-1 Cert indicates the proportion of the most popular gambling application signing certificates, #Prefix indicates different types The number of package name prefixes, %Top-1 Prefix indicates the most popular package prefix sharing gambling applications.
insert image description here
2) Identify new gambling apps

Inference over HTTPS

Method: Use VirsTotal to collect the latest certificate information for all gambling domain names, then extract the SAN data in the certificates, and identify relevant domain names by analyzing these data. A total of 53,749 websites were successfully accessed through this method. Afterwards, 1,000 websites were selected for manual verification, and 961 domains (96.1%) were found to be gambling websites. Then, a feature provided by virtotal was used to trace the applications communicating with these gambling servers. In this way, we identified a total of 16,973 applications. Then, 1000 were downloaded from virtotal for manual verification. 879 of them were actually gambling apps.

Inference based on application developer's signature

Method: Collect signatures from gambling software, and then use the signatures to search for published applications in Koodous to find APPs signed by relevant developers

Guess you like

Origin blog.csdn.net/Ohh24/article/details/128612139