20,189,224 2018-2019-2 "Password Security and New Technology" Lessons Learned Report

20,189,224 2018-2019-2 "Password Security and New Technology" Lessons Learned Report

Course: "Password Security and New Technology"

Class: 1892
Name: Shi Xinyi
Student ID: 20,189,224
Class Teacher: Johnny
Compulsory / Elective: Elective

1 course content summary

1.1 teacher lectures

(1) Web Application Security - Zhang Jianyi teacher

  • Introduce common web vulnerabilities : SQL injection, cross-site scripting vulnerabilities, CSFR (CSRF), Web information disclosure, privilege issues, logical loopholes, third-party program vulnerabilities, Web servers resolve vulnerabilities, weak passwords, SSRF
  • Web presentation machine learning safety-related applications : harpoon (Spear Phishing, Catfishing), puddle attacks (Watering Holes), roaming network (Lateral Movement), covert channel detection (Covert Channel Detection), injection attacks (Injection Attacks ), Trojan (Webshell), certificate theft (credential theft)

(2) quantum cryptography - Sun Ying teacher

  • Background Quantum Cryptography:
    symmetric cryptosystem (Advantages: fast encryption for bulk encryption data; disadvantages: key distribution, key management, there is no signature function);
    public-key cryptosystem (decomposition based on the mathematical problem of large numbers, the advantage of : solve the key distribution, management issues, can be used for signature; disadvantages: slow encryption);
    hybrid cryptosystem (public key cryptosystem to distribute the session key, symmetric cryptosystem to encrypt data);
    security challenges (Shor algorithm: factoring polynomial time algorithm to solve the integer factorization problem affected cryptosystem; Grover algorithm: fast search algorithm can speed up the search key cryptosystem affected: DES, AES symmetric cipher, etc.)
  • Quantum communication-related fundamental physical concepts:
    Quantum concept (physical difference microcosm adjacent discrete value of the quantum is called a physical quantity / microscopic particles having specific properties or photons);
    quantum state (classical information: bit 0 or 1, the level of available represents a voltage, etc.);
    quantum information (qubit | 0>, | 1> qubit state can also be superimposed in a different state);
    non-cloning theorem (unknown quantum state can not be clones);
    Uncertainty Principle (unknown quantum states can not be accurately measured)
  • Quantum communication basic model describes:
    information transfer (normally used at the same time a quantum channel and a classical channel, the quantum channel quantum transmission carrier, the classical channel transmission classical message);
    Correction (random selection portion quantum carrier, compare the initial and final state of the comparison protocol , the tapping inevitable interference of quantum states, and thus introduce errors, once found eavesdropping the communication is terminated, the relevant data is discarded);
    eavesdropping detecting (correcting errors in the key);
    privacy amplification (by compression key length, the possible eavesdropper part of the key to get information compressed to any small, safe keys)
  • Status quantum communication research
    quantum key distribution, quantum secret sharing, quantum secure direct communication, quantum identity authentication, quantum coin toss, quantum oblivious transfer;
    improve performance, increase efficiency: reusable base, entangled enhanced, two-photon, two-detector
    to improve immunity: decoherence subspace, quantum codes
    raising the actual system attack resistance: decoy-state, device-independent
    higher rate, farther, stronger security

(3) Preliminary design and cryptanalysis based on the depth of learning - Jinxin teacher

  • Cryptanalysis and machine learning
    machine learning models: sample x -------------- model (function fx) --------------- marked y
    cryptanalysis model : Plain x -------------- key (function fx) --------------- ciphertext y
    research trend: more and more cryptanalysis methods to start using machine learning techniques
  • Profile depth study
    depth study: that attribute categories or features formed by combining low level features a more abstract level, in order to find the distributed nature of data represents
    the depth of neural networks: convolutional neural network (CNN: Convolutional NeuralNetworks), recurrent neural network (RNN: Recurrent Neural networks), generating confrontation network (GAN: Generative Adversarial networks)
    depth application examples study: human segmentation, gesture recognition, face segmentation and automatic makeup, visual Q & a law dog, captured Dota2, code cracking, test monitoring cheating discovered the beauty of the image auto-discovery
  • Learning and password depth analysis: based side-channel attack convolutional neural network-based neural network cycle deciphered plaintext, network-based password generation against cracks, the depth recognition based on neural network password motif
  • Depth study and design of cryptographic
        communications secure
        computing security need to establish collaborative communications processing security safe, open environment of things converged network environment
        storage security ----------------------- ---------------> cloud computing security requirements under collaborative communication processing environment
        key establishment and authentication of quantum computing technology for secure password
        password protection resources

(4) Information Hiding - summer super teacher

(5) block chain technology - Zhang Jianyi teacher

  • The basic concept of block chain
    block: bitcoin network, data are permanently recorded in the form of a file, called a block (Block). Record a single transaction data unit is called the Block, a lot of transactions will be recorded on a single Block.
    Chain Block: Block all the way to two-way linked list, and each Block will save it on a Block Hash value of. Block only a highest node, i.e. Block Creation (first Block).
  • Block chain technique
    peer-to-peer network: block chains can be understood as a decentralized distributed database. The database does not depend on any institution or administrator role is to block chain store information, data in a database maintained by the common node of the whole network, anyone can access block chain network, become a data node. If you write to the data node, the node will broadcast information data written to an adjacent node and adjacent nodes then broadcast to their adjacent nodes, eventually the information will be broadcast to all nodes in the whole network, Finally, all nodes will synchronize data to ensure consistency.
    Consensus mechanism: POW, POS, DPOS
    PKI public key system, immutable data: data verifiability of
    system design reward cooperation: non-cooperative game, cooperative Nash equilibrium reached
  • Block chain in the future: money, contracts, cross balanced governance

(6) vulnerability mining technology - Wang Zhiqiang teacher

  • Common Vulnerabilities mining introduced:
    manual testing (testers discovered by manual analysis and test procedure measured target vulnerability, the vulnerability is the most primitive method of mining);
    patch comparison (s vulnerability to mining by a difference between the comparative patches );
    program analysis (static and dynamic);
    binary audit (source code is not available, obtaining by reverse binary code, binary code in the security assessment levels);
    fuzzing (via the input test data to a large number of abnormal target and monitor its abnormal to identify vulnerabilities)
  • Progress vulnerability mining technology
    development direction: AI-- machine learning - learning depth
    1) binary function Identification: Existing disassembly analysis tool has the correct low defect identification; use of recycled neural network algorithm RNN binary function recognition program the model training.
    2) generating a test: In the mining software vulnerabilities, the test input configuration code coverage or high vulnerability oriented to improve mining efficiency and specific vulnerabilities. You can use machine learning to guide test input samples to generate higher quality
    3) screening test
    4) path constraint solver: blurred focus on the screening test can cover a sample of the new path for the seed file, but did not take advantage of the time variation of the seed file program data streams. Path constraint solver with the ability to perform symbolic than fuzzing vulnerabilities and other more advanced technology reflects the excavation. Angora stain using the tracking impact test input byte conditional branch, then use the gradient descent path variation generated after constraint solving
    vulnerability exploiting example
  • Vulnerability Mining Example: Router Protocol vulnerability discovery, NFC vulnerability discovery

1.2 students report

(1) The first group (Zhang Yuxiang, Bao Zheng Li): Finding Unknown Malice in 10 Seconds: Mass Vetting for New Threats at the Google-Play Scale

Contributed articles:
  • We developed a new technology called MassVet can be carried out large-scale review of the application in appearance and behavior without understanding malware.
  • Construction of a "DiffCom" analysis, the algorithm maps the salient features of the control flow graph structures or methods UI application to a fast comparison value.
  • In the stream processing engine implemented MassVet, and to assess nearly 1.2 million applications from around 33 applications market, namely the size of Google Play.
  • Experiments show that the technique can detect a low error rate and superior to all audit application scanner 54 VirusTotal (NOD32, Symantec, McAfee, etc.) in terms of the detection coverage within 10 seconds, to capture more than 100,000 malicious applications, including more than 20 potential zero-day malware and millions of installations of malicious software.
Work Ideas

The core method


MassVet processes all applications, including a look at the structure of the database and the database for the database. Two databases have been ordered in support of a binary search, and submit a new application for review to the market. Once uploaded repackaging AngryBird market, a first automatically broken down into small denotes the preprocessing stage, which can identify the interface structures and methods. Their function is mapped to a core v (center view) and m Nuclear (geometric center control flow) by calculation. Application of v-cores used first by a binary search query the database. Once the match is satisfied, when there is another having a similar user interface application AngryBird structure, the application will be repackaged and method of application level market compared to identify differences thereof. These differences are then automatically analyzed to ensure that they are not ad libs and indeed suspicious, and if so, to a market report. When there is nothing, MassVet continue to look for m core methods of database AngryBird.

(2) The second group (dry Feng, Li Yang): Spectre Attacks: Exploiting Speculative Execution

Summarizes the content of the paper:
  • Spectre Attacks (ghost attack): main CPU utilization predictable execution vulnerability attack, resulting in time of the attack victims unaware.
  • Speculative Execution (speculative execution): some new prediction processors have the ability to execute instructions to be executed can be estimated, pre-calculated using the method to accelerate the entire process. The design concept is: to accelerate a high probability event.
  • Deceive branch predictor :( a code example)

    into the if statement to determine, first, whether the query array1_size value from the cache, if no inquiry from the low-speed memory, according to our design, the cache has been erased so there is no array1_size of value, the total going to the low-speed cache query. After the query, the determination is true, then the query has the value array1 [x] and array2 [array1 [x] * 256 ] from the cache, under normal circumstances there is not, then loaded from the low to the cache buffer. After performing several times, continuously determines if true, the next required at low speed from the cache loader array1_size, in order not to waste of clock cycles, a CPU prediction execution start work, then it is determined if the condition is true reason, because before are true (accelerated high probability event), then directly execute the following code, which means that at this time even if the value of x is out of range, we are still likely to inquire into memory array1 [x] and array2 in the cache [ array1 [x] * 256] value, when the CPU found prediction error we've got the information you need.

  • Attack process:

  • Attack Results:

(3) The third group (Yu Chao, Yang Chenxi): All Your GPS Are Belong To Us: Towards Stealthy Manipulation of Road Navigation Systems

Thesis contribution:
  • In order to prove the feasibility of this method, we first test to be controlled and measured by a portable GPS spoofing to achieve in the actual car.
  • Design a search algorithm for calculating the value of Th, and real-time mobile GPS offset victims route.
  • Analog (Manhattan and Boston 600 taxi line) tracking drive conducted an extensive evaluation, and then verify the complete attack AL- world driving tests by RE (attack our own car).
  • User research deceptive to the United States and China were driving simulator.
  • The feasibility of the road navigation system stealth manipulation attacks.
Research ideas:
  • Four components: HackRF One-based front end, Raspberry Pi, the portable power source and an antenna, the use of 10000 mAh mobile power as a power supply of the entire system
  • The assembly is connected to the antenna in a frequency range between 700MHz to 2700MHz covering civilian GPS band L1 (1575.42 MHz).
  • Raspberry Pi 3B (quad-core 1.2GHz Broad com BCM2837 64 bit CPU, 1GB RAM) is used as a central server.
  • GPS satellite signals generated by a wireless transmitter box attack (WALB) running on a Raspberry Pi.
  • To obtain real-time GPS location information by controlling the Raspberry Pi, either manually or using a script.
The results:

Attack achieved a high success rate (95%). 40 people, only one American and one Chinese participant participants recognized the attack. The remaining 38 participants have completed the task of driving these four NDS and follow the navigation to the wrong destination.

Study limitations:
  • Learning Limit: Attack effect may be reduced in the suburbs, the attack does not apply to all people
  • User Study limitations: Experiments select only European city, a large-scale study did not reach the scale, the study only tested route

(4) The fourth group (Zhang Hongyu, Li Xi Bridge): With Great Training Comes Great Vulnerability: Practical Attacks against Transfer Learning

Thesis contribution:

We proposed a new attack against transfer learning adversarial, that the white-box attack model for teachers, students attacked black box model. The attacker knows the internal structure of ownership and re-teacher model, but do not know the value and ownership of training dataset student model.

Attack ideas:

  • The target input to the teacher FIG dog model, in FIG captured target output vector K Teachers model layer.
  • FIG added source of disturbance, so that the source of FIG was added over disturbance (i.e., against the sample) in the teacher input model to produce very similar output vector in the first layer K.
  • Since each layer feedforward networks only observe its previous level, so if we fight in the sample output vector of the K layer can be perfectly matched to the corresponding target output vector map, then right layer after layer of matter on K regardless of the value change, it will be mistakenly classified into the same label and the target of FIG.
Defensive approach:

Modifying student model, updated layer weights to determine a new local optimum value, expanding the difference between it and the teacher models under the premise of providing comparable or better classification effect.

(5) The fifth group (Prince hazel, Shi Xinyi): safeinit: Comprehensive and Practical Mitigation of Uninitialized Read Vulnerabilities

Article achievements:
  • Proposed SafeInit, a solution based compiler
  • By ensuring stack and heap initialization automatically reduce read uninitialized value.
  • Cost optimization proposed solutions can be reduced to the lowest level (<5%), and can achieve the realization of a prototype clang and LLVM based SafeInit directly in the modern compiler, and that it can be applied to most real C / C ++ applications without any additional manual work.
  • Our assessment of CPU resources in the CPU-intensiv operation, the operator IO-intensive occupation of I / O devices as well as aspects of the Linux kernel, and verify that succeed in reducing the existing loopholes
LLVM architecture


LLVM LLVM includes a narrow and a broad LLVM. Generalized LLVM LLVM actually refers to the entire compiler architectures, including a front end, a rear end, the optimizer, and many of the library functions of many modules; narrowly focused on the fact LLVM compiler back-end functions (code generation, code optimization etc.) of a series of modules and libraries. Clang is written in C ++, based on C / C ++ / Objective-C / Objective-C ++ compiler LLVM's. Clang is a highly modular development of lightweight compiler, it's fast compilation speed, small memory, very convenient for secondary development. The figure is the relationship between LLVM and Clang: Clang actually can be roughly correspond to the compiler front-end, mainly to deal with some specific operations and analysis for machine-independent language; optimizer compiler and rear end portions in fact, before we talk about to LLVM backend (narrow LLVM); while the overall architecture is LLVM Compiler architecture.

safeinit architecture


FIG compiler after obtaining the C / C ++ files, the front end of the compiler converts the source file into intermediate language (IR), by the initialization, the code optimizer optimizing binding existing compiler then eliminated by the invalid data finally obtained dispenser strengthen binary file. Safeinit added in the whole process is to initialize all the variables, optimization and strengthen distributor, to avoid or mitigate an uninitialized value. Finally, SafeInit optimizer provides noninvasive conversion and optimization, are conventional compiler optimization (self modify if necessary) and a final assembly (prior "dead store elimination" optimization extension) run together. These costs can be minimized when built on our pass initialization and distributor, to perform a more extensive remove unnecessary initialization code to prove that our solutions run.

(6)第六小组(郭开世):Manipulating Machine Learning: Poisoning Attacks and Countermeasures for Regression Learning

Papers in:
  • The first systematic study of poisoning attacks and countermeasures linear regression models were.
  • Proposed a new framework for optimization poisoning attacks and rapid statistical attack, the framework requires very little understanding of the training process.
  • Adopt principled method to design a robust defense of the new algorithm is largely superior to the existing robust regression method.
  • On several data sets health care, real estate loan evaluation and assessment of a wide range of offensive and defensive algorithm proposed by the authors.
  • It demonstrates the true meaning of poisoning attacks in the health case study applications.
system structure:

(7) The seventh group (Yang Jingyi): Convolutional Neural Networks for Sentence Classification (convolution neural network to classify the sentence)

Article achievements:

Convert text to provide a similar image matrix, the method CNN text classification task to complete, using the order and more semantic features of a word.

Model Introduction

  • Input matrix
    CNN input matrix size depends on two factors:. A length of each character sentence length (number of words included) B.. Assuming the input X comprises m words, and each word of the word embedded (Word Embedding) of length d, then the time is input in the two-dimensional vector md. For I like this movie very much! , When fitted word length is set to 5, that is, the input 75 of the two-dimensional vector.
  • Convolution process
    assumed that the input X comprises m words, and each word of the word embedded (Word Embedding) of length d, then the time is input in the two-dimensional vector md. For I like this movie very much! , When fitted word length is set to 5, that is, the input 75 of the two-dimensional vector.
    Representative A.filter_size longitudinal core comprising the convolution number of words, i.e., word order is considered the relationship between the adjacent few words, using the code [3,4,5].
    B.embedding_size is the dimension of the word vector. After each convolution calculation is done we obtain a column vector representing the convolution kernel extracted from the sentence feature. How many will be able to extract the convolution kernel how many features.
  • Process pool
    article uses a method for Filter MaxPooling extracted feature dimension reduction operations, a final feature. The results of each convolution becomes a characteristic value, to generate a final feature vector.
Experimental results
  • CNN-rand: All word vector are randomly initialized, while the optimization of the training process as a parameter;
  • CNN-static: All wordvector a direct result of the use of unsupervised learning, Google's word2vector tools available, and is fixed;
  • CNN-non-static: All wordvector a direct result of the use of unsupervised learning tool, Google's word2vector get, but will be fine-tuned in the training process;
  • CNN-multichannel: mixed version CNN-static and CNN-non-static, i.e., two types of input.

2 thoughts and experiences

"Password Security and New Technology" The course opened up my horizons and made me realize that a lot of security-related technologies and research, I have found through this course cross artificial intelligence technology and the integration of safety technology far beyond the heat my imagination. Teachers explain so that we have more understanding of the different areas, we recommend that Wang was also raised in the job so we have a new understanding of how to learn better ways of learning, teacher thank you for the hard work to pay! Finally, the teacher arranged this top will read the papers reproduced task let me gain a lot from topic to read the article to debug the code have a great harvest, this election topics related to architecture and compiler, in the reading process It is very difficult, under the access to information by other students and help us understand this article, although the last did not succeed to reproduce, but this study I have benefited.

3 suggestions and comments on this course

  • Before class can collect the students want to know about the content
  • The lesson can be a little more practical demonstration and interaction

Guess you like

Origin www.cnblogs.com/20189224sxy/p/11032722.html