Entrance small program development summary

This article first appeared in my micro-channel public number: small water-month long

Original link: https://mp.weixin.qq.com/s/dIn1YsM_i-o76BVAIrPAhA

Tomorrow is the annual college entrance examination, college entrance examination the number of applicants this year reached a new high of 10.31 million, three years ago as a reference to the entrance of the quasi-ape program, in time for the college entrance examination, overtime made a scratch entrance check points applets can be regarded as a veteran mentor to the entrance ceremony of Benedictine College Shanghai Campus. About this applet, you can refer to my previous article,
https://mp.weixin.qq.com/s/f78npxbhNrwjmtUzX5VhTQ
today mainly to talk about the principles and technical implementation details.

# Data Sources

Applet background included a total of nearly 30w of data, containing 2008-2017 Division of Arts admission of each batch of all major colleges and universities as well as the use of new curriculum standards for all roll 2008-2018, vols new curriculum standards, new curriculum standards in three volumes autonomous provinces and some propositions from Tiqian Pi to a specialist vocational batch of admission, of sorts informative.

Here Insert Picture Description

All data were collected from various tertiary institutions and related sites each entrance, because of the huge amount of data, in order to improve the speed, use the concurrent.futures (required Python3.5 +) module in the ThreadPoolExecutor to construct thread pool to perform multiple tasks concurrently.

Database using the PgSQL, a known as the world's most powerful open source database products, all the data are present new gaokao database, under which a batch of two tables, university (college admission points) and province (provinces of second line)

university table shows

Field Explanation
name School Name
stu_loc New students
stu_wl Bunrika
pc Admission batch
year years
score Admission Average

province table shows

Field Explanation
year years
stu_loc Candidates location
stu_wl Bunrika
pc batch
control This batch minimum control line

30w amount of data, a plurality of sites, concurrent crawling, data collision is unavoidable, before insertion, incomplete first filtered data, such as when inserting a table of data missing university pc field, then this record it should be discarded, the most serious is the duplication of data, the solution I use is: first, whether the query to be inserted into the data already exists, the primary key university table is (name, stu, stu_wl, pc, year), because of practical constraints a college only a year in a batch can be only one category average admission, if not, before the implementation of the last insertion, and commit to commit the transaction.

Background build

After 30w of data to get, I intend background using Flask + PgSQL model to achieve, even in the background deployed at Ali cloud server, a small program end after the developer tools by the FBI, on the applet to a line encountered big trouble because the program requires a small line running through the ip address can not access the background, must be accessible via the domain name registration, buy a domain name is not quite troublesome, but the domain name for the record relatively time-consuming, requires more than a week, and there was also less than 5 from the entrance day, when the helpless, accidentally saw a small cloud development program, on the applet cloud development, introduction official website is:

Developers can use the cloud to develop small program to develop micro-letters, games, without having to set up the server, you can use cloud capabilities.

Cloud development to provide complete for developers of native cloud support and micro-channel service support, weakening the back-end maintenance concept and operation, without the need to build a server, API using the platform provided by the core business development, we can achieve rapid on-line and iteration, and this capability , with the cloud service developers already use compatible with each other, they are not mutually exclusive.

In other words, as long as the data into small program that comes with the background, you can access through a small platform API program to these data, previously understood the LeanCloud cloud cloud Bomb third parties and did not expect a small program now integrates these functions, Tencent have to admire it.

That is, the next task is mainly to import background data for small programs known backstage, back-office support data import json or csv format. So I wrote a script to export data from a local database to json file:

import psycopg2
import json

# 连接 pgsql 数据库,为保证隐私,密码已隐藏
conn = psycopg2.connect(database="gaokao", user="postgres", password="*******", host="127.0.0.1", port="5432")
cur = conn.cursor()

cur.execute('select stu_loc,year,stu_wl,pc,control from province')
result = []
query_res = cur.fetchall()
for i in query_res:
	item = {}
	item['stu_loc'] = i[0]
	item['year'] = i[1]
	item['wl'] = i[2]
	item['pc'] = i[3]
	item['score'] = i[4]
	result.append(item)
# indent=2 控制 json 格式的缩进
# ensure_ascii 控制中文的正常显示
with open("province.json", 'w', encoding="utf-8") as f:
	f.write(json.dumps(result, indent=2, ensure_ascii=False))

There is also a need to explain there is a pit, backstage applet required json format and json little difference between us on the ordinary meaning of the format, first of all, all the contents of json can not be [and] include them, and each is {the} including the inability to obtain data items commas.

Here Insert Picture Description

Selection notepad ++ json open the original file, use the Replace function can be solved, the [and] replace spaces, put}} can be replaced.

After modification, the small background program by importing the json file, the background to build basically completed.

Write small programs end

About writing small programs end, I mainly talk about two experiences, the first page is written, such as the following interface.

Initially want to achieve this effect, no ideas, and finally from the custom modal popups that has been thought, at the beginning of this drop-down box corresponding layout area colleges are hidden in wxml file control by hidden = true , a click region / college drop-down box, put the hidden set to false, if there are other drop-down box corresponding to the start of the layout of the hidden attribute is false, then, while the hidden attribute is set to make all these layouts to true to hide other layouts, of course, , where the need to true or false by js li setData () dynamically modified, the modified data from the data layer to render the view layer.

The second is a small program on the development of native cloud-Bug, a background query only query the data to up to 20 to achieve once to get all matching results, you need to solve two problems, the first question was naturally able to think, after the first data found in 20, before the second skipped 20 and then take the 20, 40 and then take before skipping the third 20, and so on; there is a more deadly problem, query background asynchronous callback API to get results, that is, in order to ensure a complete data, the second query need to write the callback first query, the third query need to write the callback second query in, but you can not explicitly know how many times you want to query, so how many layers of nested, and the annoying need to write the same variables are coverage issues, this is called asynchronous hell. To solve this problem, we need to write code to turn this into an asynchronous method of synchronization, which would be:

To be added to the import function js runtime.js page file, and the file into the appropriate folder runtime.js
;
const = regeneratorRuntime the require ( "... / Runtime");

runtime.js Download: https://github.com/inspurer/CampusPunchcard/blob/master/runtime.js

The following example also emulated code completion business logic:

// 查询可能较慢,最好加入加载动画​
wx.showLoading({
		  title: '加载中',
		})
		const countResult = await db.collection('province').where({
		  stu_loc: name,
		  pc: pici,

		}).count()
		const total = countResult.total
		//计算需分几次取
		const batchTimes = Math.ceil(total / MAX_LIMIT)
		// 承载所有读操作的 promise 的数组
		//初次循环获取云端数据库的分次数的promise数组
		for (let i = 0; i < batchTimes; i++) {
		  const promise = await db.collection('province').where({
			stu_loc: name,
			pc: pici,
		  }).skip(i * MAX_LIMIT).limit(MAX_LIMIT).get()
		  //二次循环根据​获取的promise数组的数据长度获取全部数据push到newResult数组中
		  for (let j = 0; j < promise.data.length; j++) {
			var item = {};
			item.code = i * MAX_LIMIT + j;
			item.name = promise.data[j].stu_loc;
			item.year = promise.data[j].year;
			item.wl = promise.data[j].wl;
			item.pc = promise.data[j].pc;
			item.score = promise.data[j].score;
			console.table(promise.data)
			newResult.push(item)
		  }
		}
		if (newResult.length != 0) {
		  that.setData({
			hasdataFlag: true,
			resultData: newResult
		  })
		} else {
		  that.setData({
			hasdataFlag: false,
			resultData: newResult
		  })
		}
		// 隐藏加载动画
		wx.hideLoading()

These are some of the ideas I developed this experience, welcome criticism.

Published 84 original articles · won praise 250 · Views 150,000 +

Guess you like

Origin blog.csdn.net/ygdxt/article/details/92011679