Supplements

Always trying to make some gains last year a brief summary, but always to find their own reasons to shirk. On the one hand feel somewhat insignificant these things, not the system, on the other hand they have in preparation for graduate school. But this would like to summarize this idea has been the heart of diving, from time to time emerge, ultimately "wasted" time to summarize the weekend this blog.
This is the blog title called Supplements reason.

Blogger is a junior software engineering students, nearly a year before learning the Python web development, Bowen mainly some of their own problems encountered, perception and understanding, and now once again stood in the fork of life.

Bowen include the following:

  • Language and framework (Python + Django)
  • Problems encountered by the database
  • Server Deployment problems encountered
  • Envisaged project requirements
  • For software development team to understand
  • Outlook

First, the language and framework

About 17 years late, I know almost as respected crowd of Python, and half imperfect start learning solution. And now more and more colleges and universities use it as a freshman introductory programming language.

1.1 Python

Interpretative language, is inherently slow label affixed efficiency, but this explanation is also an intermediate product, Python two steps in the interpretation of the source:

  • The first step: the source code into bytecode
  • Step: converting bytecode into machine code

Paid attention to it will be noted that when the python file is compiled, there is the relationship between import file, it will generate a __pyc__ folder. pyc file is converted by the module source Python bytecode interpreter. (Pyc, py representatives pychon, c is compiled meaning, pyc that is compiled python files) When our program is not modified, then the next time you run the program, you can skip the process from source to bytecode, pyc files loaded directly. The purpose of doing so is to run faster. Want to improve development efficiency, it is possible to make some trade-offs execution efficiency.

As a dynamically typed language, could only read part of the code is difficult to determine a variable or parameter represents what it means, what type, so comments and variable naming convention is necessary. And when you really want to run up the code inside wanted to see in the end is what, may also be due to abnormalities or other conditions resulting in variable inside is no longer the original data type. There are so many in python library is instance and type of statement to ensure that the data structure variable can want to save as expected.

This highlights coding standards, the importance of the document and comments.
+ = + Documentation software program data = data structure + algorithm + data + document

After that, how to ensure that an object, a variable pointing to it from one object to another object: This variable can point to each different object (in Python, all objects are "fluent Python" said)? The original object should be useless, then the amount of memory should be freed. To solve this problem, Python using reference counting method as the default garbage collection, maintenance of a target object reference fields for each record reference number, once the object reference count is 0, the object was immediately recovered, the object takes up memory space released. But to maintain such a field, each object but also take up extra memory space and can not solve the problem of circular references, so the introduction of Python mark - sweep GC and generational recall of two mechanisms.

If you want to learn, "fluent Python" definitely can help open new doors Python.

1.2 Coding Standards

Although PE8 regulations say no suggestions tab, but the use of four spaces, because the number of different editors may define tab is inconsistent. Even the local write code when there is no problem, no problem can not be avoided by other people when other editors, after all, no Python syntax braces This configuration code block, indent rely on their own than the indentation other much more needed, but still get rid of the habit of using only tab, bad habits may take a week, but he did spend several times to get rid of the effort. Named habits are inconsistent, there is no reasonable comment, a long time, I have not read your own code.
I seem to remember where seen so many words, roughly say

The ultimate goal of the program is to exchange, in order to achieve the functionality and user-encoding specification for the team and communicate with others, others can understand what you write, written by someone else you can understand

1.3 Django

People are constantly learning, language and framework in evolving to meet the demand, inevitably, there are contradictions between the new version and the old version. Software engineering in the upper speak compatible lower that the new version to be compatible with older versions. But that is not always the case, appeared and Django2.x incompatibility issues when using xadmin as back office systems. The original xadmin is based Django1.x do, and the original author no longer maintained.

The front end of the rear end will always be repeated transmission request authentication data, such as the form is submitted, the front end of the data input to tag attributes do a verification, a backend authentication done manually to get the data, when extracted authentication information database match database talk model will do a validation prior to a match three verification looks a little unnecessary, but can not prevent someone might use the string form will enter into executable script code or steal important information on the offending host (sql injection and csrf) .
You can not be more cautious.

Learning framework of official documents is definitely the most right-hand information.

Second, the database

Just MySQL ......

2.1 garbage problem

I contacted the largest amount of data is one million data (field), is a laboratory school brother do reptiles crawling in recent years, information about college admission scores for all of the various institutions, and colleges, get our hands on data when two tables, two tables I want to do a merger, the results of the computer running for a few hours is not over, eventually gave up. There is also a problem which lasted two days to solve, from the school brother mongodb export database tables and then import my local mysql, both sides of the database by default UTF-8, but I am here to display Chinese data import data garbled after, also considered the mysql utf-8 is actually three characters instead of four characters, but this is actually not a garbled reasons, school brother later told me that in order to check whether the data is complete and accurate export, after the data export table or view a little with WPS office to open, in fact, this is the problem, the Chinese office software will default to gbk or gb2312 Chinese character coding, which is Chinese characters in UTF-8-> gbk or gb2313, no problem, but gbk or gb2313-> UTF-8 will be very likely garbage problem.

2.2 Design table structure

Small website design table structure of the contact are relatively simple, commonly used real data sheet may be up to a dozen, whether to do the association between the table and the table structure, the field is not empty, the default properties, set the primary key of the database table is actually a problem development and testing locally exposed more obvious. Most of the time for implementation and testing capabilities will be added to delete duplicate data, foreign key in order to maintain referential integrity of the database, an exception will be prompted an error, often fall directly to the entire database deleted, re-add data instead cascading deletes, which should be regarded as a bad habit.

  • On the one hand the data table structure design is unreasonable
  • On the other hand, the code behind is not enough for exception handling.

By chance, I saw a sister school to the lab to do a set of internal performance appraisal system, because it is used by government agencies, the courts, it is still very high functional reliability, the database is MySQL, the table to learn said Tuesday then about a dozen, but did not make the association between the table and the table, if you do, then the higher the associated database design and processing experience requirements, not easy to manage. In fact, later to view the information found in the database model Django inside _set for this association table and the table is not to directly query the high efficiency filter query. But if you do the associated table structure means that can more easily query with several deep, as there ModelSerializer do depth query data in the table in Django Rest framework, to analyze specific issues.
So if I design a database:

  • Or foreign key primary key of a well-designed data tables try not to make changes;
  • Do not easily or foreign key, the primary key will be the default, but the field is not empty, set the default property value is necessarily need to seriously consider.

Third, server

Since last year, when writing to on-line management systems, teacher gave a centos server account password, and then I went, first contact with no operating system graphical user interface, completely ignorant force, according to the online tutorial started, after you change the configuration file, the amount can not connect remotely embarrassing.
In order to remedy the defect, he bought a year Ali cloud server, go for a little linux, then look is "follow the birds brother learn Linux", a thick, saw a small part, in fact, many commands are watching again tried again, and then forget, only some simple command also remember that if the principle, then, the role and rights and the principle of hard and soft links in mind.

Server deployment using centos + Django + Nginx deployment time there have been many problems

3.1 style does not display

Admin interface found to occur after deploying django project to Ali cloud (nginx as a web server) admin does not display the styles. After opening the browser to view css resource discovery, although the file to the project file static resources, but under static directory without a corresponding css and js resources to try to collect the style in the project path, into production in local development and projects, django project for management static files in both states is not the same, you need to specify a static file path to the deployment on the line.

3.2 code modification failures

After deploying django project to Ali cloud, pagoda use the control panel to modify the project source code, or use directly after terminal modification, modification and restart nginx uwsgi invalid. When the local environment, directly modify, and then you can restart the runserver effect, more convenient, but to the server on the problem.

  • On the one hand: If you modify the source code directly on the server, because the Python indentation itself to strict requirements, it will cause an exception error.
  • On the other hand: In local time, use python manage.py runserver to run the server. This applies only to the use of the test environment. That every time you modify the source code, you need to restart Ngnix.

After modifying the source code upload server, enter the server project directory, view uwsgi process,
then kill all processes, then restart uwsgi, then pagoda or a direct order to restart Nginx, changes to take effect.

Fourth, demand is envisaged

There are a lot of features wanted, wanted to learn knowledge (cheese is power) can only temporarily stranded.

  • How to restrict access to anonymous users
  • How do their own rights management
  • What happens to high concurrency
  • Optimization
  • data processing
  • Combined with new technology
  • ......

4.1 Rights Management

Although the django-admin or xadmin on backstage management is perfect, but if you do it yourself? Django project to generate super administrator can enter directly into the project directory in the terminal
python manage.py createsuperuser
and then enter the administrator password, email. As a super administrator has all the rights, including adding general manager, general users. Django uses generated by default in the table, there is about having the default user identity field, is_staff, is_superuser, is 0 or 1 field corresponds to whether the general administrators and super users. In xadmin re-do some admin no supplement and perfect the user to log back office function to re-do a table, and add the IP operations using management personnel, to some extent higher security.

4.2 Access Restrictions

Using the IP address of the user, especially one anonymous user access restrictions, which have a special name called in Django Rest framework - throttling (access restrictions). It is usually many times a user requests a page, or click on a link, when, before a few clicks is no problem, but after a few times once in a row, there will be restricted access, only after a certain time limit will be lifted, normal access. Want to restrict anonymous users must identify which is which request is issued by an anonymous user, IP address is a unique identifier. When a user at the time of the first transmission request, the cache may be generated in a dictionary, the dictionary of the keys are uniquely identifies the user and the user's access time, the dictionary is added to each request for a key, is provided most visits, access time is set more for the first time to add key-value pairs and compares the last access time to determine whether to allow access.

4.3 Concurrency

This seems to be the most distant thing away from me, but the curious thing is always to think simulate some other cases no contact, according to the Office of Academic Affairs for the school, if there is 2w + hits per day, distributed in the early hours to six two in the morning a total of 20 hours, and then follow 28 principles, assuming 4 hours to accept 2w + hits, but also less than the average 1.5 times per second. give up.

4.4 Optimization

Simple optimization can be done in many ways.
I remember when debugging backend code, and then refresh the browser will not effect, the reason should be done on a browser page cache, open the console cancel cache. In the modified front page, that they will try to stressed that the main part of css stylesheets in the head, while the dynamic effects of js files in the footer, in order to prevent network latency to load the stylesheet to the user a simple style , gradually load dynamic content, minimizes user dissatisfaction. The number of simultaneous requests a file browser will have on the line, when clicking on a page when, if included multiple css and js file browser will be part of part of the request, so reducing the number of static files is also a way less the js statements can be embedded directly into the html tags rather than create a new js file.

  • Minimize the connection to the database
  • Django caching mechanism to increase
  • Database Index
  • ......
  • We are yet to come and try

4.5 Data Processing

In the United States a large supermarket, diapers and beer put together to sell. This may seem strange, but diapers and beer sales have increased. This is a real case occurred, behavior patterns diapers and beer represented: young father after work often go to the supermarket to buy baby diapers, and they have some people also buy some beer for himself. Excavated from the customer's buying habits of "a great record for a commodity will buy another product (due to the occurrence of certain events caused the occurrence of some other event)," this law, also known as association rules. If there is an accumulation of a large data warehouse customer transaction data, it can be carried out from the data warehouse data mining to extract association rules to help retailers develop marketing strategies, tariff design, merchandising, merchandise and emissions based on buying patterns customer division. I think it is not possible to Apriori algorithm, as the project's electricity supplier product recommendation algorithm? Now Internet Data dividend outbreak.
With the story of the previous beer and diapers, now, next to the supermarket checkout stand will put some small snacks, chewing gum, candy, if customers buy goods that have a fraction of, will not change and a great chance to pick up some candy instead. But now what? The convenience of mobile payments has been no need to consider making it troublesome issues change brought about by the supermarket checkout commodity gradually nobody cares.

4.6 New Technology

Data mining technology used in shopping malls, now very popular image recognition apply to the school management?
General campus management system if that student management, hostel management, scholarships evaluation systems, etc., if the person assessed in accordance with the traditional scholarships and financial aid, then there will certainly be unfair, the human element mixed into it. If the management system and the combination of image recognition, through the dormitory door, the camera classroom checking students' attendance, class rate, as a reference coefficient, by monitoring the students inside the campus card balances and monthly expenses to assess scholarships reference, to be more reasonable, fair and just.

Lab teacher last summer when it and a pig farm to do the behavior and recognize sounds detected.

These are just a bold vision, hope to graduate this talent to move closer.

Fifth, teamwork

5.1 Communication

I remember it was late September last year, when at the beginning of October, to help teachers do a small management system, the first genuine combat heart is still nervous because of. In order to finish the work, Mid-Autumn Festival and National Day did not go home. But always looking for vacation time to take a break, sometimes it did not go to the lab, the teacher saw me not to call me on the seat in the past, basically every two or three days would talk to me and asked me to do how I was feeling really urging me, the beginning was still a little inconsistent.
Actually own problems, before I do not like to communicate with the teacher, or less clear sense of communication, that I will do, focusing instead on their own, everyday problems they want a solution, feel that the teacher is not busy and teachers to communicate, as managers of the project - teachers do not know how my schedule every day to do something.
Understanding of the software project management these things seem strategically advantageous position, as manager of the project, if not a good control of the progress of the project is good, cause users to panic, and the entire project delay or even failure. The teacher was right to do so.
And I think that if in the future as a team newcomer, or work as a trainee, a week to his immediate return on their leadership work and the progress of work is very necessary, it will at least you left him there is a always learning, rather than mixed with food to die a good impression.

If technical knowledge is hard power, the ability to communicate is in all positions can be reused with soft power.

5.2 Demand problem

Planning, analysis, design, coding, testing, operation and maintenance.
I appreciate the genuineness of the problem caused by the waterfall model, if a person is doing a small project, it can give you at least one reference development process, from start to finish, there is based . But the results of the previous stage waterfall model is the basis of a later stage, dependence is too high, a stage if the problem could lead to the collapse of the entire process, even if each link has feedback, but not for frequent changes in the project. Also in the beginning of the project, they also have no way to ensure that users start very clearly completely express their needs, not the needs of bigger degree of change, and the user only until the delivery stage to see the size of the actual operating results and the software.

5.3 full-stack

Bloggers have junior, and many students are also busy and anxious to practice their own in the second half, there are many training companies to catch fish, training techniques include full-stack development Python, and Java enterprise development. But why Python is the only full-stack? I thought I had to do back-end Python is relatively lightweight compared with Java, only small companies will use Python + Django as a back-end technology stack development, but I was wrong. A few days ago blog garden acquaintances say get the offer to goose factory as an intern, do python web, it is to use Vue.js and Django Restframework. I do not know the development of the front and rear coding stage simultaneously, the front and rear end of the separation mode.

我第一次听说全栈这个词,是一次和老师吃饭的时候老师说的,给我的第一印象是:全栈工程师就是一个人能顶一支队伍。
也有具体定义:一个能处理数据库 、服务器 、系统工程和客户端的所有工作的工程师 。
那么真的是这样吗?对于本科生来说,难度可能大了点。
团队职位分化细化,也就是各司其职,意味着一个人价值的压榨与贬值。而升值更多要靠自我驱动。
软件工程讲究代码的可移植性可复用性,代码映射到软件团队中的个人,比如一个后端如果可以在必要时能去帮助开发前端甚至维护服务器,那么他的竞争优势肯定就凸现出来了
。所以全栈工程师就是可以独立完成一个产品的人 。
我有看过一本关于全栈的书,《Web全栈工程师的自我修养》,作者余果毕业于西电曾是腾讯高级UI工程师。书不厚但每一章都很精彩,以后一定要挤时间把每一章推荐的书单都看一下。

六、拾遗

我还记得去年三月初刚开学的时候毛遂自荐要跟老师进实验室学习,到今年三月初离开实验室,准备考研,不过我也觉得去年一整年是对我来说最有意义的一年。能和比自己厉害许多的学姐学长还有老师能够在一起共事,很幸运。
现在接近六月份,差不多三个月以来每天面对的都是折磨人的高数,英语的长难句和阅读,但我还是经常想念之前在实验室写代码的那种感觉,想实现的功能被自己像积木一样一块一块堆好,即使会出问题,积木本身也会倒,但问题被分析到最后被解决,所收获的快感恐怕不是游戏胜利能够相比的,问题解决留下来的经验也能避免以后重复再犯。还有,我很喜欢那种带上耳机敲代码,仿佛世界都是自己的的感觉。

刚开始准备考研的时候心神不宁也不知道如何下手,好基友告诉我说要未虑胜,先虑败,同时有学姐因为复试差点运气与心仪的学校插肩而过,我有段时间晚上连续梦到自己复试不幸被刷了,潜意识里再想怎么办,但答案都是未知,我也不知道如果这背水一战如果失败了该怎么办。如果继续留在实验室,我现在应该在刷算法题和面试题准备面试实习。

我也不知道我是不是适合考研,但是我现在真的很需要这么在三年的时间来看我想看的书,学我想学的知识技术,还有一个研究生的身份能够帮我接触到我本科接触不到的圈子,认识更多厉害的人。


这是一位本硕华科的大佬在我大一懵懵懂懂的时候对我说的一番话,到现在还印象深刻。

多看一些经典书籍,自己多动手,每个人都有自己的学习方法和研究方向,没有千篇一律的内容。

博客和视频教程都是个人总结,难免会有漏洞,逻辑不连贯的地方,但经过反复改版印刷的书籍基本上不会有这种问题。真的应该多看书,毕竟

以上皆一管之见,如有偏颇不当之处,但请谅解与指出,毕竟博主还是一个大三学生,too young, too naive

Guess you like

Origin www.cnblogs.com/welan/p/10926645.html