Python most difficult problems

The following is a recently viewed articles on python feel good, for everyone to share

Unresolved Issues

Are problems everywhere. Difficult, time-consuming and more is certainly one of the problems. Just try to solve this problem would be surprising. The entire community is to try before, but now only the periphery of the developers working on it. For starters, try to solve this problem, mainly because of problems after considerable honor difficult enough to solve can be obtained. Computer science unresolved P = NP is such a problem. If you can answer this polynomial time complexity, it is simply you can change the world. The most difficult problem than Python prove P = NP be easier, but so far still not a satisfactory solution, you know, practical solution to this problem is also able to act as transformative. Because of this, it is easy to see that the Python community will have so many people focus on the question: "What can I do for global interpreter lock?"

 


Python bottom

To understand the meaning of the GIL, we need to start from the basic Python. Such as C ++ language is a compiled language, the so-called compiled languages, refers to the program input to the compiler, the compiler then parsed according to grammar of the language, and then translated into a language-independent intermediate representation, the final link into a highly optimized machine executable program code. The reason why the compiler can deep optimize the code, because it can see the whole program (or a large independent part). This makes it possible to reason about the interaction between the different language instructions, thereby giving a more efficient means to optimize.
On the contrary, Python is an interpreted language. Is input to the interpreter program runs. Interpreter did not understand before their execution; it knows only Python's rules, as well as the dynamic of how to apply these rules in the implementation process. It also has some optimization, but this is basically just another level of optimization. As the interpreter can not be good for the program is derived, in fact, most of the optimization is to optimize the Python interpreter itself. Run faster interpreter naturally means that the program can also be "free" faster. In other words, the interpreter optimization, Python programs do not make changes you can enjoy the benefits of optimized.
This is important, let us re-emphasize. If other conditions remain unchanged, the execution speed of Python programs directly related to the interpreter of "Speed." No matter how you optimize your program, the execution speed of your program is still dependent on the efficiency of your program to the interpreter. This obviously explains why we need to do so much work to optimize the Python interpreter. For Python programmers, this is probably the closest to a free lunch.

Free lunch is over

Still no end? Moore's Law gives the hardware speed will increase in accordance with the determined period of time, at the same time, a whole generation of programmers to learn how to code. If a man wrote a slower code, the simplest is often the result of a faster processor to wait code execution. Obviously, Moore's Law is still valid, and will take effect for a long time, but the way it has been referred to a fundamental change. The clock frequency is not increased to a speed unattainable, but to take advantage of the transistor density increases caused by polynuclear. Programs running on the new processor in order to take full advantage of its performance, must be rewritten in accordance with the concurrent mode.
Most developers hear "concurrent" usually immediately think of multithreaded programs. For now, the use of multi-core or multi-threaded execution system is the most common way. Although multi-threaded programming is much better than "order" programming, but even careful programmer in the code would not be able to do the best concurrency. Programming language should have done better in this regard, the most widely used modern programming language will support multi-threaded programming.

Unexpected facts

Now we look at the crux of the problem. To take advantage of multi-core systems, Python must support multiple threads to run. As an interpreted language, Python interpreter must be both safe and efficient. We all know the problem of multi-threaded programming experience. Interpreter to note is to avoid operating at different internal threads to share data. At the same time it is also to ensure that in the management of user threads to ensure that there is always maximize computing resources.
So, when different threads simultaneously access, data protection mechanism of what is it? The answer is that the global interpreter lock. Judging from the name can tell us many things, obviously, that's a plus on a global interpreter (from the interpreter's point of view) lock (mutex or from a similar point of view). This is of course very safe, but it has a layer of hidden meaning (Python beginners need to know about this): For any Python program, no matter how many processors at any time is always only one thread execution.

Many people are accidentally discovered this fact. Many online discussion groups and message boards are flooded with questions like from Python beginners and experts - "Why am I the new multi-threaded Python program that only one thread running faster than when slower?" Many people in when asked this question is very sound so confused, because apparently one of the two threads of the program has only one thread than it is faster (assuming that the program is indeed parallel). In fact, this question is asked so frequently that Python experts crafted a standard answer: "Do not use multi-threaded, multi-use process." But the answer is more confusing than the problem. Can not I use multithreading in Python? Use multithreading in Python such a popular language exactly how bad it is, even the experts are not recommended. Do I really missed something?
Unfortunately, nothing is missing. Due to the design of the Python interpreter, use multithreading to improve performance should be regarded as a difficult task. In the worst case, it will reduce (and sometimes obviously) the speed of your program. A computer science and technology professional novice students may tell you when multiple threads are what will happen in the race for a shared resource. The results are usually not very good. Multithreading can work well in many cases, it may be for the interpreter to achieve and kernel developers, not too much to complain about the performance of multi-threaded Python.

Now how to do? panic?

So, what can it? Has the problem been solved? Are we as a Python developer would mean abandoning the idea of ​​using multiple threads to explore the parallel? Why no matter what, GIL need to ensure that only one thread at a time is in operation? Do not add the access lock to prevent multiple independent fine-grained objects at the same time? And why no one tried anything like that before?
These practical problems has a very interesting answer. GIL access to such things as the current state of the thread and heap allocated objects as garbage collection and used to provide the protection. However, this is nothing special for the Python language, it requires the use of a GIL. This is a typical product of this implementation. Now there are other Python interpreter (and compiler) does not use GIL. Although, for CPython it, since it has been around a lot without the use of GIL interpreter.
GIL so why not cut it? Many people may not know, in 1999, for Python 1.5, a frequently cited but not very understanding of "free threading" patches have been trying to implement this idea, the patch from Greg Stein. In this patch, the GIL is completely removed, and the fine-grained locks in place. However, to remove the GIL single-threaded program execution speed has brought a price. When performing single-threaded, the speed reduction of approximately 40%. Two threads demonstrated increase in speed, but in addition to this increase, the income and not with the increase in the number of nuclear and linear growth. Because of lower execution speed, this patch was rejected, and almost forgotten.

GIL is very difficult to remove, let's go shopping!

(Translator's Note: XXX is hard Let's go shopping like roar body of Chinese in English it was implied to want something very difficult successfully completed, we go looking for third-party products directly replace it.!..)
However, "free threading" This patch is instructive sense, it proved a fundamental issue about the Python interpreter: GIL is very difficult to remove. As the age at which time the patch is released, the interpreter becomes more dependent on the global state, which makes today's GIL want to remove more difficult. It is worth mentioning that, precisely for this reason, many more people have become interested in to try to remove the GIL. Difficult issues are often very interesting.
But this may be a bit misguided. Let us consider this: If we had a magic patch, remove the GIL, and no decline in performance on single-threaded Python code, then what will happen? We will get our always wanted: a thread API may be making use of all the processors. So now, we've got our wish, but this is really a good thing?

Thread-based programming is undoubtedly difficult. Whenever someone thought he understood all the time about how threads are working, there will always be some new problems quietly. Because in this regard want to get right and proper consistency is really too difficult, so there are some very well-known language, designers and researchers have concluded that drew some threading model. Like a write multithreaded applications people can tell you the same, whether it is multi-threaded application development or debugging will be more difficult than single-threaded applications on several times. Programmers often has the order of execution mode of thinking does not match precisely with the parallel execution mode. GIL appeared inadvertently helped developers from trouble. In the case of the use of multiple threads still require synchronization primitives, GIL in fact help us to maintain data consistency between the different threads.
So now it seems the most discussed Python is a bit rare problem asking the wrong question. We have very good reasons why experts recommend we use Python instead of multi-threaded multi-process, rather than trying to hide the lack of Python threads to achieve. Furthermore, we encourage developers to use more secure more direct way to achieve concurrency model, while retaining the use of multi-threaded development unless it is really necessary, then you feel. For most people, what is the best model for parallel programming may not be very clear. But now we know that multi-threaded approach may not be the best.

As GIL, I do not think it exists in that is static and unanalyzed before. Antoine Pitrou implemented in Python 3.2, a new GIL, and with some positive results. This is since 1992, GIL is a major change. This change is very large, it is difficult to explain here, but from a higher-level perspective, the old GIL by counting the Python command to determine when to give up the GIL. The result of this is that a single Python script will contain a lot of work, and that they are not 1: 1 translated into machine instructions. In the new GIL implementation, with a fixed timeout to indicate the current thread to give up the lock. In the current thread holds the lock, and when the second thread requests the lock, the current thread will be forced to release the lock out in 5ms after (that is to say, every 5ms current thread will check whether the lock is released ). When the task is feasible, which would make switching between threads more predictable.
However, this is not a perfect change. For all kinds of jobs in the effective use of GIL this area, probably the most active researchers of David Beazley. In addition to research on the GIL before Python 3.2 is the most thorough, he also studied this latest GIL achieve, and found a lot of interesting programs program. For these programs, even if the new GIL achieve its performance is quite bad. He is currently to lead and advance the discussion by GIL are still some practical research and publish some experimental results.

Regardless of an individual's sense of how to Python GIL, it is still the Python language the most difficult technical challenges. I want to understand its implementation requires the operating system design, multi-threaded programming, C language, design and implement an interpreter CPython interpreter has a very thorough understanding. These alone are required to prepare a lot of developers would hinder a more thorough study to GIL. Even so, there is no indication that at any one time GIL soon will be away from us. At present, it will continue to give those new to Python, and at the same time brings confusion and surprise to people to solve the very difficult technical issues of interest.
The above is based on the research I am currently made by the Python interpreter written. Although I also want to write some other aspects related to the interpreter, but none better known than the Global Interpreter Lock (GIL). While I believe that some of the content here is not accurate, but these technical details and a lot of resources CPython entry is different. If you find inaccuracies, please let me know in time, so I'll correct them as soon as possible.

 More technical information may concern: gzitcast

 

Guess you like

Origin www.cnblogs.com/heimaguangzhou/p/11696407.html